ADR 0004 — No mutation, no eBPF in v1¶
Status. Accepted, in v1.
Context¶
Two large families of features are popular in the K8s policy ecosystem:
- Mutation, generation, cleanup — Kyverno's bread and butter. Modify incoming admission objects (defaulting, sidecar injection), create dependent resources, remove orphans on a schedule.
- eBPF / kernel-level enforcement — Tetragon, Falco. Hook syscalls and network events in-kernel, react in microseconds, block process exec.
Both are legitimate. Both are out of scope for Portal v1.
Decision¶
- v1 is non-mutating. Admission rules can
deny,warn, ordryrun. They cannot modify the object, cannot add a default, cannot inject a sidecar. - v1 has no resource generation. Portal does not create dependent resources (no auto-generated
NetworkPolicyper namespace, no defaultServiceAccount, etc.). - v1 has no cleanup. Portal does not delete or garbage-collect resources outside the scope of its own actions (the
evictaction evicts a pod the rule flagged; that's not "cleanup"). - v1 has no eBPF. Portal is a Go process talking to
kube-apiserver. It does not load BPF programs, does not read/sys/fs/bpf, does not require kernel features.
Rationale¶
Why not mutation in v1¶
- Conceptual integrity. Validation and mutation are different problems with different correctness properties. A mutation webhook must be idempotent; the rule schema would need new constructs (
patch,add,removeoperations). Tackling both at once in v1 is over-extension. - Kyverno already does it well. If you need mutation, run Kyverno. Portal coexists with Kyverno (see
../migration/coexistence-with-kyverno.md) — operators are not forced to pick. - The migration path is intact. podwatcher-poc was non-mutating; Portal v1 preserves that property exactly. Operators upgrading from podwatcher-poc don't encounter behavioural drift.
Mutation is plausibly v3. The rule schema would gain a mutate: block; the admission webhook would emit a JSON patch in the AdmissionResponse. The plumbing is well-understood. We're not doing it in v1 because shipping v1 with a tight scope beats shipping v1.5 with everything.
Why not eBPF, ever (for Portal)¶
eBPF is a different layer of the stack. Portal is layer 4 (admission), 5 (audit), 6 (network policy analysis), and — in v2 — layer 7 (K8s API audit log). eBPF lands at layer 7+8 (live process, live network, live syscalls).
- Operational profile is different. eBPF tools are DaemonSets, kernel-version-sensitive, often privileged. Portal is a Deployment, kernel-agnostic, non-privileged.
- The user already has tools. Tetragon and Falco are CNCF projects with significant adoption. Replicating their kernel-level work in Portal would duplicate effort and produce inferior results.
- v2's runtime story is API-audit-log-driven, not kernel-driven. The K8s API audit log gives Portal visibility into exec/attach/port-forward/secret-read at the API layer — broad enough for "did someone exec into a prod container" without descending into syscalls. v2 adds an
EventSourcethat consumes the audit log; the same v1 action engine handles the response.
If you need kernel-level runtime detection, run Tetragon or Falco. If you want their findings flowing through the same alerting and response pipeline as Portal's, a v3 candidate is a Tetragon/Falco gRPC consumer that wraps their events in api.EventMeta and emits to Portal's dispatcher.
Consequences¶
- The rule schema has no mutation verb. Adding one later is additive; no v1 rules will break.
- Documentation (this ADR plus
../security/threat-model.md) is explicit about what Portal does not cover. - The chart does not require any kernel capabilities, BPF helpers, or privileged containers. PSA-
restricted-clean. - The v2 plan has a clear seam: K8s audit log as new
EventSource; everything else reused. - The v3 plan can layer in mutation (new rule-schema block + new
MutationAdmissionWebhook) and an external runtime consumer (newEventSource). Both fit the existing module boundaries.