The problem and four ways to solve it
Every gateway deployment hits the same question eventually: "the built-in routing and auth get me 80% of the way, but I need to add custom logic for my specific use case." Maybe it's a fine-grained authorization rule that depends on JWT claims and MCP tool names together. Maybe it's injecting a header derived from the caller's identity. Maybe it's calling an external classification service on every LLM completion before it reaches the client.
Agentgateway and kgateway give you four mechanisms to extend the gateway's behaviour at request time, and each one is designed for a different class of problem:
- CEL (Common Expression Language) is a small, fast, side-effect-free expression language from Google. You write expressions directly in the policy CRD YAML. They compile once at config load, evaluate per-request inside the Rust dataplane, and need zero external infrastructure. This is the built-in default.
- OPA (Open Policy Agent) is a policy engine that evaluates Rego policies. It runs as a sidecar or remote service and the gateway calls it via the ext-authz protocol. Use it when your policy depends on external data (a user-role database, an entitlement store) or when you want to version and unit-test policies independently of the CRD.
- ext-authz (External Authorization) is a gRPC or HTTP callout the gateway makes on every request. Your service receives the request metadata, returns allow or deny, and can inject response headers. This is the full escape hatch for arbitrary auth logic: LDAP, custom token introspection, multi-factor step-up, legacy IdP integration.
- ext-proc (External Processing) is a bidirectional gRPC stream between the gateway and your service. Unlike ext-authz, it doesn't just gate traffic, it mutates it. Your service receives request/response headers and bodies, rewrites them, and sends them back. Use it when you need full programmatic body transformation mid-flight.
The rest of this post is a short guide to choosing between them. The headline: most teams never need to leave CEL. The platform is more powerful out of the box than it looks. But when you do need to extend, the hooks are there, each scoped to a specific kind of problem.
CEL: the built-in default
The first thing I tell every team ramping on agentgateway or kgateway when they want to add custom policy, validation, or traffic handling: before you write any code, check whether a CEL expression in the policy CRD already does what you need. In my experience so far, it usually does.
CEL expressions cover six surfaces out of the box:
- Authorization rules (allow / deny / require against JWT claims, MCP tool names, source IPs, SPIFFE identities)
- Request and response transformation (header and body rewrite)
- Rate-limit descriptor computation (per-user, per-team, token-based cost)
- Logging and tracing enrichment (custom log fields, trace attributes, Prometheus labels)
- MCP tool-level filtering
- LLM request/response mutation
All declarative YAML. No sidecar, no external service, no separate deploy pipeline for your policy logic. The expression compiles once at config load and evaluates per-request inside the Rust dataplane.
OPA: policy-as-code with external data
OPA (Open Policy Agent) is the step up from CEL when your authorization decision depends on data that isn't in the request. CEL sees the JWT claims, the headers, the body, and the source/backend metadata. OPA can pull in a user-entitlement table, a deny-list that updates every 30 seconds, or a set of compliance rules versioned in git and unit-tested before deploy. The policy language is Rego, the runtime is the OPA agent (sidecar or remote), and the gateway calls it via the ext-authz gRPC protocol. From the gateway's perspective it's just another ext-authz backend, but the policy surface is dramatically richer.
The Solo ext-auth service ships an OPA plugin in the
AuthConfig chain. You can inline Rego directly
in the CRD or point at an OPA bundle server for dynamic updates.
AuthConfig
plugin chain (JWT + OPA Rego + OAuth2 introspection +
claimsToHeaders) is covered in
Solo external auth service.
ext-authz: bring your own auth service
When the logic doesn't fit Rego or CEL, you deploy your own
gRPC or HTTP service and wire it into the gateway as an
ext-authz backend. The gateway sends the request metadata (or
the full headers and body, depending on config), your service
returns OK or DENIED plus optional
response headers, and the gateway enforces the decision. This
is the full escape hatch. LDAP lookups, custom token introspection
against a proprietary IdP, multi-factor step-up flows, risk
scoring services. Anything you can write in a gRPC handler,
you can run here.
The tradeoff is operational: you now own a service that sits in the request path of every call. It needs to be fast (the gateway blocks until it responds), highly available (if it's down, requests fail unless you configure fail-open, which has its own risks), and observable. Most teams that go this route already have an internal auth service they're trying to integrate, not one they're building from scratch for the gateway.
ext-proc: modify traffic, not just gate it
ext-proc is a different animal. It doesn't decide whether a request should proceed. It changes what the request (or response) looks like when it does. The gateway opens a bidirectional gRPC stream to your ext-proc service. Your service receives the request headers, can ask for the body, mutate both, and send them back. Same for the response path. Full streaming, both directions.
The use case that keeps coming up in MCP and LLM conversations
is output gating: running every model completion through a DLP
scanner, a PII stripper, or a prompt-injection detector before
the client sees it. CEL can do simple regex-based redaction
(the cookbook shows regexReplace() in LLM
transformations), but if you need to call an external
classification API or apply a model-based detector, ext-proc
is the mechanism.
The latency cost is higher than ext-authz because the body crosses the wire to your service and back. Design accordingly: stream-process where possible, avoid buffering entire LLM completions in memory, and put the ext-proc service close to the gateway (same node or same availability zone).
The gradient
Start on the left. Move right only when you know why. Most teams never need to leave CEL. The ones that do move to OPA because they need external data, or to ext-authz because they have a legacy IdP that speaks a proprietary protocol. ext-proc is for a specific class of problem (body mutation, output gating) that the other three can't address.
The platform is powerful out of the box. Fine-grained authorization, per-user rate limits, claim-based header injection, observability enrichment: all CEL in a policy CRD. But when you do need to extend, the hooks are there, each one scoped to a specific kind of problem, and you don't have to choose blind.