Eight enterprise controls for MCP traffic, expressed in agentgateway policy, by Tom O'Rourke

Problem statement

The scenario

You may want to stand up an MCP server in your organisation. Maybe it's the GitHub MCP server, twenty-odd tools on it: search_repositories, get_pull_request, create_pull_request, merge_pull_request, create_or_update_file, delete_file, list_workflow_runs, the whole catalogue. You want different people across your organisation to be allowed to do different things with it. Your developers can read pull requests on every repo, but only your platform engineering team can merge them. Your contractors get search and read, nothing write. Your finance team doesn't see GitHub tools at all.

Token scope is the only control

The GitHub MCP server's only auth mechanism is the token the agent presents. Whether that's a human's OAuth token or a PAT, the agent presents one identity, the server authenticates the call, and that's the gate. The MCP server can drive a huge slice of the GitHub API, so you have to enable the permissions that you're comfortable granting your AI tools, and the only control surface you actually have is the scope of the token itself.

Which means: if you want fine-grained per-user access, your only lever is to mint multiple tokens, one per user, each scoped down to the narrowest set of permissions that user is allowed. You then have to provision them, rotate them, store them somewhere the agent can pick the right one at the right time, and audit who used which token to do what. The token is the policy. And tokens are a blunt instrument.

Pushing policy into the MCP server isn't the answer either

Every MCP server you pull off the shelf would have to reinvent authn and authz. The policy ends up living in tool code instead of in a policy layer. Rotating an identity provider signing key means redeploying the tool. That's not where enterprises want this to live. The pattern you already use for HTTP APIs, a policy layer in front, is what you want for MCP too.

What's missing from a stock MCP server

No per-call authz decision based on who the human is, what they're doing right now, and on what resource.
No way to say "this agent can create_pull_request on repo-a but only pull_request_read on repo-b".
No way to enforce "only between 09:00 and 17:00 UK time, only from a known network, only when the user is in the platform-eng AD group".
No external policy engine integration (OPA, Cedar, custom).
No central audit trail decoupled from GitHub's own audit log.
No rate limiting per-tool, per-user, per-tenant.
No content inspection. You can't block create_or_update_file if the diff contains a secret pattern.
No prompt injection or tool-poisoning defence at the proxy layer.

About this lab

The sister lab at kagent-agentcore-solo-kb uses the Solo Knowledge Base MCP because it's safe to demo against publicly, but the gateway pattern is identical. alice and bob carry Keycloak JWTs, the gateway validates them at the listener, and a CEL allow-list on the MCP backend decides which tools each one is allowed to see and call. Swap Solo KB for GitHub MCP and the shape of the policy doesn't change.

The rest of this page is the eight bullets above, each one mapped to the exact EnterpriseAgentgatewayPolicy (enterpriseagentgateway.solo.io/v1alpha1) field that closes the gap. Where a control needs to step outside what the v2.3 schema documents (per-argument CEL, time of day, CIDR), the section says so explicitly and points at the documented escape hatch — BYO ext-authz with traffic.extAuth.forwardBody.

The shared scaffold (gateway + JWT)

Every example below assumes a Gateway, an AgentgatewayBackend with an MCP target pointing at the GitHub MCP server, and a JWT validator EnterpriseAgentgatewayPolicy on the listener. This block is the prerequisite for the eight that follow — read it once, then the section snippets are the only thing that changes.

# Gateway + JWT validator + remote MCP backend. Validated against
# agentgateway 2.3 CRD schemas and the Solo docs use-cases
# "MCP Tool-Level Access Control" and "MCP Server HTTPS Connectivity".
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: jwt-keycloak
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: agentgateway-proxy
  traffic:
    jwtAuthentication:
      mode: Strict
      providers:
        - issuer: "https://idp.example.com/realms/solo"
          jwks:
            remote:
              jwksPath: "/protocol/openid-connect/certs"
              backendRef:
                name: keycloak
                namespace: auth
                kind: Service
                port: 8080
---
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: github-mcp
  namespace: agentgateway-system
spec:
  mcp:
    targets:
      - name: github
        static:
          host: api.githubcopilot.com
          port: 443
          path: /mcp/
          policies:
            tls:
              sni: api.githubcopilot.com   # HTTPS:443 needs SNI
# The upstream GitHub PAT is injected on the HTTPRoute via a
# RequestHeaderModifier filter setting Authorization: Bearer <pat>
# — left out of this scaffold so the auth strategy doesn't leak
# into every section snippet. See the Solo docs use-case
# "MCP Server HTTPS Connectivity" for the HTTPRoute shape.

Everything that follows targets that AgentgatewayBackend. Add the snippets cumulatively or one at a time — each is self-contained.

Q1 Per-call authz on who the human is and what they're doing

The MCP server treats any caller holding a valid token as one identity with one permission set. We want a per-call decision keyed on the JWT subject (or group) and the MCP tool name.

Gap

A GitHub MCP server with a PAT can drive any GitHub API the PAT scope allows. The only knob is the scope on the token.

Solo answer

CEL allow-list at spec.backend.mcp.authorization.policy.matchExpressions[] reading jwt.<claim> and mcp.tool.name. Tools that don't match are auto-hidden from tools/list, and direct tools/call requests for them are rejected with a JSON-RPC error before the request reaches the upstream MCP server.

YAML

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: github-mcp-rbac
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: agentgateway.dev
      kind: AgentgatewayBackend
      name: github-mcp
  backend:
    mcp:
      authorization:
        action: Allow
        policy:
          matchExpressions:
            # platform-eng and developers can read and search
            - 'jwt.groups.exists(g, g in ["platform-eng", "developers"]) && mcp.tool.name in ["get_pull_request", "search_repositories", "list_issues"]'
            # only platform-eng can merge or open PRs
            - 'jwt.groups.exists(g, g == "platform-eng") && mcp.tool.name in ["merge_pull_request", "create_pull_request"]'
            # contractors get search only
            - 'jwt.groups.exists(g, g in ["contractors"]) && mcp.tool.name == "search_repositories"'

matchExpressions is an OR-joined list. A tool call is allowed when any one expression returns true; groups are not aggregated across expressions for you. If platform-eng should also pick up everything a developer can do, that has to be expressed explicitly, either by listing both groups inside a single jwt.groups.exists(g, g in [...]) call (the pattern above) or by repeating the tool list under a second expression keyed on platform-eng. Finance gets no tools at all because no expression matches their JWT.

test sample JWTs + curl probes

# Four personas. Decoded payloads (the "groups" claim is what the
# allow-list reads):
#   alice    { "sub": "alice",    "groups": ["platform-eng"] }
#   bob      { "sub": "bob",      "groups": ["developers"]   }
#   carlos   { "sub": "carlos",   "groups": ["contractors"]  }
#   dana     { "sub": "dana",     "groups": ["finance"]      }
#
# Mint with step (any IdP that signs RS256 against the JWKS the
# scaffold validator points at works the same way):
ALICE=$(step crypto jwt sign --key dev.key --kid dev \
  --iss https://idp.example.com/realms/solo --aud github-mcp \
  --sub alice --exp $(($(date +%s)+3600)) \
  --payload '{"groups":["platform-eng"]}')

# tools/list  — alice sees the union (read+merge+create+search):
curl -sS -H "Authorization: Bearer $ALICE" \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
  https://gw.example.com/mcp/ | jq '.result.tools[].name'
# "get_pull_request"
# "search_repositories"
# "list_issues"
# "merge_pull_request"
# "create_pull_request"

# tools/call merge_pull_request as alice  — allowed (200):
curl -sS -H "Authorization: Bearer $ALICE" \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call",
       "params":{"name":"merge_pull_request",
                  "arguments":{"owner":"solo-io",
                               "repo":"demos","pullNumber":42}}}' \
  https://gw.example.com/mcp/

# tools/call merge_pull_request as bob  — denied at the gateway.
# merge_pull_request is hidden from bob's catalogue, and a direct
# call is rejected with a JSON-RPC error response before reaching
# the upstream MCP server.
curl -sS -H "Authorization: Bearer $BOB" \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":3,"method":"tools/call",
       "params":{"name":"merge_pull_request",
                  "arguments":{"owner":"solo-io",
                               "repo":"demos","pullNumber":42}}}' \
  https://gw.example.com/mcp/

# dana sees an empty tools list — no expression matches finance.

Q2 Per-resource scoping inside a tool argument

Not just "can alice call create_pull_request" but "can alice call create_pull_request against repo-a while only being able to read repo-b". The resource lives inside the tool's JSON-RPC arguments.

Gap

The GitHub MCP server enforces no per-repo policy of its own — the PAT either has repo:write on the target or it doesn't.

Solo answer

BYO ext-authz with traffic.extAuth.forwardBody.maxSize set, so a gRPC ext-authz service receives the JSON-RPC body and reads params.name and params.arguments.repo to make the decision.

The native MCP authz CEL surface in 2.3 is jwt.<claim> and mcp.tool.name. Per-argument decisions go through ext-authz: agentgateway forwards the JSON-RPC body to a gRPC ext-authz service via traffic.extAuth.forwardBody, and that service reads params.name and params.arguments.<field> to make the call.

YAML

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: github-mcp-byo-extauth
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: agentgateway-proxy
  traffic:
    extAuth:
      backendRef:
        name: my-mcp-authz
        namespace: agentgateway-system
        port: 9001
      forwardBody:
        maxSize: 8192   # JSON-RPC bodies are small; 8 KiB covers them
      grpc: {}

With forwardBody set, the ext-authz service sees: the Authorization header (full JWT to decode), the URL path (which MCP server), the JSON-RPC method, params.name (the tool) and params.arguments (the per-tool inputs). A per-repo decision is one read of params.arguments.repo. Caveat noted in the Solo use-case: with ext-authz at the HTTP layer, tools/list is all-or-nothing — the ext-auth service can deny the entire list, but it can't filter individual tools out of the response. Use native MCP authz (Q1) for the visibility filter and ext-authz for the per-resource action filter — they layer.

go my-mcp-authz / main.go — gRPC ext-authz service

// Minimal Envoy v3 ext-authz that reads the JSON-RPC body
// agentgateway forwards (because forwardBody is set), decodes
// params.name + params.arguments.repo, and decides per-repo.
//
// Wire format: envoy.service.auth.v3.Authorization (CheckRequest /
// CheckResponse). agentgateway's gRPC ext-authz client speaks this.

package main

import (
    "context"
    "encoding/json"
    "log"
    "net"

    authv3 "github.com/envoyproxy/go-control-plane/envoy/service/auth/v3"
    statuspb "google.golang.org/genproto/googleapis/rpc/status"
    "google.golang.org/grpc"
    "google.golang.org/grpc/codes"
)

// repo -> groups that may write to it. Read access is universal.
var writeACL = map[string]map[string]bool{
    "solo-io/demos":    {"platform-eng": true, "developers": true},
    "solo-io/internal": {"platform-eng": true},
}

type rpc struct {
    Method string `json:"method"`
    Params struct {
        Name      string                 `json:"name"`
        Arguments map[string]interface{} `json:"arguments"`
    } `json:"params"`
}

type svc struct{ authv3.UnimplementedAuthorizationServer }

func (s *svc) Check(_ context.Context, r *authv3.CheckRequest) (*authv3.CheckResponse, error) {
    body := r.GetAttributes().GetRequest().GetHttp().GetBody()
    var msg rpc
    if err := json.Unmarshal([]byte(body), &msg); err != nil ||
        msg.Method != "tools/call" {
        return ok(), nil // not a tool call: don't second-guess
    }

    groups := groupsFromJWT(r) // parse Authorization header
    repo, _ := msg.Params.Arguments["repo"].(string)
    owner, _ := msg.Params.Arguments["owner"].(string)
    full := owner + "/" + repo

    if isWrite(msg.Params.Name) {
        allowed := writeACL[full]
        for g := range groups {
            if allowed[g] {
                return ok(), nil
            }
        }
        return denied("repo " + full + " not writable by your groups"), nil
    }
    return ok(), nil
}

func ok() *authv3.CheckResponse {
    return &authv3.CheckResponse{
        Status: &statuspb.Status{Code: int32(codes.OK)},
        HttpResponse: &authv3.CheckResponse_OkResponse{
            OkResponse: &authv3.OkHttpResponse{},
        },
    }
}

func denied(msg string) *authv3.CheckResponse {
    return &authv3.CheckResponse{
        Status: &statuspb.Status{Code: int32(codes.PermissionDenied), Message: msg},
        HttpResponse: &authv3.CheckResponse_DeniedResponse{
            DeniedResponse: &authv3.DeniedHttpResponse{Body: msg},
        },
    }
}

func main() {
    lis, err := net.Listen("tcp", ":9001")
    if err != nil { log.Fatal(err) }
    g := grpc.NewServer()
    authv3.RegisterAuthorizationServer(g, &svc{})
    log.Println("my-mcp-authz listening on :9001")
    log.Fatal(g.Serve(lis))
}

yaml my-mcp-authz / k8s.yaml — Deployment + Service

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-mcp-authz
  namespace: agentgateway-system
spec:
  replicas: 2
  selector:
    matchLabels: { app: my-mcp-authz }
  template:
    metadata:
      labels: { app: my-mcp-authz }
    spec:
      containers:
        - name: authz
          image: ghcr.io/example/my-mcp-authz:0.1.0
          ports: [{ containerPort: 9001, name: grpc }]
          readinessProbe:
            tcpSocket: { port: 9001 }
          resources:
            requests: { cpu: 50m, memory: 64Mi }
            limits:   { cpu: 500m, memory: 256Mi }
---
apiVersion: v1
kind: Service
metadata:
  name: my-mcp-authz
  namespace: agentgateway-system
spec:
  selector: { app: my-mcp-authz }
  ports:
    - name: grpc
      port: 9001
      targetPort: 9001
      appProtocol: grpc

Q3 Time of day, source network, and group membership in the same rule

"Only between 09:00 and 17:00 UK time, only from a known network, only when the user is in the platform-eng AD group." Group membership is in the JWT; time and network are not.

Gap

The MCP server has no view of clock or source IP, and groups aren't sent to it in any standard form.

Solo answer

Group membership goes in the native CEL allow-list at backend.mcp.authorization.policy.matchExpressions[] (Q1). Time-of-day and CIDR matching delegate to the same ext-authz endpoint as Q2 — OPA / Cedar do them natively.

The native CEL surface in 2.3 covers jwt.<claim> and mcp.tool.name; clock and CIDR live elsewhere. Two patterns work: bake the decision into the JWT at mint time (the IdP issues an office-hours-on-call claim and the CEL allow-list reads it), or push time and network rules to OPA / Cedar via traffic.extAuth. The Rego example below does the second one.

YAML — what stays at the gateway (groups)

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: github-mcp-group-rule
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: agentgateway.dev
      kind: AgentgatewayBackend
      name: github-mcp
  backend:
    mcp:
      authorization:
        action: Allow
        policy:
          matchExpressions:
            - 'jwt.groups.exists(g, g == "platform-eng") && mcp.tool.name == "merge_pull_request"'

YAML — what delegates to OPA (time + network)

# Same ext-authz wiring as Q2. The OPA service receives the JWT,
# request metadata and (with forwardBody) the JSON-RPC body, and
# evaluates: data.mcp.allow with input.now and input.source.address.
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: github-mcp-opa
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: agentgateway-proxy
  traffic:
    extAuth:
      backendRef:
        name: opa-envoy-plugin
        namespace: opa
        port: 9191
      grpc: {}
      forwardBody:
        maxSize: 8192

rego mcp.rego — office-hours + CIDR + group

package mcp

import future.keywords.if
import future.keywords.in

# Loaded into opa-envoy-plugin; reached via input.attributes from
# the Envoy v3 CheckRequest agentgateway forwards.

default allow := false

# Office hours: 09:00-17:00 Europe/London, Mon-Fri.
office_hours if {
    [h, _, _] := time.clock(time.now_ns(), "Europe/London")
    h >= 9
    h < 17
    wd := time.weekday(time.now_ns(), "Europe/London")
    wd != "Saturday"
    wd != "Sunday"
}

# Source CIDRs the merge tool is allowed from.
trusted_nets := ["10.0.0.0/8", "192.168.50.0/24"]

from_trusted_net if {
    addr := input.attributes.source.address.socketAddress.address
    some cidr in trusted_nets
    net.cidr_contains(cidr, addr)
}

# JWT groups arrive in the Authorization bearer; decode and read.
groups := g if {
    [_, payload, _] := io.jwt.decode(bearer)
    g := payload.groups
}
bearer := b if {
    h := input.attributes.request.http.headers.authorization
    b := trim_prefix(h, "Bearer ")
}

# The actual rule: platform-eng can merge during office hours from
# a trusted network.
allow if {
    body := json.unmarshal(input.attributes.request.http.body)
    body.method == "tools/call"
    body.params.name == "merge_pull_request"
    "platform-eng" in groups
    office_hours
    from_trusted_net
}

# All other tool calls fall through to the native CEL allow-list
# (Q1) — OPA only adjudicates the time/network-bound action.
allow if {
    body := json.unmarshal(input.attributes.request.http.body)
    body.method != "tools/call"
}
allow if {
    body := json.unmarshal(input.attributes.request.http.body)
    body.params.name != "merge_pull_request"
}

Q4 Delegate the decision to OPA, Cedar or a custom policy engine

Some teams have an established policy engine and don't want the rules to live in CEL on the gateway. The gateway should call the engine and enforce what it returns.

Gap

The MCP server has no plugin model for an external policy engine.

Solo answer

Two ways. Inline gRPC ext-authz at traffic.extAuth.backendRef (vanilla Envoy-style ext-authz, OPA-envoy-plugin or Cedar plug into this directly), or the enterprise traffic.entExtAuth block which can also accept an authConfigRef pointing at an AuthConfig CRD with the full Solo auth plugin chain.

YAML — inline gRPC ext-authz (OPA / Cedar / custom)

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: github-mcp-ext-authz
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: agentgateway-proxy
  traffic:
    extAuth:
      backendRef:
        name: opa-envoy-plugin
        namespace: opa
        port: 9191
      grpc: {}
      forwardBody:
        maxSize: 8192

YAML — enterprise ext-authz pointing at an AuthConfig

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: github-mcp-ent-extauth
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: agentgateway-proxy
  traffic:
    entExtAuth:
      backendRef:
        kind: Service
        name: extauth
        namespace: agentgateway-system
        port: 8083
      authConfigRef:
        name: github-mcp-authz
        namespace: agentgateway-system

Either form: the gateway hands the request to the policy service, the service returns OK / Deny, the gateway enforces. No policy logic in the MCP server itself.

policy same rule, two engines (alice may merge solo-io/demos; bob may not)

OPA / Rego Cedar

package mcp.merge

import future.keywords.if
import future.keywords.in

default allow := false

# Repos and the groups allowed to merge them.
mergeable := {
    "solo-io/demos":    {"platform-eng", "developers"},
    "solo-io/internal": {"platform-eng"},
}

# Decode the JWT off the Authorization header.
claims := payload if {
    h := input.attributes.request.http.headers.authorization
    [_, payload, _] := io.jwt.decode(trim_prefix(h, "Bearer "))
}

# Parse the JSON-RPC body agentgateway forwarded.
body := b if { b := json.unmarshal(input.attributes.request.http.body) }

allow if {
    body.method == "tools/call"
    body.params.name == "merge_pull_request"
    repo := sprintf("%s/%s", [body.params.arguments.owner,
                              body.params.arguments.repo])
    allowed := mergeable[repo]
    some g in claims.groups
    g in allowed
}

// Cedar policy bundle delivered to a Cedar ext-authz adapter
// (e.g. cedar-agent). Same Envoy v3 ext-authz wire format as OPA.
//
// Schema: User { groups: Set<String> },
//         Repo { owner: String, name: String },
//         Action::"merge_pull_request".
// Principal == the JWT subject, resource == owner/repo from the
// JSON-RPC arguments, action == params.name.

permit (
  principal,
  action == Action::"merge_pull_request",
  resource is Repo
)
when {
  // platform-eng can merge any of the listed repos
  (principal.groups.contains("platform-eng")
     && ["solo-io/demos", "solo-io/internal"]
          .contains(resource.owner ++ "/" ++ resource.name))
  ||
  // developers can merge only solo-io/demos
  (principal.groups.contains("developers")
     && resource.owner == "solo-io"
     && resource.name  == "demos")
};

// Default-deny is implicit in Cedar — anything not permitted is
// denied. No matching `permit` rule == 403 at the gateway.

Q5 A central audit trail decoupled from the upstream's own audit log

Who called what tool, with what arguments, from which JWT, when. Independent of GitHub's audit log (which only sees the upstream API calls the MCP server made on the user's behalf).

Gap

The MCP server logs locally at best. The upstream's audit log only sees the final API hit, not the tool name or arguments.

Solo answer

Structured JSON access logs from the gateway, with custom attributes added via frontend.accessLog.attributes.add[]. Each log line carries the JWT subject, the tool name, request duration, status and any header you want to copy in.

YAML — enable JSON access logs

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayParameters
metadata:
  name: agentgateway-config
  namespace: agentgateway-system
spec:
  logging:
    format: json

YAML — add audit attributes

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: github-mcp-audit
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: agentgateway-proxy
  frontend:
    accessLog:
      attributes:
        add:
          - name: jwt.sub
            expression: 'has(jwt.sub) ? jwt.sub : "anonymous"'
          - name: jwt.groups
            expression: 'has(jwt.groups) ? jwt.groups : []'
          - name: caller.user_agent
            expression: 'request.headers["user-agent"]'

Output is one JSON object per request to the gateway pod's stdout, shippable to any log backend. Every access log line already carries the route, listener, method, status and duration; AI-protocol routes also carry gen_ai.usage.input_tokens and gen_ai.usage.output_tokens. The two jwt.* attributes above are added by the policy and only populate when a verified JWT is present (the has(...) ? ... : ... ternary keeps the attribute stable when it isn't). The JSON-RPC method and tool arguments are not in the access-log CEL surface in 2.3, so for per-tool audit you either route through an ext-authz service that emits its own JSON log line, or correlate by request-id between the gateway log and the upstream MCP server log.

log one access-log line + a Loki query for the same call

// One line of agentgateway access log (logging.format=json) after
// alice calls merge_pull_request through the policy chain above.
// Only the four custom attributes (jwt.sub, jwt.groups,
// caller.user_agent, and request-id) come from the policy; the
// rest are gateway built-ins.
{
  "ts":                    "2026-05-27T09:42:11.318Z",
  "gateway":               "agentgateway-proxy",
  "listener":              "mcp-https",
  "route":                 "github-mcp",
  "method":                "POST",
  "path":                  "/mcp/",
  "status":                200,
  "duration_ms":           187,
  "upstream":              "api.githubcopilot.com:443",
  "request_id":            "b8a0a4f5-91c1-4f2f-9c1b-7e3e0b2bc8ad",
  "jwt.sub":               "alice",
  "jwt.groups":            ["platform-eng"],
  "caller.user_agent":     "claude-code/2.3.1"
}

// LogQL — "every request alice made today", then JOIN the
// upstream MCP server log on request_id to recover the JSON-RPC
// method and tool arguments.
{app="agentgateway-proxy"}
  | json
  | jwt_sub="alice"

// Same shape ships to Splunk / Datadog / OpenSearch unchanged —
// one JSON object per line, no parser config needed.

Q6 Per-tool, per-user, per-tenant rate limiting

Stop a single bob from melting the rate-limit budget the upstream applies across the whole organisation. Tier limits per user (e.g. 20/min, 500/hour, 5000/day).

Gap

The MCP server has no per-user rate limiting. The upstream's quota applies to the token, which (per the problem statement) is shared.

Solo answer

A RateLimitConfig (ratelimit.solo.io/v1alpha1) with tiered descriptors keyed on a CEL action that reads jwt.sub directly. The policy attaches it via traffic.entRateLimit.global.rateLimitConfigRefs[] and points ratelimitServerRef at the enterprise rate-limit service. No header projection step needed.

YAML — RateLimitConfig with tiered per-user limits

apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: github-mcp-per-user-tiered
  namespace: agentgateway-system
spec:
  raw:
    descriptors:
      - key: user
        descriptors:
          - key: per-minute
            rateLimit:
              requestsPerUnit: 20
              unit: MINUTE
          - key: per-hour
            rateLimit:
              requestsPerUnit: 500
              unit: HOUR
          - key: per-day
            rateLimit:
              requestsPerUnit: 5000
              unit: DAY
    rateLimits:
      - actions:
          - cel:
              key: user
              expression: 'jwt.sub'
          - genericKey:
              descriptorValue: "per-minute"
        type: REQUEST
      - actions:
          - cel:
              key: user
              expression: 'jwt.sub'
          - genericKey:
              descriptorValue: "per-hour"
        type: REQUEST
      - actions:
          - cel:
              key: user
              expression: 'jwt.sub'
          - genericKey:
              descriptorValue: "per-day"
        type: REQUEST

YAML — attach it

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: github-mcp-rate-limit
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: agentgateway.dev
      kind: AgentgatewayBackend
      name: github-mcp
  traffic:
    entRateLimit:
      global:
        rateLimitConfigRefs:
          - name: github-mcp-per-user-tiered
            namespace: agentgateway-system
        backendRef:
          name: rate-limiter-enterprise-agentgateway
          namespace: agentgateway-system
          port: 8081

Three tiers per user, fanout from one user descriptor populated by a CEL action reading jwt.sub. JWT auth on the same backend runs in mode: Strict, so jwt.sub is always set by the time the rate-limit action evaluates. For per-tool ceilings, add a second descriptor whose action keys on mcp.tool.name.

http sample 429 from the gateway

# alice hammering merge_pull_request past her 20/min cap:
HTTP/1.1 429 Too Many Requests
content-type: application/json
x-ratelimit-limit: 20
x-ratelimit-remaining: 0
x-ratelimit-reset: 17

{
  "jsonrpc": "2.0",
  "id": 7,
  "error": {
    "code": -32000,
    "message": "rate limit exceeded for user alice (20/min)"
  }
}

Q7 Block a tool call when the arguments contain a secret pattern

A developer calls create_or_update_file and the diff contains an AWS access key. We want the gateway to refuse before it ever reaches the upstream.

Gap

The MCP server has no notion of content classes. It will happily forward whatever the agent sent it.

Solo answer

Two surfaces, depending on the protocol. For LLM bodies (chat completions), backend.ai.promptGuard.request[] with built-in regex categories and custom matches[] patterns. For MCP tool arguments, the documented path is BYO ext-authz with forwardBody — your service reads params.arguments and runs the regex.

promptGuard in 2.3 is shaped for LLM provider bodies (OpenAI chat completions), not MCP JSON-RPC envelopes. MCP content inspection goes through ext-authz with forwardBody: a gRPC service receives the body and runs the regex (or a DLP classifier) against params.arguments. The promptGuard snippet below is the right answer when the same gateway also fronts a chat-completions LLM route; the DLP snippet under it is the right answer for the MCP path.

YAML — promptGuard on an LLM body (LLM route)

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: openai-promptguard
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: agentgateway.dev
      kind: AgentgatewayBackend
      name: openai
  backend:
    ai:
      promptGuard:
        request:
          - regex:
              action: Reject
              builtins:
                - Ssn
                - CreditCard
                - PhoneNumber
                - Email
              matches:
                # AWS access key pattern
                - 'AKIA[0-9A-Z]{16}'
            response:
              statusCode: 403

YAML — DLP on the MCP path (ext-authz)

# Same ext-authz wiring as Q2 / Q4. The DLP service runs whatever
# regex / classifier it likes against params.arguments and returns
# Allow / Deny. forwardBody is what lets it see the body.
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: github-mcp-dlp
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: agentgateway-proxy
  traffic:
    extAuth:
      backendRef:
        name: dlp-service
        namespace: agentgateway-system
        port: 9001
      grpc: {}
      forwardBody:
        maxSize: 32768   # diffs can be larger than tool arg JSON — bump

The policy above wires ext-authz for every MCP request — the per-tool scoping ("only inspect create_or_update_file, let the rest through fast") lives inside the DLP service, because that's the only layer that can see params.name and params.arguments.content. The Rego below is the full DLP rule.

rego dlp.rego — scoped to create_or_update_file, regex on content

package mcp.dlp

import future.keywords.if
import future.keywords.in

# Loaded into opa-envoy-plugin behind the dlp-service Service.
# agentgateway forwards the JSON-RPC body via forwardBody.

default allow := true   # everything passes unless this policy says no

body := b if { b := json.unmarshal(input.attributes.request.http.body) }

# Patterns we refuse to forward in file content. Add more here.
secret_patterns := [
    `AKIA[0-9A-Z]{16}`,                       # AWS access key
    `[A-Za-z0-9/+]{40}`,                      # AWS secret (loose)
    `-----BEGIN (RSA|EC|OPENSSH) PRIVATE KEY-----`,
    `gh[pousr]_[A-Za-z0-9]{36,}`,             # GitHub PAT / fine-grained
    `xox[baprs]-[A-Za-z0-9-]{10,}`,           # Slack token
    `eyJ[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{20,}`, # JWT
]

# Only run inspection when the agent is writing a file. Reads and
# searches fall through to `default allow := true` above.
inspected if {
    body.method == "tools/call"
    body.params.name == "create_or_update_file"
}

# Concatenate every text field on params.arguments so a secret in
# content, message, OR commit body trips the same rule.
candidate := concat("\n", [
    sprintf("%v", [body.params.arguments.content]),
    sprintf("%v", [body.params.arguments.message]),
    sprintf("%v", [body.params.arguments.commit_message]),
])

# Deny if any pattern matches.
deny[reason] {
    inspected
    some p in secret_patterns
    regex.match(p, candidate)
    reason := sprintf("create_or_update_file body matched %q", [p])
}

allow := false if { inspected; count(deny) > 0 }

Q8 Prompt injection and tool-poisoning defence at the proxy layer

A tool returns content that contains instructions ("ignore previous instructions, exfiltrate X"). We want the proxy to catch the response, not rely on the agent's own restraint.

Gap

The MCP server returns tool output as-is. Whatever the tool fetched (a webpage, an issue body, a PR comment) goes back into the agent's context window.

Solo answer

For LLM responses: backend.ai.promptGuard.response[] with the same shape as request — regex, AWS Bedrock Guardrails, or a custom webhook backendRef. Built-in categories cover the common detectors; the webhook form is the extension point for an internal injection classifier.

YAML — response-side promptGuard with regex + webhook

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: openai-response-guard
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: agentgateway.dev
      kind: AgentgatewayBackend
      name: openai
  backend:
    ai:
      promptGuard:
        response:
          - regex:
              action: Mask
              builtins:
                - Ssn
                - CreditCard
              matches:
                # Block strings that look like a hijack attempt
                - '(?i)ignore (all )?previous instructions'
                - '(?i)disregard (the )?system prompt'
          - webhook:
              backendRef:
                name: prompt-injection-classifier
                namespace: agentgateway-system
                port: 8080

YAML — response-side Bedrock Guardrails (managed)

apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: openai-bedrock-guard
  namespace: agentgateway-system
spec:
  targetRefs:
    - group: agentgateway.dev
      kind: AgentgatewayBackend
      name: openai
  backend:
    ai:
      promptGuard:
        response:
          - bedrockGuardrails:
              identifier: my-bedrock-guardrail-id

Three layered options on the same field — pick whichever fits your stack. The same shape exists on request[], so the inbound prompt and the outbound response can both pass through the same detector chain.

python prompt-injection-classifier / server.py — webhook backend

# Webhook backend referenced from promptGuard.response[].webhook.
# Receives the model's response payload, returns it unchanged, OR
# masks suspect spans, OR returns a non-200 to trigger Reject.
#
# Contract (agentgateway 2.3 webhook for promptGuard):
#   POST /  with the LLM response body
#   200 + body      -> pass-through (optionally rewritten)
#   200 + body with replaced spans -> mask
#   403            -> reject; gateway returns an error to the caller

import re
from fastapi import FastAPI, Request, Response

app = FastAPI()

INJECTION = re.compile(
    r"(?i)(ignore (all )?previous instructions|"
    r"disregard (the )?system prompt|"
    r"you are now [A-Z][a-zA-Z0-9_-]+ mode|"
    r"reveal (your|the) (system )?prompt)",
)

# Optional: load a small classifier (e.g. a fine-tuned BERT) here.
# def model_score(text: str) -> float: ...

@app.post("/")
async def check(req: Request):
    body = await req.body()
    text = body.decode("utf-8", errors="replace")

    if INJECTION.search(text):
        return Response(
            content='{"error":"response blocked: injection pattern"}',
            status_code=403,
            media_type="application/json",
        )

    # If you want to mask rather than reject, return the same shape
    # with the offending span replaced:
    #   masked = INJECTION.sub("[REDACTED]", text)
    #   return Response(content=masked, status_code=200,
    #                   media_type="application/json")

    return Response(content=body, status_code=200,
                    media_type="application/json")

yaml prompt-injection-classifier / k8s.yaml — Deployment + Service

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prompt-injection-classifier
  namespace: agentgateway-system
spec:
  replicas: 2
  selector:
    matchLabels: { app: prompt-injection-classifier }
  template:
    metadata:
      labels: { app: prompt-injection-classifier }
    spec:
      containers:
        - name: classifier
          image: ghcr.io/example/prompt-injection-classifier:0.1.0
          ports: [{ containerPort: 8080, name: http }]
          readinessProbe:
            httpGet: { path: /healthz, port: 8080 }
          resources:
            requests: { cpu: 100m, memory: 256Mi }
            limits:   { cpu: 1,    memory: 1Gi }
---
apiVersion: v1
kind: Service
metadata:
  name: prompt-injection-classifier
  namespace: agentgateway-system
spec:
  selector: { app: prompt-injection-classifier }
  ports:
    - name: http
      port: 8080
      targetPort: 8080
      appProtocol: http