Problem statement
The scenario
You may want to stand up an MCP server in your organisation. Maybe
it's the GitHub MCP server, twenty-odd tools on it:
search_repositories, get_pull_request,
create_pull_request, merge_pull_request,
create_or_update_file, delete_file,
list_workflow_runs, the whole catalogue. You want
different people across your organisation to be allowed to do
different things with it. Your developers can read pull requests
on every repo, but only your platform engineering team can merge
them. Your contractors get search and read, nothing write. Your
finance team doesn't see GitHub tools at all.
Token scope is the only control
The GitHub MCP server's only auth mechanism is the token the agent presents. Whether that's a human's OAuth token or a PAT, the agent presents one identity, the server authenticates the call, and that's the gate. The MCP server can drive a huge slice of the GitHub API, so you have to enable the permissions that you're comfortable granting your AI tools, and the only control surface you actually have is the scope of the token itself.
Which means: if you want fine-grained per-user access, your only lever is to mint multiple tokens, one per user, each scoped down to the narrowest set of permissions that user is allowed. You then have to provision them, rotate them, store them somewhere the agent can pick the right one at the right time, and audit who used which token to do what. The token is the policy. And tokens are a blunt instrument.
Pushing policy into the MCP server isn't the answer either
Every MCP server you pull off the shelf would have to reinvent authn and authz. The policy ends up living in tool code instead of in a policy layer. Rotating an identity provider signing key means redeploying the tool. That's not where enterprises want this to live. The pattern you already use for HTTP APIs, a policy layer in front, is what you want for MCP too.
What's missing from a stock MCP server
- No per-call authz decision based on who the human is, what they're doing right now, and on what resource.
- No way to say "this agent can
create_pull_requestonrepo-abut onlypull_request_readonrepo-b". - No way to enforce "only between 09:00 and 17:00 UK time, only from a known network, only when the user is in the
platform-engAD group". - No external policy engine integration (OPA, Cedar, custom).
- No central audit trail decoupled from GitHub's own audit log.
- No rate limiting per-tool, per-user, per-tenant.
- No content inspection. You can't block
create_or_update_fileif the diff contains a secret pattern. - No prompt injection or tool-poisoning defence at the proxy layer.
About this lab
The sister lab at
kagent-agentcore-solo-kb
uses the Solo Knowledge Base MCP because it's safe to demo against
publicly, but the gateway pattern is identical. alice
and bob carry Keycloak JWTs, the gateway validates
them at the listener, and a CEL allow-list on the MCP backend
decides which tools each one is allowed to see and call. Swap
Solo KB for GitHub MCP and the shape of the policy doesn't
change.
The rest of this page is the eight bullets above, each one mapped
to the exact
EnterpriseAgentgatewayPolicy
(enterpriseagentgateway.solo.io/v1alpha1) field that
closes the gap. Where a control needs to step outside what the
v2.3 schema documents (per-argument CEL, time of day, CIDR), the
section says so explicitly and points at the documented escape
hatch — BYO ext-authz with
traffic.extAuth.forwardBody.
The shared scaffold (gateway + JWT)
Every example below assumes a Gateway, an
AgentgatewayBackend with an MCP target pointing at
the GitHub MCP server, and a JWT validator
EnterpriseAgentgatewayPolicy on the listener.
This block is the prerequisite for the eight that follow — read
it once, then the section snippets are the only thing that
changes.
# Gateway + JWT validator + remote MCP backend. Validated against
# agentgateway 2.3 CRD schemas and the Solo docs use-cases
# "MCP Tool-Level Access Control" and "MCP Server HTTPS Connectivity".
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: jwt-keycloak
namespace: agentgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: agentgateway-proxy
traffic:
jwtAuthentication:
mode: Strict
providers:
- issuer: "https://idp.example.com/realms/solo"
jwks:
remote:
jwksPath: "/protocol/openid-connect/certs"
backendRef:
name: keycloak
namespace: auth
kind: Service
port: 8080
---
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
name: github-mcp
namespace: agentgateway-system
spec:
mcp:
targets:
- name: github
static:
host: api.githubcopilot.com
port: 443
path: /mcp/
policies:
tls:
sni: api.githubcopilot.com # HTTPS:443 needs SNI
# The upstream GitHub PAT is injected on the HTTPRoute via a
# RequestHeaderModifier filter setting Authorization: Bearer <pat>
# — left out of this scaffold so the auth strategy doesn't leak
# into every section snippet. See the Solo docs use-case
# "MCP Server HTTPS Connectivity" for the HTTPRoute shape.
Everything that follows targets that AgentgatewayBackend.
Add the snippets cumulatively or one at a time — each is
self-contained.
Q1 Per-call authz on who the human is and what they're doing
The MCP server treats any caller holding a valid token as one identity with one permission set. We want a per-call decision keyed on the JWT subject (or group) and the MCP tool name.
Gap
A GitHub MCP server with a PAT can drive any GitHub API the PAT scope allows. The only knob is the scope on the token.
Solo answer
CEL allow-list at
spec.backend.mcp.authorization.policy.matchExpressions[]
reading jwt.<claim> and
mcp.tool.name. Tools that don't match are
auto-hidden from tools/list, and direct
tools/call requests for them are rejected with a
JSON-RPC error before the request reaches the upstream MCP
server.
YAML
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: github-mcp-rbac
namespace: agentgateway-system
spec:
targetRefs:
- group: agentgateway.dev
kind: AgentgatewayBackend
name: github-mcp
backend:
mcp:
authorization:
action: Allow
policy:
matchExpressions:
# platform-eng and developers can read and search
- 'jwt.groups.exists(g, g in ["platform-eng", "developers"]) && mcp.tool.name in ["get_pull_request", "search_repositories", "list_issues"]'
# only platform-eng can merge or open PRs
- 'jwt.groups.exists(g, g == "platform-eng") && mcp.tool.name in ["merge_pull_request", "create_pull_request"]'
# contractors get search only
- 'jwt.groups.exists(g, g in ["contractors"]) && mcp.tool.name == "search_repositories"'
matchExpressions is an OR-joined list. A tool call is
allowed when any one expression returns true; groups are
not aggregated across expressions for you. If
platform-eng should also pick up everything a
developer can do, that has to be expressed
explicitly, either by listing both groups inside a single
jwt.groups.exists(g, g in [...]) call (the pattern
above) or by repeating the tool list under a second expression
keyed on platform-eng. Finance gets no tools at all
because no expression matches their JWT.
test sample JWTs + curl probes
# Four personas. Decoded payloads (the "groups" claim is what the
# allow-list reads):
# alice { "sub": "alice", "groups": ["platform-eng"] }
# bob { "sub": "bob", "groups": ["developers"] }
# carlos { "sub": "carlos", "groups": ["contractors"] }
# dana { "sub": "dana", "groups": ["finance"] }
#
# Mint with step (any IdP that signs RS256 against the JWKS the
# scaffold validator points at works the same way):
ALICE=$(step crypto jwt sign --key dev.key --kid dev \
--iss https://idp.example.com/realms/solo --aud github-mcp \
--sub alice --exp $(($(date +%s)+3600)) \
--payload '{"groups":["platform-eng"]}')
# tools/list — alice sees the union (read+merge+create+search):
curl -sS -H "Authorization: Bearer $ALICE" \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
https://gw.example.com/mcp/ | jq '.result.tools[].name'
# "get_pull_request"
# "search_repositories"
# "list_issues"
# "merge_pull_request"
# "create_pull_request"
# tools/call merge_pull_request as alice — allowed (200):
curl -sS -H "Authorization: Bearer $ALICE" \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":2,"method":"tools/call",
"params":{"name":"merge_pull_request",
"arguments":{"owner":"solo-io",
"repo":"demos","pullNumber":42}}}' \
https://gw.example.com/mcp/
# tools/call merge_pull_request as bob — denied at the gateway.
# merge_pull_request is hidden from bob's catalogue, and a direct
# call is rejected with a JSON-RPC error response before reaching
# the upstream MCP server.
curl -sS -H "Authorization: Bearer $BOB" \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":3,"method":"tools/call",
"params":{"name":"merge_pull_request",
"arguments":{"owner":"solo-io",
"repo":"demos","pullNumber":42}}}' \
https://gw.example.com/mcp/
# dana sees an empty tools list — no expression matches finance.
Q2 Per-resource scoping inside a tool argument
Not just "can alice call create_pull_request" but
"can alice call create_pull_request against
repo-a while only being able to read
repo-b". The resource lives inside the tool's
JSON-RPC arguments.
Gap
The GitHub MCP server enforces no per-repo policy of its
own — the PAT either has repo:write on the
target or it doesn't.
Solo answer
BYO ext-authz with
traffic.extAuth.forwardBody.maxSize set, so a
gRPC ext-authz service receives the JSON-RPC body and reads
params.name and params.arguments.repo
to make the decision.
jwt.<claim> and mcp.tool.name.
Per-argument decisions go through ext-authz: agentgateway forwards
the JSON-RPC body to a gRPC ext-authz service via
traffic.extAuth.forwardBody, and that service reads
params.name and
params.arguments.<field> to make the call.
YAML
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: github-mcp-byo-extauth
namespace: agentgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: agentgateway-proxy
traffic:
extAuth:
backendRef:
name: my-mcp-authz
namespace: agentgateway-system
port: 9001
forwardBody:
maxSize: 8192 # JSON-RPC bodies are small; 8 KiB covers them
grpc: {}
With forwardBody set, the ext-authz service sees:
the Authorization header (full JWT to decode),
the URL path (which MCP server), the JSON-RPC
method, params.name (the tool) and
params.arguments (the per-tool inputs). A per-repo
decision is one read of params.arguments.repo.
Caveat noted in the Solo use-case: with ext-authz at the HTTP
layer, tools/list is all-or-nothing — the
ext-auth service can deny the entire list, but it can't filter
individual tools out of the response. Use native MCP authz
(Q1) for the visibility filter and ext-authz for the
per-resource action filter — they layer.
go my-mcp-authz / main.go — gRPC ext-authz service
// Minimal Envoy v3 ext-authz that reads the JSON-RPC body
// agentgateway forwards (because forwardBody is set), decodes
// params.name + params.arguments.repo, and decides per-repo.
//
// Wire format: envoy.service.auth.v3.Authorization (CheckRequest /
// CheckResponse). agentgateway's gRPC ext-authz client speaks this.
package main
import (
"context"
"encoding/json"
"log"
"net"
authv3 "github.com/envoyproxy/go-control-plane/envoy/service/auth/v3"
statuspb "google.golang.org/genproto/googleapis/rpc/status"
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
)
// repo -> groups that may write to it. Read access is universal.
var writeACL = map[string]map[string]bool{
"solo-io/demos": {"platform-eng": true, "developers": true},
"solo-io/internal": {"platform-eng": true},
}
type rpc struct {
Method string `json:"method"`
Params struct {
Name string `json:"name"`
Arguments map[string]interface{} `json:"arguments"`
} `json:"params"`
}
type svc struct{ authv3.UnimplementedAuthorizationServer }
func (s *svc) Check(_ context.Context, r *authv3.CheckRequest) (*authv3.CheckResponse, error) {
body := r.GetAttributes().GetRequest().GetHttp().GetBody()
var msg rpc
if err := json.Unmarshal([]byte(body), &msg); err != nil ||
msg.Method != "tools/call" {
return ok(), nil // not a tool call: don't second-guess
}
groups := groupsFromJWT(r) // parse Authorization header
repo, _ := msg.Params.Arguments["repo"].(string)
owner, _ := msg.Params.Arguments["owner"].(string)
full := owner + "/" + repo
if isWrite(msg.Params.Name) {
allowed := writeACL[full]
for g := range groups {
if allowed[g] {
return ok(), nil
}
}
return denied("repo " + full + " not writable by your groups"), nil
}
return ok(), nil
}
func ok() *authv3.CheckResponse {
return &authv3.CheckResponse{
Status: &statuspb.Status{Code: int32(codes.OK)},
HttpResponse: &authv3.CheckResponse_OkResponse{
OkResponse: &authv3.OkHttpResponse{},
},
}
}
func denied(msg string) *authv3.CheckResponse {
return &authv3.CheckResponse{
Status: &statuspb.Status{Code: int32(codes.PermissionDenied), Message: msg},
HttpResponse: &authv3.CheckResponse_DeniedResponse{
DeniedResponse: &authv3.DeniedHttpResponse{Body: msg},
},
}
}
func main() {
lis, err := net.Listen("tcp", ":9001")
if err != nil { log.Fatal(err) }
g := grpc.NewServer()
authv3.RegisterAuthorizationServer(g, &svc{})
log.Println("my-mcp-authz listening on :9001")
log.Fatal(g.Serve(lis))
}
yaml my-mcp-authz / k8s.yaml — Deployment + Service
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-mcp-authz
namespace: agentgateway-system
spec:
replicas: 2
selector:
matchLabels: { app: my-mcp-authz }
template:
metadata:
labels: { app: my-mcp-authz }
spec:
containers:
- name: authz
image: ghcr.io/example/my-mcp-authz:0.1.0
ports: [{ containerPort: 9001, name: grpc }]
readinessProbe:
tcpSocket: { port: 9001 }
resources:
requests: { cpu: 50m, memory: 64Mi }
limits: { cpu: 500m, memory: 256Mi }
---
apiVersion: v1
kind: Service
metadata:
name: my-mcp-authz
namespace: agentgateway-system
spec:
selector: { app: my-mcp-authz }
ports:
- name: grpc
port: 9001
targetPort: 9001
appProtocol: grpc
Q3 Time of day, source network, and group membership in the same rule
"Only between 09:00 and 17:00 UK time, only from a known
network, only when the user is in the platform-eng
AD group." Group membership is in the JWT; time and network
are not.
Gap
The MCP server has no view of clock or source IP, and groups aren't sent to it in any standard form.
Solo answer
Group membership goes in the native CEL allow-list at
backend.mcp.authorization.policy.matchExpressions[]
(Q1). Time-of-day and CIDR matching delegate to the same
ext-authz endpoint as Q2 — OPA / Cedar do them natively.
jwt.<claim> and mcp.tool.name;
clock and CIDR live elsewhere. Two patterns work: bake the
decision into the JWT at mint time (the IdP issues an
office-hours-on-call claim and the CEL allow-list
reads it), or push time and network rules to OPA / Cedar via
traffic.extAuth. The Rego example below does the
second one.
YAML — what stays at the gateway (groups)
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: github-mcp-group-rule
namespace: agentgateway-system
spec:
targetRefs:
- group: agentgateway.dev
kind: AgentgatewayBackend
name: github-mcp
backend:
mcp:
authorization:
action: Allow
policy:
matchExpressions:
- 'jwt.groups.exists(g, g == "platform-eng") && mcp.tool.name == "merge_pull_request"'
YAML — what delegates to OPA (time + network)
# Same ext-authz wiring as Q2. The OPA service receives the JWT,
# request metadata and (with forwardBody) the JSON-RPC body, and
# evaluates: data.mcp.allow with input.now and input.source.address.
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: github-mcp-opa
namespace: agentgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: agentgateway-proxy
traffic:
extAuth:
backendRef:
name: opa-envoy-plugin
namespace: opa
port: 9191
grpc: {}
forwardBody:
maxSize: 8192
rego mcp.rego — office-hours + CIDR + group
package mcp
import future.keywords.if
import future.keywords.in
# Loaded into opa-envoy-plugin; reached via input.attributes from
# the Envoy v3 CheckRequest agentgateway forwards.
default allow := false
# Office hours: 09:00-17:00 Europe/London, Mon-Fri.
office_hours if {
[h, _, _] := time.clock(time.now_ns(), "Europe/London")
h >= 9
h < 17
wd := time.weekday(time.now_ns(), "Europe/London")
wd != "Saturday"
wd != "Sunday"
}
# Source CIDRs the merge tool is allowed from.
trusted_nets := ["10.0.0.0/8", "192.168.50.0/24"]
from_trusted_net if {
addr := input.attributes.source.address.socketAddress.address
some cidr in trusted_nets
net.cidr_contains(cidr, addr)
}
# JWT groups arrive in the Authorization bearer; decode and read.
groups := g if {
[_, payload, _] := io.jwt.decode(bearer)
g := payload.groups
}
bearer := b if {
h := input.attributes.request.http.headers.authorization
b := trim_prefix(h, "Bearer ")
}
# The actual rule: platform-eng can merge during office hours from
# a trusted network.
allow if {
body := json.unmarshal(input.attributes.request.http.body)
body.method == "tools/call"
body.params.name == "merge_pull_request"
"platform-eng" in groups
office_hours
from_trusted_net
}
# All other tool calls fall through to the native CEL allow-list
# (Q1) — OPA only adjudicates the time/network-bound action.
allow if {
body := json.unmarshal(input.attributes.request.http.body)
body.method != "tools/call"
}
allow if {
body := json.unmarshal(input.attributes.request.http.body)
body.params.name != "merge_pull_request"
}
Q4 Delegate the decision to OPA, Cedar or a custom policy engine
Some teams have an established policy engine and don't want the rules to live in CEL on the gateway. The gateway should call the engine and enforce what it returns.
Gap
The MCP server has no plugin model for an external policy engine.
Solo answer
Two ways. Inline gRPC ext-authz at
traffic.extAuth.backendRef (vanilla
Envoy-style ext-authz, OPA-envoy-plugin or Cedar plug into
this directly), or the enterprise
traffic.entExtAuth block which can also accept
an authConfigRef pointing at an
AuthConfig CRD with the full Solo auth plugin
chain.
YAML — inline gRPC ext-authz (OPA / Cedar / custom)
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: github-mcp-ext-authz
namespace: agentgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: agentgateway-proxy
traffic:
extAuth:
backendRef:
name: opa-envoy-plugin
namespace: opa
port: 9191
grpc: {}
forwardBody:
maxSize: 8192
YAML — enterprise ext-authz pointing at an AuthConfig
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: github-mcp-ent-extauth
namespace: agentgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: agentgateway-proxy
traffic:
entExtAuth:
backendRef:
kind: Service
name: extauth
namespace: agentgateway-system
port: 8083
authConfigRef:
name: github-mcp-authz
namespace: agentgateway-system
Either form: the gateway hands the request to the policy service, the service returns OK / Deny, the gateway enforces. No policy logic in the MCP server itself.
policy same rule, two engines (alice may merge solo-io/demos; bob may not)
package mcp.merge
import future.keywords.if
import future.keywords.in
default allow := false
# Repos and the groups allowed to merge them.
mergeable := {
"solo-io/demos": {"platform-eng", "developers"},
"solo-io/internal": {"platform-eng"},
}
# Decode the JWT off the Authorization header.
claims := payload if {
h := input.attributes.request.http.headers.authorization
[_, payload, _] := io.jwt.decode(trim_prefix(h, "Bearer "))
}
# Parse the JSON-RPC body agentgateway forwarded.
body := b if { b := json.unmarshal(input.attributes.request.http.body) }
allow if {
body.method == "tools/call"
body.params.name == "merge_pull_request"
repo := sprintf("%s/%s", [body.params.arguments.owner,
body.params.arguments.repo])
allowed := mergeable[repo]
some g in claims.groups
g in allowed
}
// Cedar policy bundle delivered to a Cedar ext-authz adapter
// (e.g. cedar-agent). Same Envoy v3 ext-authz wire format as OPA.
//
// Schema: User { groups: Set<String> },
// Repo { owner: String, name: String },
// Action::"merge_pull_request".
// Principal == the JWT subject, resource == owner/repo from the
// JSON-RPC arguments, action == params.name.
permit (
principal,
action == Action::"merge_pull_request",
resource is Repo
)
when {
// platform-eng can merge any of the listed repos
(principal.groups.contains("platform-eng")
&& ["solo-io/demos", "solo-io/internal"]
.contains(resource.owner ++ "/" ++ resource.name))
||
// developers can merge only solo-io/demos
(principal.groups.contains("developers")
&& resource.owner == "solo-io"
&& resource.name == "demos")
};
// Default-deny is implicit in Cedar — anything not permitted is
// denied. No matching `permit` rule == 403 at the gateway.
Q5 A central audit trail decoupled from the upstream's own audit log
Who called what tool, with what arguments, from which JWT, when. Independent of GitHub's audit log (which only sees the upstream API calls the MCP server made on the user's behalf).
Gap
The MCP server logs locally at best. The upstream's audit log only sees the final API hit, not the tool name or arguments.
Solo answer
Structured JSON access logs from the gateway, with custom
attributes added via
frontend.accessLog.attributes.add[]. Each log
line carries the JWT subject, the tool name, request
duration, status and any header you want to copy in.
YAML — enable JSON access logs
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayParameters
metadata:
name: agentgateway-config
namespace: agentgateway-system
spec:
logging:
format: json
YAML — add audit attributes
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: github-mcp-audit
namespace: agentgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: agentgateway-proxy
frontend:
accessLog:
attributes:
add:
- name: jwt.sub
expression: 'has(jwt.sub) ? jwt.sub : "anonymous"'
- name: jwt.groups
expression: 'has(jwt.groups) ? jwt.groups : []'
- name: caller.user_agent
expression: 'request.headers["user-agent"]'
Output is one JSON object per request to the gateway pod's
stdout, shippable to any log backend. Every access log line
already carries the route, listener, method, status and
duration; AI-protocol routes also carry
gen_ai.usage.input_tokens and
gen_ai.usage.output_tokens. The two
jwt.* attributes above are added by the policy
and only populate when a verified JWT is present (the
has(...) ? ... : ... ternary keeps the attribute
stable when it isn't). The JSON-RPC method and tool arguments are not in the
access-log CEL surface in 2.3, so for per-tool audit you
either route through an ext-authz service that emits its own
JSON log line, or correlate by request-id between the gateway
log and the upstream MCP server log.
log one access-log line + a Loki query for the same call
// One line of agentgateway access log (logging.format=json) after
// alice calls merge_pull_request through the policy chain above.
// Only the four custom attributes (jwt.sub, jwt.groups,
// caller.user_agent, and request-id) come from the policy; the
// rest are gateway built-ins.
{
"ts": "2026-05-27T09:42:11.318Z",
"gateway": "agentgateway-proxy",
"listener": "mcp-https",
"route": "github-mcp",
"method": "POST",
"path": "/mcp/",
"status": 200,
"duration_ms": 187,
"upstream": "api.githubcopilot.com:443",
"request_id": "b8a0a4f5-91c1-4f2f-9c1b-7e3e0b2bc8ad",
"jwt.sub": "alice",
"jwt.groups": ["platform-eng"],
"caller.user_agent": "claude-code/2.3.1"
}
// LogQL — "every request alice made today", then JOIN the
// upstream MCP server log on request_id to recover the JSON-RPC
// method and tool arguments.
{app="agentgateway-proxy"}
| json
| jwt_sub="alice"
// Same shape ships to Splunk / Datadog / OpenSearch unchanged —
// one JSON object per line, no parser config needed.
Q6 Per-tool, per-user, per-tenant rate limiting
Stop a single bob from melting the rate-limit budget the upstream applies across the whole organisation. Tier limits per user (e.g. 20/min, 500/hour, 5000/day).
Gap
The MCP server has no per-user rate limiting. The upstream's quota applies to the token, which (per the problem statement) is shared.
Solo answer
A RateLimitConfig
(ratelimit.solo.io/v1alpha1) with tiered
descriptors keyed on a CEL action that reads
jwt.sub directly. The policy attaches it via
traffic.entRateLimit.global.rateLimitConfigRefs[]
and points ratelimitServerRef at the enterprise
rate-limit service. No header projection step needed.
YAML — RateLimitConfig with tiered per-user limits
apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
name: github-mcp-per-user-tiered
namespace: agentgateway-system
spec:
raw:
descriptors:
- key: user
descriptors:
- key: per-minute
rateLimit:
requestsPerUnit: 20
unit: MINUTE
- key: per-hour
rateLimit:
requestsPerUnit: 500
unit: HOUR
- key: per-day
rateLimit:
requestsPerUnit: 5000
unit: DAY
rateLimits:
- actions:
- cel:
key: user
expression: 'jwt.sub'
- genericKey:
descriptorValue: "per-minute"
type: REQUEST
- actions:
- cel:
key: user
expression: 'jwt.sub'
- genericKey:
descriptorValue: "per-hour"
type: REQUEST
- actions:
- cel:
key: user
expression: 'jwt.sub'
- genericKey:
descriptorValue: "per-day"
type: REQUEST
YAML — attach it
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: github-mcp-rate-limit
namespace: agentgateway-system
spec:
targetRefs:
- group: agentgateway.dev
kind: AgentgatewayBackend
name: github-mcp
traffic:
entRateLimit:
global:
rateLimitConfigRefs:
- name: github-mcp-per-user-tiered
namespace: agentgateway-system
backendRef:
name: rate-limiter-enterprise-agentgateway
namespace: agentgateway-system
port: 8081
Three tiers per user, fanout from one user
descriptor populated by a CEL action reading
jwt.sub. JWT auth on the same backend runs in
mode: Strict, so jwt.sub is always
set by the time the rate-limit action evaluates. For per-tool
ceilings, add a second descriptor whose action keys on
mcp.tool.name.
http sample 429 from the gateway
# alice hammering merge_pull_request past her 20/min cap:
HTTP/1.1 429 Too Many Requests
content-type: application/json
x-ratelimit-limit: 20
x-ratelimit-remaining: 0
x-ratelimit-reset: 17
{
"jsonrpc": "2.0",
"id": 7,
"error": {
"code": -32000,
"message": "rate limit exceeded for user alice (20/min)"
}
}
Q7 Block a tool call when the arguments contain a secret pattern
A developer calls create_or_update_file and the
diff contains an AWS access key. We want the gateway to refuse
before it ever reaches the upstream.
Gap
The MCP server has no notion of content classes. It will happily forward whatever the agent sent it.
Solo answer
Two surfaces, depending on the protocol. For LLM bodies
(chat completions), backend.ai.promptGuard.request[]
with built-in regex categories and custom matches[]
patterns. For MCP tool arguments, the documented path is
BYO ext-authz with forwardBody — your service
reads params.arguments and runs the regex.
promptGuard in 2.3 is shaped for LLM provider bodies
(OpenAI chat completions), not MCP JSON-RPC envelopes. MCP
content inspection goes through ext-authz with
forwardBody: a gRPC service receives the body and
runs the regex (or a DLP classifier) against
params.arguments. The promptGuard
snippet below is the right answer when the same gateway also
fronts a chat-completions LLM route; the DLP snippet under it
is the right answer for the MCP path.
YAML — promptGuard on an LLM body (LLM route)
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: openai-promptguard
namespace: agentgateway-system
spec:
targetRefs:
- group: agentgateway.dev
kind: AgentgatewayBackend
name: openai
backend:
ai:
promptGuard:
request:
- regex:
action: Reject
builtins:
- Ssn
- CreditCard
- PhoneNumber
- Email
matches:
# AWS access key pattern
- 'AKIA[0-9A-Z]{16}'
response:
statusCode: 403
YAML — DLP on the MCP path (ext-authz)
# Same ext-authz wiring as Q2 / Q4. The DLP service runs whatever
# regex / classifier it likes against params.arguments and returns
# Allow / Deny. forwardBody is what lets it see the body.
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: github-mcp-dlp
namespace: agentgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: agentgateway-proxy
traffic:
extAuth:
backendRef:
name: dlp-service
namespace: agentgateway-system
port: 9001
grpc: {}
forwardBody:
maxSize: 32768 # diffs can be larger than tool arg JSON — bump
The policy above wires ext-authz for every MCP request — the
per-tool scoping ("only inspect create_or_update_file,
let the rest through fast") lives inside the DLP service, because
that's the only layer that can see params.name and
params.arguments.content. The Rego below is the
full DLP rule.
rego dlp.rego — scoped to create_or_update_file, regex on content
package mcp.dlp
import future.keywords.if
import future.keywords.in
# Loaded into opa-envoy-plugin behind the dlp-service Service.
# agentgateway forwards the JSON-RPC body via forwardBody.
default allow := true # everything passes unless this policy says no
body := b if { b := json.unmarshal(input.attributes.request.http.body) }
# Patterns we refuse to forward in file content. Add more here.
secret_patterns := [
`AKIA[0-9A-Z]{16}`, # AWS access key
`[A-Za-z0-9/+]{40}`, # AWS secret (loose)
`-----BEGIN (RSA|EC|OPENSSH) PRIVATE KEY-----`,
`gh[pousr]_[A-Za-z0-9]{36,}`, # GitHub PAT / fine-grained
`xox[baprs]-[A-Za-z0-9-]{10,}`, # Slack token
`eyJ[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{20,}`, # JWT
]
# Only run inspection when the agent is writing a file. Reads and
# searches fall through to `default allow := true` above.
inspected if {
body.method == "tools/call"
body.params.name == "create_or_update_file"
}
# Concatenate every text field on params.arguments so a secret in
# content, message, OR commit body trips the same rule.
candidate := concat("\n", [
sprintf("%v", [body.params.arguments.content]),
sprintf("%v", [body.params.arguments.message]),
sprintf("%v", [body.params.arguments.commit_message]),
])
# Deny if any pattern matches.
deny[reason] {
inspected
some p in secret_patterns
regex.match(p, candidate)
reason := sprintf("create_or_update_file body matched %q", [p])
}
allow := false if { inspected; count(deny) > 0 }
Q8 Prompt injection and tool-poisoning defence at the proxy layer
A tool returns content that contains instructions ("ignore previous instructions, exfiltrate X"). We want the proxy to catch the response, not rely on the agent's own restraint.
Gap
The MCP server returns tool output as-is. Whatever the tool fetched (a webpage, an issue body, a PR comment) goes back into the agent's context window.
Solo answer
For LLM responses: backend.ai.promptGuard.response[]
with the same shape as request — regex, AWS Bedrock
Guardrails, or a custom webhook backendRef.
Built-in categories cover the common detectors; the
webhook form is the extension point for an internal
injection classifier.
YAML — response-side promptGuard with regex + webhook
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: openai-response-guard
namespace: agentgateway-system
spec:
targetRefs:
- group: agentgateway.dev
kind: AgentgatewayBackend
name: openai
backend:
ai:
promptGuard:
response:
- regex:
action: Mask
builtins:
- Ssn
- CreditCard
matches:
# Block strings that look like a hijack attempt
- '(?i)ignore (all )?previous instructions'
- '(?i)disregard (the )?system prompt'
- webhook:
backendRef:
name: prompt-injection-classifier
namespace: agentgateway-system
port: 8080
YAML — response-side Bedrock Guardrails (managed)
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: openai-bedrock-guard
namespace: agentgateway-system
spec:
targetRefs:
- group: agentgateway.dev
kind: AgentgatewayBackend
name: openai
backend:
ai:
promptGuard:
response:
- bedrockGuardrails:
identifier: my-bedrock-guardrail-id
Three layered options on the same field — pick whichever fits
your stack. The same shape exists on request[],
so the inbound prompt and the outbound response can both pass
through the same detector chain.
python prompt-injection-classifier / server.py — webhook backend
# Webhook backend referenced from promptGuard.response[].webhook.
# Receives the model's response payload, returns it unchanged, OR
# masks suspect spans, OR returns a non-200 to trigger Reject.
#
# Contract (agentgateway 2.3 webhook for promptGuard):
# POST / with the LLM response body
# 200 + body -> pass-through (optionally rewritten)
# 200 + body with replaced spans -> mask
# 403 -> reject; gateway returns an error to the caller
import re
from fastapi import FastAPI, Request, Response
app = FastAPI()
INJECTION = re.compile(
r"(?i)(ignore (all )?previous instructions|"
r"disregard (the )?system prompt|"
r"you are now [A-Z][a-zA-Z0-9_-]+ mode|"
r"reveal (your|the) (system )?prompt)",
)
# Optional: load a small classifier (e.g. a fine-tuned BERT) here.
# def model_score(text: str) -> float: ...
@app.post("/")
async def check(req: Request):
body = await req.body()
text = body.decode("utf-8", errors="replace")
if INJECTION.search(text):
return Response(
content='{"error":"response blocked: injection pattern"}',
status_code=403,
media_type="application/json",
)
# If you want to mask rather than reject, return the same shape
# with the offending span replaced:
# masked = INJECTION.sub("[REDACTED]", text)
# return Response(content=masked, status_code=200,
# media_type="application/json")
return Response(content=body, status_code=200,
media_type="application/json")
yaml prompt-injection-classifier / k8s.yaml — Deployment + Service
apiVersion: apps/v1
kind: Deployment
metadata:
name: prompt-injection-classifier
namespace: agentgateway-system
spec:
replicas: 2
selector:
matchLabels: { app: prompt-injection-classifier }
template:
metadata:
labels: { app: prompt-injection-classifier }
spec:
containers:
- name: classifier
image: ghcr.io/example/prompt-injection-classifier:0.1.0
ports: [{ containerPort: 8080, name: http }]
readinessProbe:
httpGet: { path: /healthz, port: 8080 }
resources:
requests: { cpu: 100m, memory: 256Mi }
limits: { cpu: 1, memory: 1Gi }
---
apiVersion: v1
kind: Service
metadata:
name: prompt-injection-classifier
namespace: agentgateway-system
spec:
selector: { app: prompt-injection-classifier }
ports:
- name: http
port: 8080
targetPort: 8080
appProtocol: http