The problem. When several agents reach the same MCP tool through a gateway, and a high-impact tool call needs a person to approve it, where does the approval live? If you build it into each agent, you are trusting every agent to implement the gate, and the approver ends up being whoever is driving that particular agent. That does not hold up when many agents share one tool but a single team owns the decision.
The cleaner answer is to enforce the approval at the gateway. The agent does not know the gate exists. The gateway catches the call, holds it, and asks a reviewer. Because the gate sits in one place in front of the tool, it applies the same way no matter how many agents call it, and approval is centralised and decoupled from the caller.
How it works
The gateway calls the ext-authz service before forwarding. That service holds the request open while a reviewer decides from a separate queue. Approve and the call is forwarded to the MCP server; reject or time out and the gateway returns a denial, so the tool never runs.
Step by step, for a single gated tool call:
- An agent issues an MCP
tools/callthrough the gateway. - The route for privileged tools carries an
AgentgatewayPolicywithextAuth, so the gateway calls the external authorization service before forwarding. - The ext-authz service reads the tool name and arguments from the body (via
forwardBody, so the reviewer can see what is being asked) and holds the call open. The agent's tool call simply sits pending, with no retries and no error. - A reviewer sees the pending request in a separate queue and approves or rejects it.
- On approve, the gateway forwards the call to the MCP server. On reject or timeout, the gateway returns a denial and the call never reaches the MCP server.
Handshake and discovery frames are not gated. Only the tools/call needs a human, so the
ext-authz service lets initialize, tools/list and the like pass straight
through. There is nothing for a person to approve on those, and parking them would stall the session.
The policy
This is the whole gateway-side configuration: an AgentgatewayPolicy that points the
privileged route at your approval service over gRPC, with the request body forwarded so the reviewer
has context.
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
name: privileged-extauth
namespace: ops-tools
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: mcp-privileged # only the privileged route is gated
traffic:
extAuth:
backendRef:
name: hitl-extauth # your approval service
namespace: hitl
port: 9001
forwardBody:
maxSize: 65536 # so the reviewer sees the tool name + args
grpc: {} # Envoy ext-authz gRPC contract
gRPC over h2c. The ext-authz service is reached over the Envoy external
authorization gRPC contract, so its Kubernetes Service needs
appProtocol: kubernetes.io/h2c (HTTP/2 without TLS) on the gRPC port. Without it the
gateway cannot speak gRPC to the service.
Choosing which calls get gated
There are two ways to decide which tool calls hit the approval gate.
-
By path. Put the risky tools on a separate endpoint on the MCP server and attach
the
extAuthpolicy only to that route. The MCP server exposes, say,/publicand/privileged; the policy targets the/privilegedHTTPRoute. The gate is visible in the topology rather than hidden in an expression, and it is the approach used in the lab below. - By body inspection. Keep one endpoint and gate on the tool name carried in the JSON-RPC body. More flexible, since you do not have to split the server, but there is more logic to maintain and reason about.
The approval service contract
The approval service is small, and it is the one piece you bring. It implements the Envoy ext-authz
gRPC contract. The trick that makes the whole pattern work is that the contract lets the
Check() call take as long as it wants, so the service blocks on that call until a human
decides. A tiny admin API on the side lets a queue UI list and resolve pending calls.
// gRPC: gateway calls Check() before forwarding. We park here.
func (e *extAuthServer) Check(ctx context.Context, req *auth_v3.CheckRequest) (*auth_v3.CheckResponse, error) {
tool, args := parseMCP(req.GetAttributes().GetRequest().GetHttp().GetBody())
// Non tools/call frames (initialize, tools/list, ...) pass straight through.
if tool == "" {
return ok(), nil
}
// Hold the call open until a reviewer decides (or it times out).
decision := e.queue.park(tool, args) // blocks on a channel
if decision.Approved {
return ok(), nil // gateway forwards to the MCP server
}
return denied(decision.Reason), nil // gateway returns 403; MCP never sees it
}
The admin side is just two endpoints a queue UI talks to:
GET /pending # list calls currently parked, with tool name + args
POST /decide/{id} # {"approved": true|false, "reason": "..."}
Keep the queue out-of-band on purpose. The reviewer is not in the agent conversation, so a dedicated queue with an audit trail fits the role. Swap the sample UI for Slack, Backstage or ServiceNow as needed; the gateway-side contract does not change.
Proven end to end
This is not theory. On a single kind cluster, with the policy above attached to the
/privileged route, driving the MCP endpoint directly (no LLM in the loop, to isolate the
gateway behaviour):
# Discovery passes straight through, not parked
$ mcp-inspector --cli http://<gateway>/privileged/mcp --method tools/list
{ "tools": [ { "name": "run_migration", ... } ] } # returns immediately
# A real tool call parks at the gateway
$ mcp-inspector --cli http://<gateway>/privileged/mcp \
--method tools/call --tool-name run_migration --tool-arg version=v3
# ...blocks, waiting on approval...
# Meanwhile the call is visible in the reviewer queue
$ curl -s http://<extauth-admin>/pending
{ "pending": [ { "id": "p-19", "toolName": "run_migration",
"toolArgs": { "version": "v3" }, "path": "/privileged/mcp" } ] }
# Approve it
$ curl -X POST http://<extauth-admin>/decide/p-19 -d '{"approved":true}'
# → the parked call unblocks and returns:
{ "migrated": true, "from": "v2", "to": "v3" } # tool actually ran
# Reject a later call → the gateway denies it, the tool never runs
$ curl -X POST http://<extauth-admin>/decide/p-24 -d '{"approved":false}'
# → the MCP call fails with a denial; the DB is unchanged
The MCP server's own state confirms it: after one approved v3 migration and one rejected
v4, the schema is v3 and the audit log shows a single
v2 → v3 entry. The rejected call left no trace, because it never reached the server.
Why this fits "many agents, one approver"
The gate lives at the gateway, not inside any agent. Every agent that calls the tool hits the same approval queue, and the approver is a central reviewer rather than whoever is driving a given agent. Agents do not need to know the gate exists, and you do not have to trust each one to implement it. That is the decoupling you want when one tool is shared but one team owns the decision.
See also
- Full lab: End-User and Platform Approval Gates for MCP Agents (agent-side and gateway-side HITL side by side, on kind)
- Solo agentgateway documentation
- Model Context Protocol