Gateway-enforced human approval for MCP tool calls, by Tom O'Rourke

The problem. When several agents reach the same MCP tool through a gateway, and a high-impact tool call needs a person to approve it, where does the approval live? If you build it into each agent, you are trusting every agent to implement the gate, and the approver ends up being whoever is driving that particular agent. That does not hold up when many agents share one tool but a single team owns the decision.

The cleaner answer is to enforce the approval at the gateway. The agent does not know the gate exists. The gateway catches the call, holds it, and asks a reviewer. Because the gate sits in one place in front of the tool, it applies the same way no matter how many agents call it, and approval is centralised and decoupled from the caller.

How it works

The gateway calls the ext-authz service before forwarding. That service holds the request open while a reviewer decides from a separate queue. Approve and the call is forwarded to the MCP server; reject or time out and the gateway returns a denial, so the tool never runs.

Step by step, for a single gated tool call:

An agent issues an MCP tools/call through the gateway.
The route for privileged tools carries an AgentgatewayPolicy with extAuth, so the gateway calls the external authorization service before forwarding.
The ext-authz service reads the tool name and arguments from the body (via forwardBody, so the reviewer can see what is being asked) and holds the call open. The agent's tool call simply sits pending, with no retries and no error.
A reviewer sees the pending request in a separate queue and approves or rejects it.
On approve, the gateway forwards the call to the MCP server. On reject or timeout, the gateway returns a denial and the call never reaches the MCP server.

Handshake and discovery frames are not gated. Only the tools/call needs a human, so the ext-authz service lets initialize, tools/list and the like pass straight through. There is nothing for a person to approve on those, and parking them would stall the session.

The policy

This is the whole gateway-side configuration: an AgentgatewayPolicy that points the privileged route at your approval service over gRPC, with the request body forwarded so the reviewer has context.

apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
  name: privileged-extauth
  namespace: ops-tools
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: mcp-privileged        # only the privileged route is gated
  traffic:
    extAuth:
      backendRef:
        name: hitl-extauth        # your approval service
        namespace: hitl
        port: 9001
      forwardBody:
        maxSize: 65536            # so the reviewer sees the tool name + args
      grpc: {}                    # Envoy ext-authz gRPC contract

gRPC over h2c. The ext-authz service is reached over the Envoy external authorization gRPC contract, so its Kubernetes Service needs appProtocol: kubernetes.io/h2c (HTTP/2 without TLS) on the gRPC port. Without it the gateway cannot speak gRPC to the service.

Choosing which calls get gated

There are two ways to decide which tool calls hit the approval gate.

By path. Put the risky tools on a separate endpoint on the MCP server and attach the extAuth policy only to that route. The MCP server exposes, say, /public and /privileged; the policy targets the /privileged HTTPRoute. The gate is visible in the topology rather than hidden in an expression, and it is the approach used in the lab below.
By body inspection. Keep one endpoint and gate on the tool name carried in the JSON-RPC body. More flexible, since you do not have to split the server, but there is more logic to maintain and reason about.

The approval service contract

The approval service is small, and it is the one piece you bring. It implements the Envoy ext-authz gRPC contract. The trick that makes the whole pattern work is that the contract lets the Check() call take as long as it wants, so the service blocks on that call until a human decides. A tiny admin API on the side lets a queue UI list and resolve pending calls.

// gRPC: gateway calls Check() before forwarding. We park here.
func (e *extAuthServer) Check(ctx context.Context, req *auth_v3.CheckRequest) (*auth_v3.CheckResponse, error) {
    tool, args := parseMCP(req.GetAttributes().GetRequest().GetHttp().GetBody())

    // Non tools/call frames (initialize, tools/list, ...) pass straight through.
    if tool == "" {
        return ok(), nil
    }

    // Hold the call open until a reviewer decides (or it times out).
    decision := e.queue.park(tool, args)   // blocks on a channel
    if decision.Approved {
        return ok(), nil                   // gateway forwards to the MCP server
    }
    return denied(decision.Reason), nil    // gateway returns 403; MCP never sees it
}

The admin side is just two endpoints a queue UI talks to:

GET  /pending        # list calls currently parked, with tool name + args
POST /decide/{id}    # {"approved": true|false, "reason": "..."}

Keep the queue out-of-band on purpose. The reviewer is not in the agent conversation, so a dedicated queue with an audit trail fits the role. Swap the sample UI for Slack, Backstage or ServiceNow as needed; the gateway-side contract does not change.

Proven end to end

This is not theory. On a single kind cluster, with the policy above attached to the /privileged route, driving the MCP endpoint directly (no LLM in the loop, to isolate the gateway behaviour):

# Discovery passes straight through, not parked
$ mcp-inspector --cli http://<gateway>/privileged/mcp --method tools/list
{ "tools": [ { "name": "run_migration", ... } ] }      # returns immediately

# A real tool call parks at the gateway
$ mcp-inspector --cli http://<gateway>/privileged/mcp \
    --method tools/call --tool-name run_migration --tool-arg version=v3
#   ...blocks, waiting on approval...

# Meanwhile the call is visible in the reviewer queue
$ curl -s http://<extauth-admin>/pending
{ "pending": [ { "id": "p-19", "toolName": "run_migration",
                 "toolArgs": { "version": "v3" }, "path": "/privileged/mcp" } ] }

# Approve it
$ curl -X POST http://<extauth-admin>/decide/p-19 -d '{"approved":true}'
#   → the parked call unblocks and returns:
{ "migrated": true, "from": "v2", "to": "v3" }         # tool actually ran

# Reject a later call → the gateway denies it, the tool never runs
$ curl -X POST http://<extauth-admin>/decide/p-24 -d '{"approved":false}'
#   → the MCP call fails with a denial; the DB is unchanged

The MCP server's own state confirms it: after one approved v3 migration and one rejected v4, the schema is v3 and the audit log shows a single v2 → v3 entry. The rejected call left no trace, because it never reached the server.

Why this fits "many agents, one approver"