Agentic / MCP Lab — Waypoint federation, JWT RBAC, OAuth2 token exchange

Six labs, each building on the previous. They share two assumptions:

The standup lab has passed (AG v2.3.3 + CRDs present, ambient mode on, Gloo UI optional).
Your environment variables $CLUSTER1=kind-east-ag and $CLUSTER2=kind-west-ag are exported.

Labs run on $CLUSTER1 (east-ag) by default to match the upstream gist. The Bonus at the end shows the multicluster twist — federate orders-mcp from west-ag through the same waypoint.

Labs

LAB 1

Deploy MCP servers in `ai-tools` + caller pods in `ai-agents`

Two ambient namespaces, three MCP backends, two client identities. Mirrors Lab 2 of the upstream gist.

Create ambient namespaces

About — what this does & why

What: Creates two namespaces on east — ai-tools (MCP servers) and ai-agents (callers) — and labels both with istio.io/dataplane-mode=ambient.

Why: Splitting tools from callers makes the upcoming AuthZ policies legible — "allow callers from ai-agents to reach tools in ai-tools". Ambient enrolment is what gives every pod a SPIFFE identity for the workload-identity AuthZ in LAB 2, and it's what puts ztunnel on the wire so the waypoint can claim traffic later.

kubectl --context $CLUSTER1 create ns ai-tools
kubectl --context $CLUSTER1 label  ns ai-tools  istio.io/dataplane-mode=ambient --overwrite

kubectl --context $CLUSTER1 create ns ai-agents
kubectl --context $CLUSTER1 label  ns ai-agents istio.io/dataplane-mode=ambient --overwrite

Three MCP backends in `ai-tools`

inventory-mcp is a plain HTTP server. catalog-mcp and orders-mcp are MCP SSE servers running @modelcontextprotocol/server-everything via npx.

About — what this does & why

What: Deploys three workloads, each with its own ServiceAccount: inventory-mcp (a plain HTTP fetcher used as the SPIFFE-only target in LAB 2), catalog-mcp and orders-mcp (both running @modelcontextprotocol/server-everything over SSE — these are the MCP servers federated in LABs 3-5).

Why: Each workload gets its own SA so the SPIFFE identity in the mTLS handshake is per-server (spiffe://cluster.local/ns/ai-tools/sa/<name>). The MCP SSE backends are exposed on port 3001 with appProtocol: http so the waypoint knows to terminate L7. Three distinct backends are the minimum needed to show federation (LAB 3) and per-server tool scoping (LAB 4).

kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: v1
kind: ServiceAccount
metadata: { name: inventory-mcp, namespace: ai-tools }
---
apiVersion: apps/v1
kind: Deployment
metadata: { name: inventory-mcp, namespace: ai-tools }
spec:
  replicas: 1
  selector: { matchLabels: { app: inventory-mcp } }
  template:
    metadata: { labels: { app: inventory-mcp } }
    spec:
      serviceAccountName: inventory-mcp
      containers:
      - name: app
        image: ghcr.io/peterj/mcp-website-fetcher:main
        ports: [{ containerPort: 8000 }]
---
apiVersion: v1
kind: Service
metadata: { name: inventory-mcp, namespace: ai-tools }
spec:
  selector: { app: inventory-mcp }
  ports: [{ name: http, port: 80, targetPort: 8000 }]
---
apiVersion: v1
kind: ServiceAccount
metadata: { name: catalog-mcp, namespace: ai-tools }
---
apiVersion: apps/v1
kind: Deployment
metadata: { name: catalog-mcp, namespace: ai-tools }
spec:
  replicas: 1
  selector: { matchLabels: { app: catalog-mcp } }
  template:
    metadata: { labels: { app: catalog-mcp } }
    spec:
      serviceAccountName: catalog-mcp
      containers:
      - name: app
        image: node:20-alpine
        command: ["npx", "-y", "@modelcontextprotocol/server-everything", "sse"]
        ports: [{ containerPort: 3001 }]
---
apiVersion: v1
kind: Service
metadata: { name: catalog-mcp, namespace: ai-tools }
spec:
  selector: { app: catalog-mcp }
  ports: [{ name: http, port: 3001, targetPort: 3001, appProtocol: http }]
---
apiVersion: v1
kind: ServiceAccount
metadata: { name: orders-mcp, namespace: ai-tools }
---
apiVersion: apps/v1
kind: Deployment
metadata: { name: orders-mcp, namespace: ai-tools }
spec:
  replicas: 1
  selector: { matchLabels: { app: orders-mcp } }
  template:
    metadata: { labels: { app: orders-mcp } }
    spec:
      serviceAccountName: orders-mcp
      containers:
      - name: app
        image: node:20-alpine
        command: ["npx", "-y", "@modelcontextprotocol/server-everything", "sse"]
        ports: [{ containerPort: 3001 }]
---
apiVersion: v1
kind: Service
metadata: { name: orders-mcp, namespace: ai-tools }
spec:
  selector: { app: orders-mcp }
  ports: [{ name: http, port: 3001, targetPort: 3001, appProtocol: http }]
EOF

kubectl --context $CLUSTER1 -n ai-tools wait --for=condition=Ready pod --all --timeout=300s

Two client identities in `ai-agents`

dev-ui will be allowed; rogue-ui will be denied.

About — what this does & why

What: Deploys two long-running curl pods backed by distinct ServiceAccounts — dev-ui and rogue-ui — that we use to send traffic at the MCP servers.

Why: Two SPIFFE identities are needed to demonstrate the workload-identity AuthZ in the next lab — one allowed, one denied. sleep infinity keeps them around for repeated kubectl exec curl calls without needing port-forwards.

kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: v1
kind: ServiceAccount
metadata: { name: dev-ui, namespace: ai-agents }
---
apiVersion: apps/v1
kind: Deployment
metadata: { name: dev-ui, namespace: ai-agents }
spec:
  replicas: 1
  selector: { matchLabels: { app: dev-ui } }
  template:
    metadata: { labels: { app: dev-ui } }
    spec:
      serviceAccountName: dev-ui
      containers:
      - name: c
        image: curlimages/curl:8.5.0
        command: ["sleep", "infinity"]
---
apiVersion: v1
kind: ServiceAccount
metadata: { name: rogue-ui, namespace: ai-agents }
---
apiVersion: apps/v1
kind: Deployment
metadata: { name: rogue-ui, namespace: ai-agents }
spec:
  replicas: 1
  selector: { matchLabels: { app: rogue-ui } }
  template:
    metadata: { labels: { app: rogue-ui } }
    spec:
      serviceAccountName: rogue-ui
      containers:
      - name: c
        image: curlimages/curl:8.5.0
        command: ["sleep", "infinity"]
EOF

kubectl --context $CLUSTER1 -n ai-agents wait --for=condition=Ready pod --all --timeout=120s

Baseline — both clients can reach `inventory-mcp` (no waypoint yet)

About — what this does & why

What: Hits inventory-mcp from both dev-ui and rogue-ui over the regular Service hostname, prints the HTTP status code.

Why: Establishes the "before" state — without a waypoint and without an AuthZ policy, ambient mTLS still encrypts the calls but doesn't filter them. Both clients should get 200. After LAB 2 the same command will return 200 for dev-ui and 403 for rogue-ui — the diff is what proves the policy is doing real work.

kubectl --context $CLUSTER1 -n ai-agents exec deploy/dev-ui   -- curl -s -o /dev/null -w "dev-ui  → inventory-mcp: %{http_code}\n" http://inventory-mcp.ai-tools/
kubectl --context $CLUSTER1 -n ai-agents exec deploy/rogue-ui -- curl -s -o /dev/null -w "rogue-ui → inventory-mcp: %{http_code}\n" http://inventory-mcp.ai-tools/
# Both should print HTTP 200.

LAB 2

Waypoint + workload-identity authz (SPIFFE)

Drop an enterprise-agentgateway-waypoint in front of inventory-mcp and gate access by the caller's mesh identity. No JWT yet — pure SPIFFE from the peer mTLS cert ztunnel proves.

Deploy the waypoint Gateway + parameters

About — what this does & why

What: Creates an EnterpriseAgentgatewayParameters CR pinning the istio cluster ID and trust domain, plus a Gateway of class enterprise-agentgateway-waypoint labelled istio.io/waypoint-for: service.

Why: The parameters CR is the per-waypoint config knob — istioClusterId tells the waypoint binary which clusterID to claim (it defaults to "Kubernetes", which istiod-gloo doesn't recognise), and trustDomain: cluster.local must match the SMC's trust domain or the SPIFFE peer-cert check fails. The waypoint-for: service label scopes this waypoint to Service-level interception (rather than pod-level), which is what we want for the AuthZ policy to apply by destination Service.

kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayParameters
metadata: { name: waypoint-params, namespace: ai-tools }
spec:
  istioClusterId: east-ag
  ca:
    trustDomain: cluster.local
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: agw-waypoint
  namespace: ai-tools
  labels:
    istio.io/waypoint-for: service
spec:
  gatewayClassName: enterprise-agentgateway-waypoint
  listeners:
  - name: mesh
    protocol: HTTP
    port: 15088
    allowedRoutes: { namespaces: { from: Same } }
  infrastructure:
    parametersRef:
      group: enterpriseagentgateway.solo.io
      kind: EnterpriseAgentgatewayParameters
      name: waypoint-params
EOF

kubectl --context $CLUSTER1 -n ai-tools wait \
  --for=condition=Ready pod -l gateway.networking.k8s.io/gateway-name=agw-waypoint \
  --timeout=180s

Route `inventory-mcp` through the waypoint

About — what this does & why

What: Labels the inventory-mcp Service with istio.io/use-waypoint=agw-waypoint, then queries ztunnel-config to confirm.

Why: This is the one knob that redirects traffic destined for the inventory-mcp Service through the waypoint instead of straight to a pod. Without the label, ztunnel keeps doing direct L4 delivery and the waypoint (and therefore the AuthZ policy) is bypassed entirely.

kubectl --context $CLUSTER1 -n ai-tools label svc inventory-mcp \
  istio.io/use-waypoint=agw-waypoint --overwrite

istioctl --context $CLUSTER1 ztunnel-config service | grep inventory-mcp
# bookinfo …  inventory-mcp …  agw-waypoint  1/1   ← waypoint claimed it

Apply the workload-identity authorization policy

About — what this does & why

What: Applies an EnterpriseAgentgatewayPolicy targeting the agw-waypoint Gateway that allows traffic only when source.identity.serviceAccount == "dev-ui".

Why: This is the identity-aware AuthZ Solo's stack offers in place of hand-written AuthorizationPolicy. The source.identity.* attributes come from the SPIFFE identity ztunnel proved during the peer mTLS handshake — they can't be spoofed by a curl header. action: Allow with a single matchExpression is an explicit allowlist: anything else is denied 403.

kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata: { name: waypoint-authz, namespace: ai-tools }
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: agw-waypoint
  traffic:
    authorization:
      action: Allow
      policy:
        matchExpressions:
        - 'source.identity.serviceAccount == "dev-ui"'
EOF

Test — `dev-ui` allowed, `rogue-ui` denied

About — what this does & why

What: Replays the baseline calls from LAB 1 — dev-ui and rogue-ui both curling inventory-mcp.

Why: The result is the proof. dev-ui should still see 200 and rogue-ui now sees 403 — same network path, same code, different SPIFFE identity. The 403 is generated by the waypoint, not by inventory-mcp itself; rogue-ui's request never reaches the application.

kubectl --context $CLUSTER1 -n ai-agents exec deploy/dev-ui   -- curl -s -o /dev/null -w "dev-ui  → inventory-mcp: %{http_code}\n" http://inventory-mcp.ai-tools/
kubectl --context $CLUSTER1 -n ai-agents exec deploy/rogue-ui -- curl -s -o /dev/null -w "rogue-ui → inventory-mcp: %{http_code}\n" http://inventory-mcp.ai-tools/
# → dev-ui  → inventory-mcp: 200
#   rogue-ui → inventory-mcp: 403

LAB 3

MCP multiplexing — one entry-point, many MCP servers

A virtual mcp-gateway Service (no backing pods) fronts an AgentgatewayBackend that federates both catalog-mcp and orders-mcp. Clients open one MCP session and see both tool catalogues namespaced as catalog_* / orders_*.

Virtual entry-point Service

About — what this does & why

What: Creates a Service called mcp-gateway in ai-tools whose selector matches no pods, labelled istio.io/use-waypoint: agw-waypoint.

Why: The waypoint owns this hostname end-to-end — there are no backing pods because the upstream is the AgentgatewayBackend applied in the next step. This is the agentgateway idiom for "give me a stable DNS name clients can use, and let me handle the multiplexing in policy". Without a Service, there's no name for ztunnel to resolve and HTTPRoutes have nowhere to land.

kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: v1
kind: Service
metadata:
  name: mcp-gateway
  namespace: ai-tools
  labels:
    istio.io/use-waypoint: agw-waypoint
spec:
  ports:
  - { name: http, port: 80, targetPort: 80, appProtocol: http }
  selector:
    nonexistent: "true"   # No pods — the waypoint owns this hostname
EOF

AgentgatewayBackend with two MCP targets

About — what this does & why

What: Defines an AgentgatewayBackend with two MCP SSE targets (catalog and orders) and an HTTPRoute binding the mcp-gateway Service to that backend.

Why: AgentgatewayBackend is the type that understands MCP semantics — it speaks the protocol to each upstream, multiplexes the tools/list responses, and namespaces tool names by target (catalog_* / orders_*) so a single MCP session sees a federated catalogue. The HTTPRoute is what binds it to the entry-point Service; Gateway API's parentRefs requires a routable parent.

kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata: { name: mcp-be, namespace: ai-tools }
spec:
  mcp:
    targets:
    - name: catalog
      static:
        host: catalog-mcp.ai-tools.svc.cluster.local
        port: 3001
        protocol: SSE
    - name: orders
      static:
        host: orders-mcp.ai-tools.svc.cluster.local
        port: 3001
        protocol: SSE
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata: { name: mcp-route, namespace: ai-tools }
spec:
  parentRefs:
  - { group: "", kind: Service, name: mcp-gateway }
  rules:
  - backendRefs:
    - { group: agentgateway.dev, kind: AgentgatewayBackend, name: mcp-be }
EOF

Call `tools/list` as `dev-ui` — both servers federated

About — what this does & why

What: From dev-ui, POSTs a JSON-RPC tools/list request to mcp-gateway and counts the tool names returned.

Why: Confirms federation is working end-to-end: one MCP session, two upstream MCP servers, names prefixed by target. The expected ~26 tools is 13 from catalog + 13 from orders — the agentgateway merged them into a single response.

kubectl --context $CLUSTER1 -n ai-agents exec deploy/dev-ui -- \
  sh -c 'curl -sS -X POST http://mcp-gateway.ai-tools/mcp \
    -H "Content-Type: application/json" \
    -H "Accept: application/json,text/event-stream" \
    -d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/list\"}"' \
  | tr "," "\n" | grep -E '"name"' | head -30
# → ~26 tools: 13 catalog_* + 13 orders_*

LAB 4

Admin-level MCP tool scoping

Restrict the federation to two tools (echo + get-sum). The waypoint filters tools/list and rejects any tools/call that names a forbidden tool with JSON-RPC -32602 Unknown tool.

About — what this does & why

What: Applies an EnterpriseAgentgatewayPolicy targeting the AgentgatewayBackend with a backend.mcp.authorization allowlist of two tool names (echo and get-sum).

Why: The previous AuthZ policy targeted the Gateway and saw HTTP attributes (the SPIFFE identity in particular). This one targets the backend and sees MCP-aware attributes (mcp.tool.name, mcp.tool.target, etc.) — the agentgateway parses MCP frames at L7 and evaluates the policy against the parsed JSON-RPC payload. This is the Solo answer to "filter tool calls at the gateway" rather than relying on per-app filters.

kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata: { name: mcp-tool-allowlist, namespace: ai-tools }
spec:
  targetRefs:
  - group: agentgateway.dev
    kind: AgentgatewayBackend
    name: mcp-be
  backend:
    mcp:
      authorization:
        action: Allow
        policy:
          matchExpressions:
          - 'mcp.tool.name == "echo"'
          - 'mcp.tool.name == "get-sum"'
EOF

Verify — 26 tools collapses to 4

About — what this does & why

What: Re-runs the same tools/list call as LAB 3 and checks the tool count.

Why: Goes from ~26 tools to 4 (catalog_echo, catalog_get-sum, orders_echo, orders_get-sum) without restarting any pod or changing any MCP server config — the change is policy-only, applied at the gateway. This is the rug-pull defence at the gateway layer: a compromised upstream can advertise as many tools as it likes, but only those on the allowlist reach the caller.

kubectl --context $CLUSTER1 -n ai-agents exec deploy/dev-ui -- \
  sh -c 'curl -sS -X POST http://mcp-gateway.ai-tools/mcp \
    -H "Content-Type: application/json" \
    -H "Accept: application/json,text/event-stream" \
    -d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/list\"}"' \
  | tr "," "\n" | grep -E '"name"' | head
# → catalog_echo, catalog_get-sum, orders_echo, orders_get-sum

LAB 5

JWT authentication + per-user RBAC

Replace the SPIFFE-only authz with JWT validation. Two sample subjects (alice and bob) get different per-tool access bound to jwt.sub.

The JWTs and JWKS used in the upstream gist are sample-only (issuer http://lab6, valid until ~2036). Copy them from the gist into env vars ALICE_JWT, BOB_JWT, and the JWKS JSON below. Do not reuse them in production.

Apply JWT authentication on the waypoint

About — what this does & why

What: Applies an EnterpriseAgentgatewayPolicy on the waypoint Gateway with jwtAuthentication.mode: Strict, expecting a bearer token signed by issuer http://lab6 with audience lab6, validated against an inline JWKS.

Why: Adds user-level identity on top of the workload-level SPIFFE check from LAB 2. Strict means missing/invalid JWTs return 401 — they never reach an app. Inline JWKS keeps this lab self-contained; in production you'd point at a real IdP's JWKS URL. The jwt.* claims become available to the per-user RBAC policy in the next step.

kubectl --context $CLUSTER1 apply -f - <<EOF
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata: { name: mcp-jwt-auth, namespace: ai-tools }
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: agw-waypoint
  traffic:
    jwtAuthentication:
      mode: Strict
      providers:
      - issuer: "http://lab6"
        audiences: ["lab6"]
        jwks:
          inline: |
            \${LAB6_JWKS}    # paste the JWKS from the upstream gist here
EOF

Layer per-user tool RBAC on the backend

About — what this does & why

What: Replaces the LAB 4 tool allowlist with a richer one that joins jwt.sub against mcp.tool.name and mcp.tool.target — alice can call catalog_echo and orders_get-sum; bob is limited to catalog_echo.

Why: The policy engine evaluates user identity (from the validated JWT) and MCP semantics (from the parsed JSON-RPC) in the same expression. This is what makes "per-user tool RBAC" actually work at the gateway — no per-user logic needed in any MCP server, no in-app filter. Different subjects get different surface areas.

kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata: { name: mcp-user-rbac, namespace: ai-tools }
spec:
  targetRefs:
  - group: agentgateway.dev
    kind: AgentgatewayBackend
    name: mcp-be
  backend:
    mcp:
      authorization:
        action: Allow
        policy:
          matchExpressions:
          - 'jwt.sub == "alice" && mcp.tool.name == "echo"    && mcp.tool.target == "catalog"'
          - 'jwt.sub == "alice" && mcp.tool.name == "get-sum" && mcp.tool.target == "orders"'
          - 'jwt.sub == "bob"   && mcp.tool.name == "echo"    && mcp.tool.target == "catalog"'
EOF

Test as alice — should see `catalog_echo` + `orders_get-sum`

About — what this does & why

What: Calls tools/list with alice's JWT in the Authorization header.

Why: Demonstrates the RBAC table from the policy in practice — alice gets exactly the two tools her clauses allow. Anything outside her allowlist is filtered server-side before the response reaches her client.

kubectl --context $CLUSTER1 -n ai-agents exec deploy/dev-ui -- \
  sh -c "curl -sS -X POST http://mcp-gateway.ai-tools/mcp \
    -H 'Authorization: Bearer \${ALICE_JWT}' \
    -H 'Content-Type: application/json' \
    -H 'Accept: application/json,text/event-stream' \
    -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/list\"}'" \
  | tr "," "\n" | grep -E '"name"'

Test as bob — only sees `catalog_echo`

About — what this does & why

What: Same call as alice but with bob's JWT.

Why: Same SPIFFE caller (dev-ui), same MCP backend, same network path — the only thing that changes is the bearer token. The policy engine reads jwt.sub == "bob" and applies the narrower allowlist. This is what users mean by "user-aware RBAC at the gateway".

kubectl --context $CLUSTER1 -n ai-agents exec deploy/dev-ui -- \
  sh -c "curl -sS -X POST http://mcp-gateway.ai-tools/mcp \
    -H 'Authorization: Bearer \${BOB_JWT}' \
    -H 'Content-Type: application/json' \
    -H 'Accept: application/json,text/event-stream' \
    -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/list\"}'" \
  | tr "," "\n" | grep -E '"name"'

LAB 6

OAuth2 token exchange (RFC 8693) — on-behalf-of upstream calls

Enable the agentgateway's built-in Security Token Service. User JWTs from the upstream IdP are exchanged for short-lived OBO tokens signed by the AGW STS, with the original sub preserved. Upstreams gate on the STS issuer, not the user IdP.

This lab needs the AGW controller installed with tokenExchange.enabled: true. Re-run the AG helm install with the values fragment below, then continue.

Re-install AG with STS enabled

About — what this does & why

What: Re-runs the agentgateway helm install with tokenExchange.enabled: true, pointing the subject validator at a mock IdP's JWKS URL and using K8s ServiceAccount tokens for the actor/api validator.

Why: Token Exchange (RFC 8693) is gated behind a chart flag rather than enabled by default — turning it on adds the STS endpoint and the OBO token-minting code path. tokenExpiration: 24h bounds how long an exchanged token stays usable. --reuse-values preserves the licence key set during the standup install.

helm upgrade --install enterprise-agentgateway \
  oci://us-docker.pkg.dev/solo-public/enterprise-agentgateway/charts/enterprise-agentgateway \
  --kube-context $CLUSTER1 \
  --namespace agentgateway-system \
  --version v2.3.3 \
  --set licensing.licenseKey="$AGENTGATEWAY_LICENSE_KEY" \
  --reuse-values \
  -f - <<'EOF'
tokenExchange:
  enabled: true
  issuer: "enterprise-agentgateway.agentgateway-system.svc.cluster.local:7777"
  tokenExpiration: 24h
  subjectValidator:
    validatorType: remote
    remoteConfig:
      url: "http://mock-idp.tokenexchange-test.svc.cluster.local/.well-known/jwks.json"
  actorValidator: { validatorType: k8s }
  apiValidator: { validatorType: k8s }
EOF

Mock IdP + httpbin upstream in a test namespace

About — what this does & why

What: Creates an ambient namespace tokenexchange-test and deploys two pods: mock-idp (nginx serving the LAB 5 JWKS at /.well-known/jwks.json) and httpbin-up (a stand-in upstream that echoes the request back so we can inspect headers).

Why: Self-contained lab — no internet IdP required. httpbin-up serves as the "upstream" that gates on the STS issuer (rather than the user IdP) — the next two steps prove a user JWT goes through STS exchange before reaching it.

kubectl --context $CLUSTER1 create ns tokenexchange-test
kubectl --context $CLUSTER1 label  ns tokenexchange-test istio.io/dataplane-mode=ambient --overwrite

# Mock IdP serves the same JWKS as Lab 5 — copy LAB6_JWKS into the ConfigMap below.
kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata: { name: mock-idp-jwks, namespace: tokenexchange-test }
data:
  jwks.json: |
    \${LAB6_JWKS}
---
apiVersion: apps/v1
kind: Deployment
metadata: { name: mock-idp, namespace: tokenexchange-test }
spec:
  replicas: 1
  selector: { matchLabels: { app: mock-idp } }
  template:
    metadata: { labels: { app: mock-idp } }
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports: [{ containerPort: 80 }]
        volumeMounts:
        - { name: jwks, mountPath: /usr/share/nginx/html/.well-known }
      volumes:
      - name: jwks
        configMap:
          name: mock-idp-jwks
          items: [{ key: jwks.json, path: jwks.json }]
---
apiVersion: v1
kind: Service
metadata: { name: mock-idp, namespace: tokenexchange-test }
spec:
  selector: { app: mock-idp }
  ports: [{ name: http, port: 80, targetPort: 80, appProtocol: http }]
---
apiVersion: apps/v1
kind: Deployment
metadata: { name: httpbin-up, namespace: tokenexchange-test }
spec:
  replicas: 1
  selector: { matchLabels: { app: httpbin-up } }
  template:
    metadata: { labels: { app: httpbin-up } }
    spec:
      containers:
      - { name: c, image: kennethreitz/httpbin, ports: [{ containerPort: 80 }] }
---
apiVersion: v1
kind: Service
metadata: { name: httpbin-up, namespace: tokenexchange-test }
spec:
  selector: { app: httpbin-up }
  ports: [{ name: http, port: 80, targetPort: 80, appProtocol: http }]
EOF

Exchange a user JWT for an OBO token

About — what this does & why

What: POSTs alice's JWT to the agentgateway STS endpoint as subject_token with grant type urn:ietf:params:oauth:grant-type:token-exchange and captures the returned access_token into $OBO.

Why: This is the RFC 8693 exchange step — the STS validates alice's JWT against the IdP's JWKS, then mints a short-lived OBO ("on-behalf-of") token signed by the STS itself but preserving the original sub claim. Downstream services trust the STS issuer rather than every user IdP they could conceivably see.

OBO=$(kubectl --context $CLUSTER1 -n ai-agents exec deploy/dev-ui -- \
  curl -sS -X POST http://enterprise-agentgateway.agentgateway-system:7777/oauth2/token \
    -d "grant_type=urn:ietf:params:oauth:grant-type:token-exchange" \
    -d "subject_token=${ALICE_JWT}" \
    -d "subject_token_type=urn:ietf:params:oauth:token-type:jwt" \
  | sed -n 's/.*"access_token":"\([^"]*\)".*/\1/p')
echo "OBO token (first 40 chars): ${OBO:0:40}…"

Proof — user JWT denied, OBO token accepted by the STS-gated waypoint

About — what this does & why

What: Sends the same request to httpbin-up twice — once with alice's raw user JWT, once with the OBO token from the previous step.

Why: The user JWT gets 401 (wrong issuer — httpbin-up's waypoint trusts the STS issuer, not http://lab6); the OBO gets 200. This is the load-bearing demo of why token exchange exists: upstreams trust one local STS instead of an open set of user IdPs, and the sub claim survives so audit trails still attribute the call to alice.

echo "User JWT direct:"
kubectl --context $CLUSTER1 -n ai-agents exec deploy/dev-ui -- \
  curl -s -o /dev/null -w "  HTTP %{http_code}\n" \
  -H "Authorization: Bearer ${ALICE_JWT}" \
  http://httpbin-up.tokenexchange-test/headers
# → HTTP 401 (wrong issuer)

echo "OBO token:"
kubectl --context $CLUSTER1 -n ai-agents exec deploy/dev-ui -- \
  curl -s -o /dev/null -w "  HTTP %{http_code}\n" \
  -H "Authorization: Bearer ${OBO}" \
  http://httpbin-up.tokenexchange-test/headers
# → HTTP 200 — OBO signed by STS, sub preserved

BONUS

Multicluster twist — federate `orders-mcp` from west-ag

Our standup is multicluster (east-ag + west-ag, peered over HBONE). The upstream workshop is single-cluster. Bolt-on: redeploy orders-mcp on west-ag instead of east-ag, label it global, and point the AgentgatewayBackend at the *.mesh.internal hostname so the same MCP session multiplexes across two clusters via the east-west HBONE GW.

Move orders-mcp to west-ag

About — what this does & why

What: Deletes the east-ag orders-mcp Deployment / Service / SA, then re-creates the same workload on west-ag in a freshly-labelled ambient ai-tools namespace. The west-side Service is labelled solo.io/service-scope: global and istio.io/global: "true".

Why: Moves the workload across the HBONE fabric so the federation in LAB 3 has to traverse two clusters. The global labels tell Solo Istio peering to register a *.mesh.internal hostname that resolves cross-cluster — without them, the workload is reachable only locally on west-ag and east's waypoint has no way to find it.

kubectl --context $CLUSTER1 -n ai-tools delete deploy orders-mcp svc orders-mcp serviceaccount orders-mcp

# Recreate ai-tools on west and deploy the same SA + Deployment + Service
kubectl --context $CLUSTER2 create ns ai-tools
kubectl --context $CLUSTER2 label  ns ai-tools istio.io/dataplane-mode=ambient \
  topology.istio.io/network=west-ag --overwrite

kubectl --context $CLUSTER2 apply -n ai-tools -f - <<'EOF'
apiVersion: v1
kind: ServiceAccount
metadata: { name: orders-mcp }
---
apiVersion: apps/v1
kind: Deployment
metadata: { name: orders-mcp }
spec:
  replicas: 1
  selector: { matchLabels: { app: orders-mcp } }
  template:
    metadata: { labels: { app: orders-mcp } }
    spec:
      serviceAccountName: orders-mcp
      containers:
      - name: app
        image: node:20-alpine
        command: ["npx", "-y", "@modelcontextprotocol/server-everything", "sse"]
        ports: [{ containerPort: 3001 }]
---
apiVersion: v1
kind: Service
metadata:
  name: orders-mcp
  labels:
    solo.io/service-scope: global
    istio.io/global: "true"
  annotations:
    networking.istio.io/traffic-distribution: Any
spec:
  selector: { app: orders-mcp }
  ports: [{ name: http, port: 3001, targetPort: 3001, appProtocol: http }]
EOF

Point the AgentgatewayBackend at the global hostname

About — what this does & why

What: Re-applies the AgentgatewayBackend with the orders target host changed to orders-mcp.ai-tools.svc.west-ag.mesh.internal (cluster-scoped global hostname).

Why: The agentgateway pod is itself an ambient workload — ztunnel intercepts its outbound DNS, resolves the *.svc.<cluster>.mesh.internal name to a synthetic VIP istiod programmed for the remote endpoint, and HBONE-tunnels the request to west-ag's east-west GW. Caller (dev-ui), JWT policy, and tool-scoping policy are all unchanged — the only difference is one upstream now lives in a different cluster, and the demo works the same way.

kubectl --context $CLUSTER1 apply -f - <<'EOF'
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata: { name: mcp-be, namespace: ai-tools }
spec:
  mcp:
    targets:
    - name: catalog
      static:
        host: catalog-mcp.ai-tools.svc.cluster.local
        port: 3001
        protocol: SSE
    - name: orders
      static:
        host: orders-mcp.ai-tools.svc.west-ag.mesh.internal
        port: 3001
        protocol: SSE
EOF

Now tools/list through the east-ag waypoint returns federated catalog (local) + orders (west cluster, via HBONE). One MCP session, cross-cluster, same JWT and tool scoping policies apply.

The cross-cluster *.mesh.internal hostname-by-cluster pattern works because the east-ag agentgateway pod is in ambient — ztunnel intercepts the outbound DNS resolution and routes to the synthetic VIP that istiod programs with the remote endpoint. Mesh-layer failover from the standup lab uses the same machinery.

Cleanup

About — what this does & why

What: Disables Token Exchange on the agentgateway controller, then deletes ai-tools / ai-agents / tokenexchange-test from east and ai-tools from west.

Why: Keeping the standup platform intact while ripping out only the lab artefacts means you can re-run any lab from scratch without rebuilding the clusters. Setting tokenExchange.enabled=false returns the agentgateway to the lab-1-through-5 baseline.

# Roll back the AG STS install (return tokenExchange.enabled to false)
helm upgrade --install enterprise-agentgateway \
  oci://us-docker.pkg.dev/solo-public/enterprise-agentgateway/charts/enterprise-agentgateway \
  --kube-context $CLUSTER1 --namespace agentgateway-system --version v2.3.3 \
  --set licensing.licenseKey="$AGENTGATEWAY_LICENSE_KEY" \
  --set tokenExchange.enabled=false

# Drop the lab namespaces (also removes all per-lab resources)
kubectl --context $CLUSTER1 delete ns ai-tools ai-agents tokenexchange-test --ignore-not-found
kubectl --context $CLUSTER2 delete ns ai-tools --ignore-not-found

Where to next

Standup lab — the platform underneath these labs
Cloud connectivity lab — cross-cluster failover, in-cluster L7 waypoint routing, egress control
rvennam — AgentGateway Waypoint Workshop — the upstream gist this lab ports
RFC 8693 — OAuth 2.0 Token Exchange

Labs

Deploy MCP servers in ai-tools + caller pods in ai-agents

Create ambient namespaces

Three MCP backends in ai-tools

Two client identities in ai-agents

Baseline — both clients can reach inventory-mcp (no waypoint yet)

Waypoint + workload-identity authz (SPIFFE)

Deploy the waypoint Gateway + parameters

Route inventory-mcp through the waypoint

Apply the workload-identity authorization policy

Test — dev-ui allowed, rogue-ui denied

MCP multiplexing — one entry-point, many MCP servers

Virtual entry-point Service

AgentgatewayBackend with two MCP targets

Call tools/list as dev-ui — both servers federated

Admin-level MCP tool scoping

Verify — 26 tools collapses to 4

JWT authentication + per-user RBAC

Apply JWT authentication on the waypoint

Layer per-user tool RBAC on the backend

Test as alice — should see catalog_echo + orders_get-sum

Test as bob — only sees catalog_echo

OAuth2 token exchange (RFC 8693) — on-behalf-of upstream calls

Re-install AG with STS enabled

Mock IdP + httpbin upstream in a test namespace

Exchange a user JWT for an OBO token

Proof — user JWT denied, OBO token accepted by the STS-gated waypoint

Multicluster twist — federate orders-mcp from west-ag

Move orders-mcp to west-ag

Point the AgentgatewayBackend at the global hostname

Cleanup

Where to next

Deploy MCP servers in `ai-tools` + caller pods in `ai-agents`

Three MCP backends in `ai-tools`

Two client identities in `ai-agents`

Baseline — both clients can reach `inventory-mcp` (no waypoint yet)

Route `inventory-mcp` through the waypoint

Test — `dev-ui` allowed, `rogue-ui` denied

Call `tools/list` as `dev-ui` — both servers federated

Test as alice — should see `catalog_echo` + `orders_get-sum`

Test as bob — only sees `catalog_echo`

Multicluster twist — federate `orders-mcp` from west-ag