The story: a chat box, an LLM, and one gateway policy that redacts PII in both directions.
The route forwards to Anthropic Claude. In front of it sits an
EnterpriseAgentgatewayPolicy with two promptGuard layers on each side
(request and response): the built-in regex masks for SSN /
credit card / email / phone / Canadian SIN, then a custom webhook for whatever the built-ins miss.
The custom webhook is a tiny Python FastAPI app implementing the
Guardrail Webhook API v0.1.0
contract — two endpoints, three actions per request (Pass, Mask,
Reject). It masks UK National Insurance Numbers, IBANs, and EU passport patterns; it
Rejects obvious prompt-injection prompts so they never hit the LLM.
Everything runs in a single kind cluster. Bring it up with ./scripts/quick.sh up;
full source at
github.com/tjorourke/solo-labs/tree/main/agentic-pii-guardrail-kind.
What you'll build
One LLM route, two guardrail layers per direction. Built-in regex (cyan) runs in-process inside agentgateway —
no extra hop. The webhook (amber) is your extension point for the patterns the built-ins don't ship and for
Reject
decisions like prompt injection. Same pod handles request and response.
Why two layers
| Concern | Built-in regex | vs | Custom webhook |
|---|---|---|---|
| What it catches | SSN, credit card, email, phone, Canadian SIN | vs | Anything you encode — UK NIN, IBAN, EU passport, internal account IDs, language-specific PII… |
| Where it runs | In-process inside agentgateway — no extra hop | vs | Sidecar Service — one sync HTTP RTT per guarded request |
| Action set | Mask or Reject |
vs | Pass / Mask / Reject (Reject not allowed on response) |
| How to change it | Edit the policy YAML — pick from the fixed builtin list | vs | Edit your own code (Python regex, a classifier model, Presidio, anything) |
| Best for | The 80% of PII that's universal | vs | Country-specific, domain-specific, behavioural (injection, runaway, etc.) |
Both layers run in order on the same policy. List item [0] runs first, then [1], etc.
In this lab the built-in regex.builtins mask runs before the webhook sees the prompt, so the
webhook never sees a raw SSN — only the masked one. The inspector UI flags that with a
"built-in regex ran first" badge.
Steps
1. Clone the repo and bring everything up
About — what this does & why
quick.sh up runs 01..04 in order, all idempotent. First run is ~4 minutes
(Docker image builds + Enterprise AGW + custom services). Subsequent runs re-apply manifests but
skip what's already there.
Three things you need before running:
gcloudinstalled and authenticated. The Solo Enterprise AGW Helm chart lives atus-docker.pkg.dev/solo-public/enterprise-agentgateway/charts/...— even though the repo is public, Google Artifact Registry refuses helm OCI pull without an authenticatedhelm registry login.02-agentgateway.shruns thegcloud auth print-access-token | helm registry logindance via theensure_gar_authhelper inlib.sh; you just needgcloudinstalled and logged in. On macOS:brew install --cask google-cloud-sdk && gcloud auth login.ANTHROPIC_API_KEY— the LLM call ends up at Anthropic Claude.AGENTGATEWAY_LICENSE_KEY—promptGuardships on the enterprise wrapper CRD, not the OSS one. The data plane refuses to start without a valid license.
Bashclone, set keys, bring up the kind cluster
# Prereq — one-time gcloud auth so helm can pull the chart from GAR.
# (02-agentgateway.sh will prompt for this automatically if you skip it.)
brew install --cask google-cloud-sdk # macOS, if you don't have gcloud
gcloud auth login
git clone https://github.com/tjorourke/solo-labs.git
cd solo/agentic-pii-guardrail-kind
export ANTHROPIC_API_KEY=sk-ant-...
export AGENTGATEWAY_LICENSE_KEY=... # ask your Solo account team
./scripts/quick.sh up
./scripts/port-forward.sh # leave running
Then open the inspector UI:
- http://localhost:8090 — Data Loss Prevention Inspector
2. What got deployed
About — the inventory
Two custom things and three platform installs. The platform installs come from upstream / Solo Helm charts; the two custom things are the lab itself.
| Component | Namespace | What it is |
|---|---|---|
agentgateway | agentgateway-system | Solo Enterprise AGW control plane (chart at us-docker.pkg.dev/solo-public) |
metallb | metallb-system | Gives the Gateway a routable IP |
pii-gateway | agentgateway-system | Gateway resource (GatewayClass enterprise-agentgateway) — listens on :80 |
anthropic (Backend) | agentgateway-system | AgentgatewayBackend with ai.provider.anthropic.model=claude-haiku-4-5 |
anthropic-guardrails (Policy) | agentgateway-system | EnterpriseAgentgatewayPolicy — promptGuard.request + promptGuard.response |
pii-guardrail-webhook | pii-demo | Python FastAPI — implements /request, /response, /events |
inspector-ui | pii-demo | Go HTMX — chat box + 3-column redaction trace |
3. The policy — two layers, one route
About — why both layers on the same list
promptGuard.request is a list. List items run in order. Item [0] is the built-in
regex mask; item [1] is the webhook. So a credit-card number is masked by the built-ins
before the webhook sees it — by design. The webhook only needs to know about patterns the built-ins
don't ship.
The webhook field is constrained by a CRD oneOf: a single list item can have exactly one of
regex, webhook, or openAIModeration. That's why we use two items.
YAMLyaml/agentgateway/promptguard-policy.yaml — the whole thing
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: anthropic-guardrails
namespace: agentgateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: anthropic
backend:
ai:
promptGuard:
request:
# Layer 1 — built-in regex masks. Zero code, ships with AGW.
- regex:
action: Mask
builtins: [CreditCard, Ssn, Email, PhoneNumber, CaSin]
# Layer 2 — custom webhook for the rest.
- webhook:
backendRef:
kind: Service
name: pii-guardrail-webhook
namespace: pii-demo
port: 8000
response:
- regex:
action: Mask
builtins: [CreditCard, Ssn, Email, PhoneNumber, CaSin]
- webhook:
backendRef:
kind: Service
name: pii-guardrail-webhook
namespace: pii-demo
port: 8000
4. The webhook — three actions, two endpoints, one process
About — the Guardrail Webhook API contract
agentgateway POSTs the (normalized) prompt to /request before calling the LLM, and the LLM
completion to /response before returning it to the client. Each endpoint must return one of
three actions:
PassAction— content unchanged, proceed.MaskAction— return the modifiedPromptMessages(request) orResponseChoices(response) inbody; AGW substitutes them.RejectAction— return an HTTPstatus_codeandbody; AGW returns that to the client and the LLM is never called. Only valid on/request.
The webhook below uses Mask for EU/UK PII patterns and Reject for prompt-injection patterns. The
/events endpoint is not part of the AGW contract — it's an admin endpoint the inspector UI
polls to render the "what the LLM saw" column.
Pythonsrc/guardrail-webhook/app.py — the request endpoint (shortened)
PII_PATTERNS = [
# UK National Insurance Number: 2 letters + 6 digits + 1 letter (e.g. QQ123456C).
("UK_NINO", re.compile(r"\b[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z]\d{6}[A-D]\b")),
# IBAN: 2-letter country, 2 check digits, up to 30 alphanumeric.
("IBAN", re.compile(r"\b[A-Z]{2}\d{2}[A-Z0-9]{11,30}\b")),
# EU/UK passport: one letter + 8 digits — matches a lot of national formats.
("EU_PASSPORT", re.compile(r"\b[A-Z]\d{8}\b")),
]
INJECTION_PATTERNS = [
re.compile(r"ignore (all|previous|the above) (instructions|rules|prompts)", re.I),
re.compile(r"disregard (all|previous|the above) (instructions|rules|prompts)", re.I),
re.compile(r"reveal (your |the )?system prompt", re.I),
]
@app.post("/request", response_model=GuardrailsPromptResponse)
def process_request(req: GuardrailsPromptRequest) -> GuardrailsPromptResponse:
# 1. Reject path — any injection pattern in any user/system message.
for msg in req.body.messages:
if msg.role in ("user", "system"):
hit = _check_injection(msg.content)
if hit:
return GuardrailsPromptResponse(action=RejectAction(
body="Request blocked by guardrail webhook: suspected prompt injection.",
status_code=403,
reason=f"prompt-injection pattern matched: {hit!r}",
))
# 2. Mask path — apply EU/UK PII regexes to every message.
redacted, all_matches, any_change = [], [], False
for msg in req.body.messages:
new_content, hits = _redact_text(msg.content)
if hits: any_change = True; all_matches.extend(hits)
redacted.append(Message(role=msg.role, content=new_content))
if any_change:
return GuardrailsPromptResponse(action=MaskAction(
body=PromptMessages(messages=redacted),
reason=f"masked: {', '.join(sorted(set(all_matches)))}",
))
# 3. Pass path — nothing to do.
return GuardrailsPromptResponse(action=PassAction(reason="no PII or injection detected"))
5. The inspector UI — three columns per round-trip
About — how the UI reconstructs the trace
The UI is stateless. On Send it:
- Records a timestamp.
- POSTs
/v1/messagesat the gateway with the user prompt as a single message. - Reads
/events?limit=10from the webhook and binds the first request + response event newer than the timestamp. - Renders three columns: original prompt, what the webhook recorded (Pass / Mask / Reject), and the LLM response (already past the response-side guard).
If the typed prompt and the webhook-recorded "Original" differ, the built-in regex must have run first. The UI surfaces that with a "built-in regex ran first" badge.
Walk through the demo
Open the inspector UI at http://localhost:8090. The four scenes below exercise each of the three actions (Pass, Mask, Reject) and both layers (built-in, webhook).
Scene 1 — clean prompt, no PII
PassAction for both the request and
the response.Scene 2 — SSN, masked by the built-in
regex.builtins: [Ssn, …]
fired before the webhook because it's earlier in the promptGuard.request list. Zero code,
ships with AGW, sub-microsecond cost.
Scene 3 — UK NIN, masked by the webhook
UK_NINO chip and the side-by-side
"redacted vs original" view from the webhook's audit ring: QQ123456C →
[REDACTED:UK_NINO]. Claude answers the pension question on the redacted form.
UK_NINO. The webhook is where you encode country-specific PII, internal account IDs,
Presidio recognizers, an SLM classifier — anything that doesn't fit a fixed regex enum, in code you
control.
Scene 4 — prompt injection, rejected
HTTP 403 body. The LLM was never called — zero tokens billed.
Reject over Mask for injection: a half-redacted injection
prompt is still an injection. RejectAction short-circuits the chain and returns the
configured status_code to the client; the LLM upstream call never happens.
Inspecting state while it runs
Bashpoke at the running cluster — recent decisions, policy status, webhook logs
# Most recent guardrail decisions, newest first
kubectl --context kind-pii -n pii-demo \
exec deploy/pii-guardrail-webhook -- \
wget -qO- http://localhost:8000/events | jq
# Policy attachment status — should show Accepted/Attached True
kubectl --context kind-pii -n agentgateway-system \
get enterpriseagentgatewaypolicy anthropic-guardrails -o yaml \
| grep -A6 conditions
# Webhook logs — every /request and /response call logs phase + action + matches
kubectl --context kind-pii -n pii-demo logs -f deploy/pii-guardrail-webhook
Teardown
./scripts/quick.sh teardown
Talking points
- Built-ins solve 80% with zero code. SSN, credit cards, emails, phones — keep them in
promptGuard.*.regex.builtins. Don't reinvent. - The webhook is your extension point. Country-specific PII, prompt injection, behavioural guardrails — any logic too contextual for a fixed enum, in code you control. The wire contract is 3 actions and 2 endpoints — small surface, easy to test in isolation.
- Request AND response. A user who never typed a SSN can still get one back from the LLM. Mask both directions with the same patterns.
- Reject is for unsafe content; Mask is for sensitive content. Injection prompts must be rejected (a half-redacted injection still works). PII should be masked (preserves the useful parts of the conversation).
- The webhook is sample code, not a Solo product. The Guardrail Webhook API is the Solo-supported integration. The Python server here is your own code, swappable for Presidio, a Bedrock guardrail call, an SLM classifier, or whatever your compliance team prefers.
Troubleshooting
| Symptom | Likely cause | → | Fix |
|---|---|---|---|
02-agentgateway.sh fails with FetchReference … not found or 401 Unauthorized |
Helm OCI auth to us-docker.pkg.dev missing or stale |
→ | gcloud auth login, then gcloud auth print-access-token | helm registry login -u oauth2accesstoken --password-stdin us-docker.pkg.dev (or just re-run 02-agentgateway.sh — it'll do this for you) |
Gateway pod CrashLoopBackoff immediately |
Missing or invalid AGENTGATEWAY_LICENSE_KEY |
→ | kubectl -n agentgateway-system logs deploy/<gateway-pod> — re-helm-install with a valid key |
| 403 on every request | Anthropic API key wrong or missing | → | kubectl -n agentgateway-system get secret anthropic-secret -o yaml — should be base64 of your real key in data.Authorization |
| Inspector UI columns 2 and 3 always identical to column 1 | Policy not attached to the HTTPRoute | → | kubectl -n agentgateway-system get enterpriseagentgatewaypolicy anthropic-guardrails -o yaml \| grep -A6 conditions |
/events empty even after sending |
Webhook Service not reachable from the gateway pod | → | kubectl -n pii-demo get svc pii-guardrail-webhook; check that the policy's webhook.backendRef.namespace matches |
Gateway IP stays <pending> for >2 min |
MetalLB pool exhausted | → | kubectl -n metallb-system get ipaddresspool kind-pool -o yaml and adjust |
Use the inspector against your own gateway
The inspector UI is generic. It speaks two LLM wire formats
(anthropic-messages, openai-chat) and the Guardrail
Webhook integration is optional — set webhook.url for the
redaction trace, leave it empty to run as a simple "you sent / LLM returned"
smoke-test client. There's a Helm chart at
charts/pii-inspector-ui.
Bashinstall against your own AGW (with webhook trace)
helm install inspector ./charts/pii-inspector-ui \
--namespace my-ai-demo --create-namespace \
--set agw.url=http://my-gateway.agentgateway-system.svc.cluster.local \
--set agw.format=anthropic-messages \
--set webhook.url=http://my-guardrail.guardrail-system.svc.cluster.local:8000
kubectl -n my-ai-demo port-forward svc/inspector-pii-inspector-ui 8090:80
open http://localhost:8090
Bashinstall against your own AGW (generic mode, no webhook yet)
# Two-column mode: what you sent, what came back. Useful when you don't have
# a Guardrail Webhook implementation yet but want to smoke-test a route — a
# Reject from a built-in regex shows up as the raw 403 body under column 3.
helm install inspector ./charts/pii-inspector-ui \
--namespace my-ai-demo --create-namespace \
--set agw.url=http://my-gateway.agentgateway-system.svc.cluster.local \
--set agw.format=openai-chat
Full README — values reference, recipes for each LLM format, the
/events JSON shape your webhook needs to expose for trace mode,
troubleshooting — lives next to the chart:
charts/pii-inspector-ui/README.md.
See also
- Inspector Helm chart README — drop this in front of any AGW LLM route
- Solo — Custom Guardrails Webhook API
- Solo — Webhook OpenAPI spec
- Solo — Prompt guards (built-in regex)
- Sibling lab — Two-Layer HITL for an MCP Agent on Kind
Versions
Built and verified on both editions:
v1.3.0v1.5.1v2.3.4v1.4.0