Single-cluster Istio hides the identity story. istiod
spins up its own CA, signs every workload a SPIFFE cert, mTLS
Just Works inside that one trust domain. No CRD to write — it
happens by default.
Multi-cluster breaks that. Two clusters, two auto-generated
roots, two trust domains that have never heard of each other.
A ztunnel in cluster-east tries to validate the
cert a ztunnel in cluster-west presents — and
refuses. The fix isn't "configure trust on both clusters". It's
"configure trust once, in the mgmt cluster, and let
the operator push the same intermediate everywhere." That's
what RootTrustPolicy is. Everything below is
the trust chain drawn out end to end.
How a SPIFFE identity is born
Top-down: the root lives off-cluster —
cert-manager ClusterIssuer, Vault PKI, or a
pre-applied Secret. gloo-mesh-mgmt-server reads the
RootTrustPolicy (which points at a CASource
and an IdentityProvider), asks that root to sign a
per-cluster intermediate, and ships it to each cluster's
gloo-mesh-agent over the relay. The agent drops it
into istio-system as the cacerts Secret.
istiod-gloo picks it up and starts signing SVIDs.
Every cluster's intermediate chains to the same root, so when
ztunnel-east meets ztunnel-west on the wire, both sides build
the chain and the handshake succeeds.
What's in the SVID handshake
The diagram above stops at "ztunnel has a cert." What does ztunnel
actually do with it? When a pod in cluster-east
talks to a pod in cluster-west, the local ztunnel
grabs the connection, opens an HBONE tunnel to the remote east-west
gateway, and the two sides swap X.509 certs whose SAN is a SPIFFE
URI. Step by step:
ztunnel ↔ ztunnel · HBONE mTLS
cluster-east cluster-west
┌──────────────────────────┐ ┌──────────────────────────┐
│ ztunnel (client side) │ │ ztunnel (server side) │
│ SA: payments/checkout │ │ SA: payments/ledger │
└──────────────┬───────────┘ └─────────────▲────────────┘
│ ① TCP :15008 (HBONE) ──────────────────────────────│
│ │
│ ② ClientHello │
│ ALPN: h2 │
│ SNI : outbound_.15008_._.eastwest-gw.west │
│ │
│ ③ ServerHello + Certificate │
│ ◀────────────────────────────────────────────────── │
│ X.509 leaf: │
│ Subject: O=spiffe.io │
│ SAN URI: spiffe://cluster.local/ns/payments/ │
│ sa/ledger │
│ Issuer : intermediate-cluster-west │
│ Chain: leaf → int-west → SHARED ROOT ✓ │
│ │
│ ④ Certificate (client side, mTLS) │
│ ──────────────────────────────────────────────────▶ │
│ SAN URI: spiffe://cluster.local/ns/payments/ │
│ sa/checkout │
│ Chain: leaf → int-east → SHARED ROOT ✓ │
│ │
│ ⑤ Finished · session keys derived │
│ ◀───────────────────────────────────────────────▶ │
│ │
│ ⑥ HBONE CONNECT payments/ledger:8080 │
│ ──────────────────────────────────────────────────▶ │
│ (inner stream — original L4 / L7 payload) │
│ │
▼ ▼
Both peers chain the presented leaf back to the SAME root —
the one declared in RootTrustPolicy. If either side cannot build
that chain, the handshake fails before any HBONE bytes flow.
Two things to hold onto. The SPIFFE URI in the SAN is
the workload's identity — namespace plus ServiceAccount inside
the trust domain, not an IP and not a hostname. AuthorizationPolicy
rules key on this URI, full stop. And mTLS here is properly
symmetric: at step ④ the client side presents its own SVID, and
the server validates it against the same root. Cross-cluster
only works because both intermediates chain back to that one
shared root — which is the whole reason
RootTrustPolicy exists.
The CRDs, group by group
🔐 RootTrustPolicy admin.gloo.solo.io/v2
This is the one knob. Pick the root, set the rotation policy, tell the mgmt-server where the per-cluster intermediates are coming from — in-cluster generator, cert-manager, or Vault. You write exactly one of these per mesh, in the mgmt cluster. If you're tempted to write a second, you've misunderstood it; back up.
RootTrustPolicy · generated CAdemo / lab — auto-generates root + intermediates
apiVersion: admin.gloo.solo.io/v2
kind: RootTrustPolicy
metadata:
name: shared-trust
namespace: gloo-mesh
spec:
config:
# Mgmt-server generates a root keypair on first apply, and signs
# a fresh intermediate for every registered workload cluster.
mgmtServerCa:
generated: {}
intermediateCertOptions:
# Rotate when 10% of the cert's lifetime is left.
secretRotationGracePeriodRatio: 0.10
# Bounce pods that hold an SVID so the new one is picked up.
autoRestartPods: true
RootTrustPolicy · cert-manager ClusterIssuerproduction — root lives in cert-manager
apiVersion: admin.gloo.solo.io/v2
kind: RootTrustPolicy
metadata:
name: shared-trust
namespace: gloo-mesh
spec:
config:
intermediateCertOptions:
secretRotationGracePeriodRatio: 0.10
autoRestartPods: true
# Each workload cluster's intermediate is issued by a
# cert-manager ClusterIssuer. The mgmt-server creates a
# Certificate resource per cluster and waits for it to be
# signed before shipping the result over the relay channel.
agentCa:
certManager:
issuerRef:
group: cert-manager.io
kind: ClusterIssuer
name: gloo-mesh-root
RootTrustPolicy · HashiCorp Vault PKIenterprise — root lives off-cluster in Vault
apiVersion: admin.gloo.solo.io/v2
kind: RootTrustPolicy
metadata:
name: shared-trust
namespace: gloo-mesh
spec:
config:
intermediateCertOptions:
secretRotationGracePeriodRatio: 0.10
autoRestartPods: true
agentCa:
vault:
# Sign intermediates against the pki_int mount; pki (root)
# itself stays sealed and never leaves Vault.
caPath: pki_int/sign/gloo-mesh
rolePath: pki_int/roles/gloo-mesh
server: https://vault.internal:8200
authMethod:
# The mgmt-server's ServiceAccount token is exchanged for
# a short-lived Vault token via the kubernetes auth method.
kubernetes:
mountPath: kubernetes
role: gloo-mesh
🪪 CASource & IdentityProvider admin.gloo.solo.io/v2
Two CRDs, one job each. CASource says where the
CA bytes physically come from — a Secret, a cert-manager
ClusterIssuer, a Vault PKI mount. IdentityProvider
pins the SPIFFE trustDomain and the per-cluster
name that ends up in the SVID URI. The names don't help — read
it as "source of bytes" plus "shape of URI".
CASource · BYO bundle in a Secretreferences a Secret holding ca-cert + ca-key
apiVersion: admin.gloo.solo.io/v2
kind: CASource
metadata:
name: byo-root
namespace: gloo-mesh
spec:
# The Secret must contain:
# ca-cert.pem — the intermediate cert
# ca-key.pem — its private key
# root-cert.pem — the root cert all clusters trust
# cert-chain.pem — leaf-to-root chain
secret:
name: shared-ca-bundle
namespace: gloo-mesh
IdentityProvider · SPIFFE trust-domain configdrives the SAN URI shape
apiVersion: admin.gloo.solo.io/v2
kind: IdentityProvider
metadata:
name: spiffe-default
namespace: gloo-mesh
spec:
spiffe:
# Trust-domain segment of every SVID URI. Single value, shared
# by all clusters in this mesh.
trustDomain: cluster.local
# Per-cluster overlay: each KubernetesCluster gets a cluster name
# baked into istiod-gloo so it can write the cluster boundary into
# the workload SAN if you choose a per-cluster trust domain later.
clusterName: cluster-east
📜 SVID format on the wire
Not a CRD — this is the actual cert your workload presents.
Here's the URI shape, and three different ways to peek at one in
flight depending on whether you're in sidecar mode, ambient
mode, or just hammering the gateway with openssl
from your laptop.
spiffe://... · the SVID URI shapewhat lands in the leaf's SAN
# Generic shape
spiffe://<trust-domain>/ns/<namespace>/sa/<service-account>
# A real one — the "checkout" ServiceAccount in the "payments"
# namespace inside a single shared trust domain "cluster.local":
spiffe://cluster.local/ns/payments/sa/checkout
# Inside the X.509 leaf, this lives in:
# X509v3 Subject Alternative Name:
# URI:spiffe://cluster.local/ns/payments/sa/checkout
# Istio authorization policies key on this URI, not on IP or
# hostname.
istioctl proxy-config secret · dump SDS from a sidecar podsidecar mode
# List the SDS secrets the sidecar holds — there are usually two:
# default — the workload's own SVID
# ROOTCA — the trust bundle the sidecar validates
# peers against
istioctl proxy-config secret -n payments deploy/checkout
# Get the PEM of the leaf and pipe through openssl to see the SAN:
istioctl proxy-config secret -n payments deploy/checkout \
-o json \
| jq -r '.dynamicActiveSecrets[]
| select(.name=="default")
| .secret.tlsCertificate.certificateChain.inlineBytes' \
| base64 -d \
| openssl x509 -noout -text \
| grep -A1 'Subject Alternative Name'
ztunnel debug · workload identitiesambient mode — exact command may differ by ztunnel rev
# ztunnel exposes a localhost admin endpoint (default :15000) with
# JSON over /config_dump, /workloads, /certs and friends. The exact
# path / command name moves between ztunnel revisions, so check the
# binary on your cluster.
#
# Approximate: dump the workload table on one node.
kubectl -n istio-system exec ds/ztunnel -- \
curl -s http://localhost:15000/config_dump | jq '.workloads'
# Approximate: dump the certs ztunnel currently holds.
kubectl -n istio-system exec ds/ztunnel -- \
curl -s http://localhost:15000/config_dump | jq '.certificates'
# Or, on newer builds, a dedicated ztunnel sub-command:
# kubectl -n istio-system exec ds/ztunnel -- ztunnel admin certs
# Substitute the real flag for the ztunnel rev installed on your
# cluster — `meshctl version` will tell you which it is.
openssl s_client · peek at the east-west gateway's certno kubectl needed if the LB is reachable
# Open the eastwest gateway on :15008 (HBONE) and dump the cert it
# presents. Useful when validating that cluster-east and cluster-west
# both chain to the same root.
EW_GW=$(kubectl --context=cluster-west -n istio-eastwest \
get svc istio-eastwestgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
openssl s_client \
-connect $EW_GW:15008 \
-servername outbound_.15008_._.eastwest-gw.istio-eastwest.svc.cluster.local \
-showcerts < /dev/null \
| openssl x509 -noout -text \
| grep -E 'Subject:|Issuer:|URI:spiffe'
CLI — inspect identity at runtime
Once the policy is applied and the agents are talking, day-2 boils down to two questions: did the intermediate actually land in this cluster, and is this specific pod presenting the SVID it should be? Five commands, in roughly that order.
🔍 Day-2 — verify trust is wired end-to-end ops
Walk the chain the same way the bytes flow. Relay healthy? Did
cacerts show up in istio-system? Is
the pod (or ztunnel) actually serving the right SDS? And
finally, what does the gateway hand out when something on the
other side of the internet asks? If all four answer yes, trust
is wired.
meshctl check relay · is the agent connected at all?cert distribution rides this channel
# The intermediate + signing key are pushed from the mgmt-server to
# each cluster's gloo-mesh-agent over the relay channel (gRPC mTLS).
# If relay is unhealthy, cert rotation will silently lag — so this
# is the first thing to check when SVIDs go stale.
meshctl check relay \
--kubecontext=gloo-mgmt \
--remote-context=cluster-east
# Full health check (relay + agent + CA reconcile state):
meshctl check --kubecontext=cluster-east
kubectl get secret cacerts · the intermediate that landedone per workload cluster
# The agent writes the intermediate + key into a Secret called
# `cacerts` in istio-system. istiod-gloo mounts that Secret and
# uses it as its SPIFFE CA.
for C in cluster-east cluster-west cluster-central; do
echo "--- $C ---"
kubectl --context=$C -n istio-system get secret cacerts \
-o jsonpath='{.data.ca-cert\.pem}' \
| base64 -d \
| openssl x509 -noout -subject -issuer -dates
done
# Sanity-check: every cluster's intermediate should share the same
# Issuer DN (= the shared root).
istioctl proxy-config secret · what SDS the sidecar holdssidecar mode
# A sidecar holds two SDS secrets:
# default — the workload's leaf SVID (rotated automatically)
# ROOTCA — the trust bundle it validates peers against
istioctl proxy-config secret -n payments deploy/checkout
# Expected output: 2 dynamic active secrets, both Valid. If "default"
# is in state Warming or Stale, the SDS dance with istiod is failing.
ztunnel debug · what identity ztunnel is servingambient mode · command name varies
# Ambient mode: each pod's identity is held by the local ztunnel,
# not by a sidecar. Hit ztunnel's admin endpoint to see the SVIDs
# it currently has and which workloads they map to.
kubectl -n istio-system exec ds/ztunnel -- \
curl -s http://localhost:15000/config_dump \
| jq '.workloads, .certificates'
# Newer ztunnel revisions expose a dedicated CLI:
# kubectl -n istio-system exec ds/ztunnel -- ztunnel admin certs
# Check `ztunnel --help` on the binary in your cluster — the exact
# subcommand name has moved between revisions.
openssl s_client · peek at the gateway's cert from outsideno kubectl needed
# Confirm two clusters' east-west gateways chain to the same root.
for CTX in cluster-east cluster-west cluster-central; do
IP=$(kubectl --context=$CTX -n istio-eastwest \
get svc istio-eastwestgateway \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "--- $CTX ($IP) ---"
openssl s_client \
-connect $IP:15008 \
-servername outbound_.15008_._.eastwest-gw.istio-eastwest.svc.cluster.local \
-showcerts < /dev/null 2>/dev/null \
| openssl x509 -noout -issuer
done
# Every line should show the same Issuer DN — that's the root from
# RootTrustPolicy.
Trust & identity reference
Every moving part in the identity chain, and which cluster it lives in.
| Resource | Group | API | What it does | Lives where |
|---|---|---|---|---|
RootTrustPolicy |
trust | admin.gloo.solo.io/v2 |
Top-level CRD that pins the shared root, names the issuer (generated / cert-manager / Vault), and sets rotation policy. | mgmt cluster · namespace gloo-mesh |
CASource |
source | admin.gloo.solo.io/v2 |
Abstracts where the CA bytes come from — Secret, cert-manager ClusterIssuer or Vault PKI. Referenced by RootTrustPolicy. | mgmt cluster · namespace gloo-mesh |
IdentityProvider |
identity | admin.gloo.solo.io/v2 |
Pins the SPIFFE trustDomain and per-cluster name. Shapes the SAN URI on every issued SVID. |
mgmt cluster · namespace gloo-mesh |
cacerts Secret |
artifact | v1/Secret |
The per-cluster intermediate cert + signing key + root chain that istiod-gloo loads as its SPIFFE CA. |
each workload cluster · istio-system |
| SPIFFE SVID | artifact | X.509 + SAN URI | Per-workload identity cert. SAN URI = spiffe://<trust-domain>/ns/<ns>/sa/<sa>. Authorization policies key on this. |
in-memory · held by sidecar or ztunnel |
gloo-mesh-mgmt-server |
component | — | Signs a per-cluster intermediate (or asks cert-manager / Vault to do so) and ships it to each agent over the relay channel. | mgmt cluster |
gloo-mesh-agent |
component | — | Receives the intermediate over relay and writes it as cacerts in istio-system; restarts pods if autoRestartPods is true. |
each workload cluster |
istiod-gloo |
component | — | Mounts cacerts, becomes the SPIFFE CA, services SDS / CSR requests from local ztunnels (and sidecars). |
each workload cluster · istio-system |
ztunnel |
component | — | Per-node L4 proxy. Holds each local workload's SVID, presents it in HBONE handshakes, validates peer certs against the trust bundle. | each workload cluster · DaemonSet |
Where to go from here
Read this next to the
Gloo Operator CRD map —
that's where RootTrustPolicy, CASource
and IdentityProvider sit alongside the lifecycle
CRDs that put Ambient on the cluster in the first place. And if
you want to see what the SVID earns its keep doing once it's on
the wire, the HBONE east-west
reference picks up where this page leaves off.
Upstream references worth bookmarking:
- SPIFFE concepts & SVID format — read this first if the URI shape looks weird to you.
-
Istio security model
— how
istiodissues SVIDs and enforces mTLS. - Istio Ambient architecture — ztunnel, HBONE, and where pod traffic actually goes.
- Solo docs — field-level reference for the CRDs above.