MastertheMesh
Solo Enterprise for Istio · Reference
Visual reference

HBONE + east-west — what a cross-cluster Ambient packet does

TO
Tom O'Rourke
EMEA Field CTO · Solo.io

One TCP packet, from a pod in cluster-east to a service in cluster-west — hop by hop. What ztunnel wraps it in, why the remote east-west gateway on :15008 is on the path at all, what's actually in the SNI, and where the inner mTLS terminates.

HBONE ztunnel east-west gateway SPIFFE multi-network

HBONE is HTTP/2 over TLS, used as an L4 tunnel. Inner stream: a single HTTP/2 CONNECT that re-establishes the original TCP connection between two workloads. Outer envelope: a TLS 1.3 handshake whose SNI is the receiver's SPIFFE identity. TLS-in-TLS, identity in SNI, the gateway forwards by reading SNI alone. This page is the packet-level walkthrough — one hop at a time — so the design is concrete rather than hand-waved.

Two clusters on different networks can't reach each other's pod IPs, so the destination cluster exposes exactly one well-known endpoint — the east-west gateway on :15008 — and that gateway knows how to read the SNI and hand the tunnel off to the right ztunnel. Below: the four hops, what each component does to the bytes, the CRDs that wire it up, and the five common failure modes.

The path, end to end

SOURCE · cluster-east · network-east DESTINATION · cluster-west · network-west cluster-east network: network-east · trust: cluster.local cluster-west network: network-west · trust: cluster.local WAN · VPC peering cross-network · pod IPs not routable Client Pod app=cart 10.10.5.20 SA: cart ztunnel-east DaemonSet · per node CNI redirect intercept wraps packet in HBONE istio-eastwestgateway Service: LoadBalancer :15008 · HBONE terminates outer TLS ztunnel-west DaemonSet · per node unwraps HBONE delivers plain L4 Destination Pod app=checkout 10.20.7.30:8080 eastwestgateway cluster-east (inbound only) not on this outbound path 1 plain TCP · CNI redirect 2 HBONE · TLS+HTTP/2 · :15008 SNI = spiffe://cluster.local/ns/payments/sa/checkout 3 HBONE relay · SNI routed 4 plain TCP Identity carried in SNI · outer TLS terminates at the gateway · inner mTLS is end-to-end ztunnel ↔ ztunnel Source cluster does NOT use its own east-west gateway for outbound — ztunnel-east dials the remote GW IP directly Inbound from elsewhere arrives via the local east-west gateway (dashed box, lower-left)
pod ztunnel east-west gateway HBONE tunnel (TLS + HTTP/2) plain TCP

Read it like this. Client pod fires a plain TCP packet at the destination IP. On the way out of the node, istio-cni grabs it and redirects into ztunnel-east. ztunnel wraps it in HBONE — outer TLS 1.3, SNI carrying the source workload's SPIFFE identity — and dials the remote cluster's east-west gateway on :15008. The gateway terminates the outer TLS, reads the SNI, opens a fresh HBONE tunnel to ztunnel-west. ztunnel-west unwraps it and delivers a plain TCP segment into the destination pod. And the source cluster's own east-west gateway? Not on this path — it only matters for traffic coming in.

What happens at each hop

One block per hop on the diagram. The YAML or pseudo-config under each is the thing you'd actually touch — or grep for — to see what's happening at that point in the path.

STEP 01

Pod sends a TCP packet

The cart pod opens a TCP connection to checkout.payments.global:8080. From the app's point of view this is a bog-standard outbound socket — no client proxy, no SDK, no sidecar waiting. The trick happens one layer down: on the way out of the network namespace, istio-cni-node's redirect rules (iptables, nftables or eBPF, depending on how you installed) yank the packet into the local ztunnel before it leaves the node.

# Conceptually — istio-cni's redirect for ambient-enabled pods.
# (Don't run this; the real rules live in the istio-cni-node DaemonSet
#  and use marks + tproxy. This is the moral equivalent.)

iptables -t mangle -A PREROUTING \
  -i pod-veth \
  -p tcp \
  -j TPROXY \
    --tproxy-mark 0x111/0xfff \
    --on-port 15001 \
    --on-ip 127.0.0.1

# Result: every outbound TCP segment from app=cart arrives at
# ztunnel's outbound socket on the local node, with the original
# 5-tuple preserved.
STEP 02

ztunnel-east wraps it in HBONE

ztunnel-east looks at the destination via xDS, spots that checkout lives in cluster-west on a different network, and dials the remote east-west gateway's LoadBalancer IP on :15008. Outer TLS 1.3 handshake, SNI carrying the source workload's SPIFFE identity, then a single HTTP/2 CONNECT stream addressed to the destination pod IP. From that point on, the original app bytes ride inside the CONNECT body. Transport on the outside, L4 reassembly on the inside.

# What ztunnel-east actually emits on the wire (conceptually).
#
# 1. Outer TLS 1.3 ClientHello to :15008
#    SNI:                       spiffe://cluster.local/ns/payments/sa/cart
#    ALPN:                      h2
#    Client cert (mTLS):        spiffe://cluster.local/ns/payments/sa/cart
#
# 2. Inside the TLS session, a single HTTP/2 CONNECT stream:

CONNECT 10.20.7.30:8080 HTTP/2
:authority: 10.20.7.30:8080
baggage:    k8s.cluster.name=cluster-east,
            k8s.namespace.name=payments,
            k8s.pod.name=cart-7d9f-abcde
x-envoy-original-dst-host: 10.20.7.30:8080

# The original app bytes are now the body of that CONNECT stream.
# Outer TLS = transport. Inner CONNECT = "give me back my L4
# connection on the other side." mTLS validates both ends.
STEP 03

SNI-routed across the WAN

The istio-eastwestgateway in cluster-west is just Envoy, listening on :15008 for protocol HBONE. It terminates the outer TLS — but, importantly, not the inner stream. It reads the SNI, checks the trust domain is one it accepts, and opens a fresh HBONE tunnel to whichever ztunnel-west owns the destination pod's node. That's it. Stateless SNI router. It never sees the application bytes, only the encrypted CONNECT envelope.

# What the gateway sees on an incoming connection on :15008
#
# TLS ClientHello — SNI = spiffe://cluster.local/ns/payments/sa/cart
#                   ALPN = h2
#                   Client cert principal = spiffe://...sa/cart
#
# Routing decision:
#   - SNI's trust domain (cluster.local) is in the accepted set
#   - destination addr in the inner CONNECT (10.20.7.30:8080)
#     belongs to node-west-3
#   - dial node-west-3's ztunnel pod IP : 15008
#
# Service that fronts the gateway:

apiVersion: v1
kind: Service
metadata:
  name: istio-eastwestgateway
  namespace: istio-eastwest
spec:
  type: LoadBalancer
  selector:
    app: istio-eastwestgateway
  ports:
    - name: tls
      port: 15008
      targetPort: 15008
      protocol: TCP
STEP 04

ztunnel-west unwraps and delivers

ztunnel-west terminates the inner HBONE tunnel, takes the SPIFFE principal off the peer cert, runs it against any L4 AuthorizationPolicy attached to the destination workload, and then opens a plain TCP connection to 10.20.7.30:8080. The destination pod sees an ordinary TCP segment with a node-local source IP — exactly what a non-ambient pod would see. mTLS stops at ztunnel-west; from there on, it's plain L4 inside the node. The workload is unmodified and sees no TLS.

# From the destination pod's view (tcpdump on lo / eth0):
#
#   src = 10.20.7.1   (ztunnel-west's pod IP on node-west-3)
#   dst = 10.20.7.30:8080  (checkout pod)
#   plain TCP, no TLS, application bytes intact
#
# The identity, audit and policy decision all happened inside
# ztunnel-west. The pod itself has no idea the traffic crossed
# a cluster boundary.

The CRDs behind the flow

Three resources do almost all the work. A Gateway API Gateway declares the east-west listener on :15008. A ServiceEntry tells the source cluster the destination service exists and lives behind the remote gateway IP. A PeerAuthentication in STRICT mode makes the inner mTLS non-optional — STRICT is required; see failure mode four below.

🛣️ East-west Gateway gateway.networking.k8s.io/v1

Lives in the destination cluster (cluster-west). The istio-eastwest gatewayClass plus one HBONE listener on :15008 are what provisions the LoadBalancer Service that ztunnel-east will dial across the WAN. Get the gatewayClass wrong and no Service is ever created — there's no error, just nothing to dial.

Gateway · east-west listener on :15008applied in cluster-west
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: istio-eastwestgateway
  namespace: istio-eastwest
  annotations:
    # Tell the controller to provision a LoadBalancer Service in front
    # of the Envoy Deployment that backs this Gateway.
    networking.istio.io/service-type: LoadBalancer
spec:
  gatewayClassName: istio-eastwest
  listeners:
    - name: cross-network
      port: 15008
      protocol: HBONE
      # The gateway accepts inbound HBONE from any peer cluster that
      # presents a cert in the trust domain below.
      allowedRoutes:
        namespaces:
          from: All

🛰️ ServiceEntry · the remote service networking.istio.io/v1

Lives in the source cluster (cluster-east). It tells the local control plane: checkout.payments.svc exists, it's on a different network, here's the gateway IP that fronts it. MESH_INTERNAL plus the right network label is what makes ztunnel-east treat this as in-mesh HBONE traffic instead of routing it like an external API call. Get this wrong and ztunnel-east bypasses the HBONE path entirely.

ServiceEntry · point at the remote east-west GWapplied in cluster-east
apiVersion: networking.istio.io/v1
kind: ServiceEntry
metadata:
  name: checkout-remote
  namespace: payments
spec:
  hosts:
    - checkout.payments.svc.cluster.local
  ports:
    - number: 8080
      name: http
      protocol: HTTP
  location: MESH_INTERNAL    # ← matters: keep this in-mesh, not external
  resolution: DNS
  endpoints:
    # The remote east-west gateway's LoadBalancer IP, *not* the pod IP.
    - address: 203.0.113.42
      network: network-west
      locality: us-east-2/us-east-2a
      ports:
        # ztunnel-east dials 15008 on the gateway, even though the
        # logical service port is 8080.
        http: 15008
      labels:
        cluster: cluster-west

🔐 PeerAuthentication · STRICT mTLS security.istio.io/v1

Applied in cluster-west (or mesh-wide) on the destination workload. Without STRICT, ztunnel-west will accept plain TCP from the gateway — the SPIFFE identity in the SNI and peer cert is silently dropped. HBONE on the wire, but no mTLS guarantee at the destination: the connection succeeds while the workload's identity is never authenticated.

PeerAuthentication · require mTLS at the destinationenforced by ztunnel-west
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
  name: default
  namespace: payments
spec:
  mtls:
    mode: STRICT   # PERMISSIVE here would silently break identity

What goes wrong (and how it looks)

Five common failure modes for cross-cluster HBONE. Each one breaks at a specific hop above. Match the symptom to the hop and the cause usually follows.

1. Port 15008 not reachable externally. The Service is LoadBalancer, but the cloud LB, firewall, or security group never opened 15008 — or it was shipped as ClusterIP by mistake. ztunnel-east stalls on TCP connect, its logs show connect: connection refused or i/o timeout, and the destination pod sees nothing. Check the LB Service has 15008 exposed. Check the firewall lets it through.
2. Trust domain mismatch. The two clusters were installed with different SPIFFE trust domains (cluster.local vs mesh.solo) or different root CAs. The outer TLS handshake completes, then cert validation fails — the gateway logs SSL_ERROR_BAD_CERT_DOMAIN or unknown CA, and ztunnel-east sees the connection reset mid-handshake. Symptom looks like a network issue; the cause is identity.
3. Istio revision skew between clusters. cluster-east is on 1-26, cluster-west is still on 1-24. Outer TLS completes, then the inner HTTP/2 CONNECT is rejected — an HTTP/2 :status 421 or a stream reset — because one side sent a header the other doesn't recognise. Keep IstioLifecycleManager's revision consistent across the fleet; upgrade in lockstep.
4. PeerAuthentication PERMISSIVE / DISABLE on the destination. ztunnel-west takes the connection but drops the inner mTLS layer. HBONE is still on the wire — a packet capture looks fine — but the workload's SPIFFE identity isn't enforced. L4 AuthorizationPolicy rules that match on principals: silently evaluate to no-match and fall through. Symptom: authz works in one cluster, breaks the moment traffic crosses the WAN.
5. ServiceEntry not synced yet (DNS unresolved). The ServiceEntry has been applied via Gloo or Argo but ztunnel-east hasn't seen the xDS push yet — or a fresh VirtualDestination hasn't reconciled the remote endpoints. The client pod gets connection refused on a port that has no listener anywhere, because ztunnel-east is firing the packet at the original pod IP instead of HBONE-tunnelling to the remote gateway. Often transient on first apply; if it persists, check xDS sync state.

Components on the path

One row per box the packet touches, in order. The Protocol in / Protocol out columns are what each side of that box carries on the wire — use them to map a tcpdump capture or a log line back to the right hop.

Hop Component Where it runs What it does Protocol in / out
Client Pod pod cluster-east · payments ns · node-east-1 Originates the outbound TCP connection to checkout.payments.global:8080. Knows nothing about HBONE. app TCP / app TCP
1 ztunnel-east ztunnel cluster-east · DaemonSet · per node CNI redirect catches the packet, ztunnel wraps it in an HBONE CONNECT stream over an mTLS session, dials the remote GW IP on :15008. plain TCP / HBONE (TLS+H2)
2 WAN / VPC peering between clouds / regions / VPCs Plain L3 transit. Doesn't see inside the TLS session. Only requirement: TCP/15008 reachable from network-east to network-west. HBONE / HBONE
3 istio-eastwestgateway gateway cluster-west · istio-eastwest ns · LoadBalancer Service on :15008 Terminates outer TLS, reads SPIFFE SNI, opens a fresh HBONE tunnel to the ztunnel that owns the destination pod's node. HBONE / HBONE
4 ztunnel-west ztunnel cluster-west · DaemonSet · per node Validates the peer SPIFFE identity, enforces L4 AuthorizationPolicy, unwraps the HBONE tunnel and opens a plain TCP connection to the destination pod. HBONE / plain TCP
Destination Pod pod cluster-west · payments ns · node-west-3 Receives an ordinary TCP segment on :8080. Source IP is ztunnel-west's pod IP; identity has been authenticated upstream. app TCP / app TCP

Where to go from here

The CRDs that wire this up — PeerAuthentication, ServiceEntry, Gateway API Gateway — are grouped on the Istio Ambient CRDs visual map. The east-west gateway itself gets rolled out by the Gloo Operator's GatewayLifecycleManager; the Gloo Operator across N clusters page walks through where it installs from and how trust is shared between clusters.

For the wire format itself, the Istio Ambient architecture docs cover HBONE in detail, and the ztunnel repo is the source of truth for what ztunnel actually does on receipt.