HBONE is HTTP/2 over TLS, used as an L4 tunnel. Inner
stream: a single HTTP/2 CONNECT that re-establishes the
original TCP connection between two workloads. Outer envelope: a TLS 1.3
handshake whose SNI is the receiver's SPIFFE identity. TLS-in-TLS,
identity in SNI, the gateway forwards by reading SNI alone. This page
is the packet-level walkthrough — one hop at a time — so the design is
concrete rather than hand-waved.
Two clusters on different networks can't reach each other's pod IPs, so the destination cluster exposes exactly one well-known endpoint — the east-west gateway on :15008 — and that gateway knows how to read the SNI and hand the tunnel off to the right ztunnel. Below: the four hops, what each component does to the bytes, the CRDs that wire it up, and the five common failure modes.
The path, end to end
Read it like this. Client pod fires a plain TCP packet at
the destination IP. On the way out of the node, istio-cni
grabs it and redirects into ztunnel-east. ztunnel wraps it
in HBONE — outer TLS 1.3, SNI carrying the source workload's SPIFFE
identity — and dials the remote cluster's east-west gateway on
:15008. The gateway terminates the outer TLS, reads the SNI,
opens a fresh HBONE tunnel to ztunnel-west. ztunnel-west
unwraps it and delivers a plain TCP segment into the destination pod.
And the source cluster's own east-west gateway? Not on this path — it
only matters for traffic coming in.
What happens at each hop
One block per hop on the diagram. The YAML or pseudo-config under each is the thing you'd actually touch — or grep for — to see what's happening at that point in the path.
Pod sends a TCP packet
The cart pod opens a TCP connection to
checkout.payments.global:8080. From the app's point of view
this is a bog-standard outbound socket — no client proxy, no SDK, no
sidecar waiting. The trick happens one layer down: on the way out of
the network namespace, istio-cni-node's redirect rules
(iptables, nftables or eBPF, depending on how you installed) yank the
packet into the local ztunnel before it leaves the node.
# Conceptually — istio-cni's redirect for ambient-enabled pods.
# (Don't run this; the real rules live in the istio-cni-node DaemonSet
# and use marks + tproxy. This is the moral equivalent.)
iptables -t mangle -A PREROUTING \
-i pod-veth \
-p tcp \
-j TPROXY \
--tproxy-mark 0x111/0xfff \
--on-port 15001 \
--on-ip 127.0.0.1
# Result: every outbound TCP segment from app=cart arrives at
# ztunnel's outbound socket on the local node, with the original
# 5-tuple preserved.
ztunnel-east wraps it in HBONE
ztunnel-east looks at the destination via xDS, spots that
checkout lives in cluster-west on a different
network, and dials the remote east-west gateway's LoadBalancer
IP on :15008. Outer TLS 1.3 handshake, SNI carrying the
source workload's SPIFFE identity, then a single
HTTP/2 CONNECT stream addressed to the destination pod IP.
From that point on, the original app bytes ride inside the CONNECT
body. Transport on the outside, L4 reassembly on the inside.
# What ztunnel-east actually emits on the wire (conceptually).
#
# 1. Outer TLS 1.3 ClientHello to :15008
# SNI: spiffe://cluster.local/ns/payments/sa/cart
# ALPN: h2
# Client cert (mTLS): spiffe://cluster.local/ns/payments/sa/cart
#
# 2. Inside the TLS session, a single HTTP/2 CONNECT stream:
CONNECT 10.20.7.30:8080 HTTP/2
:authority: 10.20.7.30:8080
baggage: k8s.cluster.name=cluster-east,
k8s.namespace.name=payments,
k8s.pod.name=cart-7d9f-abcde
x-envoy-original-dst-host: 10.20.7.30:8080
# The original app bytes are now the body of that CONNECT stream.
# Outer TLS = transport. Inner CONNECT = "give me back my L4
# connection on the other side." mTLS validates both ends.
SNI-routed across the WAN
The istio-eastwestgateway in cluster-west is
just Envoy, listening on :15008 for protocol HBONE. It
terminates the outer TLS — but, importantly, not the inner
stream. It reads the SNI, checks the trust domain is one it accepts,
and opens a fresh HBONE tunnel to whichever ztunnel-west
owns the destination pod's node. That's it. Stateless SNI router. It
never sees the application bytes, only the encrypted CONNECT envelope.
# What the gateway sees on an incoming connection on :15008
#
# TLS ClientHello — SNI = spiffe://cluster.local/ns/payments/sa/cart
# ALPN = h2
# Client cert principal = spiffe://...sa/cart
#
# Routing decision:
# - SNI's trust domain (cluster.local) is in the accepted set
# - destination addr in the inner CONNECT (10.20.7.30:8080)
# belongs to node-west-3
# - dial node-west-3's ztunnel pod IP : 15008
#
# Service that fronts the gateway:
apiVersion: v1
kind: Service
metadata:
name: istio-eastwestgateway
namespace: istio-eastwest
spec:
type: LoadBalancer
selector:
app: istio-eastwestgateway
ports:
- name: tls
port: 15008
targetPort: 15008
protocol: TCP
ztunnel-west unwraps and delivers
ztunnel-west terminates the inner HBONE tunnel, takes the
SPIFFE principal off the peer cert, runs it against any L4
AuthorizationPolicy attached to the destination workload,
and then opens a plain TCP connection to 10.20.7.30:8080.
The destination pod sees an ordinary TCP segment with a node-local
source IP — exactly what a non-ambient pod would see. mTLS stops at
ztunnel-west; from there on, it's plain L4 inside the node. The
workload is unmodified and sees no TLS.
# From the destination pod's view (tcpdump on lo / eth0):
#
# src = 10.20.7.1 (ztunnel-west's pod IP on node-west-3)
# dst = 10.20.7.30:8080 (checkout pod)
# plain TCP, no TLS, application bytes intact
#
# The identity, audit and policy decision all happened inside
# ztunnel-west. The pod itself has no idea the traffic crossed
# a cluster boundary.
The CRDs behind the flow
Three resources do almost all the work. A Gateway API Gateway
declares the east-west listener on :15008. A
ServiceEntry tells the source cluster the destination service
exists and lives behind the remote gateway IP. A
PeerAuthentication in STRICT mode makes the
inner mTLS non-optional — STRICT is required; see failure mode four below.
🛣️ East-west Gateway gateway.networking.k8s.io/v1
Lives in the destination cluster (cluster-west). The
istio-eastwest gatewayClass plus one HBONE listener on
:15008 are what provisions the LoadBalancer Service that
ztunnel-east will dial across the WAN. Get the gatewayClass wrong and
no Service is ever created — there's no error, just nothing to dial.
Gateway · east-west listener on :15008applied in cluster-west
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: istio-eastwestgateway
namespace: istio-eastwest
annotations:
# Tell the controller to provision a LoadBalancer Service in front
# of the Envoy Deployment that backs this Gateway.
networking.istio.io/service-type: LoadBalancer
spec:
gatewayClassName: istio-eastwest
listeners:
- name: cross-network
port: 15008
protocol: HBONE
# The gateway accepts inbound HBONE from any peer cluster that
# presents a cert in the trust domain below.
allowedRoutes:
namespaces:
from: All
🛰️ ServiceEntry · the remote service networking.istio.io/v1
Lives in the source cluster (cluster-east). It tells the
local control plane: checkout.payments.svc exists, it's on
a different network, here's the gateway IP that fronts it.
MESH_INTERNAL plus the right network label
is what makes ztunnel-east treat this as in-mesh HBONE traffic instead
of routing it like an external API call. Get this wrong and ztunnel-east
bypasses the HBONE path entirely.
ServiceEntry · point at the remote east-west GWapplied in cluster-east
apiVersion: networking.istio.io/v1
kind: ServiceEntry
metadata:
name: checkout-remote
namespace: payments
spec:
hosts:
- checkout.payments.svc.cluster.local
ports:
- number: 8080
name: http
protocol: HTTP
location: MESH_INTERNAL # ← matters: keep this in-mesh, not external
resolution: DNS
endpoints:
# The remote east-west gateway's LoadBalancer IP, *not* the pod IP.
- address: 203.0.113.42
network: network-west
locality: us-east-2/us-east-2a
ports:
# ztunnel-east dials 15008 on the gateway, even though the
# logical service port is 8080.
http: 15008
labels:
cluster: cluster-west
🔐 PeerAuthentication · STRICT mTLS security.istio.io/v1
Applied in cluster-west (or mesh-wide) on the destination
workload. Without STRICT, ztunnel-west will accept plain TCP from the
gateway — the SPIFFE identity in the SNI and peer cert is silently
dropped. HBONE on the wire, but no mTLS guarantee at the destination:
the connection succeeds while the workload's identity is never
authenticated.
PeerAuthentication · require mTLS at the destinationenforced by ztunnel-west
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
name: default
namespace: payments
spec:
mtls:
mode: STRICT # PERMISSIVE here would silently break identity
What goes wrong (and how it looks)
Five common failure modes for cross-cluster HBONE. Each one breaks at a specific hop above. Match the symptom to the hop and the cause usually follows.
LoadBalancer, but the cloud LB, firewall, or
security group never opened 15008 — or it was shipped as
ClusterIP by mistake. ztunnel-east stalls on TCP connect,
its logs show connect: connection refused or
i/o timeout, and the destination pod sees nothing. Check
the LB Service has 15008 exposed. Check the firewall lets it through.
cluster.local vs mesh.solo) or different root
CAs. The outer TLS handshake completes, then cert validation fails — the
gateway logs SSL_ERROR_BAD_CERT_DOMAIN or unknown CA, and
ztunnel-east sees the connection reset mid-handshake. Symptom looks like
a network issue; the cause is identity.
cluster-east is on 1-26,
cluster-west is still on 1-24. Outer TLS
completes, then the inner HTTP/2 CONNECT is rejected — an
HTTP/2 :status 421 or a stream reset — because one side
sent a header the other doesn't recognise. Keep
IstioLifecycleManager's revision consistent
across the fleet; upgrade in lockstep.
AuthorizationPolicy rules
that match on principals: silently evaluate to no-match and
fall through. Symptom: authz works in one cluster, breaks the moment
traffic crosses the WAN.
ServiceEntry has been applied via Gloo or Argo but
ztunnel-east hasn't seen the xDS push yet — or a fresh
VirtualDestination hasn't reconciled the remote endpoints.
The client pod gets connection refused on a port that has
no listener anywhere, because ztunnel-east is firing the packet at the
original pod IP instead of HBONE-tunnelling to the remote gateway.
Often transient on first apply; if it persists, check xDS sync state.
Components on the path
One row per box the packet touches, in order. The Protocol in /
Protocol out columns are what each side of that box carries on
the wire — use them to map a tcpdump capture or a log line
back to the right hop.
| Hop | Component | Where it runs | What it does | Protocol in / out |
|---|---|---|---|---|
| — | Client Pod pod |
cluster-east · payments ns · node-east-1 | Originates the outbound TCP connection to checkout.payments.global:8080. Knows nothing about HBONE. |
app TCP / app TCP |
| 1 | ztunnel-east ztunnel |
cluster-east · DaemonSet · per node | CNI redirect catches the packet, ztunnel wraps it in an HBONE CONNECT stream over an mTLS session, dials the remote GW IP on :15008. | plain TCP / HBONE (TLS+H2) |
| 2 | WAN / VPC peering |
between clouds / regions / VPCs | Plain L3 transit. Doesn't see inside the TLS session. Only requirement: TCP/15008 reachable from network-east to network-west. | HBONE / HBONE |
| 3 | istio-eastwestgateway gateway |
cluster-west · istio-eastwest ns · LoadBalancer Service on :15008 | Terminates outer TLS, reads SPIFFE SNI, opens a fresh HBONE tunnel to the ztunnel that owns the destination pod's node. | HBONE / HBONE |
| 4 | ztunnel-west ztunnel |
cluster-west · DaemonSet · per node | Validates the peer SPIFFE identity, enforces L4 AuthorizationPolicy, unwraps the HBONE tunnel and opens a plain TCP connection to the destination pod. | HBONE / plain TCP |
| — | Destination Pod pod |
cluster-west · payments ns · node-west-3 | Receives an ordinary TCP segment on :8080. Source IP is ztunnel-west's pod IP; identity has been authenticated upstream. |
app TCP / app TCP |
Where to go from here
The CRDs that wire this up — PeerAuthentication,
ServiceEntry, Gateway API Gateway — are
grouped on the
Istio Ambient CRDs visual map.
The east-west gateway itself gets rolled out by the Gloo Operator's
GatewayLifecycleManager; the
Gloo Operator across N clusters
page walks through where it installs from and how trust is shared
between clusters.
For the wire format itself, the Istio Ambient architecture docs cover HBONE in detail, and the ztunnel repo is the source of truth for what ztunnel actually does on receipt.