Deploying kagent on EKS, the service-account token setting you need, by Tom O'Rourke

The short version. On EKS, set K8S_TOKEN_REVIEW: "true" in the kagent-enterprise-config ConfigMap and restart the controller. Everything below is why that setting exists and how to confirm it took.

The symptom

You install Solo Enterprise for kagent on EKS with OIDC and on-behalf-of delegation turned on. The front of the flow looks healthy. A user dispatches a task, the controller validates the user's OIDC token, maps the caller to a role, and mints an on-behalf-of token for the agent. The agent starts running. Then the moment the agent calls back to the controller (to read or update its own task state, for example) the request returns 401 Unauthorized, and there is nothing useful in the logs explaining why.

The agent's callback carries its Kubernetes service-account token, not the OIDC token. So the failure is narrow: the controller accepts OIDC tokens but rejects the agent's service-account token. The token is structurally valid. Its issuer, signing key, and audience are all correct. It still gets rejected. That combination is the EKS signature of this problem.

How the controller validates a service-account token

The kagent controller can authenticate a service-account token one of two ways, and the right choice depends on the cluster.

Verify it locally (JWKS)

The controller fetches the cluster's public signing keys, then checks the token's signature and issuer itself. This is the default, and it is the right mode on a cluster whose API server both signs the projected tokens and serves the matching public keys, which covers most self-managed and local clusters.

Wrong fit on EKS

Ask the API server (TokenReview)

Instead of fetching keys, the controller hands the token to the Kubernetes API server through the TokenReview API and asks whether it is valid for the kagent audience. The API server is the authority that issued the token, so it validates natively, no key fetch involved.

Right fit on EKS

Why local verification is the wrong fit on EKS

EKS does service-account token signing differently from a typical cluster. Each EKS cluster has its own OIDC provider, hosted by AWS at a public URL, and the projected service-account tokens your pods receive are signed by a key published at that provider's JWKS endpoint. The token's issuer points at that AWS-hosted provider, and the public key that verifies it lives there too, on the public internet behind an AWS TLS certificate.

That setup leaves the local-verification path with no good source of keys, for two separate reasons:

The in-cluster keys can be the wrong keys. The API server's own local key endpoint is reachable from inside the cluster, but on EKS it can advertise a different key than the one the external OIDC provider used to sign the projected token. The controller fetches a key set that does not contain the signing key, so the signature check fails.
The external keys can be unreachable. Pointing the controller at the AWS-hosted JWKS URL only helps if the controller can actually reach it and trust its certificate. In a locked-down or ambient-mesh network, that outbound path and its public TLS chain are exactly the kind of thing that gets restricted.

So local verification is left choosing between a key set that is unreachable and one that has the wrong key. Neither validates the token, which is why it fails even though the token itself is perfectly good. No JWKS URL, issuer string, or audience override changes that, because the problem is the validation mode, not its parameters.

The fix: TokenReview validation

Switch the controller to TokenReview validation. It then hands each service-account token to the EKS API server and trusts the API server's answer. The API server issued the token, so it validates it natively and sidesteps the whole external-key question. The setting is a single key on the controller's config:

# kagent-enterprise-config ConfigMap, in the controller's namespace
data:
  K8S_TOKEN_REVIEW: "true"

Apply it to the kagent-enterprise-config ConfigMap that the controller reads its validation settings from, then restart the controller so it picks up the change:

kubectl -n kagent patch configmap kagent-enterprise-config \
  --type merge -p '{"data":{"K8S_TOKEN_REVIEW":"true"}}'

kubectl -n kagent rollout restart deploy/kagent-controller

Keep this one in your upgrade runbook: a helm upgrade of the controller re-renders its config, so re-apply the patch and restart after an upgrade to keep TokenReview validation in place.

Two things make this work out of the box, with nothing else to wire up:

The audience already lines up. TokenReview asks the API server to validate the token for the kagent audience, and the projected token the agent presents already carries that audience. No audience to configure.
The controller already has permission. The controller's ClusterRole ships with create on tokenreviews.authentication.k8s.io, so it can call the TokenReview API the moment you turn the setting on.

This is EKS-specific. On a kind, k3s, or typical self-managed cluster the API server both signs the projected tokens and serves the matching keys, so local verification works and you can leave this setting off. EKS (and any cluster that signs service-account tokens with an external OIDC provider) is where TokenReview earns its place. The trade-off is one TokenReview call to the API server per validation, which is cheap and the documented mode for exactly this situation.

Confirming the callback path is healthy

After the controller restarts, run the same flow that failed. Dispatch a task, let the agent start, and watch the agent's callback to the controller. The request that was returning 401 should now return 200, and the task should progress to completion instead of stalling at the first callback.

If it still returns 401 or 403 after the flip, check that the controller pod actually restarted and re-read the ConfigMap, and that the ConfigMap value is the string "true" and not a boolean. Those are the two things that keep the new mode from taking effect.

Checklist

kagent on EKS, the short version

Install Solo Enterprise for kagent the usual way. The only EKS-specific change is the validation mode.
Set K8S_TOKEN_REVIEW: "true" in the kagent-enterprise-config ConfigMap.
Restart the controller so it reloads its validation settings.
Re-apply the patch and restart after a helm upgrade, which re-renders the controller config.
No audience or RBAC changes needed: the agent token already uses the kagent audience and the controller already has create on tokenreviews.
Confirm by re-running the flow: the agent's callback returns 200 instead of a silent 401.
Leave the setting off on kind, k3s, and typical self-managed clusters, where local verification works.