Problem
The S3 archival feature currently only supports IRSA (IAM Roles for Service Accounts) on EKS. When spec.archival.provider.s3.roleName is set, the operator annotates all temporal service accounts with eks.amazonaws.com/role-arn, which triggers AssumeRoleWithWebIdentity via the IRSA webhook.
This doesn't work on clusters using EKS Pod Identity (the newer, recommended mechanism), because:
- The validating webhook requires
roleName or credentials — there's no way to opt out and let the default credential chain handle it (which is how Pod Identity works).
- Even if the IAM role exists with a Pod Identity trust policy (
pods.eks.amazonaws.com), the operator forces IRSA by adding the SA annotation, and the Temporal server calls AssumeRoleWithWebIdentity which fails with AccessDenied.
Observed behavior
With Pod Identity configured and roleName set:
operation: RegisterNamespace
error: WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
This error is returned by the Temporal frontend during RegisterNamespace (since it validates archival), but the operator surfaces it as context deadline exceeded — making it very hard to diagnose.
Expected behavior
The operator should support EKS Pod Identity for S3 archival. Possible approaches:
- Make
roleName optional in the webhook validation for S3 on EKS — allow users to rely on the default AWS credential chain (Pod Identity, instance profile, etc.) without requiring roleName or credentials.
- Add a
useDefaultCredentials: true option (or similar) to explicitly opt into the default credential chain.
Additional notes
- The operator adds the
eks.amazonaws.com/role-arn annotation to all service accounts (frontend, history, matching, worker), not just history. This means all pods attempt IRSA auth, even though only history and frontend need S3 access for archival.
- The
context deadline exceeded error from the namespace controller obscures the real failure (AccessDenied on AssumeRoleWithWebIdentity). Better error propagation from the frontend's RegisterNamespace response would help debugging.
Environment
- Operator version: v0.22.0
- Temporal server: 1.28.0
- EKS Auto Mode with Pod Identity
- Kubernetes: 1.35
Problem
The S3 archival feature currently only supports IRSA (IAM Roles for Service Accounts) on EKS. When
spec.archival.provider.s3.roleNameis set, the operator annotates all temporal service accounts witheks.amazonaws.com/role-arn, which triggersAssumeRoleWithWebIdentityvia the IRSA webhook.This doesn't work on clusters using EKS Pod Identity (the newer, recommended mechanism), because:
roleNameorcredentials— there's no way to opt out and let the default credential chain handle it (which is how Pod Identity works).pods.eks.amazonaws.com), the operator forces IRSA by adding the SA annotation, and the Temporal server callsAssumeRoleWithWebIdentitywhich fails withAccessDenied.Observed behavior
With Pod Identity configured and
roleNameset:This error is returned by the Temporal frontend during
RegisterNamespace(since it validates archival), but the operator surfaces it ascontext deadline exceeded— making it very hard to diagnose.Expected behavior
The operator should support EKS Pod Identity for S3 archival. Possible approaches:
roleNameoptional in the webhook validation for S3 on EKS — allow users to rely on the default AWS credential chain (Pod Identity, instance profile, etc.) without requiringroleNameorcredentials.useDefaultCredentials: trueoption (or similar) to explicitly opt into the default credential chain.Additional notes
eks.amazonaws.com/role-arnannotation to all service accounts (frontend, history, matching, worker), not just history. This means all pods attempt IRSA auth, even though only history and frontend need S3 access for archival.context deadline exceedederror from the namespace controller obscures the real failure (AccessDeniedonAssumeRoleWithWebIdentity). Better error propagation from the frontend'sRegisterNamespaceresponse would help debugging.Environment