Skip to content

[Bug] Worker Deployment Read API rate limit still exceeded in v1.4.0 #278

@gabriel-yahav

Description

@gabriel-yahav

Description

v1.4.0 release notes mention:

"Omit DescribeVersion calls for drained versions to avoid hitting RPS limits"

However, I am still hitting Worker Deployment Read API rate limit errors in v1.4.0.
The error is coming from DescribeWorkerDeployment (the top-level deployment describe),
not from DescribeVersion on individual versions.

Error

"error":"unable to get Temporal worker deployment state: unable to describe worker
deployment X/y: Worker Deployment Read API rate limit exceeded for namespace "ABCD""

The stacktrace points to the standard reconciler loop in controller-runtime,

Root Cause (suspected)

The v1.4.0 optimization skips DescribeVersion for drained versions, which reduces
per-version calls. However, DescribeWorkerDeployment is called unconditionally on
every reconcile iteration regardless of the deployment's state. With multiple
TemporalWorkerDeployment CRs in the same Temporal namespace and a short reconcile
interval, these calls aggregate and exceed the APS (actions-per-second) limit on
Temporal Cloud.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions