-
Notifications
You must be signed in to change notification settings - Fork 78
Add PINNED_UNTIL_CONTINUE_AS_NEW behavior and CONTINUE_AS_NEW_SUGGESTED_REASON
#657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…nue-as-new-signal
…nue-as-new-signal
temporal/api/enums/v1/workflow.proto
Outdated
| // Workflow History size is getting too large. | ||
| CONTINUE_AS_NEW_SUGGESTED_REASON_HISTORY_SIZE_TOO_LARGE = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, there is "length too large" and "size too large". Is that correct? If so, we can either collapse this into one by removing the SIZE_ part of this enum, or we can separate the enums.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I vote for separating them. Users using external object storage will hit the length limit but not the size, and they may wonder, "I thought using external storage was supposed to fix my workflow history problems!"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the existing history related continue as new suggestion options are based on: limit.historySize.suggestContinueAsNew, limit.historyCount.suggestContinueAsNew, and history.maxTotalUpdates.suggestContinueAsNewThreshold, where the maxTotalUpdates one is about count, not size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So can we have CONTINUE_AS_NEW_SUGGESTED_REASON_HISTORY_SIZE_TOO_LARGE and CONTINUE_AS_NEW_SUGGESTED_REASON_HISTORY_COUNT_TOO_LARGE as independent enum values here? Or if both reasons must be represented as a single enum value, can it be called CONTINUE_AS_NEW_SUGGESTED_REASON_HISTORY_TOO_LARGE instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can separate it. The place in history where we decide to suggest CaN can tell which one it is
temporal/api/enums/v1/workflow.proto
Outdated
| CONTINUE_AS_NEW_SUGGESTED_REASON_HISTORY_SIZE_TOO_LARGE = 1; | ||
|
|
||
| // Workflow Update registry is getting too large. | ||
| CONTINUE_AS_NEW_SUGGESTED_REASON_UPDATE_REGISTRY_TOO_LARGE = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this already exist today? I can't find the docs on it, but just want to understand it a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We document the update limit here: https://docs.temporal.io/workflow-execution/event#event-history-limits
openapi/openapiv3.yaml
Outdated
| enum: | ||
| - CONTINUE_AS_NEW_SUGGESTED_REASON_UNSPECIFIED | ||
| - CONTINUE_AS_NEW_SUGGESTED_REASON_HISTORY_SIZE_TOO_LARGE | ||
| - CONTINUE_AS_NEW_SUGGESTED_REASON_UPDATE_REGISTRY_TOO_LARGE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not introduce the term "registry" since it's an implementation detail.
Would "too many updates" be sufficiently accurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, "too many updates" is accurate and concise!
openapi/openapiv3.yaml
Outdated
| - CONTINUE_AS_NEW_SUGGESTED_REASON_UNSPECIFIED | ||
| - CONTINUE_AS_NEW_SUGGESTED_REASON_HISTORY_SIZE_TOO_LARGE | ||
| - CONTINUE_AS_NEW_SUGGESTED_REASON_UPDATE_REGISTRY_TOO_LARGE | ||
| - CONTINUE_AS_NEW_SUGGESTED_REASON_WORKER_DEPLOYMENT_VERSION_CHANGED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TARGET_WORKER... since we've now documented that?
temporal/api/enums/v1/workflow.proto
Outdated
| CONTINUE_AS_NEW_SUGGESTED_REASON_HISTORY_SIZE_TOO_LARGE = 1; | ||
|
|
||
| // Workflow Update registry is getting too large. | ||
| CONTINUE_AS_NEW_SUGGESTED_REASON_UPDATE_REGISTRY_TOO_LARGE = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We document the update limit here: https://docs.temporal.io/workflow-execution/event#event-history-limits
temporal/api/enums/v1/workflow.proto
Outdated
| // Workflow History size is getting too large. | ||
| CONTINUE_AS_NEW_SUGGESTED_REASON_HISTORY_SIZE_TOO_LARGE = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I vote for separating them. Users using external object storage will hit the length limit but not the size, and they may wonder, "I thought using external storage was supposed to fix my workflow history problems!"
| // either event count or bytes) is getting large. | ||
| // True if this workflow should continue-as-new soon. Reasons for this could be: | ||
| // - History size (in either event count or bytes) is getting large | ||
| // - Update registry size is getting large |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is an update registry?
| // The reason that suggest_continue_as_new is true, if it is. | ||
| // Unspecified if suggest_continue_as_new is false or if the reason is unknown. | ||
| temporal.api.enums.v1.ContinueAsNewSuggestedReason continue_as_new_suggested_reason = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this right underneath suggest_continue_as_new, since they're semantically related
temporal/api/enums/v1/workflow.proto
Outdated
| // ContinueAsNewSuggestedReason specifies why SuggestContinueAsNew is true. | ||
| enum ContinueAsNewSuggestedReason { | ||
| // Reason will be Unspecified if the server is not configured to suggest continue as new, | ||
| // if the server is not configured to provide a reason for continue as new suggested, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no configuration, right? it's just whether it's a new enough server version to set this field. and if suggest_continue_as_new is false then no one should look at this. I think we don't really need a comment on the unspecified value.
| int64 build_id_redirect_counter = 7 [deprecated = true]; | ||
| // The reason that suggest_continue_as_new is true, if it is. | ||
| // Unspecified if suggest_continue_as_new is false or if the reason is unknown. | ||
| temporal.api.enums.v1.ContinueAsNewSuggestedReason continue_as_new_suggested_reason = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the word order switch throws me off every time. can we name this suggest_continue_as_new_reason / SuggestContinueAsNewReason to line up with the existing field?
| int64 build_id_redirect_counter = 7 [deprecated = true]; | ||
| // The reason that suggest_continue_as_new is true, if it is. | ||
| // Unspecified if suggest_continue_as_new is false or if the reason is unknown. | ||
| temporal.api.enums.v1.ContinueAsNewSuggestedReason continue_as_new_suggested_reason = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be too sneaky, but it would be wire-compatible to actually just change the existing bool field to this enum: previous clients would read any non-zero value as true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
innteresting.. I'm curious on @Sushisource thoughts on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with that. Fewer redundant fields the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR was updated to make this now a repeated and therefore may not be wire compatible (unsure, didn't dig into internals)
temporal/api/enums/v1/workflow.proto
Outdated
| VERSIONING_BEHAVIOR_AUTO_UPGRADE = 2; | ||
| // The same semantics as PINNED, with the exception of Continue-As-New. | ||
| // When a PINNED_UNTIL_CONTINUE_AS_NEW workflow continues-as-new, the new execution | ||
| // will start on the workflow's Target Version, just like an AUTO_UPGRADE workflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better to rephrase this to say the new run will start with AU behavior, and then takes the behavior specified on the Target Version (likely PUCAN again). Because that's what happens really.
…nue-as-new-signal
…nue-as-new-signal
cretz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from a "suggest continue as new reason", but will defer to others on the versioning aspect
| // - Workflow's Target Version is different from its Pinned Version, and the Versioning Behavior is PinnedUntilContinueAsNew | ||
| bool suggest_continue_as_new = 4; | ||
| // The reason(s) that suggest_continue_as_new is true, if it is. | ||
| // Nil if suggest_continue_as_new is false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic, but this can't be "nil" in the traditional sense, more like "unset"/"empty"
READ BEFORE MERGING: All PRs require approval by both Server AND SDK teams before merging! This is why the number of required approvals is "2" and not "1"--two reviewers from the same team is NOT sufficient. If your PR is not approved by someone in BOTH teams, it may be summarily reverted.
Add
PINNED_UNTIL_CONTINUE_AS_NEWbehavior andCONTINUE_AS_NEW_SUGGESTED_REASONSo that long running workflows can take advantage of workflow pinning, for the small price of doing Continue-As-New
No, all additive
This does not break the server, since it only adds enums, so it can be merged whenever.