-
Notifications
You must be signed in to change notification settings - Fork 79
Fix/221 retrypolicy #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/221 retrypolicy #314
Conversation
Signed-off-by: Casper Nielsen <casper@diagrid.io>
…olicy Signed-off-by: Casper Nielsen <casper@diagrid.io>
Signed-off-by: Casper Nielsen <casper@diagrid.io>
…d each activity gets applied retry policy Signed-off-by: Casper Nielsen <casper@diagrid.io>
Signed-off-by: Casper Nielsen <casper@diagrid.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Casper, thank you for working on this feature. I love this feature!
I think we should make sure that these attributes that are all related to each other in the context of a retry should be somehow grouped together.
I was thinking that we could add this within the agent execution config, such as this
execution=AgentExecutionConfig(
max_iterations=5
retry_max_attempts=10,
retry_initial_backoff=5,
retry_max_backoff=45,
retry_backoff_multiplier=1.5,
),but this does not apply to regular agents. As a result, we cannot include it there.
This is the best I can come up with.
retry=DurableRetryConfig(
max_attempts=10,
initial_backoff=5,
max_backoff=45,
backoff_multiplier=1.5,
),This has few benefits:
- It shows the fields that are related to each other.
- They are about retry behaviour
- It makes clear that this is a durable retry different than the in-memory retries,
- This config can be set to durable agents only
HTH
Signed-off-by: Casper Nielsen <casper@diagrid.io>
Signed-off-by: Casper Nielsen <casper@diagrid.io>
|
@bibryam have a look at the updates |
sicoyle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments so far 🙏
dapr_agents/agents/configs.py
Outdated
|
|
||
|
|
||
| @dataclass | ||
| class DurableRetryConfig: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be more workflow-specific. Something like WorkflowRetryConfig might be clearer, since this behavior only applies to the durable agent anyway, which makes “durable” feel a bit redundant here.
On retries more broadly, should we consider introducing a resiliency policy in our quickstarts—or at least documenting it—so users know how to apply retry policies for state operations and pub/sub? For example, we could reference the existing Dapr resiliency quickstart or general docs on this:
https://docs.dapr.io/getting-started/quickstarts/resiliency/resiliency-state-quickstart/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll relabel this WorkflowRetryPolicy to be explicit on it
| max_attempts: Optional[int] = 1 | ||
| initial_backoff_seconds: Optional[int] = 5 | ||
| max_backoff_seconds: Optional[int] = 30 | ||
| backoff_multiplier: Optional[float] = 1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason there are four fields here yet docs show 5 options we can set?
https://docs.dapr.io/developing-applications/building-blocks/workflow/workflow-features-concepts/#retry-policies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an oversight on my part. I've added the 5th as well!
| raise ( | ||
| ValueError("max_attempts or DAPR_API_MAX_RETRIES must be at least 1.") | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should consider if we want to deviate here. The default that i see in docs is actually a 0 value for this... which I think for durable agent durability we want retries enabled, just want confirmation on that :)
https://docs.dapr.io/developing-applications/sdks/python/python-client/#initialising-the-client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to check on 1, else it might fail upstream
Signed-off-by: Casper Nielsen <casper@diagrid.io>
Signed-off-by: Casper Nielsen <casper@diagrid.io>
Signed-off-by: Casper Nielsen <casper@diagrid.io>
| backoff_multiplier: Multiplier for exponential backoff. | ||
| """ | ||
|
|
||
| max_attempts: Optional[int] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is asking for infinite retries by default. In dapr we have -1 in places to signal infinity. Would it make sense to allow for infinite retries here in that same manner as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately that's not how it is in the upstream sdk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so as-is, there is no way to specify retries indefinitely? maybe we should tweak the upstream sdk eventually to support this :)
this PR is a good start!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cicoyle yes I concur. This should be fixed upstream in the python-sdk then carried down here. I'll put it on our todo list!
Signed-off-by: Casper Nielsen <casper@diagrid.io>
|
@sicoyle, @cicoyle I've added dapr/python-sdk#870 for upstream fixing |
sicoyle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thank you!!
Description
Implement default retry policy and allow overwriting during agent instantiation.
Issue reference
We strive to have all PR being opened based on an issue, where the problem or feature have been discussed prior to implementation.
Please reference the issue this PR closes: #221
Checklist
Please make sure you've completed the relevant tasks for this PR, out of the following list:
Note: We expect contributors to open a corresponding documentation PR in the dapr/docs repository. As the implementer, you are the best person to document your work! Implementation PRs will not be merged until the documentation PR is opened and ready for review.