-
Notifications
You must be signed in to change notification settings - Fork 908
Extend ProbingRequestFactory to support Models and Async calls #2985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Extend ProbingRequestFactory to support Models and Async calls #2985
Conversation
…uestFactory and updated interface, make flow async for futureproofing and maximum extensibility
|
@dotnet-policy-service agree |
| /// <param name="cluster">The cluster being probed.</param> | ||
| /// <param name="destination">The destination being probed.</param> | ||
| /// <returns>Probing <see cref="HttpRequestMessage"/>.</returns> | ||
| ValueTask<HttpRequestMessage> CreateRequestAsync(ClusterState cluster, DestinationState destination) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might as well include a CancellationToken argument while we're at it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding CancellationToken means we should probably also support explicit cancelling in the probe creation flow, which should be treated as a separate case from ActiveHealthProbeConstructionFailedOnCluster, i.e.:
catch (OperationCanceledException) when (!cts.IsCancellationRequested)
{
Log.ActiveHealthProbingSkippedOnDestination(_logger, destination.DestinationId, cluster.ClusterId);
return null;
}
What are your thoughts on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to just decreasing the log severity in case of cancellations, or also not reporting the exception as part of the probe result?
If it's the latter, IMO we should be including the exception and leave it up to the policy to decide how to handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both (decreasing log severity and not reporting exception)
I was considering this approach because I was thinking that OperationCanceledException via an external token signals an intentional skip rather than an error - consistent with how cancellation tokens are typically used for cooperative cancellation.
If the factory wanted the policy to treat it as a failure, it could throw a different exception type. Using OperationCanceledException signals "skip quietly" - avoiding inflated error metrics for expected behavior (e.g., circuit breaker patterns, rate limiting, conditional probing).
Main pros of this approach:
- The edge case only applies when a user explicitly implements a custom
IProbingRequestFactorythat uses cancellation - the default factory doesn't do this, so built-in policies don't need to handle it - The factory author controls the behavior via exception type choice
- It aligns with cancellation semantics: "don't do this, and don't worry about it"
That said, I see the value in giving the policy visibility into skipped probes.
Maybe including cancelled probes in results (with the exception) and lowering to Debug-level logging is a good middle-ground.
Another similar suggestion would be to add a flag on Probing Results which would mark explicit cancellation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think lowering the log severity in such cases is okay. We have precedent there with how we're already doing so for forwarding errors in cases of cancellation.
I don't expect many users in practice will need the CreateRequest to be async. The async part is more of a "might as well" since we're already changing the API. That also means it's unlikely for these to ever be cancelled in practice.
"don't do this, and don't worry about it"
I think the "don't worry about it" part should be up to the policy. If someone does want custom behavior here, they can also implement the policy. I'd consider " add a flag on Probing Results" to fall in the same boat.
Or do you believe that we're making it unnecessary difficult for users to implement custom policies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we're making in difficult in terms of functionality, the main scenario I'm thinking of is a user trying to implement probe skipping logic using the existing support, they will have to throw exceptions by design, drowning real exceptions (especially problematic in large clusters).
Let's simplify the PR:
- Pass all
OperationCanceledExceptionsto the policy (both timeout and external) - let the policy decide - Keep Debug-level logging for cancellation - addresses the noisy logs concern (but not the exceptions in the dotnet metrics)
For exception-free probe skipping, we could add a dedicated mechanism (maybe return a wrapper type). Will open a dedicate issue to discuss it.
That would be a cleaner API than relying on exception semantics and scope this PR for the original issue API support.
…over new code paths.
This change updates the
IProbingRequestFactory'sCreateRequestAPI to useDestinationStateandClusterStateinstead ofDestinationModel,ClusterModeland makes the method async.Simplifies scenarios where consumers need to access or update health state without relying on internal models.
Allows access to the ID and allows overloads to rely on
ConditionalWeakTablemappings on the objects.This change originated from a discussion with @MihaZupan, and subsequent filing of bug #2890