Skip to content

Conversation

anarthal
Copy link
Collaborator

@anarthal anarthal commented Oct 3, 2025

  • Adds support for terminal and partial cancellation to async_run.
  • Makes basic_connection::cancel use per-operation cancellation under the hood.
  • Fixes a number of race conditions during cancellation which could cause the cancellation to be ignored. This could happen if the cancellation is delivered while an async handler is pending execution.
  • Deprecates operation::{resolve, connect, ssl_handshake, reconnection, health_check}, in favor of operation::run. Calling basic_connection::cancel with these values (excepting reconnection) is now equivalent to using operation::run.
  • Fixes a problem in the health checker that caused ping timeouts to be reported as cancellations.
  • Sanitizes how the parallel group computes its final error code.
  • Simplifies the reader, writer and health checker to not care about connection cancellation. This is now the responsibility of the parallel group.
  • Removes an unnecessary setup_cancellation action in the reader FSM.
  • Adds documentation regarding per-operation cancellation to async_receive.
  • Adds additional health checker tests.
  • Adds async_run per-operation cancellation tests.
  • Adds reader FSM cancellation tests.
  • Makes test_conn_exec_retry tests more resilient.
  • Removes leftovers in the UNIX and TLS reconnection tests. These were required due to race conditions that have already been fixed.

close #318
close #319

@anarthal anarthal requested a review from mzimbres October 3, 2025 10:05
@anarthal
Copy link
Collaborator Author

anarthal commented Oct 3, 2025

This is a big rework on how cancellation is handled in the connection. It has the side effect of implementing per-operation cancellation for async_run, but the main point is fixing race conditions and making the reader/writer not concerned about cancelling the connection.

I prefer merging this before #320, which fixes the last race conditions regarding cancellation. Once this is done, I'll submit a PR implementing support for cancel_after in connection.

self.complete(system::error_code{});

// Wait until we're cancelled. This simplifies parallel group handling a lot
conn_->ping_timer_.expires_at((std::chrono::steady_clock::time_point::max)());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea. But I guess it would have been even better to disable health-checks by letting users set max() instead of zero(), then we would not need this branch at all. We would only have to reformulate the loop below to first wait and then to ping and not the other way around. That would be a breaking change hard to get around. We would have to rename health_check_interval to something like health_check_interval2 to intentionally break user code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, that wouldn't work because the interval is a duration, not a time_point. I actually had a bug at some point where I used duration::max rather than time_point::max and it caused an overflow error, with the operation finishing immediately.

timer_type reconnect_timer_; // to wait the reconnection period
timer_type ping_timer_; // to wait between pings
receive_channel_type receive_channel_;
asio::cancellation_signal run_signal_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this still be needed if we remove the cancel function a let would work entirely with user provided cancellation_signals?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it wouldn't. It only exists to keep compatibility with cancel(). But I feel removing it is breaking too much (essentially, everyone).

@mzimbres
Copy link
Collaborator

mzimbres commented Oct 3, 2025

Some many nice things. The simplifications on the reader_fsm are relieving. Good job, thanks.

@anarthal anarthal merged commit 5771128 into boostorg:develop Oct 3, 2025
17 checks passed
@anarthal anarthal deleted the feature/async-run-cancellation branch October 3, 2025 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support per-operation cancellation in async_run Race conditions in connection::cancel()
2 participants