task: add task_context propagation support (v2)#271
task: add task_context propagation support (v2)#271ballard26 wants to merge 1 commit intoredpanda-data:v26.1.xfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an optional per-task task_context that propagates through Seastar task chains (behind SEASTAR_TASK_CONTEXT), plus a with_context() helper and unit tests to validate propagation and non-propagation across shards.
Changes:
- Introduce
seastar::task_context, task-level storage (task::_context), TLS accessors, and inheritance on task construction (gated bySEASTAR_TASK_CONTEXT). - Keep TLS in sync while running tasks in the reactor and in
set_current_task(), and prevent cross-shard inheritance viano_context_tagforsmpwork items. - Add
with_context()API plus comprehensive unit tests and build wiring; add CMake/configure flags.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
include/seastar/core/task.hh |
Defines task_context, TLS accessors, and task context storage/inheritance. |
src/core/reactor.cc |
Sets/clears TLS around task execution; syncs TLS in set_current_task(). |
include/seastar/core/with_context.hh |
Adds with_context(ctx, func, ...) helper to scope TLS during synchronous execution. |
include/seastar/core/smp.hh |
Ensures cross-shard work_item tasks don’t inherit context. |
tests/unit/task_context_test.cc |
Adds unit tests for TLS accessors, propagation semantics, and shard boundaries. |
tests/unit/CMakeLists.txt |
Registers the new task_context_test.cc unit test. |
configure.py |
Adds --enable/disable-task-context configure option wiring. |
CMakeLists.txt |
Adds Seastar_TASK_CONTEXT option and compile definition. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| internal::current_task_context_ref() = tsk->context_ptr(); | ||
| #endif | ||
| tsk->run_and_dispose(); | ||
| r._current_task = nullptr; |
There was a problem hiding this comment.
In run_tasks(), TLS is set to tsk->context_ptr() before run_and_dispose(), but it isn't cleared until after the loop. Since run_and_dispose() destroys the task (and may drop the last lw_shared_ptr reference to the context), current_task_context_ref() can temporarily hold a dangling pointer for the remainder of the iteration (and potentially until the next task sets it). This is a concrete UAF hazard if any code between run_and_dispose() and the next assignment/clear reads current_task_context() (e.g., instrumentation, logging, exception paths).
Clear the TLS immediately after run_and_dispose() (e.g., set it to nullptr when _current_task is nulled) or switch this loop to use set_current_task(tsk) / set_current_task(nullptr) so the TLS and _current_task stay consistent throughout the loop.
| r._current_task = nullptr; | |
| r._current_task = nullptr; | |
| #ifdef SEASTAR_TASK_CONTEXT | |
| internal::current_task_context_ref() = nullptr; | |
| #endif |
| with_context(const lw_shared_ptr<task_context>& ctx, Func&& func, Args&&... args) noexcept { | ||
| using return_type = decltype(func(std::forward<Args>(args)...)); | ||
| using futurator = futurize<return_type>; | ||
| #ifdef SEASTAR_TASK_CONTEXT | ||
| auto* prev = current_task_context(); | ||
| set_current_task_context(ctx.get()); | ||
| auto restore = defer([prev] () noexcept { | ||
| set_current_task_context(prev); | ||
| }); | ||
| return futurator::invoke(std::forward<Func>(func), std::forward<Args>(args)...); |
There was a problem hiding this comment.
return_type is deduced as decltype(func(std::forward<Args>(args)...)), but the actual invocation uses futurator::invoke(std::forward<Func>(func), ...). For callables with ref-qualified overloads (e.g., operator()() && only), decltype(func(...)) (where func is an lvalue) can differ from std::invoke_result_t<Func&&, Args&&...>, causing hard-to-diagnose compilation failures.
Prefer deducing the return type using std::invoke_result_t<Func&&, Args&&...> (or reuse futurize_invoke() from future.hh) so the deduced type matches the forwarded call.
|
|
||
| option (Seastar_TASK_CONTEXT | ||
| "Enable task context propagation for tracing and request-scoped state." | ||
| ON) |
There was a problem hiding this comment.
Seastar_TASK_CONTEXT defaults to ON, but enabling it increases every seastar::task by an extra pointer-sized field (and adds per-task-construction work via inherit_task_context()). This changes baseline memory footprint/ABI for all consumers, even those not using task context.
Consider defaulting this option to OFF (similar to Seastar_TASK_BACKTRACE) and letting users opt in explicitly, unless there’s a strong compatibility/performance reason to enable by default.
| ON) | |
| OFF) |
Add an optional lw_shared_ptr<task_context> field on seastar::task
that propagates automatically through the task chain, gated by the
SEASTAR_TASK_CONTEXT compile flag. task_context is a polymorphic
base (enable_lw_shared_from_this + virtual dtor) that users subclass
to attach per-chain state such as tracing spans. The field adds
8 bytes to task when the flag is enabled and is compiled out
otherwise.
Propagation uses a thread-local raw pointer as the authority,
mirroring how current_scheduling_group() works:
- Reactor loop: sets TLS from task._context before run_and_dispose,
clears after the loop iteration.
- Task constructor: calls inherit_task_context() which reads TLS
and shared_from_this()'s it into the new task's _context.
- set_current_task(): keeps TLS in sync with the current task for
paths that bypass the reactor loop (parallel_for_each inline
resumption, thread_context::switch_in, thread.cc blocking .get()).
- smp: work_item uses a no_context_tag to construct its task
without inheriting, since context cannot cross shards (lw_shared_ptr
is not thread-safe).
The primary user-facing API is with_context(ctx, func, args...) —
a function wrapper that installs ctx as TLS for the duration of
func's synchronous execution. Inline work created by func (coroutine
promises, .then() continuations, parallel_for_each branches) inherits
ctx at construction time via TLS, and their own task._context carries
it across suspensions. The caller's task is never modified, so the
function is safe inside coroutines, long-lived driver tasks, and
anywhere else.
current_task_context() and set_current_task_context() expose the
TLS accessors publicly so users can build custom RAII or awaitable
helpers on top when ergonomics demand it.
Adds tests/unit/task_context_test.cc covering the primitive accessors,
propagation across .then() chains and coroutine suspensions, gate,
async, parallel_for_each, when_any, scheduling group switches,
as_future, try_future, maybe_yield, cross-shard non-propagation,
refcount semantics, and regression coverage for the set_current_task
and thread.cc sync points.
9611be5 to
b96f47b
Compare
Update the Seastar dependency to the task-context-2 branch (redpanda-data/seastar#271) which adds lw_shared_ptr<task_context> propagation through the task chain via the with_context primitive and public TLS accessors. Add the new with_context.hh header to the Seastar BUILD so Redpanda code can use it.
Add an optional
lw_shared_ptr<task_context>field onseastar::taskthat propagates automatically through the task chain, gated by theSEASTAR_TASK_CONTEXTcompile flag.task_contextis a polymorphic base (enable_lw_shared_from_this+ virtual dtor) that users subclass to attach per-chain state such as tracing spans. The field adds 8 bytes to task when the flag is enabled and is compiled out otherwise.Propagation uses a thread-local raw pointer as the authority, mirroring how
current_scheduling_group()works:task._contextbeforerun_and_dispose, clears after the loop iteration.inherit_task_context()which reads TLS andshared_from_this()'s it into the new task's_context.set_current_task(): keeps TLS in sync with the current task for paths that bypass the reactor loop (parallel_for_eachinline resumption,thread_context::switch_in, thread.cc blocking.get()).work_itemuses ano_context_tagto construct its task without inheriting, since context cannot cross shards (lw_shared_ptris not thread-safe).The primary user-facing API is
with_context(ctx, func, args...), a function wrapper that installs ctx as TLS for the duration of func's synchronous execution. Inline work created by func (coroutine promises,.then()continuations,parallel_for_eachbranches) inherits ctx at construction time via TLS, and their owntask._contextcarries it across suspensions. The caller's task is never modified, so the function is safe inside coroutines, long-lived driver tasks, and anywhere else.current_task_context()andset_current_task_context()expose the TLS accessors publicly so users can build custom RAII or awaitable helpers on top when ergonomics demand it.