Joe/tmp/decom#9
Draft
joe-redpanda wants to merge 11 commits intodevfrom
Draft
Conversation
Upcoming changes will require accessing the cluster's config_version for correctness checks, but config_manager is a heavy dependency. This commit adds 1. an interface for cluster configs - (currently just config_version fetching) 2. an implementation for cluster_config backed by config_backend 3. a test impl which allows the user to bind a config_i to a lambda
Adds partition_autobalancing_node_autodecommission_time which is the time in seconds after which partition balancer planner should begin decommissioning a node which is unresponsive.
Wires partition_autobalancing_node_autodecommission_time into partition balancer. This commit adds the basicmost implementation of auto decommissioning which is based on the last seen from the perspective of the current controller broker. This implementation will run into problems when controller leadership changes. In future commits, this will be changed for a coordinated approach where the partition_balancer_planner will instead use the cluster health report to seek the consent of a quorum of nodes before decommissioning a broker.
adds config_i and node_status to health monitor backend. These will be used in future commits to create an auto decom status report
Adds a new struct to health_monitor_types: auto_decommission_status. This struct will report when a node hasn't been heard from long enough to be considered for automatic decommission Adds this struct to node_health_report and node_health_report_serde. Adds build fixes needed given the above.
Implements calculation of nodes past auto decom time in health montior backend. All nodes which are past the automatic decommission time will be added to the auto_decommission_status field in the health report. A node will be added if it was last seen prior to now - decom time or if not seen ever this nodes uptime is prior to now - decom time
Adds config_i to partition_balancer_state alongside wiring in the sharded service based implementation in the controller.
adds a passthru method to get the config_version from config_i
Adds unit tests for the partition_balancer_planner's logic of determining when to auto decommisssion a node. Checks are: 1. check that our calculation of quorum is correct 2. dont double decommission nodes 3. dont decommission nodes in maintenance mode 4. ignore votes with a different config_version than the leader 5. ignore votes from nodes which aren't cluster members 6. support multiple decoms so long as quorum is alive
Adds unit tests on the generation of auto decommission status reports within health_monitor_backend. Tests the following: 1. don't report when nothing is timed out 2. report when a node is past timeout and uptime is past timeout 3. default last seen to timepoint of bootup 4. config is passed through 5. ignore non-members in reporting 6. multiple nodes can be included in reports
Adds two tests. 1. smoke test: check that we can auto decom a node if it elapses the auto decom timeout 2. reset test: check that node restarts DO reset the timer on auto decommissioning
joe-redpanda
pushed a commit
that referenced
this pull request
Jan 8, 2026
Seems like the `finally()` clause here was being invoked long after
the owning semaphore was destructed. Add a `_gate` holder to prevent
any heap-use-after-free.
```
#0 0x55f4110f85a7 in seastar::basic_semaphore<ssx::basic_checkpoint_mutex<seastar::lowres_clock>::checkpoint_mutex_exception_factory, seastar::lowres_clock>::available_units() const external/+non_module_dependencies+seastar/include/seastar/core/semaphore.hh:462:55
#1 0x55f4110f85a7 in ssx::basic_checkpoint_mutex<seastar::lowres_clock>::release() bazel-out/k8-dbg/bin/src/v/ssx/_virtual_includes/checkpoint_mutex/ssx/checkpoint_mutex.h:221:9
#2 0x55f4110f85a7 in ssx::basic_mutex_units<seastar::lowres_clock>::release() bazel-out/k8-dbg/bin/src/v/ssx/_virtual_includes/checkpoint_mutex/ssx/checkpoint_mutex.h:271:19
#3 0x55f4110d7cba in ssx::basic_mutex_units<seastar::lowres_clock>::~basic_mutex_units() bazel-out/k8-dbg/bin/src/v/ssx/_virtual_includes/checkpoint_mutex/ssx/checkpoint_mutex.h:247:9
#4 0x55f4110d7cba in std::__1::__optional_destruct_base<ssx::basic_mutex_units<seastar::lowres_clock>, false>::~__optional_destruct_base[abi:se200100]() external/toolchains_llvm++llvm+current_llvm_toolchain/bin/../../toolchains_llvm++llvm+current_llvm_toolchain_llvm/bin/../include/c++/v1/optional:300:15
#5 0x55f4110d7cba in cloud_topics::l0::read_debounce<seastar::lowres_clock>::process_single_request(cloud_topics::l0::read_request<seastar::lowres_clock>*)::'lambda'()::~() src/v/cloud_topics/level_zero/read_debounce/read_debounce.cc:151:20
#6 0x55f4110fdf23 in seastar::noncopyable_function<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>> (seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&)>::~noncopyable_function() external/+non_module_dependencies+seastar/include/seastar/util/noncopyable_function.hh:200:9
#7 0x55f4110fdf23 in seastar::continuation<seastar::internal::promise_base_with_type<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>, seastar::noncopyable_function<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>> (seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&)>, seastar::futurize<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>>::type seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>::then_wrapped_nrvo<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>, seastar::noncopyable_function<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>> (seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&)>>(seastar::noncopyable_function<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>> (seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&)>&&)::'lambda'(seastar::internal::promise_base_with_type<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&, seastar::noncopyable_function<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>> (seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&)>&&, seastar::future_state<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&), std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>::~continuation() external/+non_module_dependencies+seastar/include/seastar/core/future.hh:724:8
#8 0x55f4110fdc82 in seastar::continuation<seastar::internal::promise_base_with_type<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>, seastar::noncopyable_function<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>> (seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&)>, seastar::futurize<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>>::type seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>::then_wrapped_nrvo<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>, seastar::noncopyable_function<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>> (seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&)>>(seastar::noncopyable_function<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>> (seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&)>&&)::'lambda'(seastar::internal::promise_base_with_type<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&, seastar::noncopyable_function<seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>> (seastar::future<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&)>&&, seastar::future_state<std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>&&), std::__1::expected<cloud_topics::l0::dataplane_query_result, cloud_topics::errc>>::run_and_dispose() external/+non_module_dependencies+seastar/include/seastar/core/future.hh:745:9
#9 0x55f42046b967 in seastar::reactor::task_queue::run_tasks() external/+non_module_dependencies+seastar/src/core/reactor.cc:2747:14
#10 0x55f42047256c in seastar::reactor::task_queue_group::run_tasks() external/+non_module_dependencies+seastar/src/core/reactor.cc:3251:27
redpanda-data#11 0x55f420471f91 in seastar::reactor::task_queue_group::run_some_tasks() external/+non_module_dependencies+seastar/src/core/reactor.cc:3235:5
redpanda-data#12 0x55f4204741e6 in seastar::reactor::do_run() external/+non_module_dependencies+seastar/src/core/reactor.cc:3418:20
redpanda-data#13 0x55f4204db2e7 in seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0::operator()() const external/+non_module_dependencies+seastar/src/core/reactor.cc:4732:22
redpanda-data#14 0x55f4204db2e7 in decltype(std::declval<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>()()) std::__1::__invoke[abi:se200100]<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>(seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&) external/toolchains_llvm++llvm+current_llvm_toolchain/bin/../../toolchains_llvm++llvm+current_llvm_toolchain_llvm/bin/../include/c++/v1/__type_traits/invoke.h:179:25
```
joe-redpanda
pushed a commit
that referenced
this pull request
Jan 15, 2026
This test replicates known issues with the `<=>` operator and hopefully
will prevent other issues from going unnoticed.
An example of the new test reproducing known issues;
```
$ bazel test //src/v/bytes/tests:iobuf_fuzz --config=sanitizer --runs_per_test=100 00:17:56 [8/1855]
INFO: Invocation ID: ed80b7dc-a90c-495f-a647-22d11fc5c38c
INFO: Analyzed target //src/v/bytes/tests:iobuf_fuzz (0 packages loaded, 0 targets configured).
FAIL: //src/v/bytes/tests:iobuf_fuzz (run 18 of 100) (Exit 1) (see /home/brandonallard/data/.cache/bazel/_bazel_brandonallard/50acd7c106e570e66c6351a7103182f1/execroot/_main/bazel-out/k8-dbg/testlogs/src/v/bytes/tests/iobuf_fuzz/run_18_of_100/test.log)
INFO: From Testing //src/v/bytes/tests:iobuf_fuzz (run 18 of 100):
==================== Test output for //src/v/bytes/tests:iobuf_fuzz (run 18 of 100):
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 2773365366
INFO: A corpus is not provided, starting from an empty corpus
WARNING: no interesting inputs were found so far. Is the code instrumented for coverage?
This may also happen if the target rejected all inputs we tried so far
src/v/bytes/iobuf.cc:192:52: runtime error: null pointer passed as argument 2, which is declared to never be null
external/+non_module_dependencies+x86_64_sysroot/usr/include/string.h:65:33: note: nonnull attribute specified here
#0 0x7ff296ec461d in iobuf::operator<=>(iobuf const&) const src/v/bytes/iobuf.cc:192:28
#1 0x55d29d4a27b4 in iobuf_ops::compare_iobufs(std::__1::basic_string_view<char, std::__1::char_traits<char>>, bool) src/v/bytes/tests/iobuf_fuzz.cc:281:31
#2 0x55d29d49e171 in driver::handle_op(driver::op_spec) src/v/bytes/tests/iobuf_fuzz.cc:576:16
#3 0x55d29d49abbb in driver::operator()() src/v/bytes/tests/iobuf_fuzz.cc:513:13
#4 0x55d29d49abbb in LLVMFuzzerTestOneInput src/v/bytes/tests/iobuf_fuzz.cc:719:16
#5 0x55d29d3385db in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:619:13
#6 0x55d29d337c05 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:516:7
#7 0x55d29d339935 in fuzzer::Fuzzer::MutateAndTestOne() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:765:19
#8 0x55d29d33a595 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:910:5
#9 0x55d29d327ff5 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:915:6
#10 0x55d29d354bb2 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
redpanda-data#11 0x7ff28d010247 in __libc_start_call_main (/lib64/libc.so.6+0x3247) (BuildId: 515c33a35f41020661fea8ac4eb995e26ccd6b00)
redpanda-data#12 0x7ff28d01030a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x330a) (BuildId: 515c33a35f41020661fea8ac4eb995e26ccd6b00)
redpanda-data#13 0x55d29d31a704 in _start (/home/brandonallard/data/.cache/bazel/_bazel_brandonallard/50acd7c106e570e66c6351a7103182f1/execroot/_main/bazel-out/k8-dbg/bin/src/v/bytes/tests/iobuf_fuzz+0x2f0704) (BuildId: 4de8a27cc512a4a32db2efcd62d94228)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/v/bytes/iobuf.cc:192:52
MS: 5 CMP-CrossOver-InsertRepeatedBytes-ChangeBit-InsertByte- DE: "\000\000\000\000"-; base unit: adc83b19e793491b1c6ea0fd8b46cd9f32e592fc
0x23,0xa,0x0,0x0,0x0,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0xa,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8
,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0xa,0x0,
0\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\010\012\000
artifact_prefix='./'; Test unit written to ./crash-87234361ff2513f61e8eba73f82676d022d60e42
Base64: IwoAAAAICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKCAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKAA==
================================================================================
FAIL: //src/v/bytes/tests:iobuf_fuzz (run 76 of 100) (Exit 77) (see /home/brandonallard/data/.cache/bazel/_bazel_brandonallard/50acd7c106e570e66c6351a7103182f1/execroot/_main/bazel-out/k8-dbg/testlogs/src/v/bytes/tests/iobuf_fuzz/run_76_of_100/test.log)
INFO: From Testing //src/v/bytes/tests:iobuf_fuzz (run 76 of 100):
==================== Test output for //src/v/bytes/tests:iobuf_fuzz (run 76 of 100):
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 2867371789
INFO: A corpus is not provided, starting from an empty corpus
WARNING: no interesting inputs were found so far. Is the code instrumented for coverage?
This may also happen if the target rejected all inputs we tried so far
libc++abi: terminating due to uncaught exception of type std::runtime_error: (buf <=> o_buf) != (ref <=> o_ref)
==12== ERROR: libFuzzer: deadly signal
#0 0x55a6e61329f1 in __sanitizer_print_stack_trace /src/llvm-project/compiler-rt/lib/asan/asan_stack.cpp:87:3
#1 0x55a6e6027228 in fuzzer::PrintStackTrace() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerUtil.cpp:210:5
#2 0x55a6e6009e73 in fuzzer::Fuzzer::CrashCallback() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:231:3
#3 0x7f980b02704f (/lib64/libc.so.6+0x1a04f) (BuildId: 515c33a35f41020661fea8ac4eb995e26ccd6b00)
#4 0x7f980b080113 in __pthread_kill_implementation (/lib64/libc.so.6+0x73113) (BuildId: 515c33a35f41020661fea8ac4eb995e26ccd6b00)
#5 0x7f980b026f9d in gsignal (/lib64/libc.so.6+0x19f9d) (BuildId: 515c33a35f41020661fea8ac4eb995e26ccd6b00)
#6 0x7f980b00e941 in abort (/lib64/libc.so.6+0x1941) (BuildId: 515c33a35f41020661fea8ac4eb995e26ccd6b00)
#7 0x55a6e616d985 in __abort_message abort_message.cpp
#8 0x55a6e61d3288 in demangling_terminate_handler() cxa_default_handlers.cpp
#9 0x55a6e61d3172 in std::__terminate(void (*)()) cxa_handlers.cpp
#10 0x55a6e61d1d98 in __cxa_rethrow (/home/brandonallard/data/.cache/bazel/_bazel_brandonallard/50acd7c106e570e66c6351a7103182f1/execroot/_main/bazel-out/k8-dbg/bin/src/v/bytes/tests/iobuf_fuzz+0x4d4d98) (BuildId: 4de8a27cc512a4a32db2efcd62d94228)
redpanda-data#11 0x55a6e616dd28 in LLVMFuzzerTestOneInput src/v/bytes/tests/iobuf_fuzz.cc:724:9
redpanda-data#12 0x55a6e600b5db in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:619:13
redpanda-data#13 0x55a6e600ac05 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:516:7
redpanda-data#14 0x55a6e600c935 in fuzzer::Fuzzer::MutateAndTestOne() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:765:19
redpanda-data#15 0x55a6e600d595 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:910:5
redpanda-data#16 0x55a6e5ffaff5 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:915:6
redpanda-data#17 0x55a6e6027bb2 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
redpanda-data#18 0x7f980b010247 in __libc_start_call_main (/lib64/libc.so.6+0x3247) (BuildId: 515c33a35f41020661fea8ac4eb995e26ccd6b00)
redpanda-data#19 0x7f980b01030a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x330a) (BuildId: 515c33a35f41020661fea8ac4eb995e26ccd6b00)
redpanda-data#20 0x55a6e5fed704 in _start (/home/brandonallard/data/.cache/bazel/_bazel_brandonallard/50acd7c106e570e66c6351a7103182f1/execroot/_main/bazel-out/k8-dbg/bin/src/v/bytes/tests/iobuf_fuzz+0x2f0704) (BuildId: 4de8a27cc512a4a32db2efcd62d94228)
NOTE: libFuzzer has rudimentary signal handlers.
Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
MS: 4 CopyPart-ChangeBit-CMP-CMP- DE: "\000\000\000\000\000\000\000\000"-"\377\377\377\377\377\377\377\377\377\377?\377\377\377\377\377\377\377\377\377\377\377\012\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\000\000\000\000\000\000\000\011\377\377\
377\377\377\377\377\377\377"-; base unit: adc83b19e793491b1c6ea0fd8b46cd9f32e592fc
0x4a,0xa,0x0,0x0,0x0,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0x3f,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xa,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x9,0xff,0xff,0xf
f,0xff,0xff,0xff,0xff,0xff,0xff,0x0,0x0,0x0,0x0,0x0,
J\012\000\000\000\377\377\377\377\377\377\377\377\377\377?\377\377\377\377\377\377\377\377\377\377\377\012\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\000\000\000\000\000\000\000\011\377\377\377\377\377\377\377\377\377\000\000\000\000\000
artifact_prefix='./'; Test unit written to ./crash-b93c506ce3bec14322fc8cabc8898af768226160
Base64: SgoAAAD/////////////P///////////////Cv///////////////////////////////wAAAAAAAAAJ////////////AAAAAAA=
================================================================================
```
joe-redpanda
pushed a commit
that referenced
this pull request
Apr 7, 2026
Taking an r-value reference to `req` in `check_ntp_states_locally()` could
result in a heap-use-after-free [1] due to the fact `req` is not `std::move()`'d
within the function, and `check_ntp_states_locally()` is being called via
continuation in `backend::send_rpc()`.
Change `check_ntp_states_locally()` to take `req` by value instead.
[1]:
```
#0 0xaaaad89d8a18 in chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>::check_generation() const bazel-out/aarch64-dbg/bin/src/v/container/_virtual_includes/chunked_vector/container/chunked_vector.h:474:13
#1 0xaaaad89d8a18 in chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>::operator==(chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false> const&) const bazel-out/aarch64-dbg/bin/src/v/container/_virtual_includes/chunked_vector/container/chunked_vector.h:439:13
#2 0xaaaad88ff614 in seastar::future<void> ssx::detail::async_for_each_coro<ssx::async_algo_traits, ssx::detail::internal_counter, cluster::data_migrations::backend::check_ntp_states_locally(cluster::data_migrations::check_ntp_states_request&&)::$_0, chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>, chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>>(ssx::detail::internal_counter, T2, chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>, cluster::data_migrations::backend::check_ntp_states_locally(cluster::data_migrations::check_ntp_states_request&&)::$_0) (.resume) bazel-out/aarch64-dbg/bin/src/v/ssx/_virtual_includes/async_algorithm/ssx/async_algorithm.h:148:20
#3 0xaaaace22a4dc in std::__1::coroutine_handle<seastar::internal::coroutine_traits_base<void>::promise_type>::resume[abi:se200100]() const external/toolchains_llvm++llvm+current_llvm_toolchain/bin/../../toolchains_llvm++llvm+current_llvm_toolchain_llvm/bin/../include/c++/v1/__coroutine/coroutine_handle.h:144:5
#4 0xaaaace22a4dc in seastar::internal::coroutine_traits_base<void>::promise_type::run_and_dispose() external/+non_module_dependencies+seastar/include/seastar/core/coroutine.hh:171:20
#5 0xaaaadde45c2c in seastar::reactor::task_queue::run_tasks() external/+non_module_dependencies+seastar/src/core/reactor.cc:2747:14
#6 0xaaaadde4c594 in seastar::reactor::task_queue_group::run_tasks() external/+non_module_dependencies+seastar/src/core/reactor.cc:3251:27
#7 0xaaaadde4c034 in seastar::reactor::task_queue_group::run_some_tasks() external/+non_module_dependencies+seastar/src/core/reactor.cc:3235:5
#8 0xaaaadde4dcac in seastar::reactor::do_run() external/+non_module_dependencies+seastar/src/core/reactor.cc:3418:20
#9 0xaaaadde4cc44 in seastar::reactor::run() external/+non_module_dependencies+seastar/src/core/reactor.cc:3295:16
#10 0xaaaaddc381d4 in seastar::app_template::run_deprecated(int, char**, std::__1::function<void ()>&&) external/+non_module_dependencies+seastar/src/core/app-template.cc:266:31
redpanda-data#11 0xaaaaddc36a4c in seastar::app_template::run(int, char**, std::__1::function<seastar::future<int> ()>&&) external/+non_module_dependencies+seastar/src/core/app-template.cc:160:12
redpanda-data#12 0xaaaace07c3b8 in application::run(int, char**) src/v/redpanda/application.cc:312:16
redpanda-data#13 0xaaaace034788 in main src/v/redpanda/main.cc:22:16
redpanda-data#14 0xffffaf0073f8 (/opt/redpanda_installs/ci/lib/libc.so.6+0x273f8) (BuildId: 2a450fe74d1b79a321cc1b12337fc31a2c3fb834)
redpanda-data#15 0xffffaf0074c8 in __libc_start_main (/opt/redpanda_installs/ci/lib/libc.so.6+0x274c8) (BuildId: 2a450fe74d1b79a321cc1b12337fc31a2c3fb834)
redpanda-data#16 0xaaaacdf503ec in _start (/opt/redpanda_installs/ci/libexec/redpanda+0x1a3e03ec) (BuildId: 52528f0683dceb3bfb7940a4869a87f6)
```
joe-redpanda
pushed a commit
that referenced
this pull request
Apr 20, 2026
Taking an r-value reference to `req` in `check_ntp_states_locally()` could
result in a heap-use-after-free [1] due to the fact `req` is not `std::move()`'d
within the function, and `check_ntp_states_locally()` is being called via
continuation in `backend::send_rpc()`.
Change `check_ntp_states_locally()` to take `req` by value instead.
[1]:
```
#0 0xaaaad89d8a18 in chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>::check_generation() const bazel-out/aarch64-dbg/bin/src/v/container/_virtual_includes/chunked_vector/container/chunked_vector.h:474:13
#1 0xaaaad89d8a18 in chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>::operator==(chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false> const&) const bazel-out/aarch64-dbg/bin/src/v/container/_virtual_includes/chunked_vector/container/chunked_vector.h:439:13
#2 0xaaaad88ff614 in seastar::future<void> ssx::detail::async_for_each_coro<ssx::async_algo_traits, ssx::detail::internal_counter, cluster::data_migrations::backend::check_ntp_states_locally(cluster::data_migrations::check_ntp_states_request&&)::$_0, chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>, chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>>(ssx::detail::internal_counter, T2, chunked_vector<cluster::data_migrations::data_migration_ntp_state>::iter<false>, cluster::data_migrations::backend::check_ntp_states_locally(cluster::data_migrations::check_ntp_states_request&&)::$_0) (.resume) bazel-out/aarch64-dbg/bin/src/v/ssx/_virtual_includes/async_algorithm/ssx/async_algorithm.h:148:20
#3 0xaaaace22a4dc in std::__1::coroutine_handle<seastar::internal::coroutine_traits_base<void>::promise_type>::resume[abi:se200100]() const external/toolchains_llvm++llvm+current_llvm_toolchain/bin/../../toolchains_llvm++llvm+current_llvm_toolchain_llvm/bin/../include/c++/v1/__coroutine/coroutine_handle.h:144:5
#4 0xaaaace22a4dc in seastar::internal::coroutine_traits_base<void>::promise_type::run_and_dispose() external/+non_module_dependencies+seastar/include/seastar/core/coroutine.hh:171:20
#5 0xaaaadde45c2c in seastar::reactor::task_queue::run_tasks() external/+non_module_dependencies+seastar/src/core/reactor.cc:2747:14
#6 0xaaaadde4c594 in seastar::reactor::task_queue_group::run_tasks() external/+non_module_dependencies+seastar/src/core/reactor.cc:3251:27
#7 0xaaaadde4c034 in seastar::reactor::task_queue_group::run_some_tasks() external/+non_module_dependencies+seastar/src/core/reactor.cc:3235:5
#8 0xaaaadde4dcac in seastar::reactor::do_run() external/+non_module_dependencies+seastar/src/core/reactor.cc:3418:20
#9 0xaaaadde4cc44 in seastar::reactor::run() external/+non_module_dependencies+seastar/src/core/reactor.cc:3295:16
#10 0xaaaaddc381d4 in seastar::app_template::run_deprecated(int, char**, std::__1::function<void ()>&&) external/+non_module_dependencies+seastar/src/core/app-template.cc:266:31
redpanda-data#11 0xaaaaddc36a4c in seastar::app_template::run(int, char**, std::__1::function<seastar::future<int> ()>&&) external/+non_module_dependencies+seastar/src/core/app-template.cc:160:12
redpanda-data#12 0xaaaace07c3b8 in application::run(int, char**) src/v/redpanda/application.cc:312:16
redpanda-data#13 0xaaaace034788 in main src/v/redpanda/main.cc:22:16
redpanda-data#14 0xffffaf0073f8 (/opt/redpanda_installs/ci/lib/libc.so.6+0x273f8) (BuildId: 2a450fe74d1b79a321cc1b12337fc31a2c3fb834)
redpanda-data#15 0xffffaf0074c8 in __libc_start_main (/opt/redpanda_installs/ci/lib/libc.so.6+0x274c8) (BuildId: 2a450fe74d1b79a321cc1b12337fc31a2c3fb834)
redpanda-data#16 0xaaaacdf503ec in _start (/opt/redpanda_installs/ci/libexec/redpanda+0x1a3e03ec) (BuildId: 52528f0683dceb3bfb7940a4869a87f6)
```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backports Required
Release Notes