fix: restore _current_span in scheduler to fix task failures (issue #…#380
Open
NexionisJake wants to merge 1 commit intoGetBindu:mainfrom
Open
fix: restore _current_span in scheduler to fix task failures (issue #…#380NexionisJake wants to merge 1 commit intoGetBindu:mainfrom
NexionisJake wants to merge 1 commit intoGetBindu:mainfrom
Conversation
…etBindu#353) Commit 1cc2a61 changed _TaskOperation to carry trace_id/span_id strings but the in-memory path never restored the span in the worker, silently dropping trace context and causing every task to fail when the worker tried to access _current_span. Changes: - base.py: revert _TaskOperation fields from trace_id/span_id back to _current_span: Span; import Span from opentelemetry.trace; update docstring to clarify Redis divergence - memory_scheduler.py: update _send_operation() to pass _current_span=get_current_span() instead of trace_id/span_id; remove unused math import; replace math.inf buffer with 100 to restore backpressure; drop get_trace_context import workers/base.py already handles the None-span case via nullcontext() so no change is needed there. The Redis scheduler correctly serialises trace_id/span_id strings for cross-process transport (CASE 2). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit
1cc2a61changed_TaskOperationto carrytrace_id/span_idstrings but the in-memory path never restored thespan in the worker, silently dropping trace context and causing every task to fail when the worker tried to access
_current_span.Changes:
base.py: revert_TaskOperationfields fromtrace_id/span_idback to_current_span: Span; importSpanfromopentelemetry.trace; update docstring to clarify Redis divergencememory_scheduler.py: update_send_operation()to pass_current_span=get_current_span()instead oftrace_id/span_id; remove unusedmathimport; replacemath.infbuffer with100to restore backpressure; dropget_trace_contextimportworkers/base.pyalready handles theNone-span case vianullcontext()so no change is needed there. The Redisscheduler correctly serialises
trace_id/span_idstrings for cross-process transport (CASE 2).Summary
1cc2a61refactored_TaskOperationto use serialisabletrace_id/span_idstrings, but never updatedInMemorySchedulerto pass a live span — so_current_spanwas never set in the task dict, the worker receivedNone,and every task was silently marked
failedbroad
exceptblock in the worker swallowed the error with no visible signal_TaskOperationcarries_current_span: Spanagain;_send_operation()inInMemorySchedulernowcalls
get_current_span()directly;math.infbuffer replaced with100to prevent unbounded memory growth underbackpressure
RedisScheduleris untouched — it legitimately needstrace_id/span_idstrings because live
Spanobjects cannot be JSON-serialised across processes;workers/base.pyis untouched — italready guards against a missing span with
nullcontext()Change Type (select all that apply)
Scope (select all touched areas)
Linked Issue/PR
User-Visible / Behavior Changes
Tasks submitted via the A2A
message/sendendpoint now complete successfully and return artifacts. Previously every taskimmediately transitioned to
"failed"state with no artifacts.Security Impact (required)
NoNoNoNoNoYes, explain risk + mitigation: N/AVerification
Environment
Steps to Test
python examples/beginner/echo_simple_agent.pycurl -X POST http://localhost:3773/ -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":"<uu id>","method":"message/send","params":{"configuration":{"acceptedOutputModes":["text"]},"message":{"messageId":"<uuid>"," contextId":"<uuid>","taskId":"<uuid>","kind":"message","role":"user","parts":[{"kind":"text","text":"Hello"}]}}}'curl -X POST http://localhost:3773/ -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":"<uuid>","method":"tasks/get","params":{"taskId":"<task-uuid>"}}'Expected Behavior
submitted→working→completedartifactsarrayActual Behavior
failedartifactsfield in responseKeyErroron_current_spanswallowed by the broadexceptblockEvidence (attach at least one)
Full suite after fix: 666 passed, 18 skipped, 0 failures