feat: add distributed tracing to tracker and worker by BradenBug · Pull Request #292 · vals-ai/Valkyrie

BradenBug · 2026-04-17T23:54:03Z

Summary

Instruments the tracker API and Taskiq worker with OpenTelemetry, giving end-to-end visibility into benchmark execution from HTTP request through worker task completion. Trace context propagates across the Redis/Taskiq boundary so a single trace covers the full benchmark lifecycle. Traces and logs ship to Sentry.

Architecture

OTel SDK via the logfire package (spans + FastAPI/httpx auto-instrumentation). send_to_logfire=False — we don't ship to Logfire cloud, logfire is only used as an ergonomic OTel wrapper.
Traces → Sentry via SentrySpanProcessor, with SPAN_MAX_TIME_OPEN_MINUTES bumped to 240 so long-running parents like process_benchmark / process_task survive (default 10m drops them).
Logs → Sentry Logs via LoggingIntegration(sentry_logs_level=INFO). Stdlib logger.info/warning/error appear alongside spans in the Sentry UI.
Propagation via a composite propagator: W3C traceparent/tracestate + Baggage + Sentry's sentry-trace/baggage, so both non-Sentry peers (Daytona, benchmark_service) and Sentry peers see the trace.

Changes

Tracing config

New tracing.py with configure_tracing(service_name) — wires up the OTel SDK, Sentry span processor, composite propagator, and a _ContextVarSpanProcessor that attaches request_id/benchmark_id/task_id contextvars to every span (mirrors the existing sentry._before_send pattern for events).
Auto-instrument FastAPI and httpx; SQLAlchemy and Redis intentionally excluded (high-volume, low-signal Taskiq plumbing).
/health excluded from request tracing.

Cross-boundary propagation

/start-benchmark and /retry-or-resume-benchmark inject() the current trace context into Taskiq labels before kicking.
New TracingContextMiddleware (Taskiq) extracts trace context from labels on the worker side, so the kicked process_benchmark run shows up as a child of the originating HTTP request span.
Sandbox-side: pass DAYTONA_SANDBOX_OTEL_EXTRA_LABELS env var so the sandbox's internal OTel telemetry is filterable by benchmark_id/task_id/environment (Daytona's OTLP export is account-level, so environment tags matter for separating dev/staging/prod).

Manual spans

process_benchmark and process_task get top-level spans.
Sub-op spans: upload_agent, setup_task, run_agent, evaluate_instance, final_score, upload_results, create_log_group, upload_to_s3, send_notification, pty_disconnect, kill_pty_session.
Caught exceptions recorded via logfire.exception() at the three sentry_sdk.capture_exception() sites (process_benchmark, process_task, TrackedTask.run).

Sentry config

init_sentry switched to INSTRUMENTER.OTEL and enable_logs=True.
LoggingIntegration(level=None, event_level=None, sentry_logs_level=INFO) — breadcrumb capture and auto-error events disabled (spans carry context; we capture_exception explicitly).

Infra

Removed LOGFIRE_TOKEN secret from TrackerStack / WorkerStack (no longer needed).
docker-compose passes SENTRY_DSN / SENTRY_RELEASE / ENVIRONMENT through from host env for local testing.

Follow-ups

Instrument benchmark_service with OTel + Sentry if we want visibility into its internals — right now it appears only as outbound httpx spans from the tracker side, so you can see how long a call took but not what it was doing.

Type of Change

Testing

Added/updated unit tests
Manually tested

Checklist

Self-reviewed the diff
No debug/dead code left in
Docs updated if needed

assert-app · 2026-04-17T23:54:07Z

Review on Assert →

# Conflicts: # services/tracker/src/tracker/utils.py

2b2652b was a broad ruff format pass that touched 21 files. Keeping the formatting applied to logfire-related code (utils.py, config.py, tracker_stack.py, worker_stack.py) but dropping the unrelated churn in migrations, test files, cli/main.py, and logging/config.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

devin-ai-integration · 2026-04-22T20:21:11Z

+        additional_span_processors=[_ContextVarSpanProcessor()],
+    )
+
+    logfire.instrument_httpx()


🚩 logfire.instrument_httpx() called globally in configure_logfire

logfire.instrument_httpx() at services/tracker/src/tracker/tracing.py:49 monkey-patches httpx.Client and httpx.AsyncClient globally. This means ALL httpx clients created after this call are instrumented, including the Slack webhook client in notifications.py:138 and any BenchmarkServiceClient httpx usage. For the worker, configure_logfire is called during WORKER_STARTUP event (config.py:76), which runs before any tasks are processed, so all benchmark service clients created per-task will be instrumented. This is a broad-scope side effect worth being aware of — if any httpx client handles sensitive data in headers, those would appear in traces.

Was this helpful? React with 👍 or 👎 to provide feedback.

JarettForzano

Instead of using context managers I think we should make decorators that wrap around the functions. Its easier to maintain and helps clean up the code by enforcing that we divide it up into functions.

JarettForzano · 2026-05-05T00:12:40Z

            task.status = TaskStatus.BUILDING
            task_session.commit()

+        environment = os.environ.get("ENVIRONMENT", "development")


This I think should live inside of the config file

JarettForzano · 2026-05-05T00:15:37Z

+    try:
+        sentry_sdk.init(
+            dsn=dsn,
+            environment=os.environ.get("ENVIRONMENT", "development"),


Pull from config - just since its used in more than one location

sure can do

JarettForzano · 2026-05-05T00:18:59Z

+            # Sentry's LoggingIntegration (enabled in init_sentry with sentry_logs_level=INFO)
+            # attaches to the root logger and ships records to Sentry Logs without needing a
+            # dictConfig handler entry.


it just uses the logs that we have built in?

JarettForzano · 2026-05-05T00:25:01Z

I think we may want to use instrument with logfire on these so that we can capture the sandbox id. Should also use logfire.exception on the warnings so that we can make sure we are capturing those.

JarettForzano

Can merge after requested changes are made / responded to

BradenBug added 2 commits April 17, 2026 16:14

feat: add logfire distributed tracing to tracker and worker

d1b28fe

chore: style

2b2652b

BradenBug and others added 2 commits April 17, 2026 17:08

Merge remote-tracking branch 'origin/dev' into bw/logfire-tracing

a447c93

# Conflicts: # services/tracker/src/tracker/utils.py

This comment was marked as resolved.

Sign in to view

BradenBug added 3 commits April 21, 2026 15:29

remove noisy traces

e5267e9

Merge remote-tracking branch 'origin/dev' into bw/logfire-tracing

2626f25

polish logfire tracing integration

83f1d40

devin-ai-integration Bot reviewed Apr 22, 2026

View reviewed changes

BradenBug marked this pull request as ready for review April 22, 2026 22:13

Merge remote-tracking branch 'origin/dev' into bw/logfire-tracing

142988b

BradenBug marked this pull request as draft April 23, 2026 00:48

BradenBug added 2 commits April 23, 2026 18:37

Publish traces to sentry instead of logfire

e47c734

Merge remote-tracking branch 'origin/dev' into bw/logfire-tracing

d16e494

BradenBug changed the title ~~feat: add logfire distributed tracing to tracker and worker~~ feat: add distributed tracing to tracker and worker Apr 24, 2026

BradenBug added 2 commits April 23, 2026 18:55

code cleanup

c1110ef

improve span names

ed22716

BradenBug marked this pull request as ready for review April 24, 2026 19:45

BradenBug marked this pull request as draft April 24, 2026 19:45

BradenBug marked this pull request as ready for review April 24, 2026 19:54

JarettForzano requested changes Apr 24, 2026

View reviewed changes

Comment thread services/tracker/src/tracker/config.py Outdated

Comment thread services/tracker/docker-compose.yml

BradenBug added 4 commits April 24, 2026 14:04

context manager -> decorators

cb10158

chore: style

9c9dc30

update comment wording

8a07736

Merge remote-tracking branch 'origin/dev' into bw/logfire-tracing

b4c7d95

BradenBug requested a review from JarettForzano April 27, 2026 18:03

BradenBug added 2 commits April 27, 2026 15:41

remove aws region default

c3dcc02

Merge remote-tracking branch 'origin/dev' into bw/logfire-tracing

1303171

JarettForzano reviewed May 5, 2026

View reviewed changes

JarettForzano approved these changes May 5, 2026

View reviewed changes

BradenBug and others added 3 commits May 4, 2026 17:29

envionment -> config

bfd919b

logfire instrument sandbox methods

4c2496c

Merge branch 'dev' into bw/logfire-tracing

ab1543e

BradenBug merged commit 175cf53 into dev May 5, 2026
7 of 8 checks passed

Conversation

BradenBug commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

Changes

Follow-ups

Type of Change

Testing

Checklist

Uh oh!

assert-app Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JarettForzano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JarettForzano May 5, 2026

Choose a reason for hiding this comment

Uh oh!

JarettForzano May 5, 2026

Choose a reason for hiding this comment

Uh oh!

BradenBug May 5, 2026

Choose a reason for hiding this comment

Uh oh!

JarettForzano May 5, 2026

Choose a reason for hiding this comment

Uh oh!

BradenBug May 5, 2026

Choose a reason for hiding this comment

Uh oh!

JarettForzano May 5, 2026

Choose a reason for hiding this comment

Uh oh!

BradenBug May 5, 2026

Choose a reason for hiding this comment

Uh oh!

JarettForzano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BradenBug commented Apr 17, 2026 •

edited

Loading

assert-app Bot commented Apr 17, 2026 •

edited

Loading

devin-ai-integration Bot Apr 22, 2026 •

edited

Loading