Skip to content

Comments

feat: add support for serializing WorkflowSpan input and output#472

Open
calebe94 wants to merge 4 commits intomainfrom
feature/54706-support-workflow-spans
Open

feat: add support for serializing WorkflowSpan input and output#472
calebe94 wants to merge 4 commits intomainfrom
feature/54706-support-workflow-spans

Conversation

@calebe94
Copy link

@calebe94 calebe94 commented Feb 10, 2026

Shortcut:

Description:

This change adds support for properly serializing the input and output of a WorkflowSpan in the OpenTelemetry instrumentation. It handles both string and Message object formats, as well as sequences of Documents in the output.

The changes ensure that the full context of the workflow is captured in the OpenTelemetry spans, which is crucial for understanding the flow and debugging issues.

Additionally, the start_galileo_span context manager was updated to correctly handle WorkflowSpan instances.

Tests:

  • Unit Tests Added
  • E2E Test Added (if it's a user-facing feature, or fixing a bug)

Generated description

Below is a concise technical summary of changes proposed in this PR:

This pull request adds support for WorkflowSpan OpenTelemetry attribute mapping in the start_galileo_span() function. Previously, when a WorkflowSpan was passed to start_galileo_span(), no type-specific attributes were set, resulting in incomplete telemetry data. The implementation follows the existing pattern established by _set_retriever_span_attributes() and properly handles the polymorphic input and output types defined for WorkflowSpan.

TopicDetails
WorkflowSpan OTel Attribute Mapping Adds OpenTelemetry attribute mapping for WorkflowSpan instances in start_galileo_span(). The implementation includes a new helper function _set_workflow_span_attributes() that correctly serializes polymorphic input types (Union[str, Sequence[Message]]) and output types (Union[str, Message, Sequence[Document], None]) into OpenTelemetry's gen_ai.input.messages and gen_ai.output.messages attributes. String values are wrapped in appropriate message objects with role metadata, Message objects are serialized using model_dump(exclude_none=True), and Document sequences are nested within an assistant message. None values for output are handled gracefully by skipping the attribute.
Modified files (2)
  • src/galileo/otel.py
  • tests/test_otel.py
Latest Contributors(0)
UserCommitDate

Evidences

I've created a demo script callled workflow_span_demo.py to test this changes. To run this script we need to export a couple of credentials:

export GALILEO_API_KEY="your-api-key-here"
export GALILEO_CONSOLE_URL="https://your-galileo-console-url.com"
export GALILEO_PROJECT="your-project-name"
export GALILEO_LOG_STREAM="your-log-stream-name"
export OPENAI_API_KEY="sk-openai-key"
export OPENAI_MODEL_NAME="gpt-4"

Inside the galileo-python directory I've started a new virtualenv so we can install the project locally:

# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate

# Or using pyenv
pyenv local 3.12.12
pyenv shell

Then I installed the dependencies:

# Install galileo-python package in editable mode
pip install -e .

# Or using poetry
poetry install --all-extras

# Install OpenTelemetry dependencies separately if needed
pip install opentelemetry-sdk opentelemetry-api opentelemetry-exporter-otlp galileo-sdk galileo

After all this, I've ran the workflow_span_demo.py Python script and got this as a result:

$ python examples/workflow_span_demo.py
2026-02-12 17:12:05 - __main__ - INFO - 
================================================================================
2026-02-12 17:12:05 - __main__ - INFO - WorkflowSpan OpenTelemetry Attribute Mapping Demo
2026-02-12 17:12:05 - __main__ - INFO - Feature: 54706 - Support Workflow Spans
2026-02-12 17:12:05 - __main__ - INFO - ================================================================================

2026-02-12 17:12:05 - __main__ - INFO - Setting up Galileo OpenTelemetry integration
2026-02-12 17:12:05 - httpx - INFO - HTTP Request: GET https://api-edimar-calebe-test.gcp-dev.galileo.ai/healthcheck "HTTP/1.1 200 OK"
2026-02-12 17:12:06 - httpx - INFO - HTTP Request: POST https://api-edimar-calebe-test.gcp-dev.galileo.ai/login/api_key "HTTP/1.1 200 OK"
2026-02-12 17:12:06 - httpx - INFO - HTTP Request: GET https://api-edimar-calebe-test.gcp-dev.galileo.ai/current_user "HTTP/1.1 200 OK"
2026-02-12 17:12:06 - __main__ - INFO - Galileo OpenTelemetry integration configured
2026-02-12 17:12:06 - __main__ - INFO - Galileo OpenTelemetry integration initialized
2026-02-12 17:12:06 - __main__ - INFO -   Console URL: https://api-edimar-calebe-test.gcp-dev.galileo.ai
2026-02-12 17:12:06 - __main__ - INFO -   Project: test
2026-02-12 17:12:06 - __main__ - INFO -   Log Stream: default
2026-02-12 17:12:06 - __main__ - INFO - 
2026-02-12 17:12:06 - __main__ - INFO - Running Demo 1: String Input/Output
2026-02-12 17:12:06 - __main__ - INFO - ✓ String input/output demo completed
2026-02-12 17:12:06 - __main__ - INFO - Running Demo 2: Message Input/Output
2026-02-12 17:12:06 - __main__ - INFO - ✓ Message input/output demo completed
2026-02-12 17:12:06 - __main__ - INFO - Running Demo 3: Document Sequence Output
2026-02-12 17:12:06 - __main__ - INFO - ✓ Document sequence output demo completed
2026-02-12 17:12:06 - __main__ - INFO - Running Demo 4: None Output (Edge Case)
2026-02-12 17:12:06 - __main__ - INFO - ✓ None output demo completed
2026-02-12 17:12:06 - __main__ - INFO - Running Demo 5: Nested Workflow Spans (Real-world RAG)
2026-02-12 17:12:06 - __main__ - INFO -   Parent span: rag-workflow started
2026-02-12 17:12:06 - __main__ - INFO -     Child span: retrieval started
2026-02-12 17:12:06 - __main__ - INFO -     Child span: retrieval completed
2026-02-12 17:12:06 - __main__ - INFO -     Child span: generation started
2026-02-12 17:12:06 - __main__ - INFO -     Child span: generation completed
2026-02-12 17:12:06 - __main__ - INFO - ✓ Nested workflow spans demo completed
2026-02-12 17:12:06 - __main__ - INFO - 
================================================================================
2026-02-12 17:12:06 - __main__ - INFO - All demos completed successfully!
2026-02-12 17:12:06 - __main__ - INFO - ================================================================================
2026-02-12 17:12:06 - __main__ - INFO - 
To use in production:
2026-02-12 17:12:06 - __main__ - INFO -   1. Set GALILEO_* environment variables
2026-02-12 17:12:06 - __main__ - INFO -   2. Initialize your OpenTelemetry provider
2026-02-12 17:12:06 - __main__ - INFO -   3. Use start_galileo_span() with WorkflowSpan instances
2026-02-12 17:12:06 - __main__ - INFO -   4. Spans are automatically exported to Galileo
2026-02-12 17:12:06 - __main__ - INFO - 
See src/galileo/otel.py for the implementation:
2026-02-12 17:12:06 - __main__ - INFO -   - _set_workflow_span_attributes() handles serialization
2026-02-12 17:12:06 - __main__ - INFO -   - start_galileo_span() dispatches to the correct handler
2026-02-12 17:12:06 - __main__ - INFO - 
🔄 Flushing spans to Galileo Console...
2026-02-12 17:12:07 - __main__ - INFO - ✅ Spans flushed successfully
2026-02-12 17:12:07 - __main__ - INFO - 
📊 Check your Galileo Console to view the exported traces:
2026-02-12 17:12:07 - __main__ - INFO -    URL: https://api-edimar-calebe-test.gcp-dev.galileo.ai
2026-02-12 17:12:07 - __main__ - INFO -    Project: test
2026-02-12 17:12:07 - __main__ - INFO -    Log Stream: default

What it demonstrates:

The demo script tests all polymorphic input/output types for WorkflowSpan:

  1. Demo 1: String Input/Output

    • Tests simplest case: both input and output are plain strings
    • Serialization wraps them in message format with role metadata
    • Input: "What is the capital of France?"{"role": "user", "content": "..."}
    • Output: "The capital of France is Paris."{"role": "assistant", "content": "..."}
  2. Demo 2: Message Input/Output

    • Tests using Message objects for input and output
    • Messages are serialized using model_dump(exclude_none=True) to preserve all properties
    • Includes role (system, user, assistant), content, and optional fields
    • Multiple messages can be in input sequence
  3. Demo 3: Document Sequence Output

    • Tests Document objects with metadata as output
    • Documents are wrapped in an assistant message with nested structure
    • Output format: {"role": "assistant", "content": {"documents": [...]}}
    • Documents include content and metadata (e.g., source, page)
  4. Demo 4: None Output (Edge Case)

    • Tests when output is None
    • Only gen_ai.input.messages attribute is set
    • gen_ai.output.messages attribute is skipped entirely
  5. Demo 5: Nested Workflow Spans (Real-world RAG)

    • Demonstrates practical scenario: RAG (Retrieval-Augmented Generation)
    • Parent span: Main workflow
    • Child span 1: Document retrieval
    • Child span 2: Response generation
    • OpenTelemetry automatically creates parent-child relationship
    • All spans properly linked in a single trace

@calebe94 calebe94 requested a review from a team as a code owner February 10, 2026 19:08
@calebe94 calebe94 self-assigned this Feb 10, 2026
@calebe94 calebe94 force-pushed the feature/54706-support-workflow-spans branch from e89351e to 45d9ed1 Compare February 10, 2026 19:41
@codecov
Copy link

codecov bot commented Feb 10, 2026

Codecov Report

❌ Patch coverage is 95.45455% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.74%. Comparing base (114e24d) to head (49c732e).

Files with missing lines Patch % Lines
src/galileo/otel.py 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #472      +/-   ##
==========================================
+ Coverage   82.61%   82.74%   +0.12%     
==========================================
  Files          96       96              
  Lines        9147     9168      +21     
==========================================
+ Hits         7557     7586      +29     
+ Misses       1590     1582       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This change adds support for properly serializing the input and output of a WorkflowSpan in the OpenTelemetry instrumentation. It handles both string and Message object formats, as well as sequences of Documents in the output.

The changes ensure that the full context of the workflow is captured in the OpenTelemetry spans, which is crucial for understanding the flow and debugging issues.

Additionally, the start_galileo_span context manager was updated to correctly handle WorkflowSpan instances.
- Uses the `model_construct` method to bypass a validator bug in `galileo_core`
- Leverages the `document_adapter` utility for consistent serialization of the output documents
@calebe94 calebe94 force-pushed the feature/54706-support-workflow-spans branch from 1c994c8 to d56204e Compare February 17, 2026 18:18
@calebe94 calebe94 requested a review from savula15 February 18, 2026 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants