Skip to content

Bug: OpenTelemetry traces are not flushed on shutdown, dropping spans on SIGTERM #3095

@miparnisari

Description

@miparnisari

Description

SpiceDB never calls Shutdown() or ForceFlush() on the OpenTelemetry TracerProvider during process termination. As a result, spans buffered in the BatchSpanProcessor at the moment of SIGTERM/SIGINT are dropped. This is visible on slow traces, where the root span arrives at the collector but many of its inner spans are missing.

Likely root cause

The TracerProvider is constructed inside cobraotel.RunE() (cobraotel.go#L202-L206):

otel.SetTracerProvider(trace.NewTracerProvider(
    trace.WithSampler(trace.ParentBased(trace.TraceIDRatioBased(sampleRatio))),
    trace.WithBatcher(exporter),                                                                                                      
    trace.WithResource(res),
))

The provider is registered globally and the reference is discarded, therefore SpiceDB has no handle to it. When main returns, the BatchSpanProcessor goroutine is likely being abruptly killed and any unexported spans are lost.

Proposed fix

Drop the cobraotel dependency; have SpiceDB build, own and terminate the TracerProvider object.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething is broken or regressed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions