Skip to content

feat(tracing): integrate langsmith-go for LLM request tracing#612

Open
anhle128 wants to merge 7 commits intonextlevelbuilder:mainfrom
anhle128:feature/add-langsmith
Open

feat(tracing): integrate langsmith-go for LLM request tracing#612
anhle128 wants to merge 7 commits intonextlevelbuilder:mainfrom
anhle128:feature/add-langsmith

Conversation

@anhle128
Copy link
Copy Markdown
Contributor

@anhle128 anhle128 commented Mar 31, 2026

Summary

Closes #611

Integrate langsmith-go SDK to export LLM call traces to LangSmith for AI-specific observability.

  • Add langsmithexport.Exporter implementing SpanExporter + SpanUpdateExporter interfaces
  • Map GoClaw span types to LangSmith run types (llm, tool, chain) with proper dotted_order hierarchy
  • Buffer child "running" spans for single-shot POST (avoids LangSmith RunUpdate validation failure on missing parent_run_id)
  • Remap root span IDs to TraceID to satisfy LangSmith run_id == trace_id constraint
  • Add LangSmithConfig with env var support (LANGSMITH_API_KEY, LANGSMITH_PROJECT, LANGSMITH_ENDPOINT)
  • Build-tag gated (-tags langsmith) — zero dependency footprint on default builds
  • Background pruning of orphaned pending/buffered entries (5min TTL)
  • Forward token usage, cost, model, provider, and metadata to LangSmith runs

Files changed

Area Files
Exporter internal/tracing/langsmithexport/exporter.go, mapping.go
Config internal/config/config.go, config_load.go, config_secrets.go
Build tags cmd/gateway_langsmith.go, cmd/gateway_langsmith_noop.go
Wiring cmd/gateway.go, internal/tracing/collector.go

Configuration

export LANGSMITH_API_KEY="lsv2_..."
export LANGSMITH_PROJECT="my-project"       # optional, default: "default"
export LANGSMITH_ENDPOINT="https://..."     # optional, default: LangSmith cloud
go build -tags langsmith -o goclaw .

Test plan

  • Build without tag: go build ./... — compiles, no langsmith-go dependency
  • Build with tag: go build -tags langsmith ./... — compiles and links
  • Set LANGSMITH_API_KEY and verify runs appear in LangSmith UI
  • Verify LLM calls → llm runs with model, tokens, cost
  • Verify tool calls → tool runs with tool name and input/output
  • Verify agent spans → chain runs wrapping child runs
  • Verify run tree hierarchy renders correctly (parent-child dotted_order)
  • Verify two-phase spans complete without orphaned entries
  • Verify graceful shutdown flushes pending runs
  • Verify API key masked in config display

Implement LangSmith trace export integration:
- exporter.go: Core LangSmith API client with batch run submission
- mapping.go: Token count mapping (input/output → prompt/completion)
Supports API URL configuration and per-project trace isolation.
Implement build-tag gated initialization:
- gateway_langsmith.go: Initialize and wire LangSmith exporter (with -tags langsmith)
- gateway_langsmith_noop.go: No-op stub for default builds
Allows optional LangSmith integration without coupling to core gateway.
Config changes:
- Add LangSmith struct with api_key, project, api_url fields
- Load from LANGSMITH_* env vars with config file overlay
- Mask API key in sensitive output

Collector changes:
- Add exporter management via AddExporter interface
- Graceful shutdown of all exporters
- Pre-initialized exporter list

Initializes LangSmith exporter on gateway startup per config.
Buffer child "running" spans for deferred single-shot POST instead of
two-phase POST+PATCH which fails LangSmith validation (RunUpdate lacks
ParentRunID). Track run ID mappings via runMap for proper dotted_order
hierarchy across batches. Remap root span IDs to TraceID to satisfy
LangSmith's run_id == trace_id constraint.

Closes nextlevelbuilder#611
@mrgoonie
Copy link
Copy Markdown
Contributor

🔍 Code Review — PR #612

Nice work @anhle128! LangSmith integration cho LLM observability — feature mà team cần từ lâu.

✅ What's Good

1. Kiến trúc sạch sẽ

  • Build-tag gated (-tags langsmith) → zero dependency footprint khi không dùng. Pattern này giống hệt OTel exporter, dễ maintain.
  • SpanUpdateExporter interface extension thông minh — cho phép two-phase export (POST + PATCH) mà không phá backward compat.
  • Multiple exporters support ([]SpanExporter) → fan-out same spans to OTel + LangSmith cùng lúc.

2. LangSmith-specific constraints handled đúng

  • Root span ID remapping: run_id == trace_id — constraint quan trọng mà LangSmith yêu cầu.
  • dotted_order format đúng spec: YYYYMMDDTHHMMSSffffffZ<uuid> — đảm bảo hierarchy render đúng trong UI.
  • Child running spans buffered → single-shot POST thay vì POST + PATCH riêng (tránh validation failure do thiếu parent_run_id).

3. Background pruning

  • 5min TTL cho orphaned pending/buffered entries → tránh memory leak.
  • Orphaned child buffers flushed as incomplete runs → better than lost.

4. Config + env var support

  • LANGSMITH_API_KEY, LANGSMITH_PROJECT, LANGSMITH_ENDPOINT — follow LangSmith convention.
  • API key masked trong config display → security good practice.

5. Test plan chi tiết

  • Cover đủ: build với/không tag, hierarchy verification, token usage forwarding, graceful shutdown.

⚠️ Suggestions (Nice to Have)

1. pendingPruneFreqpendingTTL — consider making configurable

Hard-coded 5min TTL + 2min prune frequency có thể không phù hợp cho mọi deployment. Consider thêm config option:

type LangSmithConfig struct {
    APIKey       string
    Project      string
    APIUrl       string
    PendingTTL   time.Duration // default: 5m
    PruneFreq    time.Duration // default: 2m
}

2. Error retry logic

Hiện tại failed export chỉ log warning:

if err := e.client.CreateRun(rc); err != nil {
    slog.Warn("langsmith: failed to create run", ...)
}

Consider thêm retry với exponential backoff cho transient errors (network timeout, 5xx). LangSmith SDK có thể đã có built-in retry — check xem có thể enable không.

3. Metrics/telemetry cho exporter

Thêm counter cho:

  • Spans exported success/failed
  • Pruned entries count
  • Buffered child spans count

Giúp debug khi có issue trong production.

4. Comment về runMap cleanup

runMap không có TTL cleanup trong pruneStale — chỉ remove khi createdAt quá cũ. Nhưng nếu có nhiều span trong thời gian ngắn, map có thể phình to. Consider:

  • Thêm max size limit
  • Hoặc cleanup aggressive hơn (e.g., only keep last N hours)

5. Test case cho applySpanUpdate

Nên thêm unit test verify rằng applySpanUpdate merge đúng updates vào buffered RunCreate mà không overwrite fields có sẵn.

🎯 Verdict

Approve — Implementation solid, architecture clean, handles LangSmith-specific constraints đúng cách. Merge được ngay, các suggestion trên có thể làm follow-up PR.

Link: #612

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: integration with langsmith-go for tracing request LLM

2 participants