feat(observability): opt-in LangFuse callback for litellm calls#198
feat(observability): opt-in LangFuse callback for litellm calls#198taxfree-python wants to merge 1 commit intohuggingface:mainfrom
Conversation
Adds an env-gated hook that registers litellm's LangFuse OTEL callback when the host and both keys are set. With any var unset the integration is a no-op and behavior is unchanged. The host is mandatory by design so the destination is always an explicit choice (self-hosted or SaaS) — no silent fallback to litellm's default endpoint. Either env-var name is accepted: Langfuse SDK v4's docs issue credentials as LANGFUSE_BASE_URL while litellm reads LANGFUSE_HOST, so we mirror BASE_URL into HOST when only the former is set. Uses litellm's `langfuse_otel` callback rather than the legacy `langfuse` one, which breaks against Langfuse SDK v4 with `module 'langfuse' has no attribute 'version'` — the OTEL path works against both v3 and v4. The HF-Dataset-based primary telemetry pipeline is untouched; this is an additional side channel. Refs huggingface#196. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
226f3d5 to
63598bb
Compare
|
Force-pushed an update with two changes after end-to-end testing against a self-hosted Langfuse instance:
Verified end-to-end with |
|
closed per maintainer request |
Summary
Refs #196.
Adds a small, env-gated hook that registers litellm's native LangFuse callback when
LANGFUSE_HOST+LANGFUSE_PUBLIC_KEY+LANGFUSE_SECRET_KEYare all set. With any of the three unset the integration is a no-op and runtime behavior is identical to today.The primary HF-Dataset-based telemetry pipeline (
agent/core/telemetry.py) is unchanged — this is purely an additional side channel for operators who want LLM traces in their own LangFuse instance.Why mandatory
LANGFUSE_HOSTlitellm's default LangFuse host is
cloud.langfuse.com. If the gate were justPUBLIC_KEY + SECRET_KEY, an operator who only meant to set keys for a self-hosted instance — but forgotLANGFUSE_HOST— would silently exfiltrate prompts to a third-party SaaS. RequiringLANGFUSE_HOSTmakes "where does the data go" a conscious decision. Both deployment shapes are supported:LANGFUSE_HOST=https://langfuse.internal.example.comLANGFUSE_HOST=https://cloud.langfuse.comScope (intentionally minimal)
agent/core/observability.pywithsetup_langfuse()(env-gated, idempotent)agent/config.py:load_config()— covers both CLI (agent/main.py) and backend (backend/session_manager.pymodule-init) without a separate hookpyproject.toml: new[observability]optional-dep group;langfuseis not a hard depagent/README.md: new "Observability (optional)" sectiontests/unit/test_observability.py: gate-passes / gate-blocks / idempotentOut of scope (deferred to a follow-up if there's interest): forwarding the existing
kindtag from PR #179, plussession.user_id/session.session_id, asmetadata={...}to each litellm call. That's a mechanical change touching the 7 acompletion call sites and is much easier to review on its own.Test plan
pytest tests/unit/test_observability.py -v— 5 cases pass (1 gate-passes, 3 parametrized gate-blocks, 1 idempotent)pytest tests/unit/test_config.py— existing config tests still pass (noload_configsemantics change when env vars unset)LANGFUSE_*unset, run the agent — no log lines about LangFuse, no errorsNote on CI
The
claude-code-actionreview workflow fails on every external-fork PR (pull_requesttriggers don't grant write tokens to fork actors — confirmed across recent fork PRs). The redreviewcheck is expected and not contributor-fixable; the change itself does not alter any workflow.