Skip to content

[codex] Add cost fields to LLM usage analytics#721

Merged
rahulkarajgikar merged 2 commits intotruffle-ai:mainfrom
rahulkarajgikar:cost-metrics
Apr 14, 2026
Merged

[codex] Add cost fields to LLM usage analytics#721
rahulkarajgikar merged 2 commits intotruffle-ai:mainfrom
rahulkarajgikar:cost-metrics

Conversation

@rahulkarajgikar
Copy link
Copy Markdown
Collaborator

@rahulkarajgikar rahulkarajgikar commented Apr 14, 2026

Summary

  • add estimated USD cost fields and flattened cost-breakdown fields to the shared dexto_llm_tokens_consumed analytics payload
  • propagate costBreakdown through core llm:response pricing metadata so TUI/WebUI analytics can forward cost data without duplicating pricing logic
  • preserve downstream usage delivery by teaching the server subscribers to forward or reuse the emitted cost breakdown
  • add focused regression coverage for core, TUI, and WebUI analytics emission paths

Why

Core already computes per-response LLM pricing, but the cross-platform usage analytics event only carried token counts. That left PostHog-style usage metrics blind to cost even though the information already existed in the runtime.

Impact

  • LLM usage analytics now include total estimated USD cost plus per-bucket input/output/reasoning/cache cost fields when pricing is available
  • priced responses with non-input/output token buckets are no longer dropped from analytics just because inputTokens/outputTokens are zero
  • the generated OpenAPI document is synced so repo checks no longer fail on stale version metadata

Validation

  • pnpm exec vitest run packages/core/src/llm/executor/stream-processor.test.ts packages/webui/lib/events/handlers.test.ts packages/tui/src/services/processStream.test.ts
  • bash scripts/quality-checks.sh lint
  • bash scripts/quality-checks.sh typecheck
  • bash scripts/quality-checks.sh hono-inference
  • ./scripts/quality-checks.sh reaches build and OpenAPI successfully, but the full test step is currently blocked by unrelated CLI test instability/timeouts in packages/cli/src/cli/utils/config-validation.test.ts and packages/cli/src/cli/modes/cli.test.ts
  • pnpm exec vitest run packages/cli/src/cli/modes/cli.test.ts packages/cli/src/cli/utils/config-validation.test.ts passed when rerun in isolation

Summary by CodeRabbit

  • New Features
    • LLM usage analytics now display estimated USD costs
    • Cost breakdown by category: input tokens, output tokens, reasoning operations, and cache read/write operations
    • Cost metrics available across CLI and WebUI interfaces

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

This PR extends LLM usage analytics across the dexto monorepo by adding cost breakdown fields to event types and propagating them through core events, server handlers, and CLI/WebUI analytics capture pipelines. USD cost estimation now includes per-category breakdowns (input, output, reasoning, cache operations) alongside total estimated costs.

Changes

Cohort / File(s) Summary
Release & Metadata
.changeset/few-apes-judge.md, docs/static/openapi/openapi.json
Changeset documenting patch releases with cost field additions; OpenAPI spec version bumped to 1.6.22.
Analytics Event Types
packages/analytics/src/events.ts
Extended LLMTokensConsumedEvent with optional USD cost fields: estimatedCostUsd, inputCostUsd, outputCostUsd, reasoningCostUsd, cacheReadCostUsd, cacheWriteCostUsd.
Core Event Infrastructure
packages/core/src/events/index.ts, packages/core/src/llm/executor/stream-processor.ts, packages/core/src/llm/executor/stream-processor.test.ts
Added costBreakdown?: TokenUsageCostBreakdown field to llm:response payload in AgentEventMap and SessionEventMap; updated StreamProcessor.emitLLMResponse() to conditionally emit cost breakdown; tests updated to expect cost breakdown object.
LLM Metadata & Cost Calculation
packages/core/src/llm/usage-metadata.ts
Replaced calculateCost() with calculateCostBreakdown() and extended LLMUsagePricingMetadata with optional costBreakdown field containing per-category USD costs.
Server-Side Event Handling
packages/server/src/events/a2a-sse-subscriber.ts, packages/server/src/events/usage-event-subscriber.ts
Updated SSE subscriber to stream costBreakdown in task messages; modified usage event subscriber to prefer explicit payload.costBreakdown over derived breakdown and support cost data propagation.
CLI Analytics Capture
packages/tui/src/services/processStream.ts, packages/tui/src/services/processStream.test.ts
Introduced hasMeaningfulTokenUsageForAnalytics() helper to gate analytics on either estimatedCost presence or non-zero tokens; expanded dexto_llm_tokens_consumed payload with USD cost fields; updated token usage field access to use optional chaining.
WebUI Analytics Capture
packages/webui/lib/events/handlers.ts, packages/webui/lib/events/handlers.test.ts
Introduced hasMeaningfulTokenUsageForAnalytics() helper; extended handleLLMResponse() to capture and map estimatedCost and costBreakdown fields to USD cost analytics; updated token usage field access with optional chaining; added test coverage for cost field mapping.

Sequence Diagram

sequenceDiagram
    participant LLM as LLM Provider
    participant Core as Core Executor
    participant Meta as Usage Metadata
    participant Events as Event Emitters
    participant Server as Server Handlers
    participant Analytics as Analytics Capture

    LLM->>Core: Response with tokens
    Core->>Meta: Calculate cost breakdown
    Meta->>Meta: calculateCostBreakdown()<br/>(per-category USD costs)
    Core->>Events: emitLLMResponse(event, config)
    Events->>Events: Include costBreakdown<br/>in llm:response payload
    
    Events->>Server: llm:response event
    Server->>Server: Extract costBreakdown<br/>from payload
    
    Server->>Analytics: SSE/Usage events
    Analytics->>Analytics: Map costBreakdown<br/>to USD fields
    Analytics->>Analytics: Emit dexto_llm_tokens_consumed<br/>with cost data
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 Coins and costs now clearly flow,
From tokens counted, fast and slow,
Each breakdown tracked with USD care,
Through CLI, web, and server's share! ✨💰

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: adding cost fields to LLM usage analytics across the monorepo.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 14, 2026

@rahulkarajgikar is attempting to deploy a commit to the Shaunak's projects Team on Vercel.

A member of the Team first needs to authorize it.

@rahulkarajgikar rahulkarajgikar marked this pull request as ready for review April 14, 2026 10:23
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
packages/server/src/events/usage-event-subscriber.ts (1)

157-158: Extract cost-breakdown resolution into a dedicated resolver.

Line 157 introduces a multi-source fallback chain inline. Please move this into a small resolver function to keep one explicit selection path and improve maintainability.

♻️ Suggested refactor
+    private resolveCostBreakdown(
+        payload: AgentEventMap['llm:response']
+    ): AgentEventMap['llm:response']['costBreakdown'] | undefined {
+        if (payload.costBreakdown) {
+            return payload.costBreakdown;
+        }
+        if (!payload.provider || !payload.model || !payload.tokenUsage) {
+            return undefined;
+        }
+        const pricing = getModelPricing(payload.provider, payload.model);
+        return pricing ? calculateCostBreakdown(payload.tokenUsage, pricing) : undefined;
+    }
+
     private buildUsageEvent(payload: AgentEventMap['llm:response']): UsageEvent | null {
@@
-        const resolvedCostBreakdown =
-            payload.costBreakdown ??
-            (payload.provider && payload.model
-                ? (() => {
-                      const pricing = getModelPricing(payload.provider, payload.model);
-                      if (!pricing) {
-                          return undefined;
-                      }
-
-                      return calculateCostBreakdown(payload.tokenUsage, pricing);
-                  })()
-                : undefined);
+        const resolvedCostBreakdown = this.resolveCostBreakdown(payload);

As per coding guidelines: "Avoid multi-source values encoded as optional + fallback + fallback chains (a ?? b ?? c); prefer a single source of truth with explicit resolver function."

Also applies to: 167-167

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/server/src/events/usage-event-subscriber.ts` around lines 157 - 158,
Extract the inline multi-source fallback used to compute costBreakdown into a
dedicated resolver function (e.g., resolveCostBreakdown) and replace the inline
expression (the payload.costBreakdown ?? (payload.provider && payload.model ...
) and the similar occurrence around line 167) with a single call to that
resolver; the resolver should accept the payload (or provider, model, and any
rawCost fields) and implement the explicit selection logic (check
payload.costBreakdown first, then provider+model-derived value, etc.), returning
the resolved costBreakdown value so the main flow in usage-event-subscriber.ts
remains a single clear selection point.
packages/webui/lib/events/handlers.ts (1)

242-272: Consider extracting the duplicated captureTokenUsage call into a helper.

The same analytics capture logic appears twice in handleLLMResponse (lines 242-272 for the streaming path and lines 320-350 for the non-streaming path). This internal duplication could lead to drift if one path is updated but not the other.

♻️ Suggested extraction
+function captureTokenUsageAnalytics(
+    sessionId: string,
+    provider: string | undefined,
+    model: string | undefined,
+    tokenUsage: EventByName<'llm:response'>['tokenUsage'],
+    estimatedCost: number | undefined,
+    costBreakdown: EventByName<'llm:response'>['costBreakdown'],
+    estimatedInputTokens: number | undefined,
+    reasoningVariant: string | undefined,
+    reasoningBudgetTokens: number | undefined
+): void {
+    if (!hasMeaningfulTokenUsageForAnalytics(tokenUsage, estimatedCost)) {
+        return;
+    }
+
+    let estimateAccuracyPercent: number | undefined;
+    const actualInputTokens = tokenUsage?.inputTokens;
+    if (estimatedInputTokens !== undefined && actualInputTokens) {
+        const diff = estimatedInputTokens - actualInputTokens;
+        estimateAccuracyPercent = Math.round((diff / actualInputTokens) * 100);
+    }
+
+    captureTokenUsage({
+        sessionId,
+        provider,
+        model,
+        reasoningVariant,
+        reasoningBudgetTokens,
+        inputTokens: tokenUsage?.inputTokens,
+        outputTokens: tokenUsage?.outputTokens,
+        reasoningTokens: tokenUsage?.reasoningTokens,
+        totalTokens: tokenUsage?.totalTokens,
+        cacheReadTokens: tokenUsage?.cacheReadTokens,
+        cacheWriteTokens: tokenUsage?.cacheWriteTokens,
+        estimatedCostUsd: estimatedCost,
+        inputCostUsd: costBreakdown?.inputUsd,
+        outputCostUsd: costBreakdown?.outputUsd,
+        reasoningCostUsd: costBreakdown?.reasoningUsd,
+        cacheReadCostUsd: costBreakdown?.cacheReadUsd,
+        cacheWriteCostUsd: costBreakdown?.cacheWriteUsd,
+        estimatedInputTokens,
+        estimateAccuracyPercent,
+    });
+}

Then call captureTokenUsageAnalytics(sessionId, provider, model, tokenUsage, estimatedCost, costBreakdown, estimatedInputTokens, reasoningVariant, reasoningBudgetTokens) in both code paths.

Also applies to: 320-350

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/webui/lib/events/handlers.ts` around lines 242 - 272, There is
duplicated analytics logic in handleLLMResponse: the captureTokenUsage call
(using sessionId, provider, model, reasoningVariant, reasoningBudgetTokens,
tokenUsage, estimatedInputTokens, estimatedCost, costBreakdown, etc.) appears in
both the streaming and non-streaming paths; extract this into a new helper
function (e.g., captureTokenUsageAnalytics) that accepts the necessary symbols
(sessionId, provider, model, tokenUsage, estimatedCost, costBreakdown,
estimatedInputTokens, reasoningVariant, reasoningBudgetTokens) and performs the
estimateAccuracyPercent calculation (using estimatedInputTokens and
tokenUsage.inputTokens) and then calls captureTokenUsage with the consolidated
payload; replace the duplicated captureTokenUsage blocks in both paths with a
call to this new helper to ensure single-source logic.
packages/tui/src/services/processStream.ts (1)

119-139: Code duplication: hasMeaningfulTokenUsageForAnalytics is duplicated with WebUI.

This helper is identical to the one in packages/webui/lib/events/handlers.ts (lines 124-144). Consider extracting it to a shared location to avoid divergence.

Options:

  1. Add to @dexto/core alongside hasMeaningfulTokenUsage in packages/core/src/llm/usage-metadata.ts
  2. Add to a shared analytics utilities module

This would also make the behavioral difference between hasMeaningfulTokenUsage (token-only gating) and hasMeaningfulTokenUsageForAnalytics (cost OR token gating) explicit and documented in one place.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tui/src/services/processStream.ts` around lines 119 - 139, Extract
the duplicated hasMeaningfulTokenUsageForAnalytics helper into a shared location
and import it where needed: create/export a function named
hasMeaningfulTokenUsageForAnalytics (with the same signature and logic)
alongside hasMeaningfulTokenUsage in the core usage-metadata module (e.g., add
to the module that defines hasMeaningfulTokenUsage), then replace the local
implementations in processStream.ts and WebUI handlers.ts with imports from that
shared module; ensure the exported symbol is typed correctly (matching
Extract<StreamingEvent, { name: 'llm:response' }>['tokenUsage']) and update any
imports/usages accordingly.
packages/webui/lib/events/handlers.test.ts (1)

202-245: Good test coverage for the happy path; consider adding edge case coverage.

The test correctly validates that cost breakdown fields are mapped to the analytics payload. However, there are two edge cases worth covering:

  1. When costBreakdown is undefined but estimatedCost is present (the new gating logic should still emit analytics)
  2. When tokenUsage has zero tokens but estimatedCost is defined (verifies the new hasMeaningfulTokenUsageForAnalytics behavior change)

These would ensure the behavioral change (emitting analytics when only estimatedCost is present) is regression-tested.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/webui/lib/events/handlers.test.ts` around lines 202 - 245, Add two
new unit tests in handlers.test.ts that call handleLLMResponse: (1) a case where
event.costBreakdown is undefined but event.estimatedCost is set—set up the chat
message via useChatStore.getState().setStreamingMessage as other tests do and
assert captureTokenUsageMock is called with estimatedCostUsd equal to the
event.estimatedCost; (2) a case where event.tokenUsage has zeros
(inputTokens/outputTokens/totalTokens = 0) but event.estimatedCost is set—again
assert captureTokenUsageMock is invoked and includes estimatedCostUsd,
confirming hasMeaningfulTokenUsageForAnalytics no longer blocks analytics when
estimatedCost is present; reuse TEST_SESSION_ID, provider/model fields and the
same expect.objectContaining pattern used in the existing test to verify the
mapped cost fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/server/src/events/usage-event-subscriber.ts`:
- Around line 157-158: Extract the inline multi-source fallback used to compute
costBreakdown into a dedicated resolver function (e.g., resolveCostBreakdown)
and replace the inline expression (the payload.costBreakdown ??
(payload.provider && payload.model ... ) and the similar occurrence around line
167) with a single call to that resolver; the resolver should accept the payload
(or provider, model, and any rawCost fields) and implement the explicit
selection logic (check payload.costBreakdown first, then provider+model-derived
value, etc.), returning the resolved costBreakdown value so the main flow in
usage-event-subscriber.ts remains a single clear selection point.

In `@packages/tui/src/services/processStream.ts`:
- Around line 119-139: Extract the duplicated
hasMeaningfulTokenUsageForAnalytics helper into a shared location and import it
where needed: create/export a function named hasMeaningfulTokenUsageForAnalytics
(with the same signature and logic) alongside hasMeaningfulTokenUsage in the
core usage-metadata module (e.g., add to the module that defines
hasMeaningfulTokenUsage), then replace the local implementations in
processStream.ts and WebUI handlers.ts with imports from that shared module;
ensure the exported symbol is typed correctly (matching Extract<StreamingEvent,
{ name: 'llm:response' }>['tokenUsage']) and update any imports/usages
accordingly.

In `@packages/webui/lib/events/handlers.test.ts`:
- Around line 202-245: Add two new unit tests in handlers.test.ts that call
handleLLMResponse: (1) a case where event.costBreakdown is undefined but
event.estimatedCost is set—set up the chat message via
useChatStore.getState().setStreamingMessage as other tests do and assert
captureTokenUsageMock is called with estimatedCostUsd equal to the
event.estimatedCost; (2) a case where event.tokenUsage has zeros
(inputTokens/outputTokens/totalTokens = 0) but event.estimatedCost is set—again
assert captureTokenUsageMock is invoked and includes estimatedCostUsd,
confirming hasMeaningfulTokenUsageForAnalytics no longer blocks analytics when
estimatedCost is present; reuse TEST_SESSION_ID, provider/model fields and the
same expect.objectContaining pattern used in the existing test to verify the
mapped cost fields.

In `@packages/webui/lib/events/handlers.ts`:
- Around line 242-272: There is duplicated analytics logic in handleLLMResponse:
the captureTokenUsage call (using sessionId, provider, model, reasoningVariant,
reasoningBudgetTokens, tokenUsage, estimatedInputTokens, estimatedCost,
costBreakdown, etc.) appears in both the streaming and non-streaming paths;
extract this into a new helper function (e.g., captureTokenUsageAnalytics) that
accepts the necessary symbols (sessionId, provider, model, tokenUsage,
estimatedCost, costBreakdown, estimatedInputTokens, reasoningVariant,
reasoningBudgetTokens) and performs the estimateAccuracyPercent calculation
(using estimatedInputTokens and tokenUsage.inputTokens) and then calls
captureTokenUsage with the consolidated payload; replace the duplicated
captureTokenUsage blocks in both paths with a call to this new helper to ensure
single-source logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f7c5771b-16a5-4b02-8359-539f76ab3c3d

📥 Commits

Reviewing files that changed from the base of the PR and between 23ab076 and d0f4ea0.

📒 Files selected for processing (13)
  • .changeset/few-apes-judge.md
  • docs/static/openapi/openapi.json
  • packages/analytics/src/events.ts
  • packages/core/src/events/index.ts
  • packages/core/src/llm/executor/stream-processor.test.ts
  • packages/core/src/llm/executor/stream-processor.ts
  • packages/core/src/llm/usage-metadata.ts
  • packages/server/src/events/a2a-sse-subscriber.ts
  • packages/server/src/events/usage-event-subscriber.ts
  • packages/tui/src/services/processStream.test.ts
  • packages/tui/src/services/processStream.ts
  • packages/webui/lib/events/handlers.test.ts
  • packages/webui/lib/events/handlers.ts

@rahulkarajgikar rahulkarajgikar merged commit 8f6330b into truffle-ai:main Apr 14, 2026
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant