-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Objective
Add cost computation for providers that don't natively report cost (OpenAI, OpenRouter via AI SDK, Azure, Gemini). Currently only the Claude CLI provider reports costUsd (from its result event). All AI SDK-based providers return costUsd: undefined.
Approach
Use a model pricing lookup table to compute cost from token usage:
cost = (input_tokens * input_price_per_token) + (output_tokens * output_price_per_token) + (cached_tokens * cached_price_per_token) + (reasoning_tokens * reasoning_price_per_token)
Design
-
Pricing table — a static JSON/TS map of
model_id → { input, output, cached?, reasoning? }per-million-token prices. Covers common models from OpenAI, Anthropic, Google, and popular OpenRouter models. -
Fallback hierarchy:
- Provider-reported
costUsd(Claude CLI) takes precedence - If
costUsdis undefined andtokenUsageis available, compute from pricing table - If model not in table,
costUsdremains undefined (no guessing)
- Provider-reported
-
Extension point — allow users to override/extend the pricing table via config (e.g.,
.agentv/pricing.jsonor inline in eval YAML target config) for custom/fine-tuned models or updated prices. -
Staleness — model prices change frequently. The built-in table should be easy to update (single file, sorted by provider). Consider a
--update-pricingCLI command or a note in docs about keeping it current.
Design Latitude
- Location of the pricing table (standalone JSON file vs inline TS object)
- Whether to support per-provider pricing APIs (e.g., OpenRouter
/api/v1/modelsendpoint returns pricing) - Whether
evalRun.costUsdshould aggregate candidate + grader costs
Acceptance Signals
costUsdis computed for OpenAI and Anthropic AI SDK providers when token usage is available- Pricing table covers at least: gpt-4o, gpt-4o-mini, gpt-5, gpt-5-mini, o3, o4-mini, claude-sonnet-4, claude-opus-4, gemini-2.5-pro, gemini-2.5-flash
- Provider-reported cost (Claude CLI) is never overridden
- Unknown models gracefully return
costUsd: undefined - User-defined pricing overrides work via config
Non-Goals
- Real-time pricing API integration (can be a follow-up)
- Billing or budget enforcement (existing
totalBudgetUsdhandles this for known costs) - Historical pricing or price-at-time-of-run tracking
Related
- bug: show duration_ms for agent being evaluated #633 — reasoning tokens and evalRun aggregate metrics (merged)