Skip to content

feat: compute costUsd from token usage via model pricing table #635

@christso

Description

@christso

Objective

Add cost computation for providers that don't natively report cost (OpenAI, OpenRouter via AI SDK, Azure, Gemini). Currently only the Claude CLI provider reports costUsd (from its result event). All AI SDK-based providers return costUsd: undefined.

Approach

Use a model pricing lookup table to compute cost from token usage:

cost = (input_tokens * input_price_per_token) + (output_tokens * output_price_per_token) + (cached_tokens * cached_price_per_token) + (reasoning_tokens * reasoning_price_per_token)

Design

  1. Pricing table — a static JSON/TS map of model_id → { input, output, cached?, reasoning? } per-million-token prices. Covers common models from OpenAI, Anthropic, Google, and popular OpenRouter models.

  2. Fallback hierarchy:

    • Provider-reported costUsd (Claude CLI) takes precedence
    • If costUsd is undefined and tokenUsage is available, compute from pricing table
    • If model not in table, costUsd remains undefined (no guessing)
  3. Extension point — allow users to override/extend the pricing table via config (e.g., .agentv/pricing.json or inline in eval YAML target config) for custom/fine-tuned models or updated prices.

  4. Staleness — model prices change frequently. The built-in table should be easy to update (single file, sorted by provider). Consider a --update-pricing CLI command or a note in docs about keeping it current.

Design Latitude

  • Location of the pricing table (standalone JSON file vs inline TS object)
  • Whether to support per-provider pricing APIs (e.g., OpenRouter /api/v1/models endpoint returns pricing)
  • Whether evalRun.costUsd should aggregate candidate + grader costs

Acceptance Signals

  • costUsd is computed for OpenAI and Anthropic AI SDK providers when token usage is available
  • Pricing table covers at least: gpt-4o, gpt-4o-mini, gpt-5, gpt-5-mini, o3, o4-mini, claude-sonnet-4, claude-opus-4, gemini-2.5-pro, gemini-2.5-flash
  • Provider-reported cost (Claude CLI) is never overridden
  • Unknown models gracefully return costUsd: undefined
  • User-defined pricing overrides work via config

Non-Goals

  • Real-time pricing API integration (can be a follow-up)
  • Billing or budget enforcement (existing totalBudgetUsd handles this for known costs)
  • Historical pricing or price-at-time-of-run tracking

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions