Skip to content

Debug: Anthropic cache token cost investigation#44

Open
andrewm4894 wants to merge 2 commits intomainfrom
andy/debug-anthropic-cache-cost
Open

Debug: Anthropic cache token cost investigation#44
andrewm4894 wants to merge 2 commits intomainfrom
andy/debug-anthropic-cache-cost

Conversation

@andrewm4894
Copy link
Copy Markdown
Member

Summary

  • Debug script to reproduce and investigate the Anthropic cache token cost overcharge issue
  • Makes live API calls to verify Anthropic returns exclusive input_tokens, confirms the PostHog SDK passes them through correctly, and demonstrates how posthog_properties overrides can cause the bug
  • Results written to output.md for easy sharing

Context

Customer reported ~7x cost overcharge on Anthropic cached calls. Investigation found the SDK is correct but posthog_properties can override the SDK's exclusive token values with inclusive ones.

See python/scripts/debug-anthropic-cache-cost/output.md for full results.

Test plan

  • Script runs and produces correct output
  • output.md generated with formatted results
  • No API keys or sensitive data in committed files

Reproduction script investigating why $ai_input_tokens can be
inclusive (input + cache_read) instead of exclusive, causing cost
overcharges. Tests raw API, SDK wrapper, and posthog_properties
override behavior. Writes results to output.md for sharing.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 711f5cda6e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +332 to +336
prev_api_input = response.usage.input_tokens
prev_cache_read = getattr(response.usage, "cache_read_input_tokens", 0) or 0
prev_output = response.usage.output_tokens
inclusive_input = prev_api_input + prev_cache_read
computed_total = inclusive_input + prev_output
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Compute override costs from the same API response

Step 3b derives inclusive_input and computed_total from response (the previous 3a call) and then applies them to a new messages.create() request with a different prompt; if token usage changes between calls, the reported “correct vs overcharged” comparison is no longer apples-to-apples and the overcharge factor becomes inaccurate. This can mislead the investigation output, so both exclusive and overridden values should be tied to the same request context.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant