feat: end-to-end cache token tracking + multi-provider coverage by Simon-Free · Pull Request #1 · Simon-Free/cheetahclaws

Simon-Free · 2026-04-18T07:54:57Z

Summary

End-to-end tracking of prompt-cache token usage, from the provider usage response through AgentState into checkpoint snapshots. Two new small helpers in providers.py (_anthropic_cache_tokens, _openai_cached_read_tokens) give each provider family one obvious extraction point instead of three sprinkled getattr chains.

Provider compatibility

Provider	Cache read	Cache write	Mechanism
Anthropic (native `stream_anthropic`)	? `usage.cache_read_input_tokens`	? `usage.cache_creation_input_tokens`	Both fields live on `final.usage`; returned by Anthropic when prompt-caching beta is active
OpenAI / OpenAI-compatible (`stream_openai_compat`, covers OpenAI, Gemini, Groq, xAI, any OpenAI-schema provider)	? `usage.prompt_tokens_details.cached_tokens`	? always 0	OpenAI's schema has no separate "cache creation" counter - caching is implicit on their side
Ollama (`stream_ollama`)	? always 0	? always 0	No prompt-caching in Ollama today
Any future provider (custom `stream_xxx`, downstream fork)	defaults to 0	defaults to 0	`AssistantTurn` defaults + `getattr(event, 'cache_read_tokens', 0)` in `agent.run` + `getattr(state, 'total_cache_read_tokens', 0)` in `make_snapshot` = no-op fallback if the provider never sets the fields
Bedrock via litellm (downstream forks, e.g. bouzecode)	handled gracefully	handled gracefully	When the wrapper forwards Anthropic-shaped usage, `_anthropic_cache_tokens` catches the fields; when it shapes to OpenAI, `_openai_cached_read_tokens` catches the read side. When neither, the getattr defaults to 0 - no exception path

Missing / None usage fields are coerced to 0 throughout; see TestAnthropicCacheExtraction::test_missing_fields_default_to_zero / test_none_fields_coerced_to_zero and the OpenAI equivalents.

Changes

File	+/-	What
`providers.py`	+31	`_anthropic_cache_tokens` helper, `_openai_cached_read_tokens` helper, cache-write field on `AssistantTurn`, 2 call-sites in `stream_anthropic`, 2 in `stream_openai_compat`, `stream_ollama` now explicitly passes 0/0
`agent.py`	+4	`AgentState.total_cache_read_tokens` / `total_cache_write_tokens`; accumulate from `assistant_turn` on every turn via `getattr(... , 0)` so providers that don't set the fields still work
`checkpoint/store.py`	+2	`token_snapshot["cache_read"]` / `["cache_write"]` persisted via `getattr(state, ... , 0)`
`tests/test_cache_tokens.py`	rewritten (~170 lines)	5 layers of coverage - see below
`tests/test_checkpoint.py`	?52	Cache cases moved out to test_cache_tokens.py so checkpoint tests stay focused on snapshots

Test layers

AssistantTurn / AgentState - constructor defaults, explicit values, accumulation across increments.
Checkpoint persistence - real make_snapshot call against tmp_path, asserts token_snapshot keys.
Provider extraction helpers - 3 cases each for Anthropic and OpenAI (populated / missing / None); Ollama shape check.
E2E one-turn - agent.run drains a mocked providers.stream that emits an AssistantTurn with cache tokens; asserts state.total_cache_* and snapshot values.
E2E multi-turn - two consecutive agent.run calls with distinct cache values; asserts running totals.

Backwards compatibility

Adding fields with defaults = 0 to AssistantTurn and AgentState is non-breaking for every caller that constructs them positionally or by keyword.
make_snapshot uses getattr(state, "total_cache_read_tokens", 0) so old AgentState instances rehydrated from pre-PR session files still produce valid snapshots.
No config flag - the feature is free: if a provider never sets the cache fields, everything records as 0.

…ider tests Two new small helpers in providers.py give each provider family one obvious extraction point instead of three sprinkled getattr chains: - _anthropic_cache_tokens(usage) -> (read, write) Reads cache_read_input_tokens / cache_creation_input_tokens. Returns (0, 0) if the fields are missing (older SDKs, Bedrock-via-litellm, non-cached calls) or None (Anthropic occasionally emits JSON null). - _openai_cached_read_tokens(usage) -> int Walks usage.prompt_tokens_details.cached_tokens. OpenAI's schema has no separate cache-creation counter (caching is implicit), so the write-side stays 0 for this entire provider family. stream_anthropic, stream_openai_compat now call these helpers instead of inlining the getattr dance. stream_ollama was already 0/0; behaviour unchanged. Any new provider that builds an AssistantTurn without passing cache_read_tokens / cache_write_tokens inherits the dataclass defaults and agent.run's getattr(... , 0) fallbacks, so downstream totals and snapshots stay consistent. Tests (tests/test_cache_tokens.py, rewritten): - AssistantTurn + AgentState defaults and accumulation. - Checkpoint snapshot persists cache_read + cache_write via real make_snapshot against a tmp_path. - TestAnthropicCacheExtraction (3 cases) + TestOpenAICacheExtraction (3 cases) covering populated / missing / None usage objects. - Ollama shape check (no-cache path). - test_agent_run_propagates_cache_tokens_from_mocked_stream: one turn through agent.run with a scripted stream; asserts state totals AND the produced snapshot. - test_agent_run_accumulates_cache_across_multi_turn: two consecutive runs with distinct cache values; asserts running totals. Cleanup: - The three duplicate cache-token cases previously appended to tests/test_checkpoint.py are removed; test_cache_tokens.py is the single home for this feature now. - Fix the stale make_snapshot(state, session_id, prompt) call in test_cache_tokens that survived from the earlier signature mismatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Simon FREYBURGER and others added 5 commits April 18, 2026 09:52

wip: add cache token fields to AssistantTurn

5a11c54

feat: end-to-end cache token tracking (providers -> state -> checkpoint)

d0035af

feat: add cache token fields to AgentState + behavioral tests

5b1e1c8

fix: test_cache_tokens uses correct make_snapshot signature

1fbd183

Simon-Free changed the title ~~feat: end-to-end cache token tracking~~ feat: end-to-end cache token tracking + multi-provider coverage Apr 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: end-to-end cache token tracking + multi-provider coverage#1

feat: end-to-end cache token tracking + multi-provider coverage#1
Simon-Free wants to merge 5 commits intomainfrom
pr11-cache-token-tracking

Simon-Free commented Apr 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Simon-Free commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Provider compatibility

Changes

Test layers

Backwards compatibility

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Simon-Free commented Apr 18, 2026 •

edited

Loading