feat: end-to-end cache token tracking + multi-provider coverage#1
Draft
Simon-Free wants to merge 5 commits intomainfrom
Draft
feat: end-to-end cache token tracking + multi-provider coverage#1Simon-Free wants to merge 5 commits intomainfrom
Simon-Free wants to merge 5 commits intomainfrom
Conversation
…ider tests Two new small helpers in providers.py give each provider family one obvious extraction point instead of three sprinkled getattr chains: - _anthropic_cache_tokens(usage) -> (read, write) Reads cache_read_input_tokens / cache_creation_input_tokens. Returns (0, 0) if the fields are missing (older SDKs, Bedrock-via-litellm, non-cached calls) or None (Anthropic occasionally emits JSON null). - _openai_cached_read_tokens(usage) -> int Walks usage.prompt_tokens_details.cached_tokens. OpenAI's schema has no separate cache-creation counter (caching is implicit), so the write-side stays 0 for this entire provider family. stream_anthropic, stream_openai_compat now call these helpers instead of inlining the getattr dance. stream_ollama was already 0/0; behaviour unchanged. Any new provider that builds an AssistantTurn without passing cache_read_tokens / cache_write_tokens inherits the dataclass defaults and agent.run's getattr(... , 0) fallbacks, so downstream totals and snapshots stay consistent. Tests (tests/test_cache_tokens.py, rewritten): - AssistantTurn + AgentState defaults and accumulation. - Checkpoint snapshot persists cache_read + cache_write via real make_snapshot against a tmp_path. - TestAnthropicCacheExtraction (3 cases) + TestOpenAICacheExtraction (3 cases) covering populated / missing / None usage objects. - Ollama shape check (no-cache path). - test_agent_run_propagates_cache_tokens_from_mocked_stream: one turn through agent.run with a scripted stream; asserts state totals AND the produced snapshot. - test_agent_run_accumulates_cache_across_multi_turn: two consecutive runs with distinct cache values; asserts running totals. Cleanup: - The three duplicate cache-token cases previously appended to tests/test_checkpoint.py are removed; test_cache_tokens.py is the single home for this feature now. - Fix the stale make_snapshot(state, session_id, prompt) call in test_cache_tokens that survived from the earlier signature mismatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
End-to-end tracking of prompt-cache token usage, from the provider usage response through
AgentStateinto checkpoint snapshots. Two new small helpers inproviders.py(_anthropic_cache_tokens,_openai_cached_read_tokens) give each provider family one obvious extraction point instead of three sprinkledgetattrchains.Provider compatibility
stream_anthropic)usage.cache_read_input_tokensusage.cache_creation_input_tokensfinal.usage; returned by Anthropic when prompt-caching beta is activestream_openai_compat, covers OpenAI, Gemini, Groq, xAI, any OpenAI-schema provider)usage.prompt_tokens_details.cached_tokensstream_ollama)stream_xxx, downstream fork)AssistantTurndefaults +getattr(event, 'cache_read_tokens', 0)inagent.run+getattr(state, 'total_cache_read_tokens', 0)inmake_snapshot= no-op fallback if the provider never sets the fields_anthropic_cache_tokenscatches the fields; when it shapes to OpenAI,_openai_cached_read_tokenscatches the read side. When neither, the getattr defaults to 0 - no exception pathMissing /
Noneusage fields are coerced to 0 throughout; seeTestAnthropicCacheExtraction::test_missing_fields_default_to_zero/test_none_fields_coerced_to_zeroand the OpenAI equivalents.Changes
providers.py_anthropic_cache_tokenshelper,_openai_cached_read_tokenshelper, cache-write field onAssistantTurn, 2 call-sites instream_anthropic, 2 instream_openai_compat,stream_ollamanow explicitly passes 0/0agent.pyAgentState.total_cache_read_tokens/total_cache_write_tokens; accumulate fromassistant_turnon every turn viagetattr(... , 0)so providers that don't set the fields still workcheckpoint/store.pytoken_snapshot["cache_read"]/["cache_write"]persisted viagetattr(state, ... , 0)tests/test_cache_tokens.pytests/test_checkpoint.pyTest layers
make_snapshotcall againsttmp_path, assertstoken_snapshotkeys.agent.rundrains a mockedproviders.streamthat emits anAssistantTurnwith cache tokens; assertsstate.total_cache_*and snapshot values.agent.runcalls with distinct cache values; asserts running totals.Backwards compatibility
AssistantTurnandAgentStateis non-breaking for every caller that constructs them positionally or by keyword.make_snapshotusesgetattr(state, "total_cache_read_tokens", 0)so oldAgentStateinstances rehydrated from pre-PR session files still produce valid snapshots.