Skip to content

Conversation

@KosBeg
Copy link

@KosBeg KosBeg commented Jan 25, 2026

Summary

Codex session logs can emit duplicate token_count events where the cumulative total_token_usage does not advance. The loader previously trusted last_token_usage unconditionally, which causes duplicated deltas and inflates all token totals (often close to x2). This change skips last_token_usage when totals are unchanged, and only derives deltas from totals when last is missing.

Why this happens

Codex logs contain both:

  • info.last_token_usage: per-event delta
  • info.total_token_usage: cumulative totals

In practice, some sessions emit the same token_count payload twice without advancing the cumulative totals. When we sum last_token_usage for every event, those duplicates are counted again even though total_token_usage shows no progress.

Example from real logs (simplified)

Event A (first emission):

  • last_token_usage: input=1000, cached=250, output=200, total=1200
  • total_token_usage: input=1000, cached=250, output=200, total=1200

Event B (duplicate emission, totals unchanged):

  • last_token_usage: input=1000, cached=250, output=200, total=1200
  • total_token_usage: input=1000, cached=250, output=200, total=1200

Old behavior:

  • Adds Event A delta + Event B delta = 2400 total tokens (double counted)

New behavior:

  • Detects that totals did not change for Event B
  • Skips Event B and keeps 1200 total tokens (correct)

How the fix works

  • Track the previous total_token_usage per session
  • If the current total_token_usage equals the previous total, treat the
    event as a duplicate and ignore last_token_usage
  • If last_token_usage is missing but totals exist, derive a delta from the
    totals (existing behavior)

What changed

  • Added a small helper to compare totals for equality
  • Updated the event processing logic to skip duplicated totals
  • Added an in-source test to reproduce the duplicate pattern

Summary by CodeRabbit

  • Bug Fixes
    • Improved detection and handling of duplicate token usage events to prevent redundant processing and ensure accurate token count tracking.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 25, 2026

📝 Walkthrough

Walkthrough

Modified the data loader to detect and handle duplicate token usage events. Introduces a new isSameRawUsage helper function to compare RawUsage objects for equality. Refactors per-event raw usage selection logic to skip processing when total usage remains unchanged, and adds unit tests validating this deduplication behavior.

Changes

Cohort / File(s) Change Summary
Duplicate detection logic
apps/codex/src/data-loader.ts
Added isSameRawUsage helper to compare RawUsage objects. Modified loadTokenUsageEvents to detect duplicate totals and conditionally select raw usage (prioritizing lastUsage when not a duplicate, falling back to computed delta from totalUsage). Reworked conditional logic to skip zero deltas and handle null cases. Added unit tests verifying duplicate token_count events with unchanged totals produce single event output.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A rabbit hops through tokens with care,
Spotting twins where duplicates dare!
Same totals caught, no delta needed—
One event clean, the noise defeated! 🌟
Data flows pure, through dedup's grace.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(codex): ignore duplicate token_count totals' directly and concisely describes the main change—detecting and ignoring duplicate token_count events when totals are unchanged.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant