-
-
Notifications
You must be signed in to change notification settings - Fork 333
fix(ccusage): improve deduplication to keep records with most tokens #826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(ccusage): improve deduplication to keep records with most tokens #826
Conversation
- Modified createUniqueHash to fall back to messageId when requestId is missing - Added getTotalTokensFromUsageData function for complete token calculation - Changed deduplication logic to keep records with highest token count - Updated loadDailyUsageData, loadSessionData, and loadSessionBlockData - Added comprehensive unit tests for new deduplication behavior Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe data-loader.ts module's deduplication logic is refactored to prioritize entries with higher token counts instead of simple duplicate skipping. A new exported utility function Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@apps/ccusage/src/data-loader.ts`:
- Around line 529-543: The function getTotalTokensFromUsageData is exported but
not used externally; remove the export to make it module-private by changing its
declaration to a non-exported function (keep the name
getTotalTokensFromUsageData and signature using UsageData so callers inside this
file still work), then run a quick grep across the repo to confirm no external
imports reference it and update any internal callers if needed.
🧹 Nitpick comments (1)
apps/ccusage/src/data-loader.ts (1)
490-496: Avoid a test-only helper drifting from production logic.
shouldReplaceExistingis only exercised in the test block, while the dedup paths inline the comparison. Consider using this helper in those paths (or removing it) to keep a single source of truth.
| /** | ||
| * Calculate total tokens from UsageData for deduplication comparison | ||
| * Uses complete formula: input + output + cache_creation + cache_read | ||
| * @param data - UsageData object | ||
| * @returns Total number of tokens | ||
| */ | ||
| export function getTotalTokensFromUsageData(data: UsageData): number { | ||
| const usage = data.message.usage; | ||
| return ( | ||
| usage.input_tokens + | ||
| usage.output_tokens + | ||
| (usage.cache_creation_input_tokens ?? 0) + | ||
| (usage.cache_read_input_tokens ?? 0) | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n "getTotalTokensFromUsageData" -g "!apps/ccusage/src/data-loader.ts"Repository: ryoppippi/ccusage
Length of output: 43
Remove the export keyword from getTotalTokensFromUsageData. This function is not used outside this module and violates the guideline to export only symbols actually used by other modules.
🤖 Prompt for AI Agents
In `@apps/ccusage/src/data-loader.ts` around lines 529 - 543, The function
getTotalTokensFromUsageData is exported but not used externally; remove the
export to make it module-private by changing its declaration to a non-exported
function (keep the name getTotalTokensFromUsageData and signature using
UsageData so callers inside this file still work), then run a quick grep across
the repo to confirm no external imports reference it and update any internal
callers if needed.
Summary
This PR improves the deduplication logic in ccusage to handle edge cases where API proxies don't include
requestIdfield, and ensures that records with the highest token count are preserved during deduplication.Changes
1. Modified
createUniqueHashrequestIdis missing, fall back to using onlymessageIdfor deduplicationnullwhenrequestIdwas missing, causing records to bypass deduplication2. Added
getTotalTokensFromUsageDatainput_tokens + output_tokens + cache_creation_input_tokens + cache_read_input_tokens3. Updated deduplication logic
Set<string>toMap<string, {index, tokens}>for tracking processed entries4. Updated functions
loadDailyUsageDataloadSessionDataloadSessionBlockDataTesting
createUniqueHash(including fallback behavior)getTotalTokensFromUsageDatashouldReplaceExistingSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.