Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Plan: Add Cache Token Stats to `/stats` Models Tab

## Problem

The `/stats` Models tab (`ODK` component) displays per-model token usage but omits
prompt cache data. The `usage` object passed to `ODK` already contains
`cacheReadInputTokens` and `cacheCreationInputTokens` — they are collected for all
users (API and subscription) via `pd6()` and facet aggregation — but the component
only renders `inputTokens` and `outputTokens`.

### Current output

```
• Sonnet 4 (85.2%)
In: 1.2M · Out: 45K
```

### Proposed output

```
• Sonnet 4 (85.2%)
In: 1.2M · Out: 45K · Cache R: 890K · Cache W: 312K
Cache hit: 67%
```

## Scope

One component change: `ODK` (the per-model usage card in the `/stats` Models tab).

No changes to `/cost`, no changes to pricing display, no new commands.

## Implementation

### Location

The `ODK` component in the source (minified name in `cli.js`). In the original
TypeScript source, this is the component rendered for each model entry in the
Models tab of the `/stats` command.

### Current component (decompiled from minified)

```tsx
function ODK({ model, usage, totalTokens }) {
const pct = ((usage.inputTokens + usage.outputTokens) / totalTokens * 100).toFixed(1);
const modelName = sj(model);

return (
<Box flexDirection="column">
<Text>
{bullet} <Text bold>{modelName}</Text> <Text color="subtle">({pct}%)</Text>
</Text>
<Text color="subtle">
{" "}In: {M3(usage.inputTokens)} · Out: {M3(usage.outputTokens)}
</Text>
</Box>
);
}
```

### Proposed change

```tsx
function ODK({ model, usage, totalTokens }) {
const pct = ((usage.inputTokens + usage.outputTokens) / totalTokens * 100).toFixed(1);
const modelName = sj(model);

const hasCacheData = usage.cacheReadInputTokens > 0 || usage.cacheCreationInputTokens > 0;

const totalInput = usage.inputTokens + usage.cacheReadInputTokens + usage.cacheCreationInputTokens;
const cacheHitPct = totalInput > 0
? (usage.cacheReadInputTokens / totalInput * 100).toFixed(0)
: null;

return (
<Box flexDirection="column">
<Text>
{bullet} <Text bold>{modelName}</Text> <Text color="subtle">({pct}%)</Text>
</Text>
<Text color="subtle">
{" "}In: {M3(usage.inputTokens)} · Out: {M3(usage.outputTokens)}
{hasCacheData && ` · Cache R: ${M3(usage.cacheReadInputTokens)} · Cache W: ${M3(usage.cacheCreationInputTokens)}`}
</Text>
{cacheHitPct !== null && (
<Text color="subtle">
{" "}Cache hit: {cacheHitPct}%
</Text>
)}
</Box>
);
}
```

### What changes

1. **Add `hasCacheData` guard** — only show cache tokens when at least one is > 0.
Avoids noise for models or sessions with no caching.

2. **Append cache read/write to the existing token line** — uses the same `M3()`
compact number formatter (e.g. "890K") and same `· ` separator style.

3. **Add a cache hit rate line** — `cacheRead / (input + cacheRead + cacheCreation) * 100`.
This matches the formula already used in telemetry (`cacheHitRate` in
`tengu_fork_agent_query`). Only shown when `totalInput > 0` to avoid division
by zero.

### What does NOT change

- **Data collection** — `pd6()` already accumulates `cache_read_input_tokens` and
`cache_creation_input_tokens` for all users. No change needed.
- **Facet storage** — `v16()` already persists cache token counts to disk. No change.
- **Facet aggregation** — The stats aggregation code already sums cache tokens per
model across sessions. No change.
- **`/cost` command** — Remains hidden for subscription users. Out of scope.
- **Dollar amounts** — Not shown. Out of scope.
- **`/stats` Overview tab** — Not modified. Could be a follow-up.

## Edge cases

| Case | Behavior |
|------|----------|
| Caching disabled (`DISABLE_PROMPT_CACHING=1`) | Both cache fields are 0, `hasCacheData` is false, cache line hidden |
| Model with zero cache tokens | Cache line suppressed, only `In / Out` shown |
| All tokens from cache (100% hit) | Shows `Cache hit: 100%` |
| Only cache writes, no reads | Shows `Cache R: 0 · Cache W: 312K` and `Cache hit: 0%` |
| Division by zero (no input at all) | `cacheHitPct` is `null`, line not rendered |
| Historical sessions without cache fields | Defensive `> 0` check handles `undefined` (falsy) |

## Dependencies

None. The `usage` prop already carries the required fields. The `M3()` formatter
is already in scope within the component's module.

## Testing

- Verify `ODK` renders cache line only when cache data > 0
- Verify cache hit rate math: `890000 / (1200000 + 890000 + 312000) * 100 ≈ 37%`
- Verify zero-data case renders only `In / Out`
- Verify both API and subscription users see cache data in `/stats`
97 changes: 97 additions & 0 deletions scripts/patch-stats-cache.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
#!/usr/bin/env node
/**
* Patches the compiled Claude Code cli.js to add cache token stats
* (cache read, cache write, cache hit rate) to the /stats Models tab.
*
* Usage: node scripts/patch-stats-cache.js [path-to-cli.js]
*
* Default path: /opt/node22/lib/node_modules/@anthropic-ai/claude-code/cli.js
*/

var fs = require("fs");
var path = require("path");

var cliPath = process.argv[2] ||
"/opt/node22/lib/node_modules/@anthropic-ai/claude-code/cli.js";

if (!fs.existsSync(cliPath)) {
console.error("cli.js not found at:", cliPath);
process.exit(1);
}

var src = fs.readFileSync(cliPath, "utf8");

// Locate the ODK function (per-model usage card in /stats Models tab)
var idx = src.indexOf("function ODK(A)");
if (idx === -1) {
console.error("ERROR: function ODK(A) not found in cli.js. Is this the right file?");
process.exit(1);
}

// Extract the full function by brace-matching
var d = 0, i = idx, end = -1;
while (i < src.length && i < idx + 10000) {
if (src[i] === "{") d++;
else if (src[i] === "}") { d--; if (d === 0) { end = i; break; } }
i++;
}
if (end === -1) {
console.error("ERROR: Could not find end of ODK function");
process.exit(1);
}

var oldFn = src.slice(idx, end + 1);

// Check if already patched
if (oldFn.indexOf("cacheReadInputTokens") >= 0) {
console.log("Already patched. Nothing to do.");
process.exit(0);
}

// Find the cut point: the final assembly of the Box element (slots 18-20)
var cutMarker = "let j;if(K[18]";
var cutPoint = oldFn.indexOf(cutMarker);
if (cutPoint === -1) {
console.error("ERROR: Could not find final assembly marker in ODK function.");
console.error("The function may have changed. Manual inspection required.");
process.exit(1);
}

var keepPart = oldFn.slice(0, cutPoint);

// New tail: compute cache stats and add them as extra children in the Box
var newTail = [
"var _cr=Y.cacheReadInputTokens||0,_cw=Y.cacheCreationInputTokens||0,",
"_hc=_cr>0||_cw>0,",
"_ti=Y.inputTokens+_cr+_cw,",
'_hp=_ti>0?(_cr/_ti*100).toFixed(0):null,',
"_cEl=null,_hEl=null;",
'if(_hc)_cEl=D4.default.createElement(f,{color:"subtle"},',
'" Cache R: ",M3(_cr)," \\u00b7 Cache W: ",M3(_cw));',
'if(_hc&&_hp!==null)_hEl=D4.default.createElement(f,{color:"subtle"},',
'" Cache hit: ",_hp,"%");',
'var j=D4.default.createElement(I,{flexDirection:"column"},Z,D,_cEl,_hEl);',
"return j}"
].join("");

var newFn = keepPart + newTail;

// Apply patch
var backupPath = cliPath + ".bak";
if (!fs.existsSync(backupPath)) {
fs.copyFileSync(cliPath, backupPath);
console.log("Backup saved to:", backupPath);
}

var newSrc = src.replace(oldFn, newFn);
fs.writeFileSync(cliPath, newSrc);

console.log("Patch applied successfully.");
console.log(" Old function: " + oldFn.length + " chars");
console.log(" New function: " + newFn.length + " chars");
console.log(" Added: " + (newFn.length - oldFn.length) + " chars");
console.log("");
console.log("The /stats Models tab will now show:");
console.log(" • Model Name (XX.X%)");
console.log(" In: XXK · Out: XXK · Cache R: XXK · Cache W: XXK");
console.log(" Cache hit: XX%");
126 changes: 126 additions & 0 deletions scripts/test-patch-stats-cache.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
#!/usr/bin/env node
/**
* Tests the patched ODK function from cli.js by extracting it
* and running it with mocked React/Ink dependencies.
*
* Usage: node scripts/test-patch-stats-cache.js [path-to-cli.js]
*/

var fs = require("fs");

var cliPath = process.argv[2] ||
"/opt/node22/lib/node_modules/@anthropic-ai/claude-code/cli.js";

// Mock dependencies
var React = {
createElement: function(type, props) {
var children = Array.prototype.slice.call(arguments, 2);
return { type: typeof type === "string" ? type : "component", props: props, children: children };
}
};
var D4 = { default: React };
var f = "Text";
var I = "Box";
var aA = { bullet: "\u2022" };
function a(n) { return new Array(n).fill(Symbol.for("react.memo_cache_sentinel")); }
function sj(m) { return m; }
function M3(n) {
if (n >= 1000000) return (n / 1000000).toFixed(1) + "M";
if (n >= 1000) return (n / 1000).toFixed(1) + "K";
return String(n);
}

// Extract ODK from cli.js
var src = fs.readFileSync(cliPath, "utf8");
var idx = src.indexOf("function ODK(A)");
if (idx === -1) { console.error("ODK not found"); process.exit(1); }
var d = 0, i = idx, end = -1;
while (i < src.length && i < idx + 10000) {
if (src[i] === "{") d++;
else if (src[i] === "}") { d--; if (d === 0) { end = i; break; } }
i++;
}
var fnSrc = src.slice(idx, end + 1);
var ODK = new Function("a", "sj", "M3", "D4", "f", "I", "aA",
"return " + fnSrc)(a, sj, M3, D4, f, I, aA);

// Helpers
function render(el) {
if (el === null || el === undefined) return "";
if (typeof el === "string" || typeof el === "number") return String(el);
var out = "";
if (el.children) el.children.forEach(function(c) { out += render(c); });
return out;
}

function getLines(result) {
var lines = [];
result.children.forEach(function(c) { if (c) lines.push(render(c)); });
return lines;
}

var passed = 0, failed = 0;
function assert(cond, msg) {
if (cond) { passed++; console.log(" \u2713 " + msg); }
else { failed++; console.log(" \u2717 FAIL: " + msg); }
}

// Test 1: With cache data
console.log("\nTest 1: With cache data");
var r1 = getLines(ODK({
model: "claude-sonnet-4-20250514",
usage: { inputTokens: 1200000, outputTokens: 45000, cacheReadInputTokens: 890000, cacheCreationInputTokens: 312000 },
totalTokens: 2447000
}));
r1.forEach(function(l) { console.log(" " + l); });
assert(r1.length === 4, "Should have 4 lines (header, in/out, cache, hit rate)");
assert(r1[2].indexOf("Cache R:") >= 0, "Line 3 contains 'Cache R:'");
assert(r1[2].indexOf("Cache W:") >= 0, "Line 3 contains 'Cache W:'");
assert(r1[3].indexOf("Cache hit:") >= 0, "Line 4 contains 'Cache hit:'");
assert(r1[3].indexOf("37%") >= 0, "Hit rate is 37%");

// Test 2: No cache data
console.log("\nTest 2: No cache data (zero values)");
var r2 = getLines(ODK({
model: "claude-sonnet-4-20250514",
usage: { inputTokens: 500000, outputTokens: 20000, cacheReadInputTokens: 0, cacheCreationInputTokens: 0 },
totalTokens: 520000
}));
r2.forEach(function(l) { console.log(" " + l); });
assert(r2.length === 2, "Should have 2 lines (no cache lines)");

// Test 3: Cache writes only
console.log("\nTest 3: Cache writes only (0% hit rate)");
var r3 = getLines(ODK({
model: "claude-haiku-3.5",
usage: { inputTokens: 100000, outputTokens: 5000, cacheReadInputTokens: 0, cacheCreationInputTokens: 50000 },
totalTokens: 155000
}));
r3.forEach(function(l) { console.log(" " + l); });
assert(r3.length === 4, "Should have 4 lines");
assert(r3[3].indexOf("0%") >= 0, "Hit rate is 0%");

// Test 4: Undefined cache fields (backward compat)
console.log("\nTest 4: Undefined cache fields");
var r4 = getLines(ODK({
model: "claude-opus-4-20250514",
usage: { inputTokens: 200000, outputTokens: 10000 },
totalTokens: 210000
}));
r4.forEach(function(l) { console.log(" " + l); });
assert(r4.length === 2, "Should have 2 lines (undefined = no cache)");

// Test 5: 100% cache hit
console.log("\nTest 5: 100% cache hit rate");
var r5 = getLines(ODK({
model: "claude-sonnet-4-20250514",
usage: { inputTokens: 0, outputTokens: 1000, cacheReadInputTokens: 500000, cacheCreationInputTokens: 0 },
totalTokens: 501000
}));
r5.forEach(function(l) { console.log(" " + l); });
assert(r5.length === 4, "Should have 4 lines");
assert(r5[3].indexOf("100%") >= 0, "Hit rate is 100%");

// Summary
console.log("\n" + passed + " passed, " + failed + " failed");
process.exit(failed > 0 ? 1 : 0);