Skip to content

Conversation

@ScriptedAlchemy
Copy link

Goal: improve performance without changing outputs or data shapes.

Changes (intentionally minimal / compatible):

  • getCodexFirstPromptTimestamp now streams ~/.codex/history.jsonl instead of readFile().
  • Adds a fast pre-filter to skip JSON.parse for irrelevant lines.
  • Adds bounded concurrency across session files (min(cpuCount, 8)).
  • Preserves existing CodexUsageData output (including events + sorting).

Perf on this machine (2025 sessions ~1.1GB, 3706 files):

  • collectCodexUsageData(2025) ~3.0s (was ~6.7s).

@ScriptedAlchemy
Copy link
Author

Reviewed.

  • Scope looks minimal: only src/collector.ts changes; CodexUsageData shape + events sorting preserved; stats.ts untouched.
  • Perf wins come from (1) streaming history.jsonl instead of readFile(), (2) skipping JSON.parse for irrelevant lines, and (3) bounded file-level concurrency.

Notes:

  • Concurrency is the only meaningful behavioral change (IO pattern). min(cpuCount, 8) seems reasonable; if we want to be extra conservative for slower disks, we could cap at 4.
  • getTopLevelType + the "payload" guard is a good safety to avoid matching nested "type" fields.

Local perf on this machine (2025 sessions ~1.1GB, 3706 files): collectCodexUsageData(2025) ~3.0s (was ~6.7s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant