Summary
A Pi extension that tracks token usage, cost, and context window consumption per session, providing real-time analytics and optimization recommendations. Inspired by ECC's token-budget-advisor, strategic-compact, and model-route features. No Pi extension currently provides cost/context visibility.
Motivation
Developers using coding agents have no visibility into how much a session costs, which tools consume the most tokens, or when they're approaching context limits. This leads to unexpectedly expensive sessions and degraded output quality as context fills up. Basic analytics and compaction suggestions can save significant cost with minimal effort.
Proposed Features
1. Real-Time Token Tracking
- Hook into every turn to track input/output token counts
- Track per-tool token consumption (which tools are most expensive?)
- Track per-turn token consumption (which turns blew the budget?)
- Cumulative session cost estimate (using model pricing tables)
2. Session Stats Command (/session-stats)
- Total tokens used (input/output breakdown)
- Estimated session cost in USD
- Token consumption by tool (sorted descending)
- Context window utilization percentage
- Turns remaining estimate before context pressure
- Session duration and turns count
3. Context Pressure Alerts
- Monitor context window fill level via turn metadata
- At 60%: informational note ("context at 60%, consider wrapping up complex tasks")
- At 80%: warning injected into system prompt ("approaching context limit, prioritize completing current task")
- At 90%: strong recommendation to compact or start new session
- Configurable thresholds
4. Model Routing Suggestions
- Detect task complexity heuristics (file count, diff size, error count)
- Suggest Haiku for simple tasks (single-file edits, formatting, simple lookups)
- Suggest Sonnet for complex tasks (multi-file refactors, architecture decisions)
/model-suggest command for on-demand recommendation
- Track cost savings from model switches
5. Compaction Suggestions
- Detect natural breakpoints (feature complete, tests passing, PR created)
- Suggest
/compact at these points to reclaim context
- Track what was lost in compaction for session continuity
Pi Extension API Integration
| API Surface |
Usage |
turn_start / turn_end hooks |
Track token usage per turn |
tool_execution_start / tool_execution_end hooks |
Track per-tool token consumption |
model_select hook |
Track model switches and cost implications |
session_compact hook |
Record compaction events |
before_agent_start hook |
Inject context pressure warnings |
pi.registerCommand() |
/session-stats, /model-suggest |
pi.registerTool() |
session_stats tool for LLM self-awareness |
Implementation Notes
- Token counts may need to be estimated from character counts if not exposed by Pi API directly
- Cost tables: maintain a simple JSON mapping of model -> price per 1K tokens (input/output)
- Store session analytics in
~/.pi/session-analytics/sessions/<session-id>.json
- Aggregate analytics across sessions for trend reporting
- Lightweight: no LLM calls required, pure bookkeeping
Prior Art
- ECC
token-budget-advisor: token budget optimization
- ECC
strategic-compact: suggests compaction at logical intervals
- ECC
model-route: model selection routing
- ECC
ecc-tools-cost-audit: cost analysis for ECC tools
- No existing Pi extension provides token/cost analytics
Effort Estimate
Low. Pure bookkeeping with hooks, no LLM calls. The hardest part is getting accurate token counts from the Pi API (may require estimation if not directly exposed).
Summary
A Pi extension that tracks token usage, cost, and context window consumption per session, providing real-time analytics and optimization recommendations. Inspired by ECC's
token-budget-advisor,strategic-compact, andmodel-routefeatures. No Pi extension currently provides cost/context visibility.Motivation
Developers using coding agents have no visibility into how much a session costs, which tools consume the most tokens, or when they're approaching context limits. This leads to unexpectedly expensive sessions and degraded output quality as context fills up. Basic analytics and compaction suggestions can save significant cost with minimal effort.
Proposed Features
1. Real-Time Token Tracking
2. Session Stats Command (
/session-stats)3. Context Pressure Alerts
4. Model Routing Suggestions
/model-suggestcommand for on-demand recommendation5. Compaction Suggestions
/compactat these points to reclaim contextPi Extension API Integration
turn_start/turn_endhookstool_execution_start/tool_execution_endhooksmodel_selecthooksession_compacthookbefore_agent_starthookpi.registerCommand()/session-stats,/model-suggestpi.registerTool()session_statstool for LLM self-awarenessImplementation Notes
~/.pi/session-analytics/sessions/<session-id>.jsonPrior Art
token-budget-advisor: token budget optimizationstrategic-compact: suggests compaction at logical intervalsmodel-route: model selection routingecc-tools-cost-audit: cost analysis for ECC toolsEffort Estimate
Low. Pure bookkeeping with hooks, no LLM calls. The hardest part is getting accurate token counts from the Pi API (may require estimation if not directly exposed).