From 2277c06afa1aac97bab7198669a16b48a4dc0d67 Mon Sep 17 00:00:00 2001 From: emireaksay-8867 Date: Mon, 27 Apr 2026 21:07:53 +0200 Subject: [PATCH] docs: changelog and cleanup Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 30 +- PLAN.md | 286 ----------------- docs/facts-first-design.md | 643 ------------------------------------- docs/ui-scenarios.md | 524 ------------------------------ 4 files changed, 18 insertions(+), 1465 deletions(-) delete mode 100644 PLAN.md delete mode 100644 docs/facts-first-design.md delete mode 100644 docs/ui-scenarios.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 5e77993..6940151 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,20 +1,26 @@ # Changelog +All notable changes to Keddy are listed here. + ## [0.1.2] - 2026-04-27 ### Changed -- README copy updates +- Restructured the README around what each surface gives the user: exchanges, plans, notes, daily notes, activity analysis, dashboard, and the MCP tools. +- Embedded the hero demo video natively via GitHub attachments. + +### Fixed +- Pre-launch capture, parsing, and dashboard fixes. -## [0.1.0] - 2024-03-21 +## [0.1.1] - 2026-04-23 + +Initial public release. ### Added -- JSONL transcript parser with support for tool calls, plan mode, compaction, interrupts -- SQLite database with FTS5 full-text search -- Programmatic analysis: segments, plans, milestones -- 4 Claude Code hooks (SessionStart, Stop, PostCompact, SessionEnd) -- CLI: init, open, status, config, import -- MCP server with 4 session intelligence tools -- Dashboard API (Hono) with session, plan, stats, config routes -- React dashboard with session list, timeline, plan viewer, settings -- Optional AI analysis layer (titles, summaries, decisions) -- Comprehensive test suite with integration tests +- Capture pipeline running off four Claude Code hooks, writing to a local SQLite database with FTS5 full-text search. +- 11 MCP tools, all exposed by default, covering session search, session reading at varying depth, plan version history, file lookups, project status, recent activity, and saved session and daily notes. +- Local dashboard at `localhost:3737` with Activity, Plans, and Notes tabs. +- Optional AI layer (session notes, daily notes, activity analysis) using your own Anthropic API key. Off by default. +- CLI: `init`, `open`, `status`, `import`, `reimport`, `backfill`, `config`, `version`, `help`. + +### Security +- Dashboard hardened against CSRF, SQL injection, and SVG-based XSS before public release. diff --git a/PLAN.md b/PLAN.md deleted file mode 100644 index c1ff49f..0000000 --- a/PLAN.md +++ /dev/null @@ -1,286 +0,0 @@ -# Keddy — Full Implementation Plan - -## Context - -Keddy is a session intelligence tool for Claude Code. It captures every coding session, organizes transcripts into navigable timelines with plan version tracking, and provides MCP tools for Claude to search past sessions. It is NOT a memory injection layer — it's a session organizer with search. - -**Problem:** Claude Code sessions generate rich decision-making context (plans, revisions, architectural discussions, direction changes) that vanishes after the session ends. JSONL transcripts are unreadable. Compaction loses context. - -**Solution:** Auto-capture sessions via hooks, extract structure programmatically (plans, segments, milestones), store in SQLite, surface via dashboard + MCP tools. - -**Repository:** `emireaksay-8867/keddy` at `~/Documents/GitHub/keddy` -**Git identity:** Emir Enes Aksay / emire.aksay@gmail.com (per-repo config) -**GitHub auth:** emireaksay-8867 (active via `gh auth`) - ---- - -## Phase 1: Repository Scaffold - -Create `~/Documents/GitHub/keddy`, init git with per-repo config, create GitHub repo. - -**Files:** -- `package.json` — name: "keddy", version: "0.1.0", license: Apache-2.0, bin: keddy → dist/cli/index.js - - deps: better-sqlite3, @modelcontextprotocol/sdk, hono, @hono/node-server, zod, open - - devDeps: typescript, tsup, vite, vitest, @types/better-sqlite3, @types/node, tailwindcss, @tailwindcss/vite, react, react-dom, react-router, @types/react, @types/react-dom - - optionalDeps: @anthropic-ai/sdk -- `tsconfig.json` — target ES2022, module NodeNext, strict, outDir dist -- `tsup.config.ts` — entries: cli/index, capture/handler, mcp/server; format cjs; external better-sqlite3 -- `.gitignore`, `.nvmrc` (22), `.editorconfig` - -**Commit:** "chore: initialize keddy repository" - -## Phase 2: Shared Types - -- `src/types.ts` — All TypeScript interfaces: Session, Exchange, ToolCall, Plan, PlanStatus, Segment, SegmentType, Milestone, MilestoneType, Decision, CompactionEvent, SessionLink, ParsedExchange, KeddyConfig, AnalysisConfig, AnalysisFeature - -**Commit:** "feat: add shared type definitions" - -## Phase 3: Database Layer - -- `src/db/index.ts` — initDb(dbPath?), getDb(), closeDb(). Path: KEDDY_DB env || ~/.keddy/keddy.db. WAL mode, foreign_keys ON, busy_timeout 5000 -- `src/db/schema.ts` — 9 tables: sessions, exchanges, tool_calls, plans, segments, milestones, decisions, compaction_events, session_links. FTS5 virtual table on exchanges(user_prompt). Indexes on all foreign keys -- `src/db/queries.ts` — Insert/update/query prepared statements for all tables. Key queries: insertSession, upsertSession, insertExchange, insertToolCall, insertPlan, insertSegment, insertMilestone, insertDecision, insertCompactionEvent, getSession, getSessionExchanges, getSessionPlans, getSessionSegments, getSessionMilestones, searchSessions (FTS5), getRecentSessions, getStats, getConfig, setConfig - -**Commit:** "feat: add database layer with schema and queries" - -## Phase 4: JSONL Parser - -- `src/capture/parser.ts` — Core parser. Two modes: - 1. `parseTranscript(filePath)` — Full parse, returns all exchanges with tool calls, metadata - 2. `parseLatestExchanges(filePath, since?)` — Efficient tail parse for Stop hook - - **Detection rules (all deterministic):** - - User messages: `type === "user"` AND NOT `isCompactSummary` - - Assistant messages: `type === "assistant"` - - Tool uses: content blocks with `type === "tool_use"` (extract name, input, id) - - Tool results: content blocks with `type === "tool_result"` (match by tool_use_id) - - Plan mode enter: `tool_use.name === "EnterPlanMode"` - - Plan mode exit: `tool_use.name === "ExitPlanMode"` (input.plan has full text) - - Plan approved: tool_result contains `"User has approved your plan"` - - Plan rejected: tool_result contains `"doesn't want to proceed"` - - User feedback: parse text after `"the user said:\n"` in rejection tool_result - - User interrupt: text content `=== "[Request interrupted by user]"` or `"[Request interrupted by user for tool use]"` - - Compaction boundary: `type === "system" && subtype === "compact_boundary"` with `compactMetadata` - - Compact summary: `isCompactSummary === true` on subsequent user entry - - Session continuation: `forkedFrom` field on first entries - - SKIP: `type === "progress"`, `type === "queue-operation"`, `type === "file-history-snapshot"` - - Metadata: sessionId, cwd, gitBranch, version, slug, timestamp from entries - -**Commit:** "feat: add JSONL transcript parser" - -## Phase 5: Programmatic Analyzer - -- `src/capture/plans.ts` — Walk exchanges for EnterPlanMode/ExitPlanMode pairs. Extract plan text, detect approval/rejection, extract user feedback, assign version numbers. Return Plan[] -- `src/capture/segments.ts` — Sliding window (3 exchanges) segment detection: - - "planning": EnterPlanMode active - - "implementing": >=50% Edit/Write tool calls - - "testing": Bash with test/jest/vitest/pytest - - "debugging": tool errors + subsequent edits/discussion - - "exploring": mostly Read/Grep/Glob, no edits - - "discussion": no tool calls - - "pivot": user interrupt + direction change - - "deploying": git push/deploy commands - - Merge adjacent same-type. Min 2 exchanges per segment. Track files + tool counts per segment -- `src/capture/milestones.ts` — Regex on Bash tool inputs: - - `git commit -m` → commit + message - - `git push` → push + remote/branch - - `gh pr create` → PR - - `git checkout -b` → branch - - test commands → test_pass/test_fail based on error status -- `src/capture/github.ts` — Parse git remote URL → owner/repo. Construct commit/branch/file URLs. Optional: shell to `gh pr view --json` for PR enrichment if gh available - -**Commit:** "feat: add programmatic analyzer (segments, plans, milestones, github)" - -## Phase 6: Capture Handler (Hooks) - -- `src/capture/handler.ts` — Main hook entry, reads stdin JSON, routes by event: - - **SessionStart** (sync): upsert session, count previous sessions, write stdout with additionalContext nudge - - **Stop** (async): parse latest exchange from JSONL, store exchange + tool calls, detect new compaction boundaries - - **PostCompact** (async): store compaction event with summary from stdin `compact_summary` - - **SessionEnd** (async): mark session ended, run full transcript parse, run programmatic analysis (segments/plans/milestones), store all results, detect session links (shared files with recent sessions), optionally trigger AI analysis - -**Hook registration (4 hooks):** -```json -{ - "SessionStart": [{ "hooks": [{ "type": "command", "command": "node /path/dist/capture/handler.js SessionStart" }] }], - "Stop": [{ "hooks": [{ "type": "command", "command": "node /path/dist/capture/handler.js Stop", "async": true }] }], - "PostCompact": [{ "matcher": ".*", "hooks": [{ "type": "command", "command": "node /path/dist/capture/handler.js PostCompact", "async": true }] }], - "SessionEnd": [{ "matcher": ".*", "hooks": [{ "type": "command", "command": "node /path/dist/capture/handler.js SessionEnd", "async": true }] }] -} -``` - -**Reference:** `/Users/emiraksay/Documents/GitHub/KEDDY/cli/lib/claude.js` for hook install/remove pattern -**Reference:** `/Users/emiraksay/Documents/GitHub/KEDDY/hooks/capture.js` for stdin parsing and stdout injection - -**Commit:** "feat: add capture handler with 4 Claude Code hooks" - -## Phase 7: CLI - -- `src/cli/index.ts` — Entry point with shebang, command router -- `src/cli/init.ts` — Check ~/.claude exists, create ~/.keddy/, init DB, install hooks into ~/.claude/settings.json (reuse pattern from mano's claude.js), register MCP in project .mcp.json, offer historical import -- `src/cli/open.ts` — Start dashboard server, open browser via `open` package -- `src/cli/status.ts` — Show hook status, session count, DB size, MCP registration -- `src/cli/config.ts` — Read/write ~/.keddy/config.json. `keddy config set analysis.apiKey sk-ant-...` -- `src/cli/import.ts` — Scan ~/.claude/projects/ for JSONL files, parse each, store sessions. Show progress. Handle duplicates (skip by session_id) - -**Commit:** "feat: add CLI (init, open, status, config, import)" - -## Phase 8: MCP Server - -- `src/mcp/server.ts` — McpServer with StdioServerTransport, 4 tools: - 1. `keddy_search_sessions(query, project?, days?, limit?)` — FTS5 search on exchanges + plan text + session titles - 2. `keddy_get_session(sessionId)` — Full session with segments, plans, milestones, decisions, compaction events - 3. `keddy_get_plans(sessionId?)` — Plan versions with text, feedback, status. Without sessionId: recent plans - 4. `keddy_recent_activity(days?)` — Session list with outcomes, default 7 days - -**Reference:** `/Users/emiraksay/Documents/GitHub/KEDDY/mcp/server.mjs` for McpServer pattern, zod schemas, textResult helper - -**Commit:** "feat: add MCP server with 4 session intelligence tools" - -## Phase 9: Dashboard API - -- `src/dashboard/server.ts` — Hono app with @hono/node-server, port 3737, CORS, static file serving -- `src/dashboard/routes/sessions.ts` — GET /api/sessions (list, search, paginate), GET /api/sessions/:id (detail), GET /api/sessions/:id/exchanges, POST /api/sessions/:id/title (rename), POST /api/sessions/:id/analyze -- `src/dashboard/routes/plans.ts` — GET /api/sessions/:id/plans -- `src/dashboard/routes/stats.ts` — GET /api/stats (overview numbers) -- `src/dashboard/routes/config.ts` — GET/PUT /api/config (settings page) - -**Commit:** "feat: add Hono dashboard API" - -## Phase 10: Dashboard Frontend - -**Tech:** React 19, Vite, Tailwind v4, shadcn/ui components, React Router v7 - -- `src/dashboard/app/main.tsx` — React entry -- `src/dashboard/app/App.tsx` — Router: /, /sessions/:id, /sessions/:id/plans, /settings -- `src/dashboard/app/lib/api.ts` — Fetch wrappers -- `src/dashboard/app/lib/types.ts` — Frontend types -- `src/dashboard/app/lib/constants.ts` — Segment colors, tool icons, type labels -- `index.html` — Vite entry HTML -- `vite.config.ts` — Proxy /api to localhost:3737, build output to dist/dashboard/public -- `src/dashboard/app/pages/Sessions.tsx` — Session list with search, project filter, segment mini-bars -- `src/dashboard/app/pages/SessionDetail.tsx` — Vertical timeline with segment cards, inline plans, milestones, compaction markers. Click-to-expand exchanges -- `src/dashboard/app/pages/PlanViewer.tsx` — Plan versions tabbed/listed, full text, feedback, changes -- `src/dashboard/app/pages/Settings.tsx` — Config GUI: AI toggles per-feature/per-model, MCP status, data management -- `src/dashboard/app/components/Timeline.tsx` — Vertical timeline layout -- `src/dashboard/app/components/SegmentCard.tsx` — Type badge, exchange range, files, tool counts -- `src/dashboard/app/components/PlanCard.tsx` — Plan version with status, steps, feedback -- `src/dashboard/app/components/SessionCard.tsx` — List card with mini-bar, badges, metadata -- `src/dashboard/app/components/ExchangeView.tsx` — Expandable exchange with prompt, response, tool calls -- `src/dashboard/app/components/SearchBar.tsx` — Search + filters - -**Commit:** "feat: add React dashboard with sessions, timeline, plans, settings" - -## Phase 11: AI Analysis Layer (Optional) - -- `src/analysis/index.ts` — Orchestrator: check config, run enabled features, store results -- `src/analysis/providers.ts` — Provider abstraction (Anthropic via SDK, OpenAI-compatible for ollama) -- `src/analysis/titles.ts` — Generate session title from first/last exchanges -- `src/analysis/summaries.ts` — Generate segment summaries -- `src/analysis/decisions.ts` — Extract decision points with context - -**Config structure:** -```json -{ - "analysis": { - "enabled": false, - "provider": "anthropic", - "apiKey": "", - "features": { - "sessionTitles": { "enabled": true, "model": "claude-haiku-4-5-20251001" }, - "segmentSummaries": { "enabled": true, "model": "claude-haiku-4-5-20251001" }, - "decisionExtraction": { "enabled": false, "model": "claude-haiku-4-5-20251001" }, - "planDiffAnalysis": { "enabled": false, "model": "claude-sonnet-4-6" }, - "sessionNotes": { "enabled": false, "model": "claude-sonnet-4-6" } - } - } -} -``` - -**Commit:** "feat: add optional AI analysis with configurable providers and features" - -## Phase 12: Tests - -Using vitest. - -**Fixtures:** -- `tests/fixtures/sample-session.jsonl` — 5 exchanges, Read/Edit/Bash tools, git commit -- `tests/fixtures/sample-with-plans.jsonl` — EnterPlanMode/ExitPlanMode, approval, rejection with feedback, 2 plan versions -- `tests/fixtures/sample-with-compaction.jsonl` — compact_boundary + isCompactSummary entries -- `tests/fixtures/sample-interrupt.jsonl` — [Request interrupted by user] + direction change - -**Test files:** -- `tests/parser.test.ts` — Exchange extraction, tool call detection, plan mode detection, compaction detection, interrupt detection, metadata extraction, skip types -- `tests/plans.test.ts` — Plan text extraction, approval/rejection, user feedback parsing, version numbering -- `tests/segments.test.ts` — Implementing sequences, exploring, debugging, pivots, minimum segment size, window merging -- `tests/milestones.test.ts` — Git commit/push/PR/branch regex, test command detection -- `tests/github.test.ts` — SSH and HTTPS remote URL parsing, URL construction -- `tests/db.test.ts` — Insert session → exchanges → query back, FTS5 search, stats -- `tests/mcp.test.ts` — Tool handler responses with mock DB data - -**vitest.config.ts** at project root. - -**Commit:** "test: add comprehensive test suite with fixtures" - -## Phase 13: Documentation + Open Source Files - -- `CLAUDE.md` — Project instructions: architecture, conventions, DB schema, how hooks work, testing, what NOT to do -- `README.md` — Professional README: description, badges, screenshot placeholder, quick start (npx keddy init), feature list, architecture diagram, configuration, MCP tools docs -- `docs/DECISIONS.md` — All decisions from this planning session (why session intelligence not memory, why programmatic-first, why local-first, why Apache 2.0, pricing tiers, competitor analysis) -- `docs/PRODUCT.md` — Free/Pro/Team tier definitions with features -- `docs/ARCHITECTURE.md` — Technical design document -- `docs/COMPETITORS.md` — Mem0, Zep, Claude-Mem comparison -- `LICENSE` — Apache 2.0 full text -- `CONTRIBUTING.md` — Dev setup, PR process, coding standards -- `CODE_OF_CONDUCT.md` — Contributor Covenant v2.1 -- `SECURITY.md` — Vulnerability reporting -- `CHANGELOG.md` — v0.1.0 notes -- `.env.example` — KEDDY_DB, ANTHROPIC_API_KEY (optional) -- `.github/workflows/ci.yml` — lint + typecheck + test + build on push/PR -- `.github/workflows/release.yml` — npm publish on tag -- `.github/ISSUE_TEMPLATE/bug_report.yml` -- `.github/ISSUE_TEMPLATE/feature_request.yml` -- `.github/ISSUE_TEMPLATE/config.yml` -- `.github/PULL_REQUEST_TEMPLATE.md` -- `.github/CODEOWNERS` - -**Commit:** "docs: add documentation, LICENSE, and GitHub templates" - -## Phase 14: Plugin + Package Prep - -- `.claude-plugin/plugin.json` — `{ "name": "keddy", "version": "0.1.0", "description": "Session intelligence for Claude Code" }` -- `hooks/hooks.json` — Hook definitions in plugin format (alternative to settings.json registration) -- Final `package.json` adjustments: files field, prepublishOnly script, engines - -**Commit:** "chore: prepare npm package and Claude Code plugin" - -## Phase 15: Create GitHub Repo + Push - -- `gh repo create emireaksay-8867/keddy --public --description "Session intelligence for Claude Code"` -- `git remote add origin` -- `git push -u origin main` - -## Phase 16: End-to-End Testing + Iteration - -After all code is written: -1. Build: `npm run build` -2. Run tests: `npm test` — iterate until all pass -3. Test `keddy init` manually — verify hooks in ~/.claude/settings.json -4. Start a Claude Code session — verify Stop hook captures exchanges -5. Run `keddy open` — verify dashboard renders sessions -6. Test MCP tools — verify keddy_search_sessions returns results -7. Test historical import — verify existing sessions are imported -8. Fix any issues found, re-run tests -9. Iterate until everything works cleanly - ---- - -## Key Technical Decisions - -1. **3 hooks (not 4):** SessionStart (sync), Stop (async), SessionEnd (async). PostCompact may or may not work reliably — detect compaction from JSONL parsing instead (always reliable since the data is in the file). If PostCompact works, add it as a bonus. -2. **No bridge.cjs pattern:** Unlike Mano's compiled bridge, Keddy uses tsup to compile directly. No intermediate CJS bundle needed. -3. **Single database:** ~/.keddy/keddy.db stores all projects. Project isolation via `project_path` column. Better for cross-project search. -4. **Programmatic-first:** All core features work without AI. AI is enhancement layer, never required. -5. **FTS5 for search:** Full-text search on exchange prompts and plan text. No embeddings needed for MVP. -6. **Dashboard port 3737:** Avoids conflicts with common ports. -7. **Session title:** First user prompt truncated to 80 chars (programmatic). AI-generated title when analysis enabled. diff --git a/docs/facts-first-design.md b/docs/facts-first-design.md deleted file mode 100644 index bb95870..0000000 --- a/docs/facts-first-design.md +++ /dev/null @@ -1,643 +0,0 @@ -# Facts-First Data Foundation — Full Design - -## Philosophy - -Every piece of data Keddy stores falls into one of two categories: - -1. **Facts** — directly observable from the JSONL. No interpretation. A tool was called, a model was used, N tokens were consumed, the user interrupted. These are always correct. - -2. **Interpretations** — what the user was *trying to do*. Debugging, exploring, implementing. These require judgment and belong exclusively to the AI layer. - -The current system mixes these. The proposed system separates them completely. - ---- - -## All Definitive Signals from Claude Code - -### Per-Exchange Signals (from JSONL) - -| Signal | Source Field | What It Tells You | Currently Captured? | -|--------|-------------|-------------------|-------------------| -| **Model** | `message.model` | Which Claude model responded | NO | -| **Input tokens** | `message.usage.input_tokens` | Context size | NO | -| **Output tokens** | `message.usage.output_tokens` | Response size | NO | -| **Cache read tokens** | `message.usage.cache_read_input_tokens` | Prompt cache hits | NO | -| **Cache write tokens** | `message.usage.cache_creation_input_tokens` | New cache entries | NO | -| **Stop reason** | `message.stop_reason` | "end_turn", "tool_use", null (interrupted) | NO | -| **Permission mode** | `permissionMode` | "default", "acceptEdits", "bypassPermissions" | NO | -| **Is sidechain** | `isSidechain` | Inside a subagent? | NO | -| **Entrypoint** | `entrypoint` | "cli", "claude-vscode", "cursor" | NO (session only) | -| **CWD** | `cwd` | Working directory (can change mid-session) | NO (session only) | -| **Git branch** | `gitBranch` | Branch (can change mid-session) | NO (session only) | -| **Tool calls** | `content[].tool_use` | Exact tools used | YES (names + input) | -| **Tool errors** | `tool_result.is_error` | Which tools failed | YES | -| **Is interrupt** | Message text match | User pressed Escape | YES | -| **Is compact summary** | `isCompactSummary` flag | Post-compaction exchange | YES | -| **Timestamp** | `timestamp` | When it happened | YES | -| **Has images** | `content[].type === "image"` | Screenshots/images attached | Partial (count only) | -| **Has thinking** | `content[].type === "thinking"` | Extended thinking used | NO (filtered out) | - -### Per-Tool-Call Signals - -| Signal | Source | What It Tells You | Currently Captured? | -|--------|--------|-------------------|-------------------| -| **Skill invocation** | `Skill` tool, input.skill | User ran /commit, /review-pr, etc. | NO | -| **Subagent spawn** | `Agent` tool, input.subagent_type | "Explore", "Plan", "general-purpose" | NO (just counted) | -| **Subagent description** | `Agent` tool, input.description | What the subagent was tasked with | NO | -| **Plan mode enter** | `EnterPlanMode` tool | Entered plan mode | YES | -| **Plan mode exit** | `ExitPlanMode` tool + result | Left plan mode + approval/rejection | YES | -| **Task created** | `TaskCreate` tool | New task tracked | YES | -| **Task updated** | `TaskUpdate` tool | Task status changed | YES | -| **Web search** | `WebSearch` tool, input.query | Research query | NO (just tool name) | -| **Web fetch** | `WebFetch` tool, input.url | URL fetched for research | NO (just tool name) | -| **MCP tool** | `mcp__*` tool name | External tool/service used | Partial (counted) | -| **File read** | `Read` tool, input.file_path | Which file was read | Partial (in segments) | -| **File edit** | `Edit` tool, input.file_path | Which file was modified | Partial (in segments) | -| **File write** | `Write` tool, input.file_path | Which file was created/rewritten | Partial (in segments) | -| **File search** | `Glob`/`Grep` tool | What patterns were searched | NO (just tool name) | -| **Bash command** | `Bash` tool, input.command | Exact command run | Partial (for milestones) | -| **Bash description** | `Bash` tool, input.description | What the command does | NO | -| **Git operations** | Bash regex | commit, push, PR, branch | YES (milestones) | -| **Test runs** | Bash regex | test command + pass/fail | YES (milestones) | - -### Per-Session Signals - -| Signal | Source | Currently Captured? | -|--------|--------|-------------------| -| **Session ID** | `sessionId` | YES | -| **Slug** | `slug` | YES | -| **Claude version** | `version` | YES | -| **JSONL path** | Hook stdin | YES | -| **Project path** | Hook stdin / `cwd` | YES | -| **Forked from** | `forkedFrom` field | YES | -| **Custom title** | `custom-title` entry | YES | -| **Entrypoint** | First entry's `entrypoint` | Partial (not tracked if changes) | - -### Session-Level Events (from system messages) - -| Signal | Source | Currently Captured? | -|--------|--------|-------------------| -| **Compaction** | `compact_boundary` subtype | YES | -| **Pre-compaction tokens** | `compactMetadata.preTokens` | YES | -| **Exchanges before/after** | `compactMetadata` | YES | -| **Turn duration** | `turn_duration` subtype | NO | -| **Hook errors** | `hookErrors` array | NO | -| **Queue enqueue** | `queue-operation` type | NO | -| **Queue dequeue** | `queue-operation` type | NO | - ---- - -## What to Capture (New) - -### Priority 1: High-value, easy to extract - -These are single fields we're already parsing past but not storing: - -``` -Per exchange: - model → new column on exchanges - input_tokens → new column on exchanges - output_tokens → new column on exchanges - cache_read_tokens → new column on exchanges - cache_write_tokens → new column on exchanges - stop_reason → new column on exchanges - has_thinking → new column on exchanges (boolean) - permission_mode → new column on exchanges -``` - -### Priority 2: Tool-level enrichment - -Extract specific high-value fields from tool inputs: - -``` -Per tool call: - skill_name → extracted when tool_name === "Skill" (input.skill) - subagent_type → extracted when tool_name === "Agent" (input.subagent_type) - subagent_desc → extracted when tool_name === "Agent" (input.description) - web_query → extracted when tool_name === "WebSearch" (input.query) - web_url → extracted when tool_name === "WebFetch" (input.url) - file_path → extracted from Read/Edit/Write/Glob/Grep (input.file_path or input.path) - bash_command → extracted when tool_name === "Bash" (input.command) - bash_description → extracted when tool_name === "Bash" (input.description) -``` - -These don't need new tables — they're structured fields on tool_calls, or a new lightweight `exchange_facts` view. - -### Priority 3: Session-level events - -``` - turn_durations → extract from system messages with subtype "turn_duration" - queue_operations → count enqueues per exchange gap (user thinking time) - branch_changes → detect when gitBranch changes mid-session - cwd_changes → detect when cwd changes mid-session - entrypoint → per-exchange (detect IDE switches) -``` - ---- - -## Grouping: Boundaries Not Labels - -### Definitive Boundaries (always split) - -These are observable events that naturally divide a session: - -| Boundary | Signal | Confidence | -|----------|--------|-----------| -| **Plan mode** | EnterPlanMode / ExitPlanMode | 100% — tool call | -| **Compaction** | compact_boundary system message | 100% — system event | -| **User interrupt** | is_interrupt flag | 100% — user action | -| **Milestone** | Git commit/push/PR/branch, test pass/fail | 95% — regex on bash | -| **Skill invocation** | Skill tool call | 100% — tool call | -| **Branch change** | gitBranch field differs | 100% — observable | -| **Model switch** | model field differs | 100% — observable | - -### Soft Boundaries (configurable, suggest split) - -| Boundary | Signal | Heuristic | -|----------|--------|-----------| -| **File focus shift** | files_written set changes completely | Medium — could be same task | -| **Long pause** | >10 min gap between exchanges | Medium — user might have been reading | -| **Tool pattern shift** | Went from all-reads to all-edits | Low — still heuristic | - -### What Each Group Contains (facts only) - -```typescript -interface ActivityGroup { - // Identity - exchange_start: number; - exchange_end: number; - started_at: string; - ended_at: string; - - // What happened (counts) - exchange_count: number; - tool_counts: Record; // { Read: 5, Edit: 3, Bash: 2 } - error_count: number; - - // What was touched - files_read: string[]; - files_written: string[]; - - // Cost / effort - total_input_tokens: number; - total_output_tokens: number; - total_cache_read_tokens: number; - duration_ms: number; - - // Definitive markers present in this group - markers: GroupMarker[]; - // e.g. { type: "plan_mode", at: 3 } - // e.g. { type: "skill", name: "commit", at: 7 } - // e.g. { type: "milestone", kind: "test_pass", at: 8 } - // e.g. { type: "web_research", queries: ["react server components"], at: 5 } - // e.g. { type: "subagent", subagent_type: "Explore", at: 4 } - - // What split this group from the next - boundary: BoundaryType; - - // AI layer (optional, always null without AI) - ai_summary: string | null; - ai_label: string | null; -} -``` - ---- - -## User Scenarios: Dashboard - -### Scenario 1: "What did I do today?" - -**Current**: List of sessions with guessed segment chips -``` -[discuss] › [plan] › [build] › [debug] › [build] -``` -You can't tell sessions apart. Every session looks like plan→build→debug. - -**Proposed without AI**: -``` -Session #142 · keddy · main · 45m · 12 exchanges -██░░░░░░░░░░░█████████████░░████████░░██████████ -↑plan ● commit ✓ test -opus 4.6 · 180k tokens · 8 files - -Session #141 · keddy · feat/settings · 1h 20m · 28 exchanges -░░░░██░░░░░░░░░░░██████████████████░░░░████░░░░░ - ↑plan /commit ✗ test /review-pr -opus 4.6 · 340k tokens · 14 files · ⑂ PR created -``` - -Now you can immediately see: session 142 was a quick focused fix (one commit, tests passed). Session 141 was longer, tests failed, and ended with a PR. The activity strip shows the *shape* of work — heavy editing in the middle, reading at the start. - -**Proposed with AI** — same strips, plus: -``` -Fix parser token counting 2h ago -...strip... -✦ Fixed token extraction from JSONL, added cache field support - -Add dashboard settings page 5h ago -...strip... -✦ Built settings UI with API key config, hit test failures on validation -``` - -### Scenario 2: "How much did this session cost me?" - -**Current**: Not possible. Tokens aren't stored. - -**Proposed** — Stats tab on session detail: -``` -Token Usage - Input: 142,000 tokens - Output: 38,000 tokens - Cache: 98,000 read (69% hit rate) · 44,000 created - - Estimated cost: ~$2.40 (opus 4.6 pricing) - - Token flow: - ▁▂▃▅▇█▇▅▃▁ ← spikes at heavy exchanges - 3:15 4:00 -``` - -This is **huge** for users who care about costs. They can see which sessions burn tokens and which are cache-efficient. - -### Scenario 3: "Where did my time go?" - -**Current**: You see segment labels but no timing or effort data. - -**Proposed** — Each group shows duration + tokens: -``` -○ 2 exchanges · 3 min · 12k tokens 3:15 PM - Tools: — - -◈ 1 exchange · 2 min · 8k tokens ↑plan 3:18 PM - Tools: EnterPlanMode, ExitPlanMode - -◉ 4 exchanges · 15 min · 62k tokens 3:20 PM - Tools: Read ×3, Grep ×2, Edit ×4, Write ×1, Bash ×2 - Files: parser.ts (4 edits), types.ts (2 edits) -``` - -The 15-minute, 62k-token group is where the real work happened. You can see it immediately. No label needed. - -### Scenario 4: "I used /commit but the timeline shows 'discussion'" - -**Current**: Skill invocations aren't detected. A `/commit` with no Edit tools = "discussion". - -**Proposed**: Skills are definitive markers: -``` -◈ 1 exchange · 1 min · 5k tokens /commit 3:50 PM - Tools: Skill(commit), Bash(git) - Milestone: ● commit "fix: parser token counting" -``` - -The `/commit` skill is shown as a marker on the group. Same for `/review-pr`, `/plan`, `/simplify`, etc. These are **user-initiated workflows** — the most definitive signal of intent possible. - -### Scenario 5: "Show me all the research Claude did" - -**Current**: WebSearch/WebFetch are just tool names, no detail. - -**Proposed**: Web research queries are extracted: -``` -◉ 3 exchanges · 8 min · 45k tokens 2:30 PM - Tools: WebSearch ×2, WebFetch ×3, Read ×1 - 🔍 "react server components streaming" - 🔍 "next.js app router data fetching patterns" - 🌐 https://nextjs.org/docs/app/building-your-application/data-fetching - 🌐 https://react.dev/reference/rsc/server-components - 🌐 https://vercel.com/blog/understanding-react-server-components -``` - -You can see exactly what was researched. Not "querying" — the actual queries and URLs. - -### Scenario 6: "Claude spawned a bunch of subagents, what were they doing?" - -**Current**: Agent tool calls are just counted. Subagent type/description lost. - -**Proposed**: Subagent spawns are markers with detail: -``` -◉ 2 exchanges · 12 min · 95k tokens 4:10 PM - Tools: Agent ×3, Read ×2, Edit ×5 - 🔀 Explore: "Find all authentication middleware" - 🔀 Explore: "Analyze database migration patterns" - 🔀 general-purpose: "Run test suite and fix failures" - Files: auth.ts (3 edits), middleware.ts (2 edits) -``` - -### Scenario 7: "Which model was used?" - -**Current**: Only claude_version at session level. No per-exchange model. - -**Proposed**: Model shown per group, highlighted when it changes: -``` -◉ 4 exchanges · 15 min · 62k tokens opus 4.6 3:20 PM - ... - -◉ 2 exchanges · 3 min · 8k tokens haiku 4.5 3:35 PM ← model switch! - ... - -◉ 3 exchanges · 10 min · 45k tokens opus 4.6 3:38 PM - ... -``` - -Users using `/fast` mode or mixed models can see exactly where model switches happen and correlate with output quality. - -### Scenario 8: "How cache-efficient was this session?" - -**Current**: Not possible. - -**Proposed** — Stats tab: -``` -Cache Efficiency - ████████████████░░░░ 82% cache hit rate - - Read from cache: 312,000 tokens (saved ~$3.12) - Created cache: 68,000 tokens - Cold input: 42,000 tokens - - Cache efficiency over time: - ░▓▓█████████████████ ← cold start, then high cache reuse - exchange 1 exchange 12 -``` - -### Scenario 9: "The session got compacted — did I lose context?" - -**Current**: Shows compaction marker with token count. - -**Proposed**: Compaction is a definitive boundary with before/after: -``` -━━━ Compaction ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - Tokens: 180,000 → 45,000 (75% reduction) - Exchanges: 12 → 4 (8 compressed) - - After compaction: - ◉ 3 exchanges · 8 min · 32k tokens 4:15 PM - Tools: Read ×4, Edit ×2, Bash ×1 (1 error) - ↑ error rate increased after compaction -``` - -The "error rate increased after compaction" is an AI insight. Without AI, you just see the factual error count — but the data is there for the user to notice it themselves. - -### Scenario 10: "I want to see all the files I touched" - -**Current**: Files shown per segment, but segment grouping is wrong so file associations are wrong. - -**Proposed** — Files tab or section: -``` -Files Touched (8 files) - - parser.ts ████████████ 7 edits, 5 reads - types.ts ████░░░░░░░░ 2 edits, 1 read - schema.ts ████░░░░░░░░ 2 edits, 1 read - queries.ts ██░░░░░░░░░░ 1 edit, 1 read - handler.ts ░░░░░░░░░░░░ 0 edits, 3 reads (read-only) - server.ts ░░░░░░░░░░░░ 0 edits, 2 reads (read-only) - package.json ░░░░░░░░░░░░ 0 edits, 1 read (read-only) - tsconfig.json ░░░░░░░░░░░░ 0 edits, 1 read (read-only) -``` - -This is purely factual — extracted from tool inputs. No guessing needed. - -### Scenario 11: "What skills did I use in this project?" - -**Current**: Not tracked at all. - -**Proposed** — project-level aggregation: -``` -Skills Used (last 30 days) - /commit ████████████ 23 times across 15 sessions - /review-pr ████░░░░░░░░ 8 times across 6 sessions - /plan ███░░░░░░░░░ 5 times across 5 sessions - /simplify █░░░░░░░░░░░ 2 times across 2 sessions -``` - ---- - -## User Scenarios: MCP Layer - -The MCP layer serves **Claude itself** — giving it context about past work. Facts-first makes this dramatically better. - -### Scenario M1: "Continue where I left off" - -**Current `keddy_project_status`**: Returns active plan text + pending tasks + last milestone. - -**Proposed**: Same tool, richer response: -```json -{ - "last_session": { - "id": "abc123", - "ended_at": "2025-03-24T14:30:00Z", - "duration_min": 45, - "exchange_count": 12, - "model": "claude-opus-4-6", - "last_group": { - "tools": { "Edit": 3, "Bash": 1 }, - "files_written": ["schema.ts", "queries.ts"], - "last_milestone": { "type": "test_pass", "at": "14:28" }, - "stop_reason": "end_turn" - } - }, - "active_plan": { ... }, - "pending_tasks": [ ... ], - "recent_milestones": [ ... ] -} -``` - -Claude can now say: "Last session ended after passing tests on schema.ts and queries.ts. The plan was approved and implemented. Picking up from there..." - -### Scenario M2: "What was tried before that didn't work?" - -**Current**: Search transcripts for keywords. Tool errors are stored but not queryable. - -**Proposed new tool — `keddy_file_history`**: -``` -Input: { file: "src/parser.ts", days: 7 } - -Output: - Session #142 (2h ago): 7 edits, 5 reads. Last milestone: ✓ tests passed. - Session #138 (yesterday): 4 edits, 2 reads, 2 bash errors. No test run. - Session #135 (3 days ago): 1 edit, 8 reads. Exploring only. - - Tool errors on this file: - Session #138, exchange 8: Bash error — "TypeError: Cannot read property 'tokens' of undefined" - Session #138, exchange 9: Bash error — "Test failed: expected 42, got undefined" -``` - -Claude now knows: "parser.ts had errors in session #138 that weren't resolved with tests. Session #142 fixed it. Let me check what changed between them." - -### Scenario M3: "How should I approach this file?" - -**Proposed new tool — `keddy_tool_patterns`**: -``` -Input: { file: "src/db/schema.ts" } - -Output: - Typical workflow for this file: - Read (5 sessions) → Edit (4 sessions) → Bash: npm test (3 sessions) - - Co-edited files: queries.ts (4/5 sessions), types.ts (3/5 sessions) - - Common tools: Edit (18 calls), Read (12 calls), Bash (8 calls) - Test runs after edits: 3/4 sessions (75%) - Test pass rate: 2/3 (67%) -``` - -Claude learns the patterns: "When I edit schema.ts, I usually need to update queries.ts and types.ts too, and I should run tests after." - -### Scenario M4: "Am I on the right track?" - -**Proposed enhancement to `keddy_project_status`**: -```json -{ - "session_so_far": { - "exchange_count": 8, - "total_tokens": 120000, - "error_count": 3, - "compaction_count": 0, - "files_written": ["parser.ts"], - "files_read": ["parser.ts", "types.ts", "handler.ts", "schema.ts"], - "interrupts": 1, - "skills_used": [], - "plan_status": "approved", - "tasks_completed": 2, - "tasks_pending": 3 - } -} -``` - -Claude can self-assess: "I've hit 3 errors and the user interrupted once. I should re-read the plan and check if I'm still aligned." - -### Scenario M5: "What research was done?" - -**Proposed new tool — `keddy_search_research`**: -``` -Input: { query: "server components", days: 30 } - -Output: - Session #130 (5 days ago): - 🔍 "react server components streaming" - 🔍 "next.js app router data fetching patterns" - 🌐 https://nextjs.org/docs/app/building-your-application/data-fetching - 🌐 https://react.dev/reference/rsc/server-components - Context: Working on API route migration - - Session #125 (12 days ago): - 🔍 "react server components vs client components performance" - 🌐 https://vercel.com/blog/understanding-react-server-components - Context: Initial architecture planning -``` - -Claude doesn't re-research things it already found. It can reference past URLs and findings. - -### Scenario M6: "What decisions led to this architecture?" - -**Proposed new tool — `keddy_decision_trail`**: -``` -Input: { file: "src/db/schema.ts", include_plans: true } - -Output: - Plan v1 (Session #120): "Use SQLite with WAL mode, single file at ~/.keddy/keddy.db" - Status: approved → implemented - - Plan v3 (Session #128): "Add FTS5 for full-text search on exchanges" - Status: approved → implemented - User feedback: "don't add trigram, FTS5 is enough" - - Plan v5 (Session #135): "Add exchange_facts columns for token tracking" - Status: drafted (current) - - Related decisions (AI-extracted): - "Chose better-sqlite3 over sql.js for performance" - "WAL mode for concurrent read/write from hooks" -``` - -### Scenario M7: SessionStart hook context injection - -**Current**: Returns text blob with plan excerpt + task list. - -**Proposed**: Returns structured facts: -``` -You're continuing work on keddy (main branch). - -Last session (#142, 2h ago): - - 12 exchanges, 45 min, opus 4.6 - - Ended after: ✓ tests passed, ● committed "fix: parser token counting" - - Files modified: parser.ts, types.ts, schema.ts, queries.ts - - Stop reason: end_turn (completed normally) - -Active plan (v3, approved): - "Add exchange_facts columns for token/model/duration tracking..." - -Pending tasks: - ○ Add model column to exchanges table - ○ Extract token counts from JSONL usage field - ◐ Update parser to capture stop_reason (in progress) - -Recent errors (last 3 sessions): - None — clean run streak - -This file was last modified: schema.ts (2h ago), queries.ts (2h ago) -``` - -Claude starts with complete context. No guessing, no stale assumptions. - ---- - -## What Gets Removed - -| Current Feature | What Happens | Why | -|----------------|-------------|-----| -| `classifyExchange()` | **Deleted** | Heuristic guessing — the core problem | -| `SegmentType` enum (10 types) | **Deleted** | "discussion", "implementing" etc. are interpretations | -| Segment merging logic | **Replaced** | Boundary-based splitting instead | -| Singleton segment absorption | **Deleted** | Artifacts of bad classification | -| `segment_type` column | **Kept but nullable** | Only populated by AI layer | -| Tool proportion thresholds | **Deleted** | "≥40% reads = exploring" is a guess | - -## What Gets Added - -| New Feature | Type | Purpose | -|------------|------|---------| -| `model` on exchanges | Column | Per-exchange model tracking | -| `input_tokens`, `output_tokens` | Columns | Token usage | -| `cache_read_tokens`, `cache_write_tokens` | Columns | Cache efficiency | -| `stop_reason` | Column | How the turn ended | -| `has_thinking` | Column | Extended thinking used | -| `permission_mode` | Column | User's permission stance | -| `skill_name` on tool_calls | Column | Extracted skill name | -| `subagent_type` on tool_calls | Column | Extracted subagent type | -| `file_path` on tool_calls | Column | Extracted file path | -| `bash_command` on tool_calls | Column | Extracted bash command | -| Activity groups (boundary-based) | Logic | Replace heuristic segments | -| `boundary_type` on segments | Column | What caused the split | -| `ai_label` on segments | Column | AI-only classification | -| Token aggregation queries | Queries | Cost analysis | -| File operation queries | Queries | File-centric views | -| Skill/subagent queries | Queries | Workflow tracking | - -## What Stays the Same - -| Feature | Why It Stays | -|---------|-------------| -| **Plans** | Based on definitive EnterPlanMode/ExitPlanMode signals | -| **Plan status** | Based on exact string matches + implementation tracking | -| **Milestones** | Based on git/test command regex (high confidence) | -| **Tasks** | Based on TaskCreate/TaskUpdate tool calls | -| **Compaction events** | Based on compact_boundary system messages | -| **FTS search** | Searches actual content | -| **Session links** | Based on forkedFrom field | -| **Exchange content** | Direct from JSONL | - ---- - -## Migration Path - -This is an additive change. Nothing breaks: - -1. **Add new columns** to exchanges and tool_calls (nullable, backfill later) -2. **Update parser** to extract new fields from JSONL -3. **Add boundary-based grouping** alongside existing segments -4. **Update dashboard** to use new data (activity strips, tool breakdowns, stats) -5. **Update MCP tools** to return richer data -6. **Deprecate** classifyExchange but keep segment_type column for AI -7. **Backfill** existing sessions by re-parsing their JSONL files (import command already exists) - -No data loss. No schema breaks. Existing sessions get richer when re-parsed. diff --git a/docs/ui-scenarios.md b/docs/ui-scenarios.md deleted file mode 100644 index 4ecacc0..0000000 --- a/docs/ui-scenarios.md +++ /dev/null @@ -1,524 +0,0 @@ -# UI Scenarios: Facts-First Data Foundation - -## The Core Shift - -**Current**: Timeline organized by guessed labels → "Discussion → Implementing → Debugging" -**Proposed**: Timeline organized by what actually happened → tools, files, tokens, timing — with AI labels as an optional overlay - ---- - -## 1. Sessions List - -### Current -``` -┌─────────────────────────────────────────────────────────────────┐ -│ keddy-project · 34 sessions Last: 2h ago│ -│ ┌─────────────────────────────────────────────────────────────┐ │ -│ │ Filter sessions... │ │ -│ └─────────────────────────────────────────────────────────────┘ │ -│ │ -│ Today │ -│ ┌─────────────────────────────────────────────────────────────┐ │ -│ │ Fix parser token counting 2h ago │ │ -│ │ keddy · main · 45m · 12 exchanges │ │ -│ │ [plan] › [build] › [test] › [debug] › [build] │ │ -│ └─────────────────────────────────────────────────────────────┘ │ -│ ┌─────────────────────────────────────────────────────────────┐ │ -│ │ Add dashboard settings page 5h ago │ │ -│ │ keddy · feat/settings · 1h 20m · 28 exchanges │ │ -│ │ [discuss] › [plan] › [build] › [build] › [test] +2 │ │ -│ └─────────────────────────────────────────────────────────────┘ │ -│ │ -│ These segment labels are GUESSES. "discuss" just means │ -│ no tools were used. "build" just means an Edit tool was used. │ -│ A skill invocation with no tool calls = "discuss". Wrong. │ -└─────────────────────────────────────────────────────────────────┘ -``` - -### Proposed — Without AI -``` -┌─────────────────────────────────────────────────────────────────┐ -│ keddy-project · 34 sessions Last: 2h ago│ -│ ┌─────────────────────────────────────────────────────────────┐ │ -│ │ Filter sessions... │ │ -│ └─────────────────────────────────────────────────────────────┘ │ -│ │ -│ Today │ -│ ┌─────────────────────────────────────────────────────────────┐ │ -│ │ Session #142 2h ago │ │ -│ │ keddy · main · 45m · 12 exchanges │ │ -│ │ │ │ -│ │ ██░░░░░░░░░░░░░█████████████████░░████████░░██████████ │ │ -│ │ ↑plan read edit edit edit edit bash edit edit │ │ -│ │ ✓test │ │ -│ │ │ │ -│ │ opus 4.6 · 180k tokens · 8 files · ● commit ✓ tests │ │ -│ └─────────────────────────────────────────────────────────────┘ │ -│ │ -│ The bar is a FACTUAL activity strip. Each segment is sized │ -│ proportional to exchange count. Colors represent tool types, │ -│ not guessed intent. Milestones sit on the bar where they │ -│ happened. No interpretation — just what occurred. │ -│ │ -│ ┌─────────────────────────────────────────────────────────────┐ │ -│ │ Session #141 5h ago │ │ -│ │ keddy · feat/settings · 1h 20m · 28 exchanges │ │ -│ │ │ │ -│ │ ░░░░██░░░░░░░░░░░░░░░░░██████████████████████░░░░████░░ │ │ -│ │ chat plan read read read edit edit edit edit bash bash │ │ -│ │ ✗test │ │ -│ │ │ │ -│ │ opus 4.6 · 340k tokens · 14 files · ⑂ branch ✗ tests │ │ -│ └─────────────────────────────────────────────────────────────┘ │ -│ │ -│ Title is generic (session number) without AI. But you see │ -│ everything that matters: what tools ran, how many tokens, │ │ -│ what milestones happened, how many files were touched. │ -└─────────────────────────────────────────────────────────────────┘ -``` - -### Proposed — With AI -``` -┌─────────────────────────────────────────────────────────────────┐ -│ Today │ -│ ┌─────────────────────────────────────────────────────────────┐ │ -│ │ Fix parser token counting 2h ago │ │ -│ │ keddy · main · 45m · 12 exchanges │ │ -│ │ │ │ -│ │ ██░░░░░░░░░░░░░█████████████████░░████████░░██████████ │ │ -│ │ ↑plan read edit edit edit edit bash edit edit │ │ -│ │ ✓test │ │ -│ │ │ │ -│ │ opus 4.6 · 180k tokens · 8 files · ● commit ✓ tests │ │ -│ │ │ │ -│ │ AI: planned → implemented fix → tested → fixed edge case │ │ -│ └─────────────────────────────────────────────────────────────┘ │ -│ │ -│ Same factual strip. AI adds the title and a one-line narrative │ -│ below. The narrative is clearly marked as AI-generated. │ -│ Remove it and the session card still makes complete sense. │ -└─────────────────────────────────────────────────────────────────┘ -``` - ---- - -## 2. Session Detail — Timeline View - -### Current -``` -┌─────────────────────────────────────────────────────────────────┐ -│ ← Fix parser token counting │ -│ keddy · main · Started 3:15 PM · 45 min · 12 exchanges │ -│ Milestones: 2 · Plans: 1 │ -│ │ -│ [Timeline] [Full Transcript] [↑ Newest] │ -│ │ -│ ── Plans ────────────────────────────────────────────────── │ -│ │ v1 [approved] Fix token counting in parser by... │ -│ │ -│ ── Timeline ─────────────────────────────────────────────── │ -│ │ -│ ● [Discussion] 2 exchanges · 3 min 3:15 PM │ -│ │ You: I'm seeing wrong token counts in the dashboard... │ -│ │ Claude: Let me look at the parser to understand... │ -│ │ Files: — · Tools: 0 │ -│ │ │ -│ ● [Planning] 1 exchange · 2 min 3:18 PM │ -│ │ You: Let's plan this out │ -│ │ Claude: I'll fix the token extraction in parser.ts... │ -│ │ Files: — · Tools: 2 │ -│ │ │ -│ ● [Implementing] 4 exchanges · 15 min 3:20 PM │ -│ │ AI: Modified parser to extract usage fields from... │ -│ │ You: Also handle the cache tokens │ -│ │ Claude: I'll add cache_read and cache_creation... │ -│ │ +2 more exchanges │ -│ │ Files: parser.ts, types.ts · Tools: 12 │ -│ │ │ -│ ◆ ● commit: "fix: token counting in parser" │ -│ ◆ ✓ tests passed (14 passed) │ -│ │ │ -│ ● [Debugging] 3 exchanges · 12 min 3:35 PM │ -│ │ You: The cache tokens are still wrong │ -│ │ Claude: I see the issue — the field name is... │ -│ │ +1 more exchanges │ -│ │ Files: parser.ts · Tools: 8 │ -│ │ │ -│ ● [Implementing] 2 exchanges · 8 min 3:47 PM │ -│ │ You: ok now also update the schema │ -│ │ Claude: I'll add the columns to the exchanges table... │ -│ │ Files: schema.ts, queries.ts · Tools: 6 │ -│ │ -│ Problem: "Discussion" vs "Implementing" vs "Debugging" are │ -│ guesses. The "Debugging" segment is just "edits + errors". │ -│ Could have been iterating on a new feature. │ -└─────────────────────────────────────────────────────────────────┘ -``` - -### Proposed — Without AI - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ ← Session #142 │ -│ keddy · main · Started 3:15 PM · 45 min · 12 exchanges │ -│ opus 4.6 · 180k tokens · 8 files │ -│ │ -│ [Timeline] [Full Transcript] [Stats] [↑ Newest] │ -│ │ -│ ── Session Bar ──────────────────────────────────────────── │ -│ │ -│ ██░░░░░░░░░░░░░░░█████████████████░░████████░░██████████ │ -│ 3:15 3:18 3:20 3:35 3:47 4:00 │ -│ ↑plan ● commit ✓ test │ -│ │ -│ ── Plans ────────────────────────────────────────────────── │ -│ │ v1 [approved] Fix token counting in parser by... │ -│ │ -│ ── Activity ─────────────────────────────────────────────── │ -│ │ -│ ○ 2 exchanges · 3 min · 12k tokens 3:15 PM │ -│ │ You: I'm seeing wrong token counts in the dashboard... │ -│ │ Claude: Let me look at the parser to understand... │ -│ │ Tools: — │ -│ │ Files: — │ -│ │ │ -│ ◈ 1 exchange · 2 min · 8k tokens ↑plan 3:18 PM │ -│ │ You: Let's plan this out │ -│ │ Claude: I'll fix the token extraction in parser.ts... │ -│ │ Tools: EnterPlanMode, ExitPlanMode │ -│ │ Mode: plan │ -│ │ │ -│ ◉ 4 exchanges · 15 min · 62k tokens 3:20 PM │ -│ │ You: Also handle the cache tokens │ -│ │ Claude: I'll add cache_read and cache_creation... │ -│ │ +2 more exchanges │ -│ │ Tools: Read ×3, Grep ×2, Edit ×4, Write ×1, Bash ×2 │ -│ │ Files: parser.ts (4 edits), types.ts (2 edits) │ -│ │ │ -│ ── ● commit: "fix: token counting in parser" ───────────── │ -│ ── ✓ tests: 14 passed ──────────────────────────────────── │ -│ │ │ -│ ◉ 3 exchanges · 12 min · 45k tokens 3:35 PM │ -│ │ You: The cache tokens are still wrong │ -│ │ Claude: I see the issue — the field name is... │ -│ │ +1 more exchanges │ -│ │ Tools: Read ×2, Grep ×1, Edit ×3, Bash ×2 (1 error) │ -│ │ Files: parser.ts (3 edits) │ -│ │ │ -│ ◉ 2 exchanges · 8 min · 38k tokens 3:47 PM │ -│ │ You: ok now also update the schema │ -│ │ Claude: I'll add the columns to the exchanges table... │ -│ │ Tools: Read ×2, Edit ×3, Bash ×1 │ -│ │ Files: schema.ts (2 edits), queries.ts (1 edit) │ -│ │ -│ KEY DIFFERENCES: │ -│ • No guessed labels like "Discussion" or "Debugging" │ -│ • You see actual tool breakdown per group │ -│ • Token counts show effort per group │ -│ • File edits show exactly what was touched and how many times │ -│ • Bash errors shown as fact ("1 error"), not as "debugging" │ -│ • Groups are split by natural boundaries: │ -│ - plan mode entry/exit │ -│ - milestones (commits, test runs) │ -│ - file-focus shifts (working on different files) │ -│ - interrupts │ -│ - compaction events │ -│ • The dot style hints at density: │ -│ ○ = no tools (conversation only) │ -│ ◈ = special mode (plan, etc.) │ -│ ◉ = tools used (sized by count? or just filled) │ -└─────────────────────────────────────────────────────────────────┘ -``` - -### Proposed — With AI - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ ← Fix parser token counting ✦ AI Analyzed │ -│ keddy · main · Started 3:15 PM · 45 min · 12 exchanges │ -│ opus 4.6 · 180k tokens · 8 files │ -│ │ -│ [Timeline] [Full Transcript] [Stats] [↑ Newest] │ -│ │ -│ ── Session Bar ──────────────────────────────────────────── │ -│ │ -│ ██░░░░░░░░░░░░░░░█████████████████░░████████░░██████████ │ -│ 3:15 3:18 3:20 3:35 3:47 4:00 │ -│ ↑plan ● commit ✓ test │ -│ │ -│ ── Plans ────────────────────────────────────────────────── │ -│ │ v1 [approved] Fix token counting in parser by... │ -│ │ -│ ── Activity ─────────────────────────────────────────────── │ -│ │ -│ ○ 2 exchanges · 3 min · 12k tokens 3:15 PM │ -│ │ ┌ ✦ Problem scoping — identified token count mismatch ┐ │ -│ │ └─────────────────────────────────────────────────────-┘ │ -│ │ You: I'm seeing wrong token counts in the dashboard... │ -│ │ Claude: Let me look at the parser to understand... │ -│ │ Tools: — │ -│ │ Files: — │ -│ │ │ -│ ◈ 1 exchange · 2 min · 8k tokens ↑plan 3:18 PM │ -│ │ ┌ ✦ Planned approach: extract usage from JSONL fields ┐ │ -│ │ └────────────────────────────────────────────────────--┘ │ -│ │ You: Let's plan this out │ -│ │ Claude: I'll fix the token extraction in parser.ts... │ -│ │ Tools: EnterPlanMode, ExitPlanMode │ -│ │ Mode: plan │ -│ │ │ -│ ◉ 4 exchanges · 15 min · 62k tokens 3:20 PM │ -│ │ ┌ ✦ Implemented token extraction + cache fields ──────┐ │ -│ │ └────────────────────────────────────────────────────--┘ │ -│ │ You: Also handle the cache tokens │ -│ │ Claude: I'll add cache_read and cache_creation... │ -│ │ +2 more exchanges │ -│ │ Tools: Read ×3, Grep ×2, Edit ×4, Write ×1, Bash ×2 │ -│ │ Files: parser.ts (4 edits), types.ts (2 edits) │ -│ │ │ -│ ── ● commit: "fix: token counting in parser" ───────────── │ -│ ── ✓ tests: 14 passed ──────────────────────────────────── │ -│ │ │ -│ ◉ 3 exchanges · 12 min · 45k tokens 3:35 PM │ -│ │ ┌ ✦ Fixed cache field name mismatch ─────────────────┐ │ -│ │ └────────────────────────────────────────────────────-┘ │ -│ │ You: The cache tokens are still wrong │ -│ │ Claude: I see the issue — the field name is... │ -│ │ +1 more exchanges │ -│ │ Tools: Read ×2, Grep ×1, Edit ×3, Bash ×2 (1 error) │ -│ │ Files: parser.ts (3 edits) │ -│ │ │ -│ ◉ 2 exchanges · 8 min · 38k tokens 3:47 PM │ -│ │ ┌ ✦ Extended schema to store token data ─────────────┐ │ -│ │ └────────────────────────────────────────────────────-┘ │ -│ │ You: ok now also update the schema │ -│ │ Claude: I'll add the columns to the exchanges table... │ -│ │ Tools: Read ×2, Edit ×3, Bash ×1 │ -│ │ Files: schema.ts (2 edits), queries.ts (1 edit) │ -│ │ -│ KEY DIFFERENCES FROM WITHOUT-AI: │ -│ • Session title is AI-generated (not "Session #142") │ -│ • Each group has an AI summary card (✦ marked) │ -│ • The AI summary IS the label — not a fixed category like │ -│ "Debugging" but a specific description of what happened │ -│ • All factual data remains identical underneath │ -│ • Remove AI and nothing breaks — you just lose the narrative │ -└─────────────────────────────────────────────────────────────────┘ -``` - ---- - -## 3. Stats Tab (NEW — only possible with facts-first) - -### Without AI -``` -┌─────────────────────────────────────────────────────────────────┐ -│ [Timeline] [Full Transcript] [Stats] │ -│ │ -│ ── Token Usage ──────────────────────────────────────────── │ -│ │ -│ Input: 142,000 tokens │ -│ Output: 38,000 tokens │ -│ Cache read: 98,000 tokens (69% cache hit rate) │ -│ Cache created: 44,000 tokens │ -│ Total: 180,000 tokens │ -│ │ -│ Token flow over time: │ -│ ▁▂▃▅▇█▇▅▃▁ ←— spikes at heavy edit exchanges │ -│ 3:15 3:35 4:00 │ -│ │ -│ ── Tool Usage ───────────────────────────────────────────── │ -│ │ -│ Edit ████████████░░░ 10 │ -│ Read ███████░░░░░░░░ 7 │ -│ Bash █████░░░░░░░░░░ 5 │ -│ Grep ███░░░░░░░░░░░░ 3 │ -│ Write █░░░░░░░░░░░░░░ 1 │ -│ │ -│ Tool errors: 1 (Bash) │ -│ │ -│ ── Files ────────────────────────────────────────────────── │ -│ │ -│ parser.ts 7 edits, 5 reads │ -│ types.ts 2 edits, 1 read │ -│ schema.ts 2 edits, 1 read │ -│ queries.ts 1 edit, 1 read │ -│ │ -│ ── Model ────────────────────────────────────────────────── │ -│ │ -│ claude-opus-4-6 ████████████████ 12/12 exchanges │ -│ │ -│ ── Timing ───────────────────────────────────────────────── │ -│ │ -│ Total duration: 45 min │ -│ Avg turn time: ~3.8 min │ -│ Longest turn: 6.2 min (exchange #7) │ -│ │ -│ All of this is FACTUAL. No interpretation needed. │ -│ Not possible with current system because tokens, model, │ -│ and timing data are thrown away. │ -└─────────────────────────────────────────────────────────────────┘ -``` - ---- - -## 4. How Groups Are Split (Without AI) - -Current system: split by guessed "type" changes. -Proposed: split by **observable boundaries**. - -``` -DEFINITIVE BOUNDARIES (always split): -────────────────────────────────────── - ↑plan Plan mode entered ← EnterPlanMode tool - ↓plan Plan mode exited ← ExitPlanMode tool - ⚡ User interrupt ← is_interrupt flag - ━━━━━ Compaction ← compact_boundary entry - ● ✓ ✗ Milestone (commit/test/etc) ← regex on Bash commands - -SOFT BOUNDARIES (suggest split, can be tuned): -────────────────────────────────────── - 📁 File focus shift ← files_written changes completely - 🔄 Model change ← model field differs - ⏸ Long gap ← >10 min between exchanges - 🔀 Tool pattern shift ← went from all-reads to all-edits - -Example of how a session gets split: - - Exchange 1: chat (no tools) ─┐ - Exchange 2: chat (no tools) ─┘ Group A: 2 exchanges - (no tools, conversation) - ─── ↑plan ─── boundary ─── - Exchange 3: EnterPlanMode ─┐ - Exchange 4: ExitPlanMode ─┘ Group B: 2 exchanges - (plan mode active) - ─── ↓plan ─── boundary ─── - Exchange 5: Read, Grep ─┐ - Exchange 6: Read, Read, Edit ─┤ - Exchange 7: Edit, Edit, Bash ─┤ Group C: 4 exchanges - Exchange 8: Edit, Bash ─┘ (mixed tools, same files) - - ─── ● commit ─── boundary ─── - ─── ✓ tests ─── boundary ─── - Exchange 9: Read, Grep ─┐ - Exchange 10: Edit, Edit, Bash(err) ─┤ Group D: 3 exchanges - Exchange 11: Edit, Bash ─┘ (same file focus) - - ─── file focus shift ─── boundary ─── - Exchange 12: Read, Edit, Edit, Bash ─┐ Group E: 1 exchange - ─┘ (new files: schema, queries) - - No labels. Just groups with their factual tool/file breakdown. - AI can label them if enabled. Without AI, the data speaks. -``` - ---- - -## 5. Session List — Activity Strip Detail - -The activity strip replaces the segment flow chips. Here's how it encodes information: - -``` -CURRENT SEGMENT FLOW: - [discuss] › [plan] › [build] › [debug] › [build] - ↑ guessed labels, no size indication, no detail - -PROPOSED ACTIVITY STRIP: - - ░░██░░░░░░░░░░░████████████░░░████░░██████ - │ │ │ │ │ - │ │ │ │ └─ edits (files: schema, queries) - │ │ │ └─ reads + edits + bash errors (file: parser) - │ │ └─ reads + edits + bash (files: parser, types) - │ └─ plan mode - └─ no tools (conversation) - - Color legend (tool-type based, not intent-based): - ░ = no tools / conversation - █ = plan mode (always distinct) - ░ = reads only (Grep, Read, Glob) - █ = edits (Edit, Write) - ▓ = bash / commands - ▒ = mixed - - Markers on the strip: - ↑ = plan entered - ● = commit - ✓ = test pass - ✗ = test fail - ↑ = push - ⑂ = PR created - - Width of each section = proportional to exchange count -``` - ---- - -## 6. Full Transcript View - -### Current -``` - ── Implementing (exchanges 5-8) ─────────── purple line - - You: Also handle the cache tokens - Claude: I'll add cache_read and cache_creation... - Edit ×4 · Read ×3 · +2 tools -``` - -### Proposed — Without AI -``` - ── Group C (exchanges 5-8) · 15 min · 62k tokens ── ░░███ - - You: Also handle the cache tokens - Claude: I'll add cache_read and cache_creation... - Read ×3 · Grep ×2 · Edit ×4 · Write ×1 · Bash ×2 - Files: parser.ts, types.ts -``` - -### Proposed — With AI -``` - ── Group C (exchanges 5-8) · 15 min · 62k tokens ── ░░███ - ✦ Implemented token extraction + cache fields - - You: Also handle the cache tokens - Claude: I'll add cache_read and cache_creation... - Read ×3 · Grep ×2 · Edit ×4 · Write ×1 · Bash ×2 - Files: parser.ts, types.ts -``` - ---- - -## 7. What Stays the Same - -These parts of the UI don't change because they're already based on solid signals: - -| Feature | Why it stays | -|---------|-------------| -| **Plans section** | Based on definitive EnterPlanMode/ExitPlanMode tools | -| **Plan status badges** | Based on exact string matches in tool results | -| **Milestones** | Based on git command regex (high confidence) | -| **Tasks section** | Based on TaskCreate/TaskUpdate tool calls | -| **Compaction events** | Based on compact_boundary JSONL entries | -| **Full transcript** | Direct exchange content, no interpretation | -| **Search (FTS)** | Searches actual content, no classification needed | -| **Sidebar/navigation** | Project/session structure, no classification | - ---- - -## 8. Summary: What Changes - -| Component | Current | Without AI | With AI | -|-----------|---------|-----------|---------| -| **Session title** | AI or generic | Session #{id} | AI-generated title | -| **Segment flow** | Guessed label chips | Activity strip (tool-colored, proportional) | Same strip + AI narrative line | -| **Session meta** | Duration, exchanges | + tokens, model, files count | Same | -| **Timeline groups** | Labeled segments (Discussion, Implementing...) | Unlabeled groups split by boundaries | Groups + AI summary card each | -| **Group detail** | Label + file count + tool count | Tool breakdown + file edits + tokens + timing | Same + AI label | -| **Stats tab** | Doesn't exist | Token usage, tool usage, file heatmap, timing | Same | -| **Segment types** | 10 heuristic categories | None (boundaries only) | AI-generated (free-form, specific) | - -**The fundamental change**: Instead of seeing "Debugging (3 exchanges)" you see -"3 exchanges · 12 min · 45k tokens · Read ×2, Edit ×3, Bash ×2 (1 error) · parser.ts" -— and if AI is on, it adds "✦ Fixed cache field name mismatch" on top. - -The facts are always there. The story is optional.