diff --git a/Plans/ai-genius-system-design.md b/Plans/ai-genius-system-design.md new file mode 100644 index 000000000..c84c5b2d4 --- /dev/null +++ b/Plans/ai-genius-system-design.md @@ -0,0 +1,540 @@ +# AI Genius: Transformation Plan + +## From Current LifeOS Setup → PAI-Modular Smart Architecture + +--- + +## 1. VISION (Revised) + +**Goal**: Maximize your brainpower for synthesis, connection-making, and strategy by offloading all operational work to AI. Not a corporate hierarchy — a smart, modular system that loads the right context for every task, learns from every interaction, and gets better over time. + +**What changes**: The current monolithic CLAUDE.md + time-based automation evolves into a modular skill/context/memory architecture following PAI patterns. Everything that works today keeps working. Nothing breaks during transformation. + +**What stays the same**: Obsidian vault as source of truth, Claude Code as compute, eventkit-cli and mail-cli as service connectors, Discord/Telegram as remote interfaces, launchd scripts for scheduling, Git for safety. + +--- + +## 2. CURRENT STATE → TARGET STATE MAPPING + +### 2.1 What Exists and What It Becomes + +| Current | Target | Migration Risk | +|---|---|---| +| **CLAUDE.md** (58KB monolith) | Modular skill files + CORE SKILL.md | Medium — must preserve all 18 modes | +| **18 modes in CLAUDE.md** | 18 skill files in `AI/skills/` | Low — extract, don't rewrite | +| **AI/AI Context Log.md** | `_SYSTEM/MEMORY/context-log.md` + structured state files | Low — add structure around existing | +| **No cross-session memory** | `_SYSTEM/MEMORY/` with learnings, signals, work tracking | New — additive only | +| **No feedback capture** | Rating capture + mistake tracking in MEMORY | New — additive only | +| **5 launchd scripts** | Same scripts + hook system on top | Low — extend, don't replace | +| **afk-code (Discord/Telegram relay)** | Same — it already works | None | +| **eventkit-cli, mail-cli** | Same — they're solid tools | None | +| **6 MCP servers** | Same — keep all | None | +| **AI/Prompts/ (18 templates)** | Move into skill workflows | Low | +| **No context budgeting** | Context maps per domain + selective loading | New — additive | +| **No event-driven hooks** | PAI-style hook system | New — additive | +| **24 .base databases** | Unchanged — vault structure stays | None | + +### 2.2 What Does NOT Change + +These are working and should not be touched during transformation: + +- Vault folder structure (Areas/, Projects/, Notes/, Content/, etc.) +- All 24 .base database files and their frontmatter schemas +- Daily Notes, Meeting Notes, Journal structure +- eventkit-cli and mail-cli source code and functionality +- afk-code remote control system +- Obsidian plugins (obsidian-git, dataview, readwise, etc.) +- Git workflow and .gitignore +- Discord/Telegram webhook notifications +- Apple Reminders as primary task system +- GPR project tracking + +--- + +## 3. TARGET ARCHITECTURE + +``` +LifeOS/ (your existing Obsidian vault — unchanged top-level) +│ +├── AI/ # Existing — restructured internals +│ ├── CORE.md # NEW: replaces monolithic CLAUDE.md +│ │ # Contains ONLY: identity, universal +│ │ # rules, routing table, vault overview +│ │ # Target: ~4,000 tokens (vs 14,500) +│ │ +│ ├── skills/ # NEW: extracted from CLAUDE.md modes +│ │ ├── _SKILL-TEMPLATE.md # Template for new skills +│ │ ├── ai-equilibrium-editor.md # From mode 6.1 +│ │ ├── translator.md # From mode 6.2 +│ │ ├── editing-rewriting.md # From mode 6.2b +│ │ ├── zen-jaskiniowca.md # From mode 6.3 +│ │ ├── business-advisor.md # From mode 6.4 +│ │ ├── strategy-advisor.md # From mode 6.5 +│ │ ├── productivity-advisor.md # From mode 6.6 +│ │ ├── vault-janitor.md # From mode 6.7 +│ │ ├── health-fitness.md # From mode 6.8 +│ │ ├── communication-writing.md # From mode 6.9 +│ │ ├── network-management.md # From mode 6.10 +│ │ ├── learning-knowledge.md # From mode 6.11 +│ │ ├── technical-architecture.md# From mode 6.12 +│ │ ├── legal-advisor.md # From mode 6.13 +│ │ ├── financial-advisor.md # From mode 6.14 +│ │ ├── general-advisor.md # From mode 6.15 +│ │ ├── weekly-review.md # From mode 6.16 +│ │ ├── daily-shutdown.md # From mode 6.17 +│ │ └── meeting-processing.md # From mode 6.18 +│ │ +│ ├── context/ # NEW: domain-specific context maps +│ │ ├── newsletter.md # What to load for newsletter work +│ │ ├── business.md # What to load for business work +│ │ ├── health.md # What to load for health work +│ │ ├── content.md # What to load for content creation +│ │ ├── investing.md # What to load for investing +│ │ ├── network.md # What to load for people/network +│ │ └── personal.md # What to load for personal +│ │ +│ ├── policies/ # NEW: extracted from CLAUDE.md +│ │ ├── provocation-protocol.md # Section 4 of current CLAUDE.md +│ │ ├── council-of-experts.md # Section 7 of current CLAUDE.md +│ │ ├── proactivity-protocol.md # Current proactivity rules +│ │ ├── linking-rules.md # Note linking & vault graph rules +│ │ ├── security-boundaries.md # Tool restrictions, permissions +│ │ ├── emergency-protocols.md # Anxiety, victim mode responses +│ │ └── formatting-rules.md # Language, style, anti-slop +│ │ +│ ├── memory/ # NEW: cross-session persistence +│ │ ├── learnings/ # What the system has learned +│ │ │ ├── execution.md # Task approach improvements +│ │ │ ├── preferences.md # Your style, tone, habits +│ │ │ └── mistakes.md # Errors to avoid +│ │ ├── signals/ # Your feedback +│ │ │ └── ratings.jsonl # Timestamped quality signals +│ │ ├── work/ # Work-in-progress tracking +│ │ │ └── current.md # What's active, what stage, next step +│ │ └── context-log.md # Existing AI Context Log (moved) +│ │ +│ ├── hooks/ # NEW: event-driven automation +│ │ ├── on-session-start.ts # Load context, check WIP, greet +│ │ ├── on-session-end.ts # Capture learnings, update WIP +│ │ ├── on-feedback.ts # Capture ratings and corrections +│ │ └── lib/ # Shared utilities +│ │ +│ ├── My AgentOS/ # EXISTING — unchanged +│ │ ├── CV Krzysztof Goworek.md +│ │ ├── Deep Profile & Operating Manual.md +│ │ ├── Personal Information.md +│ │ └── ... +│ │ +│ ├── Prompts/ # EXISTING — gradually migrate into skills +│ │ └── (18 files) +│ │ +│ └── scripts/ # EXISTING — unchanged, extended +│ ├── daily-brief.sh +│ ├── deep-work-block.sh +│ ├── weekly-review.sh +│ ├── vault-cleanup.sh +│ ├── daily-close.sh +│ └── notify-utils.sh +│ +├── CLAUDE.md # TRANSITIONAL: slim router that +│ # points to AI/CORE.md + AI/skills/ +│ # Eventually becomes just a pointer +│ +└── [rest of vault unchanged] +``` + +### 3.1 How Context Loading Changes + +**Current** (every session): +``` +Load CLAUDE.md (14,500 tokens) → all 18 modes, all rules, everything +Maybe read AI Context Log (+2,000 tokens) +Maybe read Deep Profile (+3,000 tokens) += 15,000–20,000 tokens before any work starts +``` + +**Target** (per session): +``` +Load CLAUDE.md slim router (1,000 tokens) → identity, routing table +Auto-load via on-session-start hook: + → AI/memory/work/current.md (WIP state, 200 tokens) + → AI/memory/context-log.md (current situation, 2,000 tokens) + → Relevant skill file ONLY (500–2,000 tokens per skill) + → Relevant context map ONLY (300 tokens) + → Relevant policies ONLY (loaded by skill reference) += 4,000–6,000 tokens, precisely targeted +``` + +**The key mechanism**: Each skill file declares what it needs: + +```yaml +--- +name: AI Equilibrium Editor +triggers: ["newsletter", "AI Equilibrium", "content", "AIEQ"] +context_files: + - AI/context/newsletter.md + - Content/AI Equilibrium/ +policies: + - provocation-protocol + - formatting-rules +voice: "British English, authoritative, insightful, direct, pragmatic" +--- + +# AI Equilibrium Editor + +[Mode-specific instructions extracted from CLAUDE.md 6.1] +``` + +And each context map declares what vault files matter for that domain: + +```yaml +# AI/context/newsletter.md +--- +domain: newsletter +--- + +## Primary Files +- Content/AI Equilibrium/ (all issues, ideas, pipeline) +- AI/Prompts/AIEQ*.md (pipeline prompts) +- Content/Newsletter Ideas.md + +## Secondary Files (load if relevant) +- Areas/Companies/ (for business angles) +- Notes/Meeting Notes/ (last 7 days, for recency) + +## Current State +- Read from: AI/memory/context-log.md → newsletter section +- Active WIP: AI/memory/work/current.md → newsletter items +``` + +### 3.2 How Memory Works + +**Cross-session memory** (the biggest gap in the audit): + +``` +Session ends → + on-session-end hook fires → + 1. Updates AI/memory/work/current.md + (what was worked on, what stage, next step) + 2. Appends to AI/memory/learnings/ if corrections occurred + 3. Captures rating if given (to signals/ratings.jsonl) + 4. Updates AI/memory/context-log.md if significant changes + +Next session starts → + on-session-start hook fires → + 1. Reads AI/memory/work/current.md + → "Last session you were drafting newsletter #48, + you completed the outline, next step is writing + the intro section" + 2. Reads recent learnings + 3. Reads context-log.md + → Provides continuity without user re-explaining +``` + +**Learnings structure**: + +```markdown +# AI/memory/learnings/preferences.md + +## Writing Style +- Newsletter tone: conversational expert, like explaining to a smart friend +- Anti-patterns: never use "furthermore", "it is worth noting", "in conclusion" +- [Added 2026-02-04 after user rewrote 60% of newsletter #47 draft] + +## Communication +- LinkedIn posts: max 1300 chars, hook in first line, no hashtags in body +- Email: direct, no pleasantries beyond first line +- [Added 2026-01-20] + +## Workflow +- Don't suggest reorganizing vault structure proactively — user said no 4 times +- When doing weekly review, always check Reminders BEFORE calendar +- [Added 2026-01-26] +``` + +### 3.3 How Feedback Capture Works + +**Explicit** (user rates or corrects): +``` +User: "That draft was a 3/10, way too formal" +→ on-feedback hook captures: + {timestamp, task: "newsletter-draft", rating: 3, feedback: "too formal"} +→ Appends to signals/ratings.jsonl +→ If rating < 6: auto-creates learning entry in mistakes.md +``` + +**Implicit** (user rewrites, corrects, or re-asks): +``` +User significantly edits AI output → detected by diff +→ Learning: "User changed [what] to [what], in context [task]" +→ Stored in learnings/preferences.md +``` + +**Proactivity learning** (from current CLAUDE.md, but now persistent): +``` +User rejects suggestion type 3+ times → +→ Stored in learnings/preferences.md as negative preference +→ Persists across sessions (currently lost at session end) +``` + +--- + +## 4. SAFE TRANSFORMATION PROGRAM + +### Guiding Principles + +1. **Never break what works**. Every step must leave the system functional. +2. **Additive before subtractive**. Create new structure first, then migrate. +3. **Git checkpoint after every step**. Commit with descriptive messages. Rollback = `git revert`. +4. **Test after each step**. Run a representative task to verify nothing broke. +5. **CLAUDE.md is the last thing to slim down**. It stays as the authority until all skills are verified. + +### Step 0: Prepare (no vault changes) + +``` +□ Git status clean, all committed +□ Create a git branch: ai-genius-transformation +□ Document current CLAUDE.md section boundaries (line numbers for each mode) +□ Verify all 5 launchd scripts are working +□ Verify eventkit-cli and mail-cli are working +□ Baseline: run one session with current setup, note behavior +``` + +### Step 1: Create the New Directory Structure + +**Risk: ZERO** — only creates new empty folders and template files. + +``` +□ Create AI/skills/ directory +□ Create AI/context/ directory +□ Create AI/policies/ directory +□ Create AI/memory/ directory structure + └── learnings/, signals/, work/ +□ Create AI/hooks/ directory +□ Create _SKILL-TEMPLATE.md with standard skill format +□ Git commit: "scaffold: create modular AI architecture directories" +``` + +### Step 2: Extract Policies from CLAUDE.md + +**Risk: LOW** — creates new files, does NOT modify CLAUDE.md yet. + +``` +□ Extract Provocation Protocol (Section 4) → AI/policies/provocation-protocol.md +□ Extract Council of Experts (Section 7) → AI/policies/council-of-experts.md +□ Extract proactivity rules → AI/policies/proactivity-protocol.md +□ Extract note linking rules → AI/policies/linking-rules.md +□ Extract security/boundary rules → AI/policies/security-boundaries.md +□ Extract emergency protocols → AI/policies/emergency-protocols.md +□ Extract formatting/style rules → AI/policies/formatting-rules.md +□ Each policy file is a clean standalone document +□ Verify: read each policy file, confirm it matches CLAUDE.md source +□ Git commit: "extract: policies from CLAUDE.md into modular files" +``` + +### Step 3: Extract Skills from CLAUDE.md + +**Risk: LOW** — creates new files, does NOT modify CLAUDE.md yet. + +``` +□ For each of the 18 modes (6.1–6.18): + □ Create AI/skills/{skill-name}.md + □ Copy the full mode definition + □ Add YAML frontmatter: name, triggers, context_files, policies, voice + □ Add references to which policies this skill uses + □ Add references to which context maps this skill needs +□ Verify: each skill file is self-contained and readable +□ Verify: no mode content was lost in extraction +□ Git commit: "extract: 18 skill modes from CLAUDE.md into skill files" +``` + +### Step 4: Create Context Maps + +**Risk: LOW** — new files only, additive. + +``` +□ Create AI/context/newsletter.md + → Map: Content/AI Equilibrium/, AI/Prompts/AIEQ*, Newsletter Ideas +□ Create AI/context/business.md + → Map: Projects/GPR/, Areas/Companies/, Projects/Deals/ +□ Create AI/context/health.md + → Map: Areas/Health/, Health OS +□ Create AI/context/content.md + → Map: Content/, short-form video ideas +□ Create AI/context/investing.md + → Map: Areas/Investing/ +□ Create AI/context/network.md + → Map: Areas/People/ +□ Create AI/context/personal.md + → Map: light references only, advisory-only flag +□ Each context map lists: primary files, secondary files, related domains +□ Git commit: "create: domain context maps for selective loading" +``` + +### Step 5: Bootstrap Memory System + +**Risk: LOW** — new files, additive. Moves AI Context Log (creates symlink or updates references). + +``` +□ Create AI/memory/context-log.md + → Copy content from AI/AI Context Log.md + → Add symlink or update CLAUDE.md reference to point to new location + → Keep old file with a redirect note (don't break existing scripts) +□ Create AI/memory/work/current.md + → Start with empty template: "No active WIP from previous session" +□ Create AI/memory/learnings/execution.md (empty template) +□ Create AI/memory/learnings/preferences.md + → Seed with known preferences from CLAUDE.md (language rules, style, etc.) +□ Create AI/memory/learnings/mistakes.md (empty template) +□ Create AI/memory/signals/ratings.jsonl (empty) +□ Git commit: "bootstrap: memory system with initial state" +``` + +### Step 6: Slim Down CLAUDE.md (The Critical Step) + +**Risk: MEDIUM** — this changes the primary instruction file. + +**Strategy**: Don't delete anything from CLAUDE.md yet. Instead, create a NEW slim CLAUDE.md that references the modular files, and rename the old one as CLAUDE-legacy.md for safety. + +``` +□ Rename CLAUDE.md → CLAUDE-legacy.md +□ Create new CLAUDE.md (~4,000 tokens) containing ONLY: + 1. Identity & voice (Section 2 essentials — who you are, who user is) + 2. Universal rules (formatting, language, anti-slop — the ones that apply to ALL modes) + 3. Vault structure overview (condensed) + 4. Tool reference (eventkit-cli, mail-cli — just the essentials) + 5. Skill routing table: + "Based on the user's request, identify the relevant skill from AI/skills/. + Read the skill file. Follow its instructions. + If no skill matches, use AI/skills/general-advisor.md." + 6. Policy loading instruction: + "Each skill references policies. Read the referenced policies from AI/policies/." + 7. Context loading instruction: + "Each skill references context maps. Read the context map from AI/context/. + Load the primary files listed. Load secondary files only if relevant." + 8. Memory instruction: + "At session start, read AI/memory/work/current.md and AI/memory/context-log.md. + At session end, update these files." +□ Test: run a session with the new CLAUDE.md + → Verify skill routing works (try triggering 3-4 different skills) + → Verify context loading works (are the right files being read?) + → Verify formatting/style is preserved + → Verify provocation protocol still fires +□ If test fails: git checkout CLAUDE.md (instant rollback) +□ If test passes: git commit "refactor: slim CLAUDE.md with modular skill routing" +□ Keep CLAUDE-legacy.md for 2 weeks, then remove +``` + +### Step 7: Add Hook System + +**Risk: LOW** — additive automation on top of working system. + +``` +□ Create AI/hooks/on-session-start.ts (Bun script) + → Read AI/memory/work/current.md + → Read AI/memory/context-log.md (last 50 lines) + → Read AI/memory/learnings/ (recent entries) + → Output: context summary injected into session +□ Create AI/hooks/on-session-end.ts + → Prompt: update work/current.md with session summary + → Prompt: capture any learnings from this session + → Prompt: update context-log.md if significant changes occurred +□ Create AI/hooks/on-feedback.ts + → Detect explicit ratings (1-10 pattern) + → Append to signals/ratings.jsonl + → If rating < 6: trigger learning capture +□ Register hooks in Claude Code settings.json +□ Test: run a session, verify hooks fire correctly +□ Git commit: "add: PAI-style hook system for memory persistence" +``` + +### Step 8: Enhance Existing Automation + +**Risk: LOW** — modifies existing scripts minimally. + +``` +□ Update daily-brief.sh: + → Add: read AI/memory/work/current.md for WIP items in briefing + → Add: read AI/memory/learnings/ for recent learnings in briefing +□ Update daily-close.sh: + → Add: update AI/memory/work/current.md at end of day + → Add: update AI/memory/context-log.md +□ Update weekly-review.sh: + → Add: review AI/memory/learnings/ — synthesize weekly patterns + → Add: review AI/memory/signals/ — analyze quality trends + → Add: propose CLAUDE.md or skill file improvements based on data +□ Test each script after modification +□ Git commit: "enhance: existing launchd scripts with memory awareness" +``` + +### Step 9: Add Challenger Behavior as Policy + +**Risk: LOW** — new policy file + CORE.md reference. + +``` +□ Create AI/policies/challenger-protocol.md + → Rules: always consider if the task can be improved before executing + → Rules: connect to other domains when relevant + → Rules: if user asks for something suboptimal, suggest the better path + → Guardrails: max 1 challenge per interaction, accept "just do it" gracefully + → Track: when challenges are accepted vs. overridden (in learnings) +□ Reference from CLAUDE.md universal rules +□ Git commit: "add: challenger protocol for proactive quality improvement" +``` + +### Step 10: Verify and Clean Up + +``` +□ Run full test suite: + → Trigger each of the 18 skills explicitly + → Test cross-domain context loading (newsletter + business) + → Test memory persistence (end session, start new, verify WIP handoff) + → Test feedback capture (give rating, verify it's stored) + → Test challenger mode (give a suboptimal request, verify pushback) + → Test all 5 launchd scripts + → Test afk-code remote control +□ If all pass: + □ Remove CLAUDE-legacy.md + □ Update AI/My AgentOS/ references if needed + □ Git commit: "complete: AI Genius transformation verified" +□ If any fail: + □ Fix individually, test again + □ Or rollback specific steps: git revert [commit] +``` + +--- + +## 5. WHAT THIS ENABLES (Future Expansion) + +Once the modular architecture is in place, these become straightforward additions: + +| Capability | How to add it | Effort | +|---|---|---| +| **New skill** | Create one .md file in AI/skills/ | 15 minutes | +| **New domain** | Create context map + optionally new skill | 30 minutes | +| **Voice interface** | New interface that reads skills/context maps, pipes to Claude | Separate project | +| **More hooks** | Add .ts files to AI/hooks/, register in settings.json | 1 hour each | +| **Proactive scanning** | New launchd script that reads context maps, runs analysis | Half day | +| **New service connector** | New CLI tool or MCP server, reference in skills | Per service | +| **A/B testing prompts** | Version skill files (v1, v2), track ratings by version | 2 hours | +| **Cross-domain intelligence** | Skill reads multiple context maps, pulls from several domains | Built into architecture | + +The architecture is designed so that Claude Code — running on your Mac Mini or anywhere — can itself propose and create new skills, policies, and context maps. The system improves by adding files, not by editing a monolith. + +--- + +## 6. WHAT I IMPROVED ON YOUR THINKING + +1. **No management hierarchy needed.** The modular file structure IS the management structure. Skills are your "employees." Context maps are their "briefings." Policies are the "company handbook." Memory is the "institutional knowledge." You don't need CoS/Directors — you need good file organization and smart context loading. + +2. **Context maps are more valuable than a dispatcher daemon.** A daemon is complex software to build and maintain. Context maps are markdown files that tell Claude what to read. Same result, 100x simpler. + +3. **The hook system does what a dispatcher does, but lighter.** PAI's hook pattern (on-session-start, on-session-end, on-feedback) gives you event-driven behavior without building a custom server. + +4. **Keep the existing CLAUDE.md as the routing layer.** Claude Code already loads this automatically. Don't fight that — use it as the thin router that points to the modular system. No need for a separate dispatcher. + +5. **Memory solves your #1 problem without infrastructure.** Cross-session persistence via markdown files in AI/memory/ is simple, inspectable, version-controlled, and works immediately. No database, no daemon, no server. + +6. **Your launchd scripts are already the "proactive pipeline."** They run daily/weekly on schedule. Enhancing them with memory awareness (read learnings, update WIP) gives you proactive behavior without building new infrastructure. + +7. **The transformation is ~10 steps, each independently rollbackable.** No big bang migration. No risk of losing your current working system. diff --git a/Plans/instance-briefing.md b/Plans/instance-briefing.md new file mode 100644 index 000000000..48fea079a --- /dev/null +++ b/Plans/instance-briefing.md @@ -0,0 +1,229 @@ +# LifeOS Architecture Changes — Briefing for All Claude Instances + +**Date**: February 2026 +**Context**: The LifeOS vault underwent a major architectural transformation (6 phases). If you previously worked in this vault, the structure has changed significantly. This document tells you what changed, what moved, and what you need to know. + +--- + +## What Happened + +The vault's AI system was restructured from a monolithic 58KB `CLAUDE.md` file into a modular architecture inspired by Daniel Messler's PAI (Personal AI Infrastructure). The transformation was executed across 6 phases over multiple sessions. + +**Nothing outside `AI/` and `CLAUDE.md` changed.** The vault folder structure (Areas/, Projects/, Notes/, Content/, Tracking/, _System/, Tools/) is identical. All .base database files, frontmatter schemas, Obsidian plugins, eventkit-cli, mail-cli, afk-code, and MCP servers are untouched. + +--- + +## CLAUDE.md — What Changed + +**Before**: 58,119 bytes (~14,500 tokens). Contained everything: identity, 18 modes, all policies, tool docs, formatting rules, protocols. + +**After**: ~1,700 tokens. Contains only: +1. Identity (who you are, who the user is) +2. Universal rules (memory loading, skill routing) +3. Skill routing instruction (read the matching skill file from `AI/skills/`) +4. Effort classification (route to The Algorithm for STANDARD+ tasks) +5. Condensed vault structure overview +6. Tool reference (eventkit-cli, mail-cli essentials) + +**All mode content, policies, and protocols were extracted — not deleted.** They live in modular files now. + +--- + +## New Directory Structure Under AI/ + +``` +AI/ +├── skills/ # 28+ skill files (were the 18 modes + new skills) +│ ├── ai-equilibrium-editor.md +│ ├── translator.md +│ ├── business-advisor.md +│ ├── research.md # NEW — multi-source research +│ ├── council.md # NEW — multi-agent debate +│ ├── create-skill.md # NEW — self-extending +│ ├── telos.md # NEW — life OS / goals +│ ├── algorithm.md # NEW — structured execution engine +│ ├── review-proposals.md # NEW — improvement proposal review +│ └── ... (23 original + 5 new) +│ +├── context/ # Domain context maps (which vault files matter per domain) +│ ├── newsletter.md +│ ├── business.md +│ ├── health.md +│ ├── content.md +│ ├── investing.md +│ ├── network.md +│ ├── personal.md +│ ├── research.md +│ └── goals.md +│ +├── policies/ # Extracted policy files (were inline in CLAUDE.md) +│ ├── provocation-protocol.md +│ ├── council-of-experts.md +│ ├── proactivity-protocol.md +│ ├── linking-rules.md +│ ├── security-boundaries.md +│ ├── emergency-protocols.md +│ ├── formatting-rules.md +│ ├── challenger-protocol.md +│ ├── effort-classification.md +│ ├── algorithm-protocol.md +│ └── security-patterns.yaml +│ +├── telos/ # Personal knowledge (beliefs, lessons, wisdom, predictions) +│ ├── beliefs.md +│ ├── lessons.md +│ ├── wisdom.md +│ └── predictions.md +│ +├── memory/ +│ ├── work/ +│ │ ├── current.md # WIP state (auto-updated by hooks) +│ │ ├── state.json # Active session pointer +│ │ └── YYYY-MM-DD_slug/ # Per-session work directories +│ │ ├── META.yaml # Status, effort, prompt count, timestamps +│ │ ├── ISC.md # Algorithm ISC table (if used) +│ │ └── activity-report.md # Session activity summary +│ ├── learnings/ +│ │ ├── preferences.md # Structured table with confidence scores +│ │ ├── mistakes.md # Structured table with occurrence counts +│ │ ├── execution.md # Task approach learnings +│ │ └── synthesis/ # Weekly synthesis reports (YYYY-WW.md) +│ ├── proposals/ +│ │ ├── pending/ # Improvement proposals awaiting review +│ │ ├── approved/ # Applied proposals +│ │ └── rejected/ # Rejected proposals + blocklist.md +│ ├── signals/ +│ │ └── ratings.jsonl # Explicit + implicit quality ratings +│ ├── research/ # Research skill outputs +│ ├── security/ # Security validator audit trail +│ ├── events/ # Observability event logs (YYYY-MM-DD.jsonl) +│ └── context-log.md # Was: AI/AI Context Log.md (MOVED here) +│ +├── hooks/ # TypeScript hooks (Bun runtime) +│ ├── on-session-start.ts # Context injection (first prompt only) +│ ├── on-session-end.ts # Work completion + activity report +│ ├── on-feedback.ts # Explicit rating capture (1-10) +│ ├── security-validator.ts # PreToolUse security checks +│ ├── format-enforcer.ts # Format reminder injection +│ ├── auto-work-creation.ts # Automatic session tracking +│ ├── implicit-sentiment.ts # Passive frustration/satisfaction detection +│ ├── event-capture.ts # PostToolUse event logging +│ └── lib/ +│ ├── paths.ts # Shared vault path utilities +│ ├── event-logger.ts # JSONL event logging utility +│ ├── preference-tracker.ts # Preference recording with confidence +│ └── mistake-tracker.ts # Mistake recording with occurrence counts +│ +├── observability/ # Optional real-time dashboard +│ ├── server.ts +│ ├── dashboard.html +│ └── start.sh +│ +├── scripts/ # Enhanced launchd scripts (unchanged location) +│ ├── daily-brief.sh # Now includes WIP + proposals reminder +│ ├── daily-close.sh # Now updates memory +│ ├── weekly-review.sh # Now includes AI System Health section +│ ├── learning-synthesis.sh # NEW — weekly signal analysis +│ ├── deep-work-block.sh # Unchanged +│ └── vault-cleanup.sh # Unchanged +│ +├── My AgentOS/ # Unchanged +├── Prompts/ # Unchanged +└── GPR to Reminders Mapping.md # Unchanged +``` + +--- + +## What You Need to Know When Working Here + +### 1. Skill Routing + +CLAUDE.md no longer contains mode instructions. When you need to act in a specific mode: +- Read the matching skill file from `AI/skills/` +- The skill's YAML frontmatter tells you which context maps and policies to load +- Follow the skill's instructions + +Example: if the user asks about their newsletter, read `AI/skills/ai-equilibrium-editor.md`, then load `AI/context/newsletter.md` for relevant vault files, and `AI/policies/formatting-rules.md` for style rules. + +### 2. Context Maps + +Instead of loading the entire vault structure, read the relevant context map from `AI/context/`. Each map lists which vault files and folders matter for that domain. This keeps context loading efficient. + +### 3. Memory System + +**Read at session start:** +- `AI/memory/work/current.md` — what was being worked on last session +- `AI/memory/context-log.md` — current priorities and situation + +**Update at session end:** +- `AI/memory/work/current.md` — what you worked on, where you left off + +**Note:** If hooks are running (interactive sessions), this is handled automatically. For `claude -p` invocations in scripts, you may need to read/update manually. + +### 4. Moved Files + +| Old Location | New Location | +|---|---| +| `AI/AI Context Log.md` | `AI/memory/context-log.md` | +| Policies (inline in CLAUDE.md) | `AI/policies/*.md` | +| 18 modes (inline in CLAUDE.md) | `AI/skills/*.md` | + +The old `AI/AI Context Log.md` may contain a redirect notice pointing to the new location. + +### 5. Hooks + +8 hooks are registered in `.claude/settings.json` (gitignored, lives only locally). They fire automatically during interactive sessions: +- **UserPromptSubmit**: context injection, feedback capture, format enforcer, work tracking, sentiment detection +- **PreToolUse**: security validator (blocks dangerous commands) +- **PostToolUse**: event logging +- **Stop**: session end processing + +If you're modifying hook files or adding new hooks, register them in `.claude/settings.json`. Format: +```json +{ + "matcher": "", + "hooks": [{"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/your-hook.ts"}] +} +``` +`matcher` must be a string (empty string for all events, pipe-delimited tool names like `"Bash|Edit|Write|Read"` for PreToolUse). + +### 6. The Algorithm + +For STANDARD+ effort tasks, The Algorithm creates an ISC (Ideal State Criteria) table in Markdown. This is stored at `AI/memory/work/{session}/ISC.md`. If you see an ISC file in a work directory, that session used structured execution. + +### 7. Security Validator + +The PreToolUse security hook checks every Bash command, file edit, write, and read against patterns in `AI/policies/security-patterns.yaml`. If you're adding automation or scripts that run with `bypassPermissions`, be aware that the security validator still fires and may block certain operations. Check the patterns file if you hit unexpected blocks. + +### 8. Ratings and Learning + +If the user gives a numeric rating (1-10), the `on-feedback.ts` hook captures it to `AI/memory/signals/ratings.jsonl`. If the user expresses frustration or satisfaction without an explicit rating, `implicit-sentiment.ts` captures it. Both feed into the weekly synthesis that proposes skill improvements. + +### 9. Scripts Changes + +`daily-brief.sh`, `daily-close.sh`, and `weekly-review.sh` were enhanced to read from the memory system. If you modify these scripts, preserve the memory-reading sections. `learning-synthesis.sh` is new — runs Sunday 9:00 AM, analyses the week's signals. + +### 10. Git Workflow + +All AI/ changes are committed to git with descriptive messages. The `.claude/` directory is gitignored (hook registration is local-only). `CLAUDE-legacy.md` is a backup of the original monolithic file — safe to delete after the transition period. + +--- + +## What NOT to Do + +- **Don't recreate content in CLAUDE.md.** It's intentionally slim. Content lives in skills and policies. +- **Don't write to `AI/AI Context Log.md`** — it moved to `AI/memory/context-log.md`. +- **Don't hardcode mode instructions.** Read the skill file dynamically. +- **Don't ignore context maps.** They exist to prevent loading the entire vault into context. +- **Don't modify hooks without testing.** A broken hook (non-zero exit) can block Claude Code. +- **Don't delete `AI/memory/` contents.** Ratings, learnings, and work history are used by the weekly synthesis. + +--- + +## Questions? + +If something is unclear about the new architecture, the full design documents are in the PAI reference repo at `~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Plans/`: +- `ai-genius-system-design.md` — overall architecture +- `transformation-briefing.md` — Phase 1 detailed plan +- `phase-2-briefing.md` through `phase-6-briefing.md` — subsequent phases +- `user-manual.md` — user-facing guide diff --git a/Plans/phase-2-briefing.md b/Plans/phase-2-briefing.md new file mode 100644 index 000000000..3e5e079e2 --- /dev/null +++ b/Plans/phase-2-briefing.md @@ -0,0 +1,690 @@ +# Phase 2: Full Hook System — Briefing for Claude Code + +## How to Use This File + +You are a Claude Code instance working in the LifeOS Obsidian vault. This file is your complete briefing for **Phase 2** of the AI Genius transformation. Phase 1 is complete and merged to main. + +**Reference repo**: `~/PAI-reference/` — clone of Daniel Messler's PAI. Read files there for implementation patterns. Adapt, don't copy. + +**Safety rule**: After EVERY sub-step (2A through 2F), commit to git with a descriptive message. If anything breaks, `git revert` that commit. + +--- + +## PART 1: PHASE 1 COMPLETION STATE + +### What Exists After Phase 1 + +Phase 1 created the modular architecture. Here's what's in place: + +``` +AI/ +├── skills/ # 23 skill files with YAML frontmatter +├── context/ # 7 domain context maps +├── policies/ # 8 policy files +├── memory/ +│ ├── work/current.md # WIP tracking +│ ├── learnings/ +│ │ ├── preferences.md # User style preferences +│ │ ├── mistakes.md # Errors to avoid +│ │ └── execution.md # Task approach learnings +│ ├── signals/ratings.jsonl # Quality ratings (explicit) +│ └── context-log.md # Migrated from AI/AI Context Log.md +├── hooks/ +│ ├── on-session-start.ts # Context injection (WIP, context log, preferences) +│ ├── on-session-end.ts # Session end processing +│ ├── on-feedback.ts # Explicit rating capture (1-10) +│ └── lib/paths.ts # Shared path utilities +└── scripts/ # 5 launchd scripts (enhanced with memory awareness) +``` + +- `CLAUDE.md` — slim routing file (~1,656 tokens, down from 14,500) +- `CLAUDE-legacy.md` — safety backup (remove after 2 weeks from Phase 1 merge date) +- All 62 automated tests pass + +### Phase 1 Lessons Learned (CRITICAL — Read These) + +These were discovered during manual testing and will save you from repeating mistakes: + +#### 1. Hook settings.json Format + +The `.claude/settings.json` hook registration format must be: + +```json +{ + "hooks": { + "EventType": [ + { + "matcher": "", + "hooks": [ + {"type": "command", "command": "bun run /path/to/hook.ts"} + ] + } + ] + } +} +``` + +**Critical details**: +- `"matcher"` MUST be a **string**, not an object. Use `""` (empty string) for hooks that fire on all events. +- For PreToolUse hooks that target specific tools, use pipe-delimited tool names: `"matcher": "Bash|Edit|Write|Read"` +- Each hook entry is `{"type": "command", "command": "..."}` — the `type` field is required. + +#### 2. SessionStart Hooks Do NOT Inject Content + +- SessionStart hooks run but their **stdout is NOT passed to Claude's conversation context**. +- To inject content (like WIP state, preferences, reminders) into Claude's context, use **UserPromptSubmit** hooks instead. +- UserPromptSubmit hook stdout appears as `` blocks in the conversation. +- This is why `on-session-start.ts` is registered under UserPromptSubmit, not SessionStart. + +#### 3. Stop Hooks Do NOT Inject Content Either + +- Stop hooks run on `/exit` but their stdout is not displayed to the user. +- The `on-session-end.ts` hook fires but its reminder output is not visible. +- For end-of-day processing, rely on `daily-close.sh` launchd script instead. +- Stop hooks ARE useful for fire-and-forget side effects (writing files, logging). + +#### 4. Context Injection Fires Every Prompt + +- The `on-session-start.ts` hook currently fires on EVERY UserPromptSubmit, not just the first. +- This is wasteful but not harmful — it re-injects WIP/preferences on every prompt. +- **Phase 2 must fix this** with a first-prompt-only guard (see Step 2F below). + +#### 5. Hook Runtime + +- All hooks use `bun run` as the TypeScript runtime. +- Hooks read JSON from stdin (event data) and write to stdout (content injection or decisions). +- Hooks MUST exit 0 always — a non-zero exit can block Claude Code. +- Exception: PreToolUse hooks can exit 2 to BLOCK a tool call. + +#### 6. Current Working settings.json + +This is the working `.claude/settings.json` from Phase 1: + +```json +{ + "hooks": { + "UserPromptSubmit": [ + { + "matcher": "", + "hooks": [ + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/on-session-start.ts"}, + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/on-feedback.ts"} + ] + } + ], + "Stop": [ + { + "matcher": "", + "hooks": [ + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/on-session-end.ts"} + ] + } + ] + } +} +``` + +--- + +## PART 2: PAI HOOK REFERENCE FILES + +Read these PAI files for implementation patterns before writing each hook: + +### Security Validator +``` +~/PAI-reference/Packs/pai-hook-system/src/hooks/SecurityValidator.hook.ts +``` +- PreToolUse hook — intercepts Bash, Edit, Write, Read +- Pattern categories: blocked (exit 2), confirm (ask user), alert (log + allow) +- Path categories: zeroAccess, readOnly, confirmWrite, noDelete +- Logs to MEMORY/SECURITY/ directory +- Exit codes: 0 = allow, 2 = hard block + +### Format Enforcer +``` +~/PAI-reference/Packs/pai-hook-system/src/hooks/FormatEnforcer.hook.ts +``` +- UserPromptSubmit hook — injects format rules on every prompt +- Self-healing: prevents format drift in long conversations +- Reads format spec from policy files +- Fast — just file read + stdout, no inference + +### Auto-Work Creation +``` +~/PAI-reference/Packs/pai-hook-system/src/hooks/AutoWorkCreation.hook.ts +``` +- UserPromptSubmit hook — creates work entries automatically +- Classifies prompts: work | question | conversational +- Tracks effort level: trivial | quick | standard | thorough +- Creates work directories with META.yaml + +### Implicit Sentiment Capture +``` +~/PAI-reference/Packs/pai-hook-system/src/hooks/ImplicitSentimentCapture.hook.ts +``` +- Detects frustration/satisfaction without explicit ratings +- Uses Haiku inference call (fast, cheap) +- Writes to ratings.jsonl with type: "implicit" +- Fire-and-forget — never blocks + +### Stop Orchestrator +``` +~/PAI-reference/Packs/pai-hook-system/src/hooks/StopOrchestrator.hook.ts +``` +- Single entry point for all Stop event handling +- Reads transcript ONCE, distributes to handlers +- Handlers: capture (work state), tab-state, system-integrity + +### Shared Libraries +``` +~/PAI-reference/Packs/pai-hook-system/src/hooks/lib/paths.ts +~/PAI-reference/Packs/pai-hook-system/src/hooks/lib/observability.ts +~/PAI-reference/Packs/pai-hook-system/src/hooks/lib/learning-utils.ts +~/PAI-reference/Packs/pai-hook-system/src/hooks/lib/metadata-extraction.ts +``` + +--- + +## PART 3: IMPLEMENTATION STEPS + +### Constraints + +**DO NOT TOUCH**: +- Vault folder structure (Areas/, Projects/, Notes/, Content/, etc.) +- All 24 .base database files +- eventkit-cli, mail-cli, afk-code +- Obsidian plugins +- Existing working hooks from Phase 1 (unless enhancing them) +- CLAUDE.md (unless adding a reference to a new hook or policy) + +**ALWAYS**: +- Git commit after every sub-step (2A through 2F) +- Test each hook individually before moving to the next +- Keep hooks fast — no inference calls except where explicitly specified (2D only) +- Exit 0 from every hook — never block Claude Code +- Use `bun run` as the TypeScript runtime (already installed on the system) + +--- + +### Step 2A: Security Validator Hook (PreToolUse) + +**What it does**: Intercepts every Bash, Edit, Write, and Read tool call before execution. Pattern-matches against security rules. Blocks dangerous commands, requests confirmation for risky ones, logs everything. + +**Why critical**: The launchd scripts run with `--permission-mode bypassPermissions`. One bad prompt could destroy the vault. This prevents that. + +**Implementation**: + +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/SecurityValidator.hook.ts +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/lib/paths.ts + +□ Create AI/policies/security-patterns.yaml: + bash: + blocked: + - "rm -rf /" + - "rm -rf ~" + - "rm -rf $HOME" + - "git push --force origin main" + - "git push --force origin master" + - "git reset --hard" + - "format" + - "mkfs" + - "> /dev/" + confirm: + - "rm -rf" + - "rm -r" + - "git push --force" + - "git reset" + - "chmod -R" + - "chown -R" + - "mv ~/LifeOS" + alert: + - "curl.*| sh" + - "curl.*| bash" + - "wget.*| sh" + - "eval" + - "sudo" + paths: + zeroAccess: + - "~/.ssh" + - "~/.aws" + - "~/.gnupg" + - "credentials" + - ".env" + - "*.key" + - "*.pem" + readOnly: + - "/etc" + - "/System" + - "/usr" + confirmWrite: + - "CLAUDE.md" + - ".claude/settings.json" + - "AI/policies/" + noDelete: + - "_System/Databases/" + - "AI/memory/" + - "AI/skills/" + - "AI/policies/" + - "AI/hooks/" + +□ Create AI/hooks/security-validator.ts: + → PreToolUse hook + → Reads JSON from stdin: { tool_name, tool_input } + → For Bash: check tool_input.command against bash patterns + → For Edit/Write: check tool_input.file_path against path rules + → For Read: check tool_input.file_path against zeroAccess paths + → Exit codes: + - 0: allowed (or alert-only — log but allow) + - 2: blocked (hard block, Claude sees rejection) + → For "confirm" patterns: output {"decision": "ask", "message": "⚠️ This command matches a security rule: [pattern]. Proceed?"} + → Log all decisions to AI/memory/security/ directory + → MUST: never crash, wrap everything in try/catch, exit 0 on any error + +□ Create AI/memory/security/ directory + +□ Register in .claude/settings.json: + Add to the existing config: + "PreToolUse": [ + { + "matcher": "Bash|Edit|Write|Read", + "hooks": [ + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/security-validator.ts"} + ] + } + ] + +□ Test: + → Try: a normal command (ls, git status) — should pass silently + → Try: "rm -rf /" — should be hard-blocked (exit 2) + → Try: "rm -rf some-dir" — should trigger confirmation prompt + → Try: reading a file with "credentials" in path — should be blocked + → Verify: AI/memory/security/ contains log entries + +□ Git commit: "add: security validator hook with pattern-based rules" +``` + +--- + +### Step 2B: Format Enforcer Hook (UserPromptSubmit) + +**What it does**: Injects a condensed format reminder into every prompt as a ``. Prevents format drift in long conversations. + +**Why needed**: CLAUDE.md has formatting rules (British English, no slop, structured output, no emojis). In long sessions these fall out of the context window. This hook re-injects them. + +**Implementation**: + +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/FormatEnforcer.hook.ts + +□ Create AI/hooks/format-enforcer.ts: + → UserPromptSubmit hook + → Reads AI/policies/formatting-rules.md (already exists from Phase 1) + → Outputs a CONDENSED version (~200 tokens max) as stdout + → Format of output: + "FORMAT REMINDER: British English. No emojis. No corporate language. + Match user's language (Polish/English). Structured output with headers. + Factual, laconic, no filler. ADHD-aware: clear sections, bullet points." + → Does NOT re-read the full policy file every time — cache the condensed version + in the hook itself or in a small separate file + → Fast execution — pure file read + stdout, no inference + → MUST: exit 0 always + +□ Register in .claude/settings.json: + Add to the existing UserPromptSubmit hooks array: + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/format-enforcer.ts"} + +□ Test: + → Start a session, send 20+ messages + → Verify formatting stays consistent (British English, structured output) + → Verify the format reminder appears in blocks + +□ Git commit: "add: format enforcer hook for consistent output quality" +``` + +--- + +### Step 2C: Auto-Work Creation Hook (UserPromptSubmit) + +**What it does**: Automatically creates a work tracking entry for each session. Classifies prompts. Tracks what you're working on without manual effort. + +**Why needed**: Phase 1's `work/current.md` is updated manually at session end. This makes tracking automatic and richer. + +**Implementation**: + +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/AutoWorkCreation.hook.ts + +□ Create AI/hooks/auto-work-creation.ts: + → UserPromptSubmit hook + → On FIRST prompt of session (check state file): + - Create AI/memory/work/{YYYY-MM-DD}_{slug}/ directory + - Create META.yaml inside: + status: active + created_at: ISO timestamp + effort: quick (default, can be updated) + prompts: 1 + - Update AI/memory/work/state.json with current session pointer: + {"active_work": "2025-01-15_newsletter-draft", "session_start": "ISO"} + → On SUBSEQUENT prompts: + - Increment prompt count in META.yaml + - Update effort classification based on prompt count: + 1-3 prompts → trivial + 4-10 prompts → quick + 11-25 prompts → standard + 26+ prompts → thorough + → Slug generation: take first meaningful words from prompt, lowercase, hyphenated + → MUST: exit 0 always, never block + +□ Create AI/memory/work/state.json (initial: {}) + +□ Register in .claude/settings.json: + Add to UserPromptSubmit hooks array: + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/auto-work-creation.ts"} + +□ Test: + → Start a session, send a prompt + → Verify: AI/memory/work/{date}_{slug}/ directory created with META.yaml + → Send more prompts, verify prompt count increments + → Verify state.json updated + +□ Git commit: "add: auto-work creation hook for automatic task tracking" +``` + +--- + +### Step 2D: Implicit Sentiment Capture Hook (UserPromptSubmit) + +**What it does**: Detects frustration, satisfaction, or confusion in messages WITHOUT explicit ratings. Uses a quick inference call to classify emotional state. + +**Why needed**: You won't rate every interaction 1-10. But "no, that's wrong, do it again" should be captured as negative signal. This captures those implicit signals. + +**Implementation**: + +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/ImplicitSentimentCapture.hook.ts + +□ Create AI/hooks/implicit-sentiment.ts: + → UserPromptSubmit hook + → First: check if on-feedback.ts already detected an explicit rating for this prompt + - If yes: skip (don't double-count) + - Detection: check if prompt matches rating regex ^(10|[1-9])(?:\s*[-:]\s*|\s+)? + → Parse the user's prompt text from stdin JSON + → Quick heuristic check BEFORE inference call (save API costs): + - Negative signals: "no", "wrong", "that's not", "try again", "redo", "not what I", + "you missed", "didn't ask for", "ignore that", "sigh", "frustrating" + - Positive signals: "perfect", "exactly", "great", "thanks", "spot on", "love it" + - Neutral: no signal detected → skip inference, exit 0 + → If signal detected: classify sentiment as: + - rating: 1-10 (inferred) + - sentiment: frustrated | confused | satisfied | neutral + - confidence: low | medium | high + → Write to AI/memory/signals/ratings.jsonl: + {"timestamp": "ISO", "type": "implicit", "rating": N, "sentiment": "X", + "confidence": "Y", "trigger": "the signal word/phrase detected"} + → If inferred rating < 6 AND confidence is medium or high: + Append to AI/memory/learnings/mistakes.md with context + → MUST: fire-and-forget, never block, exit 0 always + → MUST: no stdout output (this hook should NOT inject into conversation) + +□ Register in .claude/settings.json: + Add to UserPromptSubmit hooks array: + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/implicit-sentiment.ts"} + +□ Test: + → Say "no, that's wrong" → verify implicit rating captured in ratings.jsonl + → Say "perfect, exactly what I needed" → verify positive signal captured + → Say a neutral prompt → verify NO entry added (skip) + → Give an explicit rating "8" → verify implicit capture is SKIPPED + +□ Git commit: "add: implicit sentiment capture for passive learning" +``` + +**Note on inference**: The PAI version uses a Haiku API call for sentiment classification. For the first version, use the heuristic approach above (keyword matching) to avoid API costs and latency. Upgrade to inference in Phase 3 if the heuristic proves too crude. + +--- + +### Step 2E: Enhanced Session End Hook (Stop) + +**What it does**: When a session ends, marks the active work as completed, updates metadata, clears session state. + +**Why needed**: Phase 1's `on-session-end.ts` is basic. This version integrates with the auto-work system from 2C. + +**Implementation**: + +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/StopOrchestrator.hook.ts +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/handlers/capture.ts + +□ Enhance AI/hooks/on-session-end.ts: + → Keep existing functionality + → Add: read AI/memory/work/state.json for active work pointer + → If active work exists: + - Update META.yaml: status → completed, completed_at → ISO timestamp + - Calculate duration from created_at to now + - Update final effort classification based on prompt count + → Clear AI/memory/work/state.json (reset to {}) + → Update AI/memory/work/current.md with: + - What was worked on (from work directory name) + - Status: completed + - Duration + - Effort level + → Remember: Stop hook stdout is NOT displayed to user — this is fire-and-forget + → MUST: exit 0 always, fast execution + +□ Test: + → Start a session (triggers auto-work creation from 2C) + → Do some work + → Exit with /exit + → Verify: META.yaml updated with completed_at + → Verify: state.json cleared + → Verify: work/current.md updated + +□ Git commit: "enhance: session end hook with auto-work completion tracking" +``` + +--- + +### Step 2F: First-Prompt-Only Guard for Context Injection + +**What it does**: Makes the `on-session-start.ts` context injection fire only on the FIRST prompt of each session, not on every prompt. + +**Why needed**: Currently, WIP state + preferences + context log are re-injected on every single UserPromptSubmit. This wastes tokens and clutters the conversation with repeated `` blocks. + +**Implementation**: + +``` +□ Modify AI/hooks/on-session-start.ts: + → Add session tracking using a state file (e.g., AI/hooks/lib/.session-state.json) + → On each invocation: + 1. Read .session-state.json + 2. Check if "session_id" matches current process/time window + - Option A: Use the Claude session transcript path from stdin JSON + - Option B: Use a timestamp — if last injection was < 5 minutes ago, skip + - Option C (simplest): Use state.json from auto-work hook (2C) — + if state.json already has an active_work entry, skip injection + 3. If this is the first prompt: output context, write state + 4. If not first prompt: exit 0 with no output (silent) + → Recommended approach: Option C — piggyback on auto-work state. + If AI/memory/work/state.json has active_work set, context was already injected. + This naturally pairs with auto-work creation (2C). + +□ Alternative simpler approach (if Option C is too coupled): + → Create AI/hooks/lib/.last-injection timestamp file + → On invocation: read file, if timestamp is < 30 minutes old → skip + → On first prompt: inject context, write current timestamp + → On session end (2E): delete the timestamp file + +□ Test: + → Start new session → first prompt should show context injection + → Send second prompt → should NOT show context injection + → Wait 30+ minutes (or delete state file) → should re-inject + → Start completely new session → should inject again + +□ Git commit: "fix: context injection fires only on first prompt per session" +``` + +--- + +### Step 2G: Update settings.json Registration + +**What it does**: Ensures all new hooks are properly registered in `.claude/settings.json`. + +**Implementation**: + +``` +□ Final .claude/settings.json should look like: + +{ + "hooks": { + "PreToolUse": [ + { + "matcher": "Bash|Edit|Write|Read", + "hooks": [ + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/security-validator.ts"} + ] + } + ], + "UserPromptSubmit": [ + { + "matcher": "", + "hooks": [ + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/on-session-start.ts"}, + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/on-feedback.ts"}, + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/format-enforcer.ts"}, + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/auto-work-creation.ts"}, + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/implicit-sentiment.ts"} + ] + } + ], + "Stop": [ + { + "matcher": "", + "hooks": [ + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/on-session-end.ts"} + ] + } + ] + } +} + +□ IMPORTANT: .claude/ is gitignored — this file lives only locally. + You cannot commit it. The user must update it manually or you must + write it directly to /Users/krzysztofgoworek/LifeOS/.claude/settings.json + +□ After updating, verify: + → Start new Claude Code session + → First prompt: should see context injection (on-session-start) + format reminder (format-enforcer) + → Security: try a dangerous command, verify it's blocked + → Feedback: give a rating, verify it's captured + → Work: verify work directory was created + +□ Git commit: "docs: document final settings.json hook registration" + (commit a reference copy or documentation, since .claude/ is gitignored) +``` + +--- + +## PART 4: MANUAL TESTING CHECKLIST + +After all hooks are implemented, run through these tests: + +``` +□ Test 1: Security Validator + → Run: a normal git status → passes + → Run: try "rm -rf /" in a prompt → blocked + → Run: try editing CLAUDE.md → confirmation prompt + → Check: AI/memory/security/ has log entries + +□ Test 2: Format Enforcement + → Start a long conversation (15+ exchanges) + → Verify: formatting stays consistent (British English, structured, no emojis) + → Verify: with format rules appears + +□ Test 3: Auto-Work Tracking + → Start session, ask about newsletter → work directory created + → Send 5 more prompts → prompt count = 6, effort = quick + → Exit session → META.yaml shows completed, work/current.md updated + +□ Test 4: Implicit Sentiment + → Say "that's wrong, try again" → check ratings.jsonl for implicit negative + → Say "perfect, exactly right" → check for implicit positive + → Say a neutral question → no new entry in ratings.jsonl + +□ Test 5: First-Prompt Guard + → Start session → context injection on first prompt + → Second prompt → NO context injection + → New session → context injection again + +□ Test 6: Explicit Rating (regression) + → Give a "7 - good response" rating + → Verify: ratings.jsonl has explicit entry + → Verify: implicit sentiment did NOT also create an entry + +□ Test 7: Combined Flow + → Start session (auto-work created, context injected, format reminder) + → Do some work + → Express frustration ("no, redo this") + → Give explicit rating ("8 - better") + → Exit session + → Verify: work directory complete, both ratings captured, security log clean +``` + +--- + +## PART 5: KNOWN ISSUES & EDGE CASES + +### Hook Execution Order + +UserPromptSubmit hooks fire sequentially in the order registered. The recommended order is: +1. `on-session-start.ts` — context injection (first-prompt only) +2. `on-feedback.ts` — explicit rating capture +3. `format-enforcer.ts` — format reminder injection +4. `auto-work-creation.ts` — work tracking +5. `implicit-sentiment.ts` — sentiment (runs last, checks if explicit rating already captured) + +### .claude/ is Gitignored + +The `.claude/settings.json` file cannot be committed. Options: +- Document the expected settings.json in `AI/hooks/README.md` +- Create `AI/hooks/settings.json.example` as a reference copy +- The user updates .claude/settings.json manually + +### Concurrent Hook Execution + +Multiple UserPromptSubmit hooks share state via the filesystem. Be careful with: +- `state.json` — both auto-work and session-start hooks may read/write it +- `ratings.jsonl` — both feedback and sentiment hooks append to it +- Use file locking or append-only patterns to avoid race conditions + +### Memory Directory Structure After Phase 2 + +``` +AI/memory/ +├── work/ +│ ├── current.md # Latest WIP summary +│ ├── state.json # Active session pointer +│ └── 2025-01-15_task-name/ # Per-session work directories +│ └── META.yaml # Status, timestamps, effort, prompt count +├── learnings/ +│ ├── preferences.md +│ ├── mistakes.md +│ └── execution.md +├── signals/ +│ └── ratings.jsonl # Explicit + implicit ratings +├── security/ # Security validator audit trail +│ └── YYYY-MM-DD.jsonl +└── context-log.md +``` + +--- + +## PART 6: WHAT COMES NEXT (Phase 3 Preview) + +Phase 2 produces raw signals (ratings, sentiment, work tracking, security logs). Phase 3 turns those signals into actionable improvements: + +- **Weekly learning synthesis** — analyzes ratings.jsonl + learnings/ to find patterns +- **Skill improvement proposals** — suggests concrete edits to skill files +- **Preference aggregation** — promotes high-confidence preferences to policy files + +Phase 2 is the foundation that makes Phase 3 possible. Get the hooks right, and the system starts learning automatically. diff --git a/Plans/phase-2-plus-roadmap.md b/Plans/phase-2-plus-roadmap.md new file mode 100644 index 000000000..b47349ee3 --- /dev/null +++ b/Plans/phase-2-plus-roadmap.md @@ -0,0 +1,409 @@ +# AI Genius: Phase 2+ Implementation Roadmap + +## Context + +Phase 1 (the current transformation-briefing.md) covers: +- Modular skill extraction from CLAUDE.md +- Basic memory system (WIP, learnings, preferences, mistakes) +- Basic hooks (session-start, session-end, feedback capture) +- Context maps and policies +- Challenger protocol + +This document plans everything that Phase 1 does NOT cover from PAI. + +--- + +## Priority Framework + +Phases are ordered by **value to you first, complexity second**: + +| Priority | Principle | +|---|---| +| 1st | Things that make the system smarter without you doing anything | +| 2nd | Things that protect you from AI mistakes | +| 3rd | Things that give you superpowers for specific tasks | +| 4th | Things that make the system observable and debuggable | +| 5th | Things that let the system grow itself | + +--- + +## PHASE 2: Full Hook System + +**Why first**: Hooks are where the "smart" behavior lives. They run automatically on every interaction — no manual invocation needed. Phase 1 gives you 3 hooks. PAI has 15. The missing 12 are where the system becomes genuinely self-aware. + +**Depends on**: Phase 1 complete (modular structure, basic hooks working) + +### 2A: Security Validator Hook (PreToolUse) + +**What it does**: Intercepts EVERY tool call (Bash, Edit, Write, Read) before execution. Pattern-matches against a security config. Blocks dangerous commands, prompts for confirmation on risky ones, logs everything. + +**Why you need it**: Your launchd scripts run with `--permission-mode bypassPermissions`. One bad prompt could `rm -rf` your vault. This prevents that. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/PreToolUse/SecurityValidator.hook.ts +□ Create AI/hooks/security-validator.ts + → PreToolUse hook (fires before any tool executes) + → Reads AI/policies/security-patterns.yaml for rules + → Exit codes: 0 = allow, 2 = block + → Returns {"decision": "ask"} for confirmable operations +□ Create AI/policies/security-patterns.yaml: + bash: + blocked: + - pattern: "rm -rf /" + - pattern: "git push --force" + - pattern: "git reset --hard" + confirm: + - pattern: "rm -rf" + - pattern: "git push" + - pattern: "chmod -R" + alert: + - pattern: "curl.*| sh" + - pattern: "eval" + paths: + zeroAccess: ["~/.ssh", "~/.aws", "credentials", ".env"] + readOnly: ["/etc", "/System"] + confirmWrite: ["CLAUDE.md", ".claude/settings.json"] + noDelete: ["_System/Databases/", "AI/memory/"] +□ Create AI/memory/security/ directory for audit trail +□ Register hook in .claude/settings.json as PreToolUse matcher +□ Test: try a blocked command, verify it's stopped +□ Git commit: "add: security validator hook with pattern-based rules" +``` + +### 2B: Format Enforcer Hook (UserPromptSubmit) + +**What it does**: Injects a response format reminder into every prompt as a ``. Prevents format drift in long conversations — Claude tends to get sloppy after 20+ exchanges. + +**Why you need it**: Your CLAUDE.md has detailed formatting rules (structured output, British English, no slop). But in long sessions these get lost from the context window. This hook re-injects them every single prompt. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/UserPromptSubmit/FormatEnforcer.hook.ts +□ Create AI/hooks/format-enforcer.ts + → UserPromptSubmit hook + → Reads AI/policies/formatting-rules.md (already created in Phase 1) + → Outputs condensed format reminder as + → Skips for subagent contexts + → Fast — just file read + stdout, no inference +□ Register in .claude/settings.json +□ Test: long conversation, verify formatting stays consistent +□ Git commit: "add: format enforcer hook for consistent output quality" +``` + +### 2C: Auto-Work Creation Hook (UserPromptSubmit) + +**What it does**: Automatically creates a work item in `AI/memory/work/` for every session. Classifies each prompt as work/question/conversational. Tracks what you're doing without you thinking about it. + +**Why you need it**: Phase 1's `work/current.md` is updated manually at session end. This makes it automatic and richer — every task gets a directory with metadata. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/UserPromptSubmit/AutoWorkCreation.hook.ts +□ Create AI/hooks/auto-work-creation.ts + → UserPromptSubmit hook + → On first prompt of session: create AI/memory/work/{timestamp}_{slug}/ + with META.yaml (status, created_at, effort estimate) + → On subsequent prompts: add items to the work directory + → Classify prompt: work | question | conversational + → Track effort level: trivial | quick | standard | thorough + → Update AI/memory/work/state.json (current session pointer) +□ Adapt the SESSION_SUMMARY hook from Phase 1 to mark work COMPLETED on session end +□ Register in .claude/settings.json +□ Test: start session, ask something, verify work directory created +□ Git commit: "add: auto-work creation hook for automatic task tracking" +``` + +### 2D: Implicit Sentiment Capture Hook (UserPromptSubmit) + +**What it does**: Detects frustration, satisfaction, or confusion in your messages WITHOUT you giving an explicit rating. Uses a quick inference call to classify your emotional state. + +**Why you need it**: You won't rate every interaction 1-10. But if you say "no, that's wrong, do it again" the system should learn. This captures those implicit signals. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/UserPromptSubmit/ImplicitSentimentCapture.hook.ts +□ Create AI/hooks/implicit-sentiment.ts + → UserPromptSubmit hook + → Skips if explicit rating was already detected (by on-feedback hook) + → Uses Haiku inference call to classify sentiment (fast, cheap) + → Writes to AI/memory/signals/ratings.jsonl with type: "implicit" + → If inferred rating < 6: creates learning entry with context + → Fire-and-forget — never blocks +□ Register in .claude/settings.json +□ Test: express frustration, verify implicit signal captured +□ Git commit: "add: implicit sentiment capture for passive learning" +``` + +### 2E: Session Summary Hook (Stop) + +**What it does**: When a session ends, automatically summarizes what happened and updates the work state. Replaces the manual "update WIP at session end" instruction. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/Stop/SessionSummary.hook.ts +□ Enhance AI/hooks/on-session-end.ts (from Phase 1): + → Mark active work directory as COMPLETED + → Update META.yaml with completed_at timestamp + → Clear AI/memory/work/state.json + → Capture session summary in the work directory +□ Git commit: "enhance: session end hook with auto-summary and work completion" +``` + +**Phase 2 total: 5 new/enhanced hooks. After this, every interaction is automatically tracked, classified, secured, and quality-controlled.** + +--- + +## PHASE 3: Learning & Synthesis Engine + +**Why next**: Hooks capture raw signals. This phase turns raw signals into actionable improvements. The system starts actually getting smarter over time. + +**Depends on**: Phase 2 (hooks producing signals in AI/memory/) + +### 3A: Learning Pattern Synthesis + +**What it does**: Periodically analyzes all captured learnings, ratings, and mistakes. Produces synthesis reports: what's working, what's not, what patterns are emerging. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-hook-system/src/hooks/lib/TrendingAnalysis.ts +□ Create AI/scripts/learning-synthesis.sh (launchd weekly, Sunday after weekly review) + → Reads AI/memory/signals/ratings.jsonl (last 7 days) + → Reads AI/memory/learnings/ (all categories) + → Claude prompt: "Analyze these signals. What patterns do you see? + What skills are performing well/poorly? What preferences are + emerging? What mistakes are repeating?" + → Writes synthesis to AI/memory/learnings/synthesis/YYYY-WW.md + → Proposes specific changes to skill files or policies +□ Create launchd plist for weekly schedule +□ Git commit: "add: weekly learning synthesis for self-improvement" +``` + +### 3B: Skill Auto-Improvement Proposals + +**What it does**: Based on synthesis data, the system proposes concrete edits to skill files. You approve or reject. Rejected proposals get logged so it doesn't suggest again. + +**Implementation**: +``` +□ Create AI/memory/proposals/ directory +□ Enhance learning-synthesis.sh: + → When a skill consistently rates poorly: generate a proposal + → Proposal format: {skill, current_behavior, proposed_change, evidence} + → Write to AI/memory/proposals/pending/YYYY-MM-DD_{skill}.md + → Notify via Discord webhook: "I have improvement proposals. Review?" +□ Add a review workflow: you say "show proposals" → Claude reads pending/, + presents them, you approve/reject, approved get applied, rejected get + moved to AI/memory/proposals/rejected/ +□ Git commit: "add: skill improvement proposal system" +``` + +### 3C: Preference Learning Aggregation + +**What it does**: Consolidates scattered preference observations into authoritative preference files that skills read. Instead of each skill guessing your style, they read your verified preferences. + +**Implementation**: +``` +□ Enhance AI/memory/learnings/preferences.md: + → Add confidence scores (observed 1x vs observed 10x) + → Add source tracking (which session, which interaction) + → Structure by domain (writing, communication, workflow, formatting) +□ Update skill files to reference preferences.md in their context_files +□ Weekly synthesis reviews preferences, promotes high-confidence ones + to the relevant policy files (e.g., new formatting preference → + update AI/policies/formatting-rules.md) +□ Git commit: "enhance: structured preference learning with confidence tracking" +``` + +**Phase 3 total: The system now analyzes its own performance weekly, proposes improvements, and learns your preferences with increasing confidence.** + +--- + +## PHASE 4: Advanced Skills from PAI + +**Why now**: The infrastructure is solid (hooks, memory, learning). Now port the skills that give you actual superpowers for specific tasks. + +**Depends on**: Phase 1 (skill structure), Phase 2 (hooks), Phase 3 (learning) + +### 4A: Research Skill + +**What it does**: Multi-source research with 3 depth levels (quick/standard/extensive). Uses your existing MCP servers (Brave Search, Firecrawl, Gemini, Perplexity) as research tools. Produces structured research reports. + +**Why valuable**: You already have 6 MCP servers. This skill orchestrates them into a coherent research workflow instead of you manually invoking each one. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-research-skill/ +□ Create AI/skills/research.md + → 3 depth modes: Quick (1 source), Standard (3 sources), Extensive (all sources) + → Integrates with existing MCP servers: Brave Search, Firecrawl, Perplexity, Gemini + → Output: structured synthesis, not just raw search results + → Context map: reads relevant domain context to focus research + → Stores results in AI/memory/research/ for future reference +□ Create AI/context/research.md (context map — where to store, what to reference) +□ Git commit: "add: research skill with multi-source orchestration" +``` + +### 4B: Council Skill (Multi-Agent Debate) + +**What it does**: For important decisions, spawns 3-5 specialized agents who debate the topic for 3 rounds. Each challenges the others' positions. You get a structured transcript with points of agreement and disagreement. + +**Why valuable**: You're strong at synthesis — this gives you pre-digested intellectual friction to synthesize from, instead of a single AI opinion. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-council-skill/ +□ Create AI/skills/council.md + → Trigger: "council:", "debate this", "get multiple perspectives" + → Round 1: Each agent states position (parallel) + → Round 2: Agents respond to each other (parallel) + → Round 3: Convergence — what do they agree on, where do they differ? + → Output: structured transcript + synthesis + recommendation + → Agents: Strategist, Devil's Advocate, Pragmatist, Domain Expert (varies) +□ Git commit: "add: council skill for multi-agent debate on decisions" +``` + +### 4C: CreateSkill Skill (Self-Extending) + +**What it does**: The system can propose and create new skills when it detects a pattern of requests that don't map to any existing skill. You approve, it scaffolds the skill file. + +**Why valuable**: The system grows organically. If you start asking about a new topic regularly, it creates a skill for it — complete with context maps, policies, and learned preferences. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-createskill-skill/ +□ Create AI/skills/create-skill.md + → Trigger: "create a new skill for...", or auto-detected by weekly synthesis + → Validates: proper frontmatter, trigger keywords, context map reference + → Scaffolds: skill file + context map + optional policy file + → Adds to routing table in CLAUDE.md (or skill registry file) +□ Git commit: "add: create-skill for self-extending capability" +``` + +### 4D: Telos Skill (Life OS / Goals) + +**What it does**: Manages your goals, beliefs, lessons learned, and life areas as structured data. Generates progress reports. Tracks what you believe, what you've learned, what you're working toward. + +**Why valuable**: You already have Tracking/Objectives/ and Tracking/Key Results/. This skill connects them into a coherent life operating system with periodic reviews and McKinsey-style progress reports. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-telos-skill/ +□ Create AI/skills/telos.md + → Integrates with existing: Tracking/Objectives/, Tracking/Key Results/, + Areas/Life Areas/, Projects/GPR/ + → Workflows: Update goals, WriteReport, Review progress + → Stores personal operating data in AI/My AgentOS/telos/ + (BELIEFS.md, GOALS.md, LESSONS.md, WISDOM.md) + → Monthly: generates goal progress report + → Quarterly: proposes goal revisions based on actual progress +□ Create AI/context/goals.md (context map) +□ Git commit: "add: telos skill for life OS and goal management" +``` + +**Phase 4 total: Four high-value skills that leverage existing infrastructure (MCP servers, vault data, memory system) to give you research, debate, self-extension, and goal management capabilities.** + +--- + +## PHASE 5: The Algorithm (Execution Engine) + +**Why later**: This is PAI's most sophisticated feature — a universal problem-solving loop. It's powerful but complex. You need the skill system, hooks, and memory working well before adding this layer. + +**Depends on**: All previous phases + +### 5A: Ideal State Criteria (ISC) Framework + +**What it does**: For every non-trivial task, the system defines what "ideal" looks like as a table of criteria. Then it works through each criterion systematically, tracking progress. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-algorithm-skill/ +□ Create AI/skills/algorithm.md + → 7-phase execution: OBSERVE → THINK → PLAN → BUILD → EXECUTE → VERIFY → LEARN + → ISC table: What Ideal Looks Like | Source | Capability | Status + → Effort classification: TRIVIAL → QUICK → STANDARD → THOROUGH → DETERMINED + → Higher effort = more capabilities unlocked (parallel agents, research, debate) +□ Create AI/policies/algorithm-protocol.md + → When to invoke: non-trivial tasks (user can say "use the algorithm") + → Effort classification rules + → Which capabilities unlock at which effort level +□ Git commit: "add: the algorithm execution engine with ISC tracking" +``` + +### 5B: Effort-Based Capability Routing + +**What it does**: Classifies task complexity, then unlocks appropriate tools. A trivial task gets a direct answer. A thorough task gets research, council debate, and parallel execution. + +**Implementation**: +``` +□ Create AI/policies/effort-classification.yaml: + trivial: → direct answer, no tools + quick: → single skill, haiku model + standard: → skill + research, sonnet model, 1-3 parallel agents + thorough: → skill + research + council, opus model, 3-5 parallel agents + determined: → everything, extended thinking, red team +□ Integrate with algorithm skill: auto-classify, auto-route +□ Git commit: "add: effort-based capability routing" +``` + +**Phase 5 total: The system now has a structured problem-solving methodology that scales its effort to match task complexity.** + +--- + +## PHASE 6: Observability & Monitoring + +**Why last**: Not a user-facing feature — it's a debugging and insight tool. Valuable once the system is complex enough to need monitoring. + +**Depends on**: Phase 2 (hooks generating events to observe) + +### 6A: Event Logging + +**What it does**: All hook events, tool calls, and agent actions are logged to JSONL files. Structured, searchable, append-only. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-observability-server/src/ +□ Create AI/hooks/agent-output-capture.ts (PostToolUse hook) + → Logs every tool call and result to AI/memory/history/YYYY-MM-DD.jsonl + → Fire-and-forget, never blocks +□ This feeds into weekly learning synthesis (Phase 3) +□ Git commit: "add: event logging hook for observability" +``` + +### 6B: Dashboard (Optional) + +**What it does**: Real-time web dashboard showing what the system is doing. Swim lanes for parallel agents, event timeline, performance metrics. + +**Implementation**: +``` +□ Read PAI reference: ~/PAI-reference/Packs/pai-observability-server/ +□ Adapt: Bun HTTP + WebSocket server on Mac Mini + → Reads JSONL event files + → Streams to Vue dashboard + → Accessible via Tailscale from any device +□ This is optional — the event logs alone are useful without a dashboard +□ Git commit: "add: observability dashboard" +``` + +**Phase 6 total: Full visibility into what the system is doing, when, and how well.** + +--- + +## PHASE SUMMARY + +| Phase | What | Key Outcome | Depends On | +|---|---|---|---| +| **1** (current) | Modular structure, basic hooks, memory | System works modularly | Nothing | +| **2** | Full hook system (security, format, work tracking, sentiment) | Every interaction auto-tracked and protected | Phase 1 | +| **3** | Learning synthesis, improvement proposals, preference aggregation | System gets measurably smarter over time | Phase 2 | +| **4** | Research, Council, CreateSkill, Telos | Superpowers for specific tasks | Phases 1-3 | +| **5** | The Algorithm, ISC, effort routing | Structured problem-solving at any scale | Phases 1-4 | +| **6** | Event logging, dashboard | Full observability | Phase 2 | + +### Execution Notes + +- **Each phase is independently valuable**. You don't need Phase 6 for Phase 2 to work. Stop at any phase and have a better system than before. +- **Phases 2 and 3 are the highest ROI**. Hooks + learning give you an auto-improving system with minimal effort from you. +- **Phase 4 skills can be done in any order** or partially. Research and Council are the most valuable. CreateSkill and Telos are nice-to-have. +- **Phase 5 (Algorithm) is optional power**. It's PAI's most complex feature. Skip it unless you find yourself regularly working on thorough/determined-level tasks. +- **Phase 6 is for debugging**. Add it when the system is complex enough that you can't tell what's happening by reading log files. +- **For each phase**: create the briefing by reading the transformation plan + the relevant PAI reference files, then hand it to a Claude Code instance in your vault. Same pattern as Phase 1. diff --git a/Plans/phase-3-briefing.md b/Plans/phase-3-briefing.md new file mode 100644 index 000000000..78a0b3c6f --- /dev/null +++ b/Plans/phase-3-briefing.md @@ -0,0 +1,641 @@ +# Phase 3: Learning & Synthesis Engine — Briefing for Claude Code + +## How to Use This File + +You are a Claude Code instance working in the LifeOS Obsidian vault. This file is your complete briefing for **Phase 3** of the AI Genius transformation. Phases 1 and 2 are complete and merged to main. + +**Reference repo**: `~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ` — clone of Daniel Messler's PAI. Read files there for implementation patterns. Adapt, don't copy. + +**Safety rule**: After EVERY sub-step (3A through 3E), commit to git with a descriptive message. If anything breaks, `git revert` that commit. + +--- + +## PART 1: CURRENT STATE AFTER PHASE 2 + +### What Exists + +Phase 1 created the modular architecture. Phase 2 added the full hook system. Here's the current state: + +``` +AI/ +├── skills/ # 23 skill files with YAML frontmatter +├── context/ # 7 domain context maps +├── policies/ # 8 policy files + security-patterns.yaml +├── memory/ +│ ├── work/ +│ │ ├── current.md # WIP tracking (updated by hooks) +│ │ ├── state.json # Active session pointer +│ │ └── YYYY-MM-DD_slug/ # Per-session work directories +│ │ └── META.yaml # Status, timestamps, effort, prompt count +│ ├── learnings/ +│ │ ├── preferences.md # User style preferences +│ │ ├── mistakes.md # Errors to avoid +│ │ └── execution.md # Task approach learnings +│ ├── signals/ +│ │ └── ratings.jsonl # Explicit + implicit quality ratings +│ ├── security/ # Security validator audit trail +│ │ └── YYYY-MM-DD.jsonl +│ └── context-log.md # Current situation, priorities, pipeline +├── hooks/ +│ ├── on-session-start.ts # Context injection (first-prompt-only guard) +│ ├── on-session-end.ts # Session end + work completion tracking +│ ├── on-feedback.ts # Explicit rating capture (1-10) +│ ├── security-validator.ts # PreToolUse security checks +│ ├── format-enforcer.ts # Format reminder injection +│ ├── auto-work-creation.ts # Automatic work directory + META.yaml +│ ├── implicit-sentiment.ts # Passive frustration/satisfaction detection +│ └── lib/ +│ └── paths.ts # Shared path utilities +└── scripts/ # 5 launchd scripts (memory-aware) +``` + +### What Phase 2 Produces (Raw Signals) + +Phase 2 hooks generate these data streams. Phase 3 turns them into actionable improvements: + +| Signal Source | Location | Format | Contents | +|---|---|---|---| +| Explicit ratings | `signals/ratings.jsonl` | JSONL | `{timestamp, type:"explicit", rating:1-10, comment}` | +| Implicit sentiment | `signals/ratings.jsonl` | JSONL | `{timestamp, type:"implicit", rating:1-10, sentiment, confidence, trigger}` | +| Work sessions | `work/YYYY-MM-DD_slug/META.yaml` | YAML | Status, effort, prompt count, duration | +| Mistakes | `learnings/mistakes.md` | Markdown | Errors captured from low ratings | +| Execution patterns | `learnings/execution.md` | Markdown | Task approach observations | +| Preferences | `learnings/preferences.md` | Markdown | Style and behavior preferences | +| Security events | `security/YYYY-MM-DD.jsonl` | JSONL | Tool call decisions + audit trail | + +### Lessons Learned from Phase 1 & 2 + +- **Hook format**: `"matcher": ""` (empty string, not object) for non-tool-specific hooks +- **Content injection**: Only UserPromptSubmit hooks inject into Claude's context (as ``) +- **SessionStart/Stop**: Fire-and-forget only — stdout not passed to conversation +- **Bun runtime**: All hooks use `bun run` — fast TypeScript execution +- **Exit 0 always**: Hooks must never crash or block Claude Code +- **`.claude/` is gitignored**: settings.json lives only locally, document changes in `AI/hooks/README.md` + +--- + +## PART 2: PAI REFERENCE FILES + +Read these before implementing each step: + +### Learning Utilities +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/lib/learning-utils.ts +``` +- Categorization logic for learnings (SYSTEM vs ALGORITHM) +- Determines if a response represents a "learning moment" via indicator patterns + +### Work Completion Learning +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/WorkCompletionLearning.hook.ts +``` +- Bridges WORK/ to LEARNING/ at session end +- Captures files changed, tools used, agents spawned +- Creates structured learning files with ideal state criteria +- Categorizes as SYSTEM (tooling) or ALGORITHM (execution) + +### Implicit Sentiment (for coordination patterns) +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/ImplicitSentimentCapture.hook.ts +``` +- Shows how PAI uses Sonnet inference for sentiment analysis +- Useful for understanding the analysis depth possible with inference calls + +### Explicit Rating Capture (for TrendingAnalysis integration) +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/ExplicitRatingCapture.hook.ts +``` +- Triggers TrendingAnalysis update on rating capture +- Shows the pattern for reactive analysis + +### Ideal State Management +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/lib/IdealState.ts +``` +- Tracks dimensions of success (Functional, Quality, Scope, Implicit, Verification) +- Supports fidelity scoring +- Achievement tracking with evidence — useful patterns for proposal tracking + +### Algorithm Phases — Learn +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Phases/Learn.md +``` +- LEARN phase: documents capability usage and performance +- Tracks iteration decisions — useful for the synthesis analysis pattern + +### Telos (for goals integration) +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/Tools/UpdateTelos.ts +``` +- Integrates learning data into visual dashboards +- Pattern for synthesizing learnings into actionable views + +--- + +## PART 3: IMPLEMENTATION STEPS + +### Constraints + +**DO NOT TOUCH**: +- Vault folder structure (Areas/, Projects/, Notes/, Content/, etc.) +- All 24 .base database files +- eventkit-cli, mail-cli, afk-code +- Obsidian plugins +- Existing working hooks from Phase 1 & 2 (unless explicitly enhancing) +- CLAUDE.md (unless adding a reference to new capabilities) + +**ALWAYS**: +- Git commit after every sub-step (3A through 3E) +- Test each script/feature individually before moving on +- Use `bun run` for TypeScript, `bash` for shell scripts +- Write results as Markdown files (readable in Obsidian) + +--- + +### Step 3A: Learning Pattern Synthesis Script + +**What it does**: A weekly script that analyzes all captured signals (ratings, sentiment, work sessions, mistakes, execution patterns) and produces a synthesis report. Identifies what's working, what's not, and what patterns are emerging. + +**Why critical**: Without synthesis, raw signals just accumulate forever. This is the step where the system actually "thinks about its own performance." + +**Implementation**: + +``` +□ Read PAI references: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/lib/learning-utils.ts + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/WorkCompletionLearning.hook.ts + +□ Create directory: AI/memory/learnings/synthesis/ + +□ Create AI/scripts/learning-synthesis.sh: + → Scheduled: Sunday after weekly-review.sh (e.g., 11:00 AM Sunday) + → Uses: claude -p --permission-mode bypassPermissions --model sonnet + → Claude prompt includes: + + "You are the LifeOS learning synthesis engine. Analyze the following data + from the past 7 days and produce a structured synthesis report. + + DATA SOURCES TO READ: + 1. AI/memory/signals/ratings.jsonl — filter to last 7 days + Count: total ratings, explicit vs implicit, average score, trend + 2. AI/memory/learnings/mistakes.md — any new entries this week + 3. AI/memory/learnings/execution.md — any new entries this week + 4. AI/memory/learnings/preferences.md — any new entries this week + 5. AI/memory/work/ — scan work directories from last 7 days + Analyze: effort distribution, completion rates, session durations + 6. AI/memory/security/ — scan security logs from last 7 days + Analyze: any blocked/confirmed events, patterns + + PRODUCE: + A synthesis report saved to AI/memory/learnings/synthesis/YYYY-WW.md + with these sections: + + ## Week Summary + - Total sessions: N + - Average rating: X.X (explicit) / X.X (implicit) + - Rating trend: improving / stable / declining + - Effort distribution: N trivial, N quick, N standard, N thorough + + ## What's Working + - Skills or patterns that consistently rate well + - Preferences that seem stable (mentioned 3+ times) + + ## What's Not Working + - Skills or patterns that rate poorly + - Recurring mistakes from mistakes.md + - Any frustration patterns from implicit sentiment + + ## Emerging Patterns + - New preferences not yet in preferences.md + - Cross-domain connections observed + - Usage patterns (which skills used most/least) + + ## Recommendations + - Specific skill files that should be improved (with reasoning) + - Preferences ready to be promoted to policy files + - Mistakes that should become explicit rules in policies + + ## Proposals (for Step 3B) + - List of concrete improvement proposals, each as: + Skill: [name] + Current behavior: [what it does now] + Proposed change: [what should change] + Evidence: [which signals support this] + Confidence: low / medium / high" + + → Sends synthesis notification via Discord webhook (notify-utils.sh) + → IMPORTANT: The prompt should instruct Claude to read the actual files, + not receive them as inline content (files may be large) + +□ Create launchd plist: AI/scripts/plists/com.lifeos.learning-synthesis.plist + → Schedule: Sunday 11:00 AM (after weekly-review.sh at 10:00 AM) + → StandardOutPath: AI/scripts/logs/learning-synthesis.log + → StandardErrorPath: AI/scripts/logs/learning-synthesis-error.log + → WorkingDirectory: /Users/krzysztofgoworek/LifeOS + +□ Test manually: + → Run: bash AI/scripts/learning-synthesis.sh + → Verify: AI/memory/learnings/synthesis/YYYY-WW.md created with all sections + → Verify: Discord notification sent + +□ Git commit: "add: weekly learning synthesis script for pattern analysis" +``` + +--- + +### Step 3B: Skill Improvement Proposal System + +**What it does**: The synthesis report (3A) generates proposals. This step creates the infrastructure to store, review, and apply those proposals. You say "show proposals" and Claude presents pending improvements. You approve or reject. Approved changes get applied to skill/policy files. Rejected ones are logged so the system stops suggesting them. + +**Why needed**: Without a proposal workflow, synthesis reports are just documents. This closes the loop — the system proposes, you decide, and the decision feeds back into learning. + +**Implementation**: + +``` +□ Create directories: + AI/memory/proposals/ + AI/memory/proposals/pending/ + AI/memory/proposals/approved/ + AI/memory/proposals/rejected/ + +□ Enhance AI/scripts/learning-synthesis.sh (from 3A): + → After generating the synthesis report, for each recommendation with + medium or high confidence: + - Create a proposal file: AI/memory/proposals/pending/YYYY-MM-DD_{skill-or-policy}.md + - Format: + --- + skill: [skill filename] + created: [ISO timestamp] + confidence: [low/medium/high] + source_synthesis: [path to synthesis report] + --- + # Proposal: [Short title] + + ## Current Behavior + [What the skill/policy currently does] + + ## Proposed Change + [Specific edit to make — quote the exact text to change and the replacement] + + ## Evidence + [Which ratings, mistakes, or patterns support this change] + + ## Risk + [What could go wrong if this change is applied] + +□ Create AI/skills/review-proposals.md: + --- + name: Review Proposals + triggers: ["show proposals", "review proposals", "improvement proposals", "what should we improve"] + context_files: + - AI/memory/proposals/pending/ + policies: + - challenger-protocol + voice: "Analytical, presenting options clearly. Show evidence for each proposal." + --- + + # Review Proposals + + ## When to Activate + When user asks to see improvement proposals or says "show proposals". + + ## Instructions + 1. Read all files in AI/memory/proposals/pending/ + 2. If none exist: "No pending proposals. The weekly synthesis runs on Sunday." + 3. Present each proposal with: + - The skill/policy affected + - What would change + - Evidence supporting the change + - Confidence level + 4. For each, ask: "Approve, reject, or skip?" + 5. On APPROVE: + - Apply the proposed edit to the target skill/policy file + - Move proposal to AI/memory/proposals/approved/ + - Append to learnings/execution.md: "Applied proposal: [title] on [date]" + - Git commit: "improve: apply proposal — [short description]" + 6. On REJECT: + - Move proposal to AI/memory/proposals/rejected/ + - Add rejection reason to the file (ask user why) + - Append to learnings/execution.md: "Rejected proposal: [title] — reason: [reason]" + 7. On SKIP: leave in pending/ for next review + + ## Guardrails + - Never apply proposals automatically — always require user approval + - Never modify CLAUDE.md, hooks, or settings.json via proposals + - Proposals only target skill files (AI/skills/) and policy files (AI/policies/) + - If a proposal has been rejected 2+ times (similar topic), stop proposing it + and log in AI/memory/proposals/rejected/blocklist.md + +□ Test: + → Manually create a test proposal in AI/memory/proposals/pending/ + → Say "show proposals" → verify skill routes to review-proposals + → Approve one → verify skill file updated, proposal moved to approved/ + → Reject one → verify moved to rejected/ with reason + +□ Git commit: "add: skill improvement proposal system with review workflow" +``` + +--- + +### Step 3C: Preference Learning with Confidence Tracking + +**What it does**: Restructures `preferences.md` from a flat list to a structured document with confidence scores and source tracking. Preferences observed multiple times get promoted to policy files automatically (via proposals). + +**Why needed**: Currently preferences are just notes. A preference seen once ("user prefers bullet points") is treated the same as one seen 20 times. Confidence tracking separates signal from noise. + +**Implementation**: + +``` +□ Read PAI reference: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/lib/IdealState.ts + (for the pattern of tracking confidence and evidence) + +□ Restructure AI/memory/learnings/preferences.md: + → New format: + + # Learned Preferences + + _Updated by hooks and synthesis. Confidence increases with repeated observation._ + + ## Writing & Communication + | Preference | Confidence | Observations | First Seen | Last Seen | + |---|---|---|---|---| + | British English spelling | high | 15 | 2025-01-01 | 2025-01-15 | + | No emojis unless asked | high | 12 | 2025-01-01 | 2025-01-14 | + | Bullet points over paragraphs | medium | 5 | 2025-01-05 | 2025-01-13 | + + ## Workflow + | Preference | Confidence | Observations | First Seen | Last Seen | + |---|---|---|---|---| + + ## Tools & Technical + | Preference | Confidence | Observations | First Seen | Last Seen | + |---|---|---|---|---| + | eventkit-cli over apple-mcp | high | seeded | 2025-01-01 | 2025-01-01 | + + ## Formatting + | Preference | Confidence | Observations | First Seen | Last Seen | + |---|---|---|---|---| + + → Migrate existing preferences from current flat format into the table + → Seed known preferences from Phase 1 (already in the file) with confidence: high + +□ Create AI/hooks/lib/preference-tracker.ts: + → Shared utility used by multiple hooks + → Function: recordPreference(category, preference, source) + - Read preferences.md + - If preference already exists: increment observations, update last_seen + - If new: add row with confidence: low, observations: 1 + - Confidence rules: + 1 observation → low + 3-5 observations → medium + 6+ observations → high + - Write updated preferences.md + → Function: getHighConfidencePreferences(category?) + - Returns preferences with confidence: high + - Used by format-enforcer hook to inject verified preferences + +□ Enhance AI/hooks/implicit-sentiment.ts (from Phase 2): + → When a preference-related signal is detected (e.g., user corrects formatting, + explicitly states a preference), call recordPreference() + → Example signals: + - "use British English" → recordPreference("writing", "British English spelling", session_id) + - "no emojis" → recordPreference("formatting", "No emojis unless asked", session_id) + - User corrects a format → infer preference from the correction + +□ Enhance AI/scripts/learning-synthesis.sh (from 3A): + → Add to synthesis: scan preferences.md + → For preferences with confidence: high AND observations > 10: + - Check if already in a policy file + - If not: generate a proposal to add to the relevant policy file + → For preferences with confidence: low AND last_seen > 30 days ago: + - Mark as stale, consider removing + +□ Test: + → Manually call recordPreference() a few times + → Verify: preferences.md table updates correctly + → Run synthesis → verify high-confidence preferences generate promotion proposals + +□ Git commit: "enhance: structured preference learning with confidence tracking" +``` + +--- + +### Step 3D: Mistake Pattern Detection + +**What it does**: Scans `mistakes.md` for repeating patterns. If the same type of mistake occurs 3+ times, generates a proposal to add an explicit rule to the relevant skill or policy file. + +**Why needed**: One-off mistakes are noise. Repeated mistakes are systemic problems that need explicit rules to prevent. + +**Implementation**: + +``` +□ Restructure AI/memory/learnings/mistakes.md: + → New format: + + # Mistake Log + + _Errors to avoid. Repeating patterns trigger policy proposals._ + + ## Entries + | Date | Category | Description | Skill | Occurrences | + |---|---|---|---|---| + | 2025-01-10 | formatting | Used American spelling | translator | 1 | + | 2025-01-12 | tool-use | Called apple-mcp instead of eventkit-cli | general-advisor | 2 | + + ## Known Patterns (3+ occurrences) + | Pattern | Occurrences | Policy Created | Status | + |---|---|---|---| + + → Migrate existing mistakes from current format into the table + +□ Create AI/hooks/lib/mistake-tracker.ts: + → Function: recordMistake(category, description, skill) + - Read mistakes.md + - Fuzzy match: is this similar to an existing entry? (simple keyword overlap) + - If similar: increment occurrences + - If new: add row with occurrences: 1 + - If occurrences reaches 3: move to "Known Patterns" section + and generate a proposal in AI/memory/proposals/pending/ + - Write updated mistakes.md + → Used by: implicit-sentiment.ts (on negative signals with context), + on-feedback.ts (on explicit low ratings) + +□ Enhance AI/hooks/on-feedback.ts (from Phase 1): + → When rating < 6 with a comment, try to extract the mistake category + → Call recordMistake() with extracted info + +□ Enhance AI/scripts/learning-synthesis.sh: + → Add to synthesis: scan mistakes.md "Known Patterns" section + → For each pattern without a policy rule: + - Generate proposal to add rule to the relevant skill/policy file + - Example: pattern "used American spelling" 5 times → + proposal: add "ALWAYS use British English" to formatting-rules.md + +□ Test: + → Manually record the same mistake 3 times + → Verify: moves to "Known Patterns" section + → Verify: proposal generated in proposals/pending/ + +□ Git commit: "add: mistake pattern detection with automatic rule proposals" +``` + +--- + +### Step 3E: Integrate Synthesis with Weekly Review + +**What it does**: Connects the learning synthesis to the existing `weekly-review.sh` launchd script so the Sunday review includes AI self-improvement insights alongside the GTD review. + +**Why needed**: The weekly review already runs every Sunday at 10:00 AM. The learning synthesis (3A) runs at 11:00 AM. This step makes the weekly review reference the previous week's synthesis and ensures the two scripts work together, not in isolation. + +**Implementation**: + +``` +□ Enhance AI/scripts/weekly-review.sh: + → Add to the Claude prompt: + + "LEARNING SYNTHESIS REVIEW: + After completing the GTD review phases, also: + + 1. Read the most recent synthesis report from AI/memory/learnings/synthesis/ + If this week's doesn't exist yet (synthesis runs after this script), + read last week's. + + 2. Report on: + - Rating trends (are we getting better or worse?) + - Any pending proposals in AI/memory/proposals/pending/ + - Any known mistake patterns in AI/memory/learnings/mistakes.md + - High-confidence preferences that haven't been promoted yet + + 3. Include in the weekly review output: + ## AI System Health + - Avg rating this week: X.X + - Pending improvement proposals: N + - Known mistake patterns: N + - Preference confidence: N high, N medium, N low + + 4. If there are pending proposals, remind the user: + 'You have N improvement proposals to review. Say show proposals to review them.'" + +□ Enhance AI/scripts/daily-brief.sh: + → Add a lightweight check: + "If AI/memory/proposals/pending/ has files, add to morning brief: + 'You have N pending AI improvement proposals to review.'" + +□ Verify the schedule makes sense: + → Sunday 10:00 AM: weekly-review.sh (GTD + learning review) + → Sunday 11:00 AM: learning-synthesis.sh (new synthesis + proposals) + → The weekly review reads LAST week's synthesis. The new synthesis runs after. + → This means proposals from the new synthesis are reviewed NEXT Sunday. + → Alternatively: swap the order (synthesis at 9:00 AM, review at 10:00 AM) + so the weekly review always has fresh data. Recommend this approach. + +□ If reordering: + → Update learning-synthesis plist: 9:00 AM Sunday + → Weekly review stays at 10:00 AM + → Now Sunday flow: synthesis runs → review reads fresh synthesis → user sees latest + +□ Test: + → Run learning-synthesis.sh manually → verify synthesis report created + → Run weekly-review.sh manually → verify it reads the synthesis report + → Verify daily-brief.sh mentions pending proposals (if any exist) + +□ Git commit: "integrate: learning synthesis with weekly review and daily brief" +``` + +--- + +## PART 4: MANUAL TESTING CHECKLIST + +After all steps are implemented: + +``` +□ Test 1: Synthesis Report Generation + → Ensure there are some ratings in ratings.jsonl (give a few explicit ratings first) + → Run: bash AI/scripts/learning-synthesis.sh + → Verify: AI/memory/learnings/synthesis/YYYY-WW.md exists + → Verify: report has all sections (Summary, Working, Not Working, Patterns, Recommendations, Proposals) + → Verify: Discord notification sent + +□ Test 2: Proposal Creation + → Verify: synthesis created proposals in AI/memory/proposals/pending/ + → If no proposals generated (not enough data), create a test proposal manually + → Say "show proposals" + → Verify: review-proposals skill activates + → Approve one → verify skill file changed, proposal moved to approved/ + → Reject one → verify moved to rejected/ with reason + +□ Test 3: Preference Tracking + → In a session, correct Claude's formatting twice + → Verify: preferences.md updated with new/incremented entry + → Repeat in another session → verify observation count increases + → After 6+ observations → verify confidence changes to high + +□ Test 4: Mistake Pattern Detection + → Record similar mistake 3 times (via low ratings with comments) + → Verify: mistake moves to "Known Patterns" in mistakes.md + → Verify: proposal generated in proposals/pending/ + +□ Test 5: Weekly Review Integration + → Run learning-synthesis.sh → produces report + → Run weekly-review.sh → verify it includes AI System Health section + → Verify it mentions pending proposals count + +□ Test 6: Daily Brief Reminder + → Create a test proposal in proposals/pending/ + → Run daily-brief.sh + → Verify: morning brief mentions pending proposals +``` + +--- + +## PART 5: DIRECTORY STRUCTURE AFTER PHASE 3 + +``` +AI/memory/ +├── work/ +│ ├── current.md +│ ├── state.json +│ └── YYYY-MM-DD_slug/ +│ └── META.yaml +├── learnings/ +│ ├── preferences.md # NOW: structured table with confidence scores +│ ├── mistakes.md # NOW: structured table with occurrence counts +│ ├── execution.md +│ └── synthesis/ +│ └── YYYY-WW.md # Weekly synthesis reports +├── proposals/ +│ ├── pending/ # Awaiting user review +│ │ └── YYYY-MM-DD_skill.md +│ ├── approved/ # Applied proposals +│ │ └── YYYY-MM-DD_skill.md +│ └── rejected/ # Rejected proposals +│ ├── YYYY-MM-DD_skill.md +│ └── blocklist.md # Topics to stop proposing +├── signals/ +│ └── ratings.jsonl +├── security/ +│ └── YYYY-MM-DD.jsonl +└── context-log.md +``` + +New/modified files: +- `AI/hooks/lib/preference-tracker.ts` — shared preference recording utility +- `AI/hooks/lib/mistake-tracker.ts` — shared mistake recording utility +- `AI/scripts/learning-synthesis.sh` — weekly synthesis script +- `AI/scripts/plists/com.lifeos.learning-synthesis.plist` — launchd schedule +- `AI/skills/review-proposals.md` — new skill for reviewing proposals +- Enhanced: `implicit-sentiment.ts`, `on-feedback.ts`, `weekly-review.sh`, `daily-brief.sh` + +--- + +## PART 6: WHAT COMES NEXT (Phase 4 Preview) + +Phase 3 closes the feedback loop — the system captures signals, synthesizes patterns, proposes improvements, and you approve. Phase 4 adds high-value skills that leverage this infrastructure: + +- **Research**: Multi-source research using existing MCP servers (Brave, Firecrawl, Perplexity, Gemini) +- **Council**: Multi-agent debate for important decisions (3-5 agents, 3 rounds) +- **CreateSkill**: Self-extending — system proposes and scaffolds new skills +- **Telos**: Life OS / goals management integrated with Tracking/Objectives/ + +Phase 3 makes these skills smarter from day one because they inherit the learning infrastructure. diff --git a/Plans/phase-4-briefing.md b/Plans/phase-4-briefing.md new file mode 100644 index 000000000..9bcaea8d6 --- /dev/null +++ b/Plans/phase-4-briefing.md @@ -0,0 +1,824 @@ +# Phase 4: Advanced Skills — Briefing for Claude Code + +## How to Use This File + +You are a Claude Code instance working in the LifeOS Obsidian vault. This file is your complete briefing for **Phase 4** of the AI Genius transformation. Phases 1–3 are complete and merged to main. + +**Reference repo**: `~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ` — clone of Daniel Messler's PAI. Read files there for implementation patterns. Adapt, don't copy. + +**Safety rule**: After EVERY sub-step (4A through 4D), commit to git with a descriptive message. If anything breaks, `git revert` that commit. + +--- + +## PART 1: CURRENT STATE AFTER PHASE 3 + +### What Exists + +Phases 1–3 built the full foundation: modular architecture, hook system, and learning engine. + +``` +AI/ +├── skills/ # 23 skill files + review-proposals skill (Phase 3) +├── context/ # 7 domain context maps +├── policies/ # 8 policy files + security-patterns.yaml +├── memory/ +│ ├── work/ # Auto-tracked sessions (META.yaml per session) +│ ├── learnings/ +│ │ ├── preferences.md # Structured table with confidence scores +│ │ ├── mistakes.md # Structured table with occurrence counts +│ │ ├── execution.md +│ │ └── synthesis/ # Weekly YYYY-WW.md reports +│ ├── proposals/ +│ │ ├── pending/ # Awaiting review +│ │ ├── approved/ # Applied +│ │ └── rejected/ # With reasons + blocklist.md +│ ├── signals/ratings.jsonl # Explicit + implicit ratings +│ ├── security/ # Audit trail +│ └── context-log.md +├── hooks/ # 7 hooks (session-start, end, feedback, security, +│ # format-enforcer, auto-work, implicit-sentiment) +└── scripts/ # 5 original + learning-synthesis.sh +``` + +### What Phase 4 Adds + +Four high-value skills that leverage the existing infrastructure: + +| Skill | What It Does | Uses | +|---|---|---| +| **Research** | Multi-source research with 3 depth levels | MCP servers (Brave, Firecrawl, Perplexity, Gemini) | +| **Council** | Multi-agent debate for important decisions | Claude subagents (Task tool) | +| **CreateSkill** | Self-extending — creates new skills on demand | Skill file system, validation | +| **Telos** | Life OS / goals management | Tracking/, Areas/, Projects/ | + +### Infrastructure These Skills Can Use + +- **Memory**: Write to `AI/memory/` for cross-session persistence +- **Learning**: Ratings and feedback captured automatically by hooks +- **Proposals**: Can generate improvement proposals for Phase 3's review system +- **Context maps**: Load relevant vault files per domain +- **Security**: All tool calls pass through security validator + +--- + +## PART 2: PAI REFERENCE FILES + +### Research Skill +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/SKILL.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/Workflows/QuickResearch.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/Workflows/StandardResearch.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/Workflows/ExtensiveResearch.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/Workflows/ExtractAlpha.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/UrlVerificationProtocol.md +``` + +### Council Skill +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/SKILL.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/CouncilMembers.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/RoundStructure.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/OutputFormat.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/Workflows/Debate.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/Workflows/Quick.md +``` + +### CreateSkill Skill +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-createskill-skill/src/skills/CreateSkill/SKILL.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-createskill-skill/src/skills/CreateSkill/Workflows/CreateSkill.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-createskill-skill/src/skills/CreateSkill/Workflows/ValidateSkill.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-createskill-skill/src/skills/CreateSkill/Workflows/CanonicalizeSkill.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-createskill-skill/src/skills/CreateSkill/Workflows/UpdateSkill.md +``` + +### Telos Skill +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/SKILL.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/Tools/UpdateTelos.ts +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/Workflows/Update.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/Workflows/WriteReport.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/Workflows/CreateNarrativePoints.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/Workflows/InterviewExtraction.md +``` + +--- + +## PART 3: IMPLEMENTATION STEPS + +### Constraints + +**DO NOT TOUCH**: +- Vault folder structure, .base files, eventkit-cli, mail-cli, afk-code, Obsidian plugins +- Existing working hooks, policies, memory structure +- CLAUDE.md (unless adding skill routing references) + +**ALWAYS**: +- Git commit after every sub-step +- Test each skill with at least 2 different prompts +- Skills must follow the existing YAML frontmatter convention from Phase 1 +- Store research outputs as Markdown (readable in Obsidian) +- New skills go in `AI/skills/`, new context maps go in `AI/context/` + +--- + +### Step 4A: Research Skill + +**What it does**: Multi-source research with 3 depth levels. Orchestrates existing MCP servers (Brave Search, Firecrawl, Perplexity, Gemini) into structured research workflows. Produces synthesized reports, not raw search dumps. + +**Why valuable**: You already have 6 MCP servers. Without this skill, you manually invoke each one. This skill routes your request to the right combination of sources at the right depth and produces a coherent synthesis. + +**Implementation**: + +``` +□ Read PAI references: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/SKILL.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/Workflows/QuickResearch.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/Workflows/StandardResearch.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/Workflows/ExtensiveResearch.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-research-skill/src/skills/Research/UrlVerificationProtocol.md + +□ Create AI/skills/research.md: + --- + name: Research + triggers: ["research", "do research", "investigate", "find information", + "quick research", "extensive research", "look into", + "what do we know about", "gather info"] + context_files: + - AI/context/research.md + policies: + - security-boundaries + voice: "Analytical, source-aware. Always cite sources. Distinguish facts from inference." + --- + + # Research + + ## When to Activate + When the user asks to research a topic, find information, investigate something, + or explicitly says "research this". Also activated by other skills that need + information gathering (e.g., newsletter writing, business decisions). + + ## Depth Modes + + ### Quick Research (default for simple questions) + Trigger: "quick research", "briefly look into", or simple factual questions + - Single source: use Claude's WebSearch tool OR one MCP server + - Time: ~10-15 seconds + - Output: 3-5 bullet points with source links + - Use when: factual lookups, quick verification, simple questions + + ### Standard Research (default for "do research") + Trigger: "research", "do research", "investigate" + - Two parallel agents: + Agent 1: WebSearch (Claude's built-in web search) + Agent 2: Perplexity MCP (alternative perspective) + - Cross-reference findings, note agreements and disagreements + - Time: ~15-30 seconds + - Output: structured report with sections: + ## Key Findings + ## Sources Agree On + ## Sources Disagree On / Nuances + ## Confidence Assessment + ## Sources + + ### Extensive Research (explicit request only) + Trigger: "extensive research", "deep dive", "thorough research" + - Multiple parallel agents: + Agent 1-2: WebSearch (broad + specific queries) + Agent 3: Perplexity MCP (synthesis perspective) + Agent 4: Brave Search MCP (alternative search engine) + Agent 5: Firecrawl MCP (if specific URLs need deep scraping) + - Synthesis phase: combine all findings, resolve contradictions + - Time: ~60-90 seconds + - Output: comprehensive report with: + ## Executive Summary + ## Detailed Findings (by sub-topic) + ## Source Analysis (reliability, recency, agreement) + ## Open Questions (what couldn't be verified) + ## Recommendations for Further Research + ## All Sources (with verification status) + + ## URL Verification Protocol + CRITICAL: Research agents sometimes hallucinate URLs. + For EVERY URL in the output: + 1. Verify the URL loads (HTTP 200) + 2. Verify the content matches what was claimed + 3. If verification fails: remove the URL, note "source not verified" + NEVER include unverified URLs in final output. + + ## Available MCP Servers + Check which are currently connected. As of last check: + - Brave Search MCP — web search + - Firecrawl MCP — web scraping, content extraction + - Perplexity MCP — AI-powered search synthesis + - Gemini MCP — Google's AI (may be offline — check status) + Note: google-workspace and gemini MCPs have been flaky. Always fall back + to WebSearch + Brave + Perplexity if Gemini is unavailable. + + ## Output Storage + - Save research output to AI/memory/research/YYYY-MM-DD_{topic-slug}.md + - Include metadata: date, depth mode, sources used, query + - This allows future sessions to reference past research + + ## Integration with Other Skills + When another skill needs research (e.g., newsletter-editor needs background on a topic): + - The calling skill can reference: "Use the Research skill at [depth] for [topic]" + - Research produces output, the calling skill continues with it + +□ Create directory: AI/memory/research/ + +□ Create AI/context/research.md: + # Research Context Map + + ## Primary Files + - AI/memory/research/ — past research outputs (check for existing work on topic) + - Content/AI Equilibrium/ — for newsletter-related research + + ## MCP Servers Available + - Brave Search: general web search + - Firecrawl: web scraping and content extraction + - Perplexity: AI-synthesised search results + - Gemini: Google AI (check availability) + + ## Research Quality Rules + - Always verify URLs before including + - Distinguish primary sources from secondary + - Note recency of information + - Flag when sources disagree + +□ Update AI/skills/ai-equilibrium-editor.md (or equivalent newsletter skill): + → Add to context_files: AI/context/research.md + → Add note: "For topic research, invoke the Research skill at Standard depth" + +□ Test: + → "quick research on latest Claude Code features" → single source, brief output + → "do research on AI agent frameworks in 2025" → two parallel agents, structured report + → "extensive research on personal knowledge management tools" → 4-5 agents, comprehensive + → Verify: AI/memory/research/ contains output files + → Verify: URLs in output are verified (no broken links) + +□ Git commit: "add: research skill with 3 depth modes and MCP integration" +``` + +--- + +### Step 4B: Council Skill (Multi-Agent Debate) + +**What it does**: For important decisions, spawns 3-5 specialised agents who debate a topic through 3 rounds. Each agent has a distinct perspective and challenges the others. You get a structured transcript with convergence points and remaining disagreements. + +**Why valuable**: You're strong at synthesis and connecting dots. This skill gives you pre-digested intellectual friction from multiple angles — raw material for your synthesis, not a single AI opinion. + +**Implementation**: + +``` +□ Read PAI references: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/SKILL.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/CouncilMembers.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/RoundStructure.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/Workflows/Debate.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-council-skill/src/skills/Council/Workflows/Quick.md + +□ Create AI/skills/council.md: + --- + name: Council + triggers: ["council", "debate this", "get multiple perspectives", + "what would different experts say", "council:", "multi-perspective"] + context_files: [] + policies: + - challenger-protocol + voice: "Facilitator. Present each agent's voice distinctly. Synthesise at the end." + --- + + # Council — Multi-Agent Debate + + ## When to Activate + When the user wants multiple perspectives on a decision, strategy, or problem. + Especially useful for: business decisions, career moves, strategy questions, + architecture choices, investment decisions, content strategy. + + ## Workflows + + ### Full Debate (default) + Trigger: "council:", "debate this", "get perspectives on" + 3 rounds, 4 agents, ~30-90 seconds + + ### Quick Council + Trigger: "quick council", "fast perspectives" + 1 round, 3-4 agents, ~10-20 seconds + + ## Default Council Members + + Adapt these to the user's domains. The user (Krzysztof) works across: + business strategy, content creation, AI/tech, investing, health, personal development. + + | Agent | Perspective | Voice Style | + |---|---|---| + | **Strategist** | Long-term strategy, competitive dynamics, power | Direct, analytical, cites frameworks | + | **Pragmatist** | Implementation reality, resource constraints, timing | Grounded, practical, "yes but how" | + | **Challenger** | Devil's advocate, risks, what could go wrong | Provocative, tough love, contrarian | + | **Domain Expert** | Deep knowledge of the specific domain (varies per topic) | Technical, precise, evidence-based | + + For specific topics, the Domain Expert adapts: + - Business question → Business/Finance expert + - Health question → Health/Science expert + - Tech question → Engineering/Architecture expert + - Content question → Media/Audience expert + - Investing question → Financial analyst + + Optional additional agents (invoke with "council with [role]"): + - **Creative** — lateral thinking, unexpected connections, "what if" + - **User Advocate** — end-user perspective, simplicity, accessibility + - **Historian** — precedent, what happened when others tried this + + ## Three-Round Structure + + ### Round 1: Initial Positions (parallel) + Each agent gets the topic + relevant context. + Each states their position in 50-150 words, first person. + All agents execute IN PARALLEL via Task tool. + + Prompt template for each agent: + "You are [Agent Name], [perspective description]. + Topic: [the user's question] + Context: [relevant domain context from context maps] + + State your initial position on this topic in 50-150 words. + Be specific and substantive. First person voice. + Do NOT hedge or equivocate — take a clear position." + + ### Round 2: Responses & Challenges (parallel) + Each agent receives ALL Round 1 positions. + Each responds to the others' points in 50-150 words. + Must explicitly reference other agents: "I disagree with Strategist's point about X..." + All agents execute IN PARALLEL. + + Prompt template: + "You are [Agent Name]. Here are the Round 1 positions: + [All positions] + + Respond to the other council members' points. 50-150 words. + Explicitly agree or disagree with specific points by name. + Add new arguments that weren't raised in Round 1. + Genuine intellectual friction — don't be polite for its own sake." + + ### Round 3: Convergence (parallel) + Each agent receives Rounds 1 + 2. + Each identifies: what they've changed their mind about (if anything), + what remains unresolved, their final recommendation. + 50-150 words. All execute IN PARALLEL. + + ### Synthesis (sequential, by you — the facilitator) + After all 3 rounds, produce: + + ## Council Transcript + [Full transcript with agent names and rounds clearly marked] + + ## Convergence Points + - Points where 3+ agents agreed + - Strongest arguments (by evidence, not just conviction) + + ## Remaining Disagreements + - Where agents still diverge, and why + - What information would resolve the disagreement + + ## Recommended Path + - Based on weight of arguments (not majority vote) + - Note the risks flagged by the Challenger + + ## Agent Execution + + Use the Task tool to run agents in parallel within each round. + Between rounds, wait for all agents to complete before starting the next round. + + Round 1: 4 parallel Task calls → wait for all + Round 2: 4 parallel Task calls (with Round 1 transcript) → wait for all + Round 3: 4 parallel Task calls (with R1+R2 transcripts) → wait for all + Synthesis: produce final output + + ## Output Storage + Save council transcripts to AI/memory/research/YYYY-MM-DD_council_{topic-slug}.md + (Councils are a form of research — reusable in future sessions) + +□ Test: + → "council: should I pivot my newsletter from weekly to biweekly?" + → Verify: 4 agents debate across 3 rounds + → Verify: transcript includes genuine disagreement + → Verify: synthesis identifies convergence + disagreements + → "quick council: is it worth learning Rust for my projects?" + → Verify: 1 round, 3-4 agents, fast output + → Verify: output saved to AI/memory/research/ + +□ Git commit: "add: council skill for multi-agent debate on decisions" +``` + +--- + +### Step 4C: CreateSkill Skill (Self-Extending) + +**What it does**: Creates new skills when a pattern of requests doesn't map to existing skills. Can also be invoked explicitly. Scaffolds the skill file with proper frontmatter, creates context maps, and validates the result. + +**Why valuable**: The system grows organically. If you start asking about a new topic regularly, it can create a dedicated skill — complete with the right context maps, policies, and voice settings. + +**Implementation**: + +``` +□ Read PAI references: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-createskill-skill/src/skills/CreateSkill/SKILL.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-createskill-skill/src/skills/CreateSkill/Workflows/CreateSkill.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-createskill-skill/src/skills/CreateSkill/Workflows/ValidateSkill.md + +□ Create AI/skills/create-skill.md: + --- + name: Create Skill + triggers: ["create a skill", "new skill", "create skill for", + "I need a skill that", "add a new mode"] + context_files: [] + policies: + - security-boundaries + voice: "Methodical. Walk through each decision. Confirm before creating." + --- + + # Create Skill — Self-Extending System + + ## When to Activate + - User explicitly asks to create a new skill + - Weekly synthesis (Phase 3) identifies a recurring request pattern + without a matching skill and proposes creation + - User says "I keep asking about X, let's make a skill for it" + + ## Workflows + + ### Create (default) + Interactive skill creation with validation. + + ### Validate + Check an existing skill file for proper structure. + + ### Update + Add workflows or modify an existing skill. + + ## Create Workflow + + Step 1: Gather Requirements + Ask the user (or read from proposal if auto-triggered): + - What domain does this skill cover? + - What should trigger it? (keywords, phrases) + - What vault files are relevant? (for context map) + - What policies apply? (existing or new?) + - What tone/voice should it use? + - Are there existing skills it overlaps with? + + Step 2: Check for Overlap + - Read ALL existing skills in AI/skills/ + - Check if any existing skill already covers this domain + - If overlap: suggest enhancing the existing skill instead of creating new + - If partial overlap: propose how to divide responsibility + + Step 3: Scaffold the Skill File + Create AI/skills/{skill-name}.md with: + ```yaml + --- + name: [Skill Name] + triggers: [trigger keywords] + context_files: + - AI/context/{domain}.md + policies: + - [relevant policies] + voice: "[tone description]" + --- + ``` + + Include sections: + - # [Skill Name] + - ## When to Activate + - ## Instructions + - ## Workflows (if applicable) + - ## Examples (2-3 concrete usage examples) + - ## Anti-Patterns (what this skill should NOT do) + + Step 4: Create Context Map (if needed) + If no existing context map covers this domain: + - Create AI/context/{domain}.md + - List primary vault files/folders + - List secondary references + - Note related domains + + Step 5: Validate + Check the created skill: + - YAML frontmatter is valid + - Triggers don't conflict with existing skills + - Referenced context_files exist + - Referenced policies exist + - Voice is defined + - At least 2 examples included + + Step 6: Register + - The skill is automatically available because CLAUDE.md routes to AI/skills/ + - No manual registration needed (file-based routing) + - Inform the user: "New skill created: [name]. Trigger with: [keywords]" + + Step 7: Git Commit + - Commit the new skill file + context map (if created) + - Message: "add: [skill-name] skill for [domain]" + + ## Validation Rules (for Validate workflow) + Check any skill file against: + □ Has valid YAML frontmatter (name, triggers, voice at minimum) + □ Triggers array is non-empty + □ No trigger conflicts with other skills' triggers + □ context_files references exist on disk + □ policies references exist in AI/policies/ + □ Has ## When to Activate section + □ Has ## Instructions section + □ Has at least 2 ## Examples + + ## Integration with Phase 3 + The weekly synthesis (learning-synthesis.sh) can detect: + - "General Advisor was used 15 times this week for cooking questions" + → Proposes: "Create a Cooking skill? Evidence: 15 unmatched requests." + → Proposal goes to AI/memory/proposals/pending/ + → User approves → CreateSkill executes the creation + +□ Test: + → "create a skill for travel planning" + → Verify: asks requirements, checks overlap, creates file + → Verify: AI/skills/travel-planning.md exists with proper frontmatter + → Verify: context map created if needed + → Verify: git commit made + → "validate the research skill" + → Verify: checks all validation rules, reports pass/fail + → Test trigger overlap: try creating a skill with triggers that match an existing skill + → Verify: warns about overlap, suggests alternative + +□ Git commit: "add: create-skill for self-extending capability" +``` + +--- + +### Step 4D: Telos Skill (Life OS / Goals) + +**What it does**: Manages goals, beliefs, lessons learned, and life areas as structured data. Integrates with the existing Tracking/ hierarchy (Objectives, Key Results, Habits) and Areas/. Produces progress reports and proposes goal revisions. + +**Why valuable**: You already have Tracking/Objectives/, Tracking/Key Results/, and Areas/Life Areas/. This skill connects them into a coherent life operating system with periodic reviews — instead of data sitting in separate files. + +**Important adaptation note**: PAI's Telos stores everything in `~/.claude/skills/CORE/USER/TELOS/`. This vault already has the data in Tracking/ and Areas/. Do NOT duplicate it. Create views and workflows that READ from existing locations, not new files. + +**Implementation**: + +``` +□ Read PAI references: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/SKILL.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/Workflows/Update.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/Workflows/WriteReport.md + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-telos-skill/src/Workflows/InterviewExtraction.md + +□ Create AI/skills/telos.md: + --- + name: Telos + triggers: ["goals", "objectives", "life review", "telos", + "update my goals", "how am I doing", "goal progress", + "quarterly review", "what am I working toward", + "add lesson", "add belief", "life OS"] + context_files: + - AI/context/goals.md + policies: + - challenger-protocol + voice: "Reflective but honest. Report progress factually. + Challenge goals that aren't being pursued. Celebrate real progress." + --- + + # Telos — Life Operating System + + ## When to Activate + When the user asks about goals, objectives, life direction, progress, + or wants to review/update their personal operating system. + + ## Data Sources (READ from existing vault — do NOT create copies) + + | Data | Location | Format | + |---|---|---| + | Objectives | Tracking/Objectives/ | Markdown with frontmatter | + | Key Results | Tracking/Key Results/ | Markdown with frontmatter | + | Habits | Tracking/Habits/ | Markdown / .base | + | Years | Tracking/Years/ | Yearly review docs | + | Life Areas | Areas/ | Folder per area | + | Projects | Projects/GPR/ | Active projects | + | Deep Profile | AI/My AgentOS/Deep Profile & Operating Manual.md | Psych profile | + | Context Log | AI/memory/context-log.md | Current priorities | + + ## Telos Personal Files (NEW — stored in AI/telos/) + + These files are NEW and managed by this skill. They capture things + NOT already tracked elsewhere in the vault: + + | File | Purpose | + |---|---| + | AI/telos/beliefs.md | What you believe to be true (tested by experience) | + | AI/telos/lessons.md | Hard-won lessons (not repeated mistakes — deep insights) | + | AI/telos/wisdom.md | Frameworks and mental models you use | + | AI/telos/predictions.md | Predictions you've made (track accuracy over time) | + + ## Workflows + + ### Review Progress (default for "how am I doing", "goal progress") + 1. Read Tracking/Objectives/ and Tracking/Key Results/ + 2. Read Projects/GPR/ for active projects + 3. Cross-reference: which objectives have active projects? Which don't? + 4. Read AI/memory/work/ — what have you actually been working on? + 5. Produce: + ## Goal Progress Report + ### On Track + - [Objectives with active key results showing progress] + ### At Risk + - [Objectives with stalled or missing key results] + ### Not Started + - [Objectives with no active projects] + ### Recommendation + - [Which goals to focus on, which to reconsider] + + ### Update Goals + Triggered by "update my goals", "add objective", "revise goals" + - Interactive: ask what to add/change/remove + - Update the relevant file in Tracking/Objectives/ or Tracking/Key Results/ + - Follow existing frontmatter schema (read examples first) + - Git commit after changes + + ### Add to Telos (beliefs, lessons, wisdom, predictions) + Triggered by "add lesson", "add belief", "I learned that", "I predict" + - Append to the relevant AI/telos/ file + - Include date, context, and confidence level + - For predictions: include timeframe and how to verify + + ### Quarterly Review + Triggered by "quarterly review" or scheduled (see below) + 1. Read all Tracking/ data + 2. Read AI/memory/learnings/synthesis/ — last 12 weeks of synthesis reports + 3. Read AI/telos/ personal files + 4. Produce comprehensive report: + ## Quarter in Review + ### Goals: What Progressed + ### Goals: What Stalled (and why) + ### Learning Trends (from synthesis reports) + ### Predictions Check (any predictions verifiable now?) + ### Belief Updates (any beliefs challenged by evidence?) + ### Recommended Focus for Next Quarter + + ### Monthly Report (lightweight) + Triggered by "monthly review" or scheduled + Subset of quarterly: just goal progress + learning trends + + ## Integration with Existing Scripts + - weekly-review.sh already runs Sunday — could include a Telos mini-check + - Quarterly review could be a new launchd script or triggered manually + + ## Challenger Protocol Integration + When reviewing goals, ACTIVELY challenge: + - Goals with no progress for 30+ days: "Is this still a priority?" + - Goals that conflict with actual time allocation: "You say X is a priority + but your sessions show you spend time on Y" + - Beliefs that have been contradicted by recent experience + +□ Create directory: AI/telos/ + +□ Create AI/telos/beliefs.md: + # Beliefs + _What I believe to be true, based on experience. Periodically reviewed._ + + | Date Added | Belief | Confidence | Domain | Last Challenged | + |---|---|---|---|---| + +□ Create AI/telos/lessons.md: + # Lessons + _Hard-won insights. Not repeated mistakes — deep understanding._ + + | Date | Lesson | Context | Domain | + |---|---|---|---| + +□ Create AI/telos/wisdom.md: + # Wisdom — Frameworks & Mental Models + _Models I use for thinking and decision-making._ + + | Model | Description | When to Use | Source | + |---|---|---|---| + +□ Create AI/telos/predictions.md: + # Predictions + _Track accuracy over time. Review quarterly._ + + | Date | Prediction | Timeframe | Confidence | Outcome | Verified | + |---|---|---|---|---|---| + +□ Create AI/context/goals.md: + # Goals Context Map + + ## Primary Files + - Tracking/Objectives/ — all current objectives + - Tracking/Key Results/ — measurable key results + - Tracking/Habits/ — habit tracking + - Tracking/Years/ — yearly reviews and themes + + ## Secondary Files + - Projects/GPR/ — active projects (linked to objectives) + - Areas/ — life areas and domains + - AI/telos/ — beliefs, lessons, wisdom, predictions + - AI/memory/context-log.md — current priorities + - AI/My AgentOS/Deep Profile & Operating Manual.md — personal profile + + ## Related Domains + - business (for business goals) + - health (for health goals) + - personal (for life goals) + +□ Test: + → "how am I doing on my goals?" → reads Tracking/, produces progress report + → "add a lesson: rushing architecture decisions always costs more time later" + → Verify: appended to AI/telos/lessons.md + → "I predict AI agents will replace 50% of SaaS tools by 2027" + → Verify: appended to AI/telos/predictions.md with timeframe + → "quarterly review" → comprehensive report with challenger pushback + +□ Git commit: "add: telos skill for life OS and goal management" +``` + +--- + +## PART 4: MANUAL TESTING CHECKLIST + +After all skills are implemented: + +``` +□ Test 1: Research — Quick + → "quick research on the latest Obsidian plugin releases" + → Verify: single source, 3-5 bullets, source links verified + +□ Test 2: Research — Standard + → "research the state of personal AI assistants in 2025" + → Verify: 2 parallel agents, structured report, sources cited + +□ Test 3: Research — Extensive + → "extensive research on knowledge graph tools for Obsidian" + → Verify: 4-5 agents, comprehensive report, stored in AI/memory/research/ + +□ Test 4: Council — Full Debate + → "council: should I invest in building my own AI tools vs using existing ones?" + → Verify: 4 agents, 3 rounds, genuine disagreement, synthesis + +□ Test 5: Council — Quick + → "quick council: biweekly newsletter — good or bad idea?" + → Verify: 1 round, fast, distinct positions + +□ Test 6: CreateSkill + → "create a skill for meal planning" + → Verify: interactive creation, proper frontmatter, no overlap warning (new domain) + → Verify: AI/skills/meal-planning.md exists + +□ Test 7: CreateSkill — Overlap Detection + → "create a skill for financial advice" + → Verify: warns about overlap with financial-advisor skill + → Verify: suggests enhancing existing instead + +□ Test 8: Telos — Progress + → "how am I doing on my goals?" + → Verify: reads Tracking/, reports progress, flags stalled goals + +□ Test 9: Telos — Add + → "add belief: most productivity advice is designed for neurotypical people" + → Verify: added to AI/telos/beliefs.md with date and domain + +□ Test 10: Cross-Skill + → "research the best goal-tracking frameworks, then update my telos with findings" + → Verify: Research runs first, then Telos uses the output +``` + +--- + +## PART 5: UPDATING CLAUDE.md + +After all skills are created, CLAUDE.md's skill routing section may need updating to mention the new skills. Add to the routing instruction: + +``` +Available skills include (but are not limited to): +- research.md — Multi-source research at 3 depth levels +- council.md — Multi-agent debate for decisions +- create-skill.md — Create new skills on demand +- telos.md — Life OS, goals, beliefs, lessons +[...existing skills...] +``` + +Only add these 4 lines. Don't restructure CLAUDE.md. + +--- + +## PART 6: WHAT COMES NEXT (Phase 5 Preview) + +Phase 4 gives you four powerful skills. Phase 5 adds The Algorithm — a structured execution engine that classifies task complexity and routes to appropriate capabilities: + +- **Trivial** → direct answer +- **Quick** → single skill +- **Standard** → skill + research, parallel agents +- **Thorough** → skill + research + council, multiple agents +- **Determined** → everything, extended thinking, red team + +The Algorithm uses an Ideal State Criteria (ISC) framework to define what "done well" looks like for every non-trivial task, then works through each criterion systematically. + +Phase 4's skills become the capabilities that Phase 5 orchestrates. diff --git a/Plans/phase-5-briefing.md b/Plans/phase-5-briefing.md new file mode 100644 index 000000000..a59705400 --- /dev/null +++ b/Plans/phase-5-briefing.md @@ -0,0 +1,666 @@ +# Phase 5: The Algorithm (Execution Engine) — Briefing for Claude Code + +## How to Use This File + +You are a Claude Code instance working in the LifeOS Obsidian vault. This file is your complete briefing for **Phase 5** of the AI Genius transformation. Phases 1–4 are complete and merged to main. + +**Reference repo**: `~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ` — clone of Daniel Messler's PAI. Read files there for implementation patterns. Adapt, don't copy. + +**Safety rule**: After EVERY sub-step (5A through 5D), commit to git with a descriptive message. If anything breaks, `git revert` that commit. + +--- + +## PART 1: CURRENT STATE AFTER PHASE 4 + +### What Exists + +Phases 1–4 built the full modular architecture, hook system, learning engine, and advanced skills. + +``` +AI/ +├── skills/ # 23 original + review-proposals + research + council +│ # + create-skill + telos = ~28 skills +├── context/ # 7 original + research + goals = ~9 context maps +├── policies/ # 8 + security-patterns.yaml +├── telos/ # beliefs, lessons, wisdom, predictions +├── memory/ +│ ├── work/ # Auto-tracked sessions +│ ├── learnings/ # Preferences, mistakes, execution, synthesis/ +│ ├── proposals/ # Pending, approved, rejected +│ ├── signals/ratings.jsonl +│ ├── security/ +│ ├── research/ # Research outputs from Phase 4 +│ └── context-log.md +├── hooks/ # 7 hooks +└── scripts/ # 5 original + learning-synthesis.sh +``` + +### What Phase 5 Adds + +The Algorithm is an execution framework that sits ABOVE individual skills. It: +1. Classifies task effort (trivial → determined) +2. Defines what "done well" looks like (ISC — Ideal State Criteria) +3. Routes to appropriate capabilities based on effort level +4. Executes systematically through 7 phases +5. Verifies results with a skeptical agent +6. Learns from the outcome + +Think of it as: skills are employees, The Algorithm is the project management methodology. + +### What Phase 5 Does NOT Need + +PAI's Algorithm is deeply integrated with custom TypeScript tools (ISCManager.ts, EffortClassifier.ts, etc.) and a voice server. This vault doesn't have those systems. Phase 5 adapts the Algorithm's **methodology** as a skill + policy, not as a TypeScript tool chain. The ISC is maintained as Markdown (readable in Obsidian), not JSON managed by CLI tools. + +--- + +## PART 2: PAI REFERENCE FILES + +Read these to understand the full Algorithm before simplifying: + +### Core Skill Definition +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/SKILL.md +``` + +### Phase Definitions (read all 7) +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Phases/Observe.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Phases/Think.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Phases/Plan.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Phases/Build.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Phases/Execute.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Phases/Verify.md +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Phases/Learn.md +``` + +### ISC Format Reference +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Reference/ISCFormat.md +``` + +### Tools (read for logic, not to port directly) +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Tools/EffortClassifier.ts +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Tools/CapabilitySelector.ts +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Tools/TraitModifiers.ts +``` + +--- + +## PART 3: IMPLEMENTATION STEPS + +### Constraints + +**DO NOT TOUCH**: +- Existing skills, hooks, policies, memory structure +- Vault folder structure, .base files, tools +- CLAUDE.md (unless adding algorithm routing reference) + +**DESIGN PRINCIPLES**: +- The Algorithm is a **skill + policy**, not a TypeScript tool chain +- ISC is **Markdown**, not JSON managed by CLI tools +- Keep it simple enough that Claude can execute it without custom tooling +- The user can say "use the algorithm" to invoke it, or it auto-activates for non-trivial tasks + +--- + +### Step 5A: Effort Classification Policy + +**What it does**: Defines how to classify task effort. This determines what capabilities are available and how much rigour to apply. Every non-trivial request gets classified. + +**Implementation**: + +``` +□ Read PAI references: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Tools/EffortClassifier.ts + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Tools/TraitModifiers.ts + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-algorithm-skill/src/skills/THEALGORITHM/Tools/CapabilitySelector.ts + +□ Create AI/policies/effort-classification.md: + + # Effort Classification + + Every non-trivial request is classified into an effort level. + Effort determines which capabilities are available, how many iterations + are allowed, and whether parallel agents are used. + + ## Effort Levels + + ### TRIVIAL + **Trigger**: greetings, yes/no questions, acknowledgements, single-word responses + **Response**: Direct answer. No ISC. No skill routing overhead. + **Examples**: "hello", "thanks", "yes", "ok", "what time is it?" + + ### QUICK + **Trigger**: Simple tasks, ≤15 words, single-step actions, low complexity + **Capabilities**: Single skill, no parallel agents, 1 iteration max + **Model**: Default (sonnet) + **Examples**: "translate this to Polish", "what's on my calendar today", + "fix the typo in X", "rename this file" + + ### STANDARD (default for most work) + **Trigger**: Multi-step tasks, requires context, moderate complexity + **Capabilities**: Skill + research (standard depth), 1-3 parallel agents, 2 iterations + **Model**: Default (sonnet) + **ISC**: 5-15 rows + **Examples**: "draft a newsletter issue", "review this business proposal", + "plan next week's content", "analyse this investment opportunity" + + ### THOROUGH + **Trigger**: Complex tasks, explicit request for depth, strategic decisions, + multi-domain work, architecture, refactoring + **Capabilities**: Skill + research (extensive) + council debate, 3-5 parallel agents, + 3-5 iterations + **Model**: Default (sonnet) with deep thinking + **ISC**: 15-50 rows + **Examples**: "redesign the newsletter strategy", "evaluate whether to pivot the business", + "thorough research on X", "create a comprehensive plan for Y" + + ### DETERMINED + **Trigger**: Explicit keywords ("until done", "keep going", "don't stop until", + "walk away and come back to results"), mission-critical tasks + **Capabilities**: Everything — research (extensive), council, all skills, + unlimited iterations, 10+ parallel agents + **Model**: Best available (opus if accessible) + **ISC**: 50+ rows + **Examples**: "until all tests pass", "keep iterating until this is production-ready", + "comprehensive due diligence on this acquisition" + + ## Override + The user can explicitly set effort: + - "algorithm effort THOROUGH: [task]" + - "use thorough effort for this" + - "this is a quick one" + + ## Auto-Classification Signals + - Word count: ≤15 words → likely QUICK + - Complexity keywords: "refactor", "redesign", "architecture", "strategy" → THOROUGH+ + - Persistence keywords: "until done", "keep going" → DETERMINED + - Multi-domain: touches 2+ context maps → STANDARD+ + - Explicit depth: "thorough", "comprehensive", "extensive" → THOROUGH + + ## Display + When effort is classified, briefly state it: + "Effort: STANDARD — using research + newsletter skill, 2 iterations max." + Only state this for STANDARD and above. TRIVIAL and QUICK are silent. + +□ Git commit: "add: effort classification policy for capability routing" +``` + +--- + +### Step 5B: The Algorithm Skill + +**What it does**: The core execution methodology. Seven phases: OBSERVE → THINK → PLAN → BUILD → EXECUTE → VERIFY → LEARN. Uses an ISC (Ideal State Criteria) table to track what "done well" looks like. + +**Implementation**: + +``` +□ Read PAI references (all 7 phase files + SKILL.md + ISCFormat.md) + +□ Create AI/skills/algorithm.md: + --- + name: The Algorithm + triggers: ["use the algorithm", "algorithm:", "algorithm effort", + "structured execution", "ISC", "ideal state"] + context_files: [] + policies: + - effort-classification + - challenger-protocol + voice: "Systematic, methodical. Report progress by phase. + Be transparent about what's been done and what remains." + --- + + # The Algorithm — Structured Execution Engine + + ## When to Activate + + Explicitly: + - User says "use the algorithm" or "algorithm effort [LEVEL]: [task]" + - User says "ISC" or "ideal state" + + Implicitly (auto-activate for STANDARD+ effort): + - When effort classification is STANDARD or above AND the task is non-trivial + - Do NOT auto-activate for TRIVIAL or QUICK tasks + + When NOT auto-activated, still follow the effort classification policy + for capability routing. The Algorithm adds the ISC discipline on top. + + ## The ISC (Ideal State Criteria) + + The ISC is a Markdown table that defines what "done well" looks like. + It is THE central artifact of The Algorithm. Every phase mutates the ISC. + + ### ISC Format + + ```markdown + ## ISC: [One-line summary of the request] + **Effort:** [LEVEL] | **Phase:** [CURRENT PHASE] | **Iteration:** [N] + + | # | What Ideal Looks Like | Source | Capability | Status | + |---|---|---|---|---| + | 1 | [Success criterion] | EXPLICIT | [skill/tool] | PENDING | + | 2 | [Inferred requirement] | INFERRED | [skill/tool] | PENDING | + | 3 | [Universal standard] | IMPLICIT | [verification] | PENDING | + ``` + + ### Source Types + - **EXPLICIT**: User literally asked for this + - **INFERRED**: Derived from context (e.g., user's tech stack, preferences, domain) + - **IMPLICIT**: Universal quality standards (no errors, consistent formatting, + follows existing patterns) + - **RESEARCH**: Discovered during research phase + + ### Status Lifecycle + PENDING → ACTIVE → DONE / ADJUSTED / BLOCKED + + - **PENDING**: Not started + - **ACTIVE**: Currently being worked on + - **DONE**: Completed and verified + - **ADJUSTED**: Completed with acceptable deviation (note why) + - **BLOCKED**: Cannot complete (note blocker) + + ### ISC Size by Effort + - QUICK: No ISC (too lightweight) + - STANDARD: 5-15 rows + - THOROUGH: 15-50 rows + - DETERMINED: 50+ rows (no limit) + + ## The 7 Phases + + ### Phase 1: OBSERVE + **Goal**: Understand the request. Create the ISC. + **Actions**: + 1. Read the user's request carefully + 2. Load relevant context (context maps, memory, preferences) + 3. Create ISC rows: + - EXPLICIT rows: what the user literally asked for + - INFERRED rows: what context implies (preferences, patterns, tech stack) + - IMPLICIT rows: universal standards (no errors, consistent style, + follows existing vault patterns, tests pass) + 4. If the request is unclear, ask up to 5 clarifying questions: + - What does success look like when this is done? + - Who will use this and what will they do with it? + - What existing thing is this most similar to? + - What should this definitely NOT do? + - (Only ask what's genuinely unclear — don't interrogate for simple tasks) + + **Gate**: Do I have at least 2 ISC rows? Did I use context to infer beyond the literal request? + + **Show the ISC table to the user after OBSERVE.** + + ### Phase 2: THINK + **Goal**: Ensure nothing is missing. Challenge assumptions. + **Actions**: + 1. Review all ISC rows for completeness + 2. Check for gaps: security, error handling, edge cases, formatting + 3. Consider: what could go wrong? What's the user NOT thinking about? + 4. Invoke challenger protocol: is there a better approach? + 5. For THOROUGH+: use extended thinking or council skill + 6. Add any new rows discovered + + **Gate**: All rows are clear and testable? No obvious gaps? + + ### Phase 3: PLAN + **Goal**: Sequence the work. Identify what can run in parallel. + **Actions**: + 1. Identify dependencies between rows (what must happen before what?) + 2. Mark rows that can execute in parallel (no dependencies, no shared files) + 3. Group into execution phases: + - Phase A: Research (gather information needed) + - Phase B: Thinking (analysis, strategy, decisions) + - Phase C: Execution (do the actual work) + - Phase D: Verification (test results) + 4. Assign capabilities to each row: + - Research skill for research rows + - Relevant domain skill for execution rows + - Council for debate/decision rows + + **Gate**: Dependencies mapped? Execution sequence clear? + + ### Phase 4: BUILD + **Goal**: Make each ISC row testable. Define success criteria. + **Actions**: + 1. For each row: how will we know it's DONE? + 2. Refine vague descriptions: + - "Works well" → specific measurable criterion + - "Looks good" → matches existing style, follows formatting policy + - "Is complete" → all sub-items enumerated + 3. Define verification method for each row: + - Read the output and check + - Run a test/command + - Compare against a reference + - Ask the user to confirm + + **Gate**: Every row has a specific, verifiable success criterion? + + ### Phase 5: EXECUTE + **Goal**: Do the work. Update ISC status as you go. + **Actions**: + 1. Work through rows in the planned sequence + 2. For parallel rows: use the Task tool to run multiple agents simultaneously + 3. Update status: PENDING → ACTIVE → DONE (as each completes) + 4. If a row hits a blocker: mark BLOCKED, note the blocker, continue with other rows + 5. If a row needs adjustment: mark ADJUSTED, note the deviation + + **Capability routing by effort**: + - STANDARD: single skill, sequential or 1-3 parallel agents + - THOROUGH: multiple skills, research + execution, 3-5 parallel agents + - DETERMINED: everything available, 10+ parallel agents + + **Gate**: Every row has a final status (DONE, ADJUSTED, or BLOCKED)? + + ### Phase 6: VERIFY + **Goal**: Test results with a skeptical eye. Don't trust your own work blindly. + **Actions**: + 1. For each DONE row: verify against the success criterion from BUILD phase + 2. Be genuinely skeptical — try to find problems, don't just confirm + 3. Run tests, read outputs, check formatting, verify links + 4. If verification fails: mark row back to ACTIVE or BLOCKED + 5. For THOROUGH+: use a separate verification pass (re-read with fresh eyes) + + **Results**: + - PASS → row stays DONE + - ADJUSTED → acceptable deviation, note why + - BLOCKED → issue found, needs iteration + + **Gate**: All DONE rows verified? All BLOCKED rows noted? + + ### Phase 7: LEARN + **Goal**: Capture what happened. Let the user rate the result. + **Actions**: + 1. Show the final ISC table with all statuses + 2. Present deliverables to the user + 3. Note what worked well and what didn't + 4. Do NOT self-rate — let the user provide a rating (captured by feedback hook) + 5. If BLOCKED rows exist: explain what's needed to unblock + 6. Suggest whether to iterate (go back to an earlier phase) or ship as-is + + **Iteration logic** (when BLOCKED rows exist): + - Execution problem → loop back to EXECUTE + - Planning problem → loop back to PLAN + - Requirements problem → loop back to THINK or OBSERVE + - Iteration count bounded by effort level: + STANDARD: 2 iterations max + THOROUGH: 3-5 iterations + DETERMINED: unlimited + + ## ISC Storage + + Save the ISC to AI/memory/work/{current-session-dir}/ISC.md + This is automatically created by the auto-work hook (Phase 2). + The ISC persists within the session and is readable in Obsidian. + + ## Example: Algorithm in Action + + User: "Draft this week's newsletter about AI agents" + Effort: STANDARD (multi-step, needs context + research + writing) + + OBSERVE → Creates ISC: + | # | What Ideal Looks Like | Source | Capability | Status | + |---|---|---|---|---| + | 1 | Topic covers AI agents with fresh angle | EXPLICIT | research | PENDING | + | 2 | 3-5 key insights, not obvious takes | EXPLICIT | research | PENDING | + | 3 | Matches AI Equilibrium voice and format | INFERRED | newsletter skill | PENDING | + | 4 | Includes 2-3 actionable takeaways for readers | INFERRED | newsletter skill | PENDING | + | 5 | Cross-links to relevant vault notes | IMPLICIT | linking rules | PENDING | + | 6 | British English, no slop, no emojis | IMPLICIT | formatting policy | PENDING | + | 7 | Draft saved to Content/AI Equilibrium/ | IMPLICIT | vault structure | PENDING | + + THINK → Adds row 8: "References recent AI agent news (last 7 days)" + PLAN → Research first (rows 1,2,8 parallel), then writing (rows 3,4,5,6,7 sequential) + BUILD → Refines "fresh angle" to "angle not covered by top 5 AI newsletters this week" + EXECUTE → Runs research skill, then newsletter skill with findings + VERIFY → Checks format, voice, links, British English + LEARN → Presents draft, shows ISC, awaits user rating + +□ Create AI/memory/work/ISC-TEMPLATE.md: + ## ISC: [Request Summary] + **Effort:** [LEVEL] | **Phase:** [PHASE] | **Iteration:** 1 + + | # | What Ideal Looks Like | Source | Capability | Status | + |---|---|---|---|---| + + _Created by The Algorithm. Updated through execution phases._ + +□ Test: + → "algorithm effort STANDARD: draft this week's newsletter about AI agents" + → Verify: ISC created with 5-15 rows + → Verify: phases execute in order + → Verify: ISC status updates as work progresses + → Verify: final ISC shown with all statuses + → "use the algorithm: redesign my morning routine" (STANDARD) + → Verify: reads health context, telos goals, creates ISC + → Simple request without "algorithm": "what's on my calendar?" + → Verify: does NOT invoke algorithm (TRIVIAL/QUICK) + +□ Git commit: "add: the algorithm skill with ISC-based structured execution" +``` + +--- + +### Step 5C: Algorithm Protocol Policy + +**What it does**: Defines when The Algorithm activates, how effort maps to capabilities, and the rules for ISC management. Referenced by the algorithm skill and by CLAUDE.md for automatic activation on STANDARD+ tasks. + +**Implementation**: + +``` +□ Create AI/policies/algorithm-protocol.md: + + # Algorithm Protocol + + ## When The Algorithm Activates + + ### Explicit Activation + User says: "use the algorithm", "algorithm:", "algorithm effort [LEVEL]:", "ISC" + → Always activate. Use the specified effort level, or classify if not specified. + + ### Implicit Activation + Task is classified as STANDARD or higher effort AND is non-trivial. + → Activate The Algorithm automatically. + → State: "This looks like a STANDARD task. Using The Algorithm." + + ### Never Activate + - TRIVIAL tasks (greetings, acknowledgements) + - QUICK tasks (single-step, simple actions) + - When user says "just do it", "skip the algorithm", "no need for ISC" + + ## Capability Routing Matrix + + | Effort | Skills | Research | Council | Parallel Agents | Iterations | ISC Rows | + |---|---|---|---|---|---|---| + | QUICK | 1 skill | none | none | 1 | 1 | none | + | STANDARD | 1-2 skills | standard | none | 1-3 | 2 | 5-15 | + | THOROUGH | any skills | extensive | yes | 3-5 | 3-5 | 15-50 | + | DETERMINED | everything | extensive | yes + red team | 10+ | unlimited | 50+ | + + ## ISC Rules + + 1. **ISC is shown to the user** after OBSERVE phase. User can modify before proceeding. + 2. **ISC rows are never deleted** — only status changes. This preserves the audit trail. + 3. **ADJUSTED is acceptable** — not everything needs to be perfectly DONE. + Document the deviation. + 4. **BLOCKED rows require a decision**: iterate, skip, or escalate to user. + 5. **ISC is saved** to the current work directory (AI/memory/work/{session}/ISC.md). + 6. **Verification is skeptical** — don't just confirm your own work. + Re-read outputs, check for errors, verify against criteria. + + ## Phase Transition Rules + + - Never skip phases (OBSERVE → THINK → PLAN → BUILD → EXECUTE → VERIFY → LEARN) + - Each phase has a gate question. Don't proceed until the gate is satisfied. + - For QUICK tasks that don't use the algorithm: just execute directly. + - For STANDARD: phases can be lightweight (2-3 sentences each for THINK/PLAN/BUILD) + - For THOROUGH/DETERMINED: phases should be substantive + + ## Iteration Rules + + When VERIFY finds issues: + - Execution error → loop to EXECUTE (try a different approach) + - Planning error → loop to PLAN (resequence or reassign capabilities) + - Requirements error → loop to THINK or OBSERVE (clarify with user) + - Each loop increments the iteration counter + - At max iterations for effort level: present results as-is, note what's unresolved + + ## Integration with Existing Skills + + The Algorithm does NOT replace skills. It orchestrates them: + - OBSERVE reads context maps to understand the domain + - PLAN assigns capabilities (research skill, newsletter skill, council, etc.) + - EXECUTE invokes those skills + - VERIFY checks the output against ISC criteria + - LEARN feeds into the memory/learning system (Phase 3) + + The Algorithm also works with: + - Challenger protocol (during THINK phase) + - Preferences (loaded during OBSERVE from memory) + - Mistake patterns (checked during THINK — don't repeat known mistakes) + +□ Git commit: "add: algorithm protocol policy for activation and routing rules" +``` + +--- + +### Step 5D: CLAUDE.md Integration + +**What it does**: Updates CLAUDE.md to reference effort classification and The Algorithm for automatic activation on non-trivial tasks. + +**Implementation**: + +``` +□ Read current CLAUDE.md + +□ Add to CLAUDE.md's universal rules section (keep it brief — 5-10 lines): + + ## Effort & Execution + - Classify every request by effort level (see AI/policies/effort-classification.md) + - TRIVIAL/QUICK: execute directly via the matching skill + - STANDARD and above: use The Algorithm (AI/skills/algorithm.md) + — create an ISC, work through 7 phases, verify results + - User can override: "algorithm effort THOROUGH: [task]" or "skip the algorithm" + - Route capabilities based on effort level (see AI/policies/algorithm-protocol.md) + +□ Do NOT restructure or rewrite CLAUDE.md. Just add the above block. + +□ Test: + → Simple request: "what's on my calendar?" → TRIVIAL, no algorithm + → Medium request: "draft a newsletter issue" → STANDARD, algorithm activates, + ISC created, phases execute + → Complex request: "algorithm effort THOROUGH: redesign my content strategy" + → THOROUGH, full algorithm with council and research + → Override: "just translate this, skip the algorithm" → executes directly + +□ Git commit: "integrate: add effort classification and algorithm routing to CLAUDE.md" +``` + +--- + +## PART 4: MANUAL TESTING CHECKLIST + +``` +□ Test 1: Effort Classification — TRIVIAL + → "hello" → no algorithm, no ISC, direct response + → Verify: no effort classification displayed + +□ Test 2: Effort Classification — QUICK + → "translate 'good morning' to Polish" → no algorithm, routes to translator skill + → Verify: no ISC, direct execution + +□ Test 3: Effort Classification — STANDARD (auto-algorithm) + → "draft this week's newsletter about personal AI assistants" + → Verify: effort stated as STANDARD + → Verify: ISC created with 5-15 rows + → Verify: ISC shown to user after OBSERVE + → Verify: phases execute in order + → Verify: final ISC with statuses shown + +□ Test 4: Effort Classification — THOROUGH (explicit) + → "algorithm effort THOROUGH: evaluate whether I should start a YouTube channel" + → Verify: effort stated as THOROUGH + → Verify: research skill invoked (extensive) + → Verify: council debate invoked + → Verify: ISC with 15+ rows + → Verify: multiple iterations if needed + +□ Test 5: Override — Skip + → "just do it, skip the algorithm: draft a quick email to John" + → Verify: no ISC, direct execution via communication skill + +□ Test 6: ISC Persistence + → Start a STANDARD algorithm task + → Verify: ISC saved to AI/memory/work/{session}/ISC.md + → Verify: ISC readable in Obsidian + +□ Test 7: Iteration + → Start a task where VERIFY finds an issue + → Verify: iteration back to appropriate phase + → Verify: iteration counter increments + → Verify: at max iterations, results presented as-is + +□ Test 8: Cross-Skill Integration + → "algorithm effort STANDARD: research AI agent frameworks and recommend one for my setup" + → Verify: PLAN assigns research skill for gathering, then analysis + → Verify: research skill actually executes within the algorithm + → Verify: results feed back into ISC status updates +``` + +--- + +## PART 5: WHAT THE ALGORITHM LOOKS LIKE IN PRACTICE + +Here's what a STANDARD interaction looks like end-to-end: + +``` +User: "Help me plan next month's newsletter topics" + +Claude: Effort: STANDARD — using The Algorithm with newsletter + research skills. + +**OBSERVE** — Creating ISC: + +## ISC: Plan next month's newsletter topics +**Effort:** STANDARD | **Phase:** OBSERVE | **Iteration:** 1 + +| # | What Ideal Looks Like | Source | Capability | Status | +|---|---|---|---|---| +| 1 | 4 newsletter topics identified (one per week) | EXPLICIT | research | PENDING | +| 2 | Each topic has a unique angle not covered by competitors | EXPLICIT | research | PENDING | +| 3 | Topics align with AI Equilibrium brand voice | INFERRED | newsletter skill | PENDING | +| 4 | Topics connect to current AI industry trends | INFERRED | research | PENDING | +| 5 | At least 1 topic connects to user's business interests | INFERRED | business context | PENDING | +| 6 | Draft outline for each topic (3-5 bullet points) | INFERRED | newsletter skill | PENDING | +| 7 | Topics saved to Content/AI Equilibrium/ | IMPLICIT | vault structure | PENDING | + +Does this ISC look right? Want to adjust before I proceed? + +[User confirms or adjusts] + +**THINK** — Checking for gaps... +Added row 8: "Topics shouldn't overlap with last 4 published issues" +All rows clear and testable. + +**PLAN** — Sequencing: +Phase A (Research): rows 1, 2, 4, 8 in parallel — research trends + competitor analysis +Phase B (Execution): rows 3, 5, 6 sequential — apply brand voice, connect to business, outline +Phase C (Verification): row 7 — save to vault + +**BUILD** — Refining success criteria: +Row 2 refined: "Angle not found in first page of results for 'AI newsletter topics January 2026'" +Row 5 refined: "Connects to at least one active GPR project or business area" + +**EXECUTE** — Working... +[Research skill runs, newsletter skill applies findings, outlines drafted] +[ISC statuses update to DONE as each completes] + +**VERIFY** — Checking results... +All 8 rows verified. Row 5 marked ADJUSTED (connects to investing area, not a GPR project). + +**LEARN** — Here are your 4 newsletter topics with outlines: +[Presents results + final ISC table] +``` + +--- + +## PART 6: WHAT COMES NEXT (Phase 6 Preview) + +Phase 6 adds observability — event logging for every tool call and hook execution, with an optional web dashboard. This is useful for debugging complex Algorithm executions and understanding system behaviour over time. It's the final phase and entirely optional. diff --git a/Plans/phase-6-briefing.md b/Plans/phase-6-briefing.md new file mode 100644 index 000000000..1a6cf09c4 --- /dev/null +++ b/Plans/phase-6-briefing.md @@ -0,0 +1,545 @@ +# Phase 6: Observability & Monitoring — Briefing for Claude Code + +## How to Use This File + +You are a Claude Code instance working in the LifeOS Obsidian vault. This file is your complete briefing for **Phase 6** of the AI Genius transformation. Phases 1–5 are complete and merged to main. This is the final phase. + +**Reference repo**: `~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ` — clone of Daniel Messler's PAI. Read files there for implementation patterns. Adapt, don't copy. + +**Safety rule**: After EVERY sub-step (6A through 6D), commit to git with a descriptive message. If anything breaks, `git revert` that commit. + +--- + +## PART 1: CURRENT STATE AFTER PHASE 5 + +### What Exists + +The full system is functional: modular architecture, 7 hooks, learning engine, 4 advanced skills, The Algorithm with ISC-based execution. + +What's missing: **visibility**. When the system runs a THOROUGH Algorithm execution with research + council + multiple parallel agents, there's no way to see what happened, what tools were called, how long things took, or where time was spent. You're flying blind on complex tasks. + +### What Phase 6 Adds + +1. **Event logging** — every hook event, tool call, and agent action logged to structured JSONL files +2. **PostToolUse capture hook** — records what every tool call actually did +3. **Enhanced session summary** — richer end-of-session reports using logged events +4. **Optional dashboard** — real-time web UI showing system activity (Bun HTTP + WebSocket) + +### Design Decision: Lightweight vs Full PAI + +PAI's observability is a full Vue 3 dashboard with WebSocket streaming, agent swim lanes, live pulse charts, and a Bun HTTP server. That's powerful but complex to maintain. + +This vault's approach: **JSONL logging first, dashboard optional**. The JSONL files alone are useful — grep-able, readable, and feed into the weekly synthesis (Phase 3). The dashboard is a nice-to-have for debugging complex Algorithm executions. + +--- + +## PART 2: PAI REFERENCE FILES + +### Observability Server +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/src/Observability/apps/server/src/index.ts +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/src/Observability/apps/server/src/file-ingest.ts +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/src/Observability/apps/server/src/types.ts +``` + +### Event Capture Hook +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/AgentOutputCapture.hook.ts +``` + +### Observability Library +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/lib/observability.ts +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/lib/metadata-extraction.ts +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/lib/TraceEmitter.ts +``` + +### Settings Hook Config +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/config/settings-hooks.json +``` + +### Dashboard Frontend (optional — only if building 6D) +``` +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/src/Observability/apps/client/src/App.vue +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/src/Observability/apps/client/src/composables/useWebSocket.ts +~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/src/Observability/apps/client/src/composables/useEventColors.ts +``` + +--- + +## PART 3: IMPLEMENTATION STEPS + +### Constraints + +**DO NOT TOUCH**: +- Existing hooks, skills, policies, memory structure +- Vault folder structure, tools, plugins +- CLAUDE.md (no changes needed) + +**DESIGN PRINCIPLES**: +- Logging hooks must NEVER block Claude Code (exit 0 always, fast execution) +- Fire-and-forget: if logging fails, work continues unaffected +- JSONL for structured data (one JSON object per line, append-only) +- The system must work identically with or without the dashboard running + +--- + +### Step 6A: Event Logger Library + +**What it does**: Shared utility that all hooks use to log events. Writes structured JSONL to `AI/memory/events/`. This is the foundation for all observability. + +**Implementation**: + +``` +□ Read PAI references: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/lib/observability.ts + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/lib/TraceEmitter.ts + +□ Create directory: AI/memory/events/ + +□ Create AI/hooks/lib/event-logger.ts: + + A shared utility with one main function: + + logEvent(event: { + hook_type: string; // "PreToolUse", "PostToolUse", "UserPromptSubmit", "Stop", "SubagentStop" + tool_name?: string; // "Bash", "Read", "Edit", "Write", "Task", etc. + tool_input_summary?: string; // Condensed description of what the tool was asked to do + tool_result_summary?: string; // Condensed result (for PostToolUse only) + session_id?: string; // From stdin JSON if available + agent_type?: string; // "main", "researcher", "council-strategist", etc. + effort_level?: string; // If Algorithm is active + algorithm_phase?: string; // If Algorithm is active + isc_row?: number; // If working on specific ISC row + duration_ms?: number; // For PostToolUse: how long the tool call took + metadata?: Record; // Any additional context + }) + + Implementation: + 1. Add timestamp (ISO format) to the event + 2. Determine log file: AI/memory/events/YYYY-MM-DD.jsonl + 3. Append one JSON line to the file + 4. MUST: never throw, never block, wrap in try/catch + 5. MUST: fast — just a file append, no processing + + Example JSONL line: + {"timestamp":"2026-02-04T20:15:30.000Z","hook_type":"PostToolUse","tool_name":"Bash","tool_input_summary":"git status","duration_ms":245,"session_id":"abc123"} + + Also export: + - readTodayEvents(): Read today's JSONL file, return parsed array + - readEventsForDateRange(start, end): Read multiple days + - getEventStats(events): Count by tool_name, by hook_type, total duration + +□ Test: + → Import and call logEvent() from a test script + → Verify: AI/memory/events/YYYY-MM-DD.jsonl created with valid JSONL + → Verify: multiple calls append correctly (not overwrite) + → Verify: malformed input doesn't crash (silently skipped) + +□ Git commit: "add: event logger library for structured JSONL observability" +``` + +--- + +### Step 6B: Event Capture Hooks (PostToolUse + Enhanced Existing) + +**What it does**: Adds a PostToolUse hook that logs every tool call result. Enhances existing hooks to also log their activity via the event logger. + +**Implementation**: + +``` +□ Read PAI references: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-hook-system/src/hooks/AgentOutputCapture.hook.ts + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/config/settings-hooks.json + +□ Create AI/hooks/event-capture.ts: + → PostToolUse hook + → Reads JSON from stdin: { tool_name, tool_use_id, tool_result } + → Logs via event-logger.ts: + - hook_type: "PostToolUse" + - tool_name: from stdin + - tool_input_summary: condensed (first 200 chars of tool_result, or "success"/"error") + - duration_ms: not available from PostToolUse directly — omit or estimate + → For Task tool (subagent) completions: + - Also log agent_type from tool input + - Log a summary of the agent's output (first 300 chars) + → MUST: exit 0 always, fast execution, no stdout output + (this hook is for logging only, not content injection) + +□ Enhance existing hooks to log their activity: + Add a single logEvent() call to each existing hook: + + - security-validator.ts: log every decision (allow/block/confirm) with tool_name and pattern matched + - on-feedback.ts: log when explicit rating is captured (rating value, has comment) + - implicit-sentiment.ts: log when implicit signal detected (sentiment, confidence) + - auto-work-creation.ts: log when work directory created or prompt count updated + - on-session-start.ts: log when context injection fires (first prompt only) + - on-session-end.ts: log when session ends (work completed, duration) + - format-enforcer.ts: log that format reminder was injected (just a ping, no content) + + Each addition is ONE LINE: import logEvent, call it with relevant data. + Do NOT restructure the hooks. Just add the logging call. + +□ Register event-capture.ts in .claude/settings.json: + Add to existing config: + "PostToolUse": [ + { + "matcher": "", + "hooks": [ + {"type": "command", "command": "bun run /Users/krzysztofgoworek/LifeOS/AI/hooks/event-capture.ts"} + ] + } + ] + +□ Updated .claude/settings.json should now have: + - PreToolUse: security-validator + - UserPromptSubmit: on-session-start, on-feedback, format-enforcer, + auto-work-creation, implicit-sentiment + - PostToolUse: event-capture (NEW) + - Stop: on-session-end + +□ Test: + → Start a session, do some work (read files, run commands, edit) + → Verify: AI/memory/events/YYYY-MM-DD.jsonl has entries for each tool call + → Verify: security decisions are logged + → Verify: rating events are logged + → Verify: session start/end events are logged + → Verify: no impact on Claude Code performance (hooks still fast) + +□ Git commit: "add: event capture hook and observability logging across all hooks" +``` + +--- + +### Step 6C: Session Activity Report + +**What it does**: Enhances the session end processing to produce a brief activity report using logged events. Shows what happened in the session: tools used, effort level, ISC status, time spent. + +**Implementation**: + +``` +□ Enhance AI/hooks/on-session-end.ts: + → After existing session-end logic (work completion, state cleanup): + → Import readTodayEvents() and getEventStats() from event-logger.ts + → Read today's events, filter to current session_id + → Compute: + - Total tool calls (by type: Read, Bash, Edit, Write, Task) + - Security events (any blocks or confirmations?) + - Ratings captured (explicit + implicit) + - Session duration (first event to last event) + - Algorithm usage (was ISC created? How many rows? Final status?) + → Write summary to the session's work directory: + AI/memory/work/{session-dir}/activity-report.md + + # Session Activity Report + **Date**: YYYY-MM-DD HH:MM - HH:MM + **Duration**: N minutes + **Effort**: [level] + + ## Tool Usage + | Tool | Calls | + |---|---| + | Read | 15 | + | Bash | 8 | + | Edit | 5 | + | Task | 3 | + | Write | 2 | + + ## Security Events + - N tool calls checked, N blocked, N confirmed + + ## Ratings + - Explicit: N (avg: X.X) + - Implicit: N (avg: X.X) + + ## Algorithm (if used) + - ISC: [summary] — N rows, N done, N blocked + - Iterations: N + + → This report is readable in Obsidian and feeds into weekly synthesis (Phase 3) + → Remember: Stop hook stdout is NOT displayed — this writes to file only + +□ Enhance AI/scripts/learning-synthesis.sh (Phase 3): + → Add to the synthesis prompt: + "Also read AI/memory/events/ for the past 7 days. + Compute: total tool calls, most-used tools, average session duration, + security events, and include in the Weekly Summary section." + +□ Test: + → Run a session with varied activity + → Exit session + → Verify: AI/memory/work/{session}/activity-report.md exists + → Verify: report has accurate tool call counts + → Verify: readable in Obsidian + +□ Git commit: "add: session activity reports from event logs" +``` + +--- + +### Step 6D: Real-Time Dashboard (OPTIONAL) + +**What it does**: A lightweight web dashboard that streams events in real-time via WebSocket. Shows what the system is doing right now. Useful for monitoring complex Algorithm executions with parallel agents. + +**This step is OPTIONAL.** Steps 6A-6C provide full observability via JSONL files and activity reports. The dashboard adds a live visual interface on top. + +**Skip this step if**: you're comfortable reading JSONL files and activity reports. Come back to it later if you find yourself wanting real-time visibility. + +**Implementation**: + +``` +□ Read PAI references: + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/src/Observability/apps/server/src/index.ts + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/src/Observability/apps/server/src/file-ingest.ts + - ~/Daniel-Messler-personal_ai_infrastructure-claude-ai-genius-system-design-GcvOQ/Packs/pai-observability-server/src/Observability/apps/client/src/App.vue + +□ Create AI/observability/ directory + +□ Create AI/observability/server.ts: + A Bun HTTP + WebSocket server. Simplified version of PAI's: + + HTTP endpoints: + - GET /events/recent — last 50 events from today's JSONL + - GET /events/today — all events from today + - GET /health — server status + + WebSocket endpoint: + - /stream — real-time event streaming + + File watching: + - Watch AI/memory/events/ directory for changes + - When JSONL file is appended: read new lines, broadcast to WebSocket clients + - Track file position to only read new content + + Server config: + - Port: 4200 (avoid conflicts with other services) + - Host: localhost (not exposed externally) + - No authentication needed (local only) + + Keep it simple: + - No database (read JSONL files directly) + - No persistence (events already in JSONL) + - No themes (minimal UI) + - Stateless restarts (re-read from files on start) + +□ Create AI/observability/dashboard.html: + A SINGLE HTML file with embedded CSS and JavaScript. No build step. + No Vue, no React — plain HTML + vanilla JS + WebSocket. + + Features: + - WebSocket connection to ws://localhost:4200/stream + - Auto-reconnect on disconnect (3 second delay) + - Event list (newest first, max 200 displayed) + - Each event shows: timestamp, hook_type, tool_name, summary + - Color coding by hook_type: + - PreToolUse: yellow + - PostToolUse: orange + - UserPromptSubmit: cyan + - Stop: red + - SubagentStop: purple + - Filter buttons: All / Tools / Prompts / Security / Ratings + - Simple stats bar: events today, active sessions, last event time + - Auto-scroll to newest event (toggle on/off) + + Style: dark background (#1a1b26), monospace font, minimal. + The whole thing should be < 500 lines of HTML/CSS/JS. + +□ Create AI/observability/start.sh: + #!/bin/bash + # Start the observability dashboard + cd "$(dirname "$0")" + echo "Starting observability server on http://localhost:4200" + bun run server.ts & + SERVER_PID=$! + echo "Server PID: $SERVER_PID" + echo "Dashboard: open AI/observability/dashboard.html in browser" + echo "Or: open http://localhost:4200/events/recent for raw events" + + Make executable: chmod +x AI/observability/start.sh + +□ Optionally create launchd plist to auto-start: + AI/scripts/plists/com.lifeos.observability.plist + → Start on login + → Keep alive + → Log to AI/scripts/logs/observability.log + +□ Update event-logger.ts (from 6A): + → After writing to JSONL, also attempt to POST to http://localhost:4200/events + → Fire-and-forget: if server is not running, silently fail (no retry) + → This enables real-time streaming when dashboard is active + +□ Test: + → Run: bash AI/observability/start.sh + → Open dashboard.html in browser + → Start a Claude Code session, do some work + → Verify: events appear in real-time on dashboard + → Verify: filter buttons work + → Stop the server → verify Claude Code still works normally (no impact) + +□ Git commit: "add: optional real-time observability dashboard" +``` + +--- + +## PART 4: MANUAL TESTING CHECKLIST + +``` +□ Test 1: Event Logging (6A + 6B) + → Start session, read a file, run a command, edit a file + → Verify: AI/memory/events/YYYY-MM-DD.jsonl has entries + → Verify: each entry is valid JSON on one line + → Verify: entries include tool_name, hook_type, timestamp + +□ Test 2: Security Event Logging + → Trigger a security check (try a risky command) + → Verify: security decision logged in events JSONL + +□ Test 3: Rating Event Logging + → Give an explicit rating "7 - good" + → Verify: rating event logged in events JSONL + +□ Test 4: Session Activity Report (6C) + → Run a session with 10+ tool calls + → Exit session + → Verify: activity-report.md in work directory + → Verify: tool counts match actual usage + → Verify: readable in Obsidian + +□ Test 5: Event File Rotation + → Check that events are in date-stamped files (YYYY-MM-DD.jsonl) + → Verify: new day = new file (no unbounded growth) + +□ Test 6: Graceful Degradation + → Delete AI/memory/events/ directory + → Start a session → verify Claude Code works normally + → Verify: event logger recreates the directory silently + +□ Test 7: Dashboard (if 6D implemented) + → Start server: bash AI/observability/start.sh + → Open dashboard in browser + → Do work in Claude Code + → Verify: events stream in real-time + → Stop server → verify Claude Code unaffected + +□ Test 8: Synthesis Integration + → Accumulate a few days of event logs + → Run learning-synthesis.sh + → Verify: synthesis report includes tool usage stats +``` + +--- + +## PART 5: EVENT LOG STRUCTURE + +After Phase 6, the events directory: + +``` +AI/memory/events/ +├── 2026-02-04.jsonl # One file per day, auto-rotated +├── 2026-02-05.jsonl +└── ... +``` + +Each line is a JSON object: +```json +{"timestamp":"2026-02-04T20:15:30.000Z","hook_type":"PostToolUse","tool_name":"Read","tool_input_summary":"AI/skills/research.md","session_id":"abc123","agent_type":"main"} +{"timestamp":"2026-02-04T20:15:31.500Z","hook_type":"PostToolUse","tool_name":"Bash","tool_input_summary":"git status","session_id":"abc123","agent_type":"main","duration_ms":245} +{"timestamp":"2026-02-04T20:15:35.000Z","hook_type":"PostToolUse","tool_name":"Task","tool_input_summary":"research agent spawned","session_id":"abc123","agent_type":"researcher"} +``` + +Benefits: +- **Grep-able**: `grep '"tool_name":"Bash"' AI/memory/events/2026-02-04.jsonl | wc -l` → bash call count +- **Readable in Obsidian**: JSONL files open as text +- **Date-partitioned**: no unbounded growth, easy to archive old files +- **Feeds into synthesis**: weekly learning script reads events for tool usage stats +- **Feeds into activity reports**: session end hook summarises per-session activity + +--- + +## PART 6: COMPLETE SYSTEM OVERVIEW + +With Phase 6 complete, here is the full LifeOS AI Genius architecture: + +``` +┌─────────────────────────────────────────────────────────┐ +│ CLAUDE.md │ +│ (Slim router, ~1,700 tokens) │ +│ Identity → Effort Classification → Skill Routing │ +└──────────────────────┬──────────────────────────────────┘ + │ + ┌─────────────┼─────────────────┐ + ▼ ▼ ▼ + ┌──────────┐ ┌──────────┐ ┌──────────────┐ + │ Skills │ │ Policies │ │ The Algorithm │ + │ (28+) │ │ (10+) │ │ (Phase 5) │ + └────┬─────┘ └──────────┘ │ OBSERVE→LEARN│ + │ │ ISC Table │ + │ ┌─────────────────────┘ + ▼ ▼ + ┌──────────────┐ ┌──────────────┐ + │ Context Maps │ │ Hooks │ + │ (9+) │ │ (8) │ + └──────────────┘ └──────┬───────┘ + │ + ┌────────────────┼────────────────┐ + ▼ ▼ ▼ + ┌──────────────┐ ┌──────────────┐ ┌────────────┐ + │ Memory │ │ Signals │ │ Events │ + │ work/ │ │ ratings.jsonl│ │ YYYY-MM-DD │ + │ learnings/ │ │ (explicit + │ │ .jsonl │ + │ proposals/ │ │ implicit) │ │ (Phase 6) │ + │ research/ │ │ │ │ │ + │ telos/ │ │ │ │ │ + └──────┬────────┘ └──────┬───────┘ └─────┬──────┘ + │ │ │ + └─────────────────┼────────────────┘ + ▼ + ┌─────────────────────┐ + │ Weekly Synthesis │ + │ (Phase 3) │ + │ → Proposals │ + │ → Skill Improvement│ + │ → Preference Promo │ + └─────────────────────┘ +``` + +### All Hooks (Final State) + +| Hook | Event Type | Purpose | +|---|---|---| +| on-session-start.ts | UserPromptSubmit | Context injection (first prompt only) | +| on-feedback.ts | UserPromptSubmit | Explicit rating capture (1-10) | +| format-enforcer.ts | UserPromptSubmit | Format reminder injection | +| auto-work-creation.ts | UserPromptSubmit | Automatic session tracking | +| implicit-sentiment.ts | UserPromptSubmit | Passive frustration/satisfaction detection | +| security-validator.ts | PreToolUse | Security pattern checking | +| event-capture.ts | PostToolUse | Tool call logging (Phase 6) | +| on-session-end.ts | Stop | Work completion + activity report | + +### All Scripts + +| Script | Schedule | Purpose | +|---|---|---| +| daily-brief.sh | 8:00 AM daily | Morning briefing with WIP + proposals reminder | +| deep-work-block.sh | 8:30 AM weekdays | Calendar analysis, deep work setup | +| learning-synthesis.sh | 9:00 AM Sunday | Weekly signal analysis + proposals | +| weekly-review.sh | 10:00 AM Sunday | GTD review + AI system health | +| vault-cleanup.sh | 11:00 AM 1st Saturday | Vault maintenance | +| daily-close.sh | 10:00 PM daily | End-of-day processing + WIP update | + +### The Feedback Loop + +``` +User works with Claude + → Hooks capture: ratings, sentiment, tool usage, security events + → Memory accumulates: preferences, mistakes, execution patterns + → Weekly synthesis: analyses signals, finds patterns + → Proposals generated: skill improvements, preference promotions, new rules + → User reviews: approve / reject + → Skills & policies improve + → Next week: better performance + → Repeat +``` + +This is a self-improving system. Every interaction makes it slightly better. Phase 6 adds the visibility layer so you can see the improvement happening. diff --git a/Plans/transformation-briefing.md b/Plans/transformation-briefing.md new file mode 100644 index 000000000..b140ae989 --- /dev/null +++ b/Plans/transformation-briefing.md @@ -0,0 +1,690 @@ +# LifeOS → Smart Modular Architecture: Briefing for Claude Code + +## How to Use This File + +You are a Claude Code instance working in the LifeOS Obsidian vault. This file is your complete briefing for the transformation project. It contains: + +1. **Your current state** — what exists in this vault right now +2. **PAI patterns to adopt** — reference implementation at `~/PAI-reference/` +3. **The transformation plan** — step-by-step, safe, rollback-friendly +4. **Constraints** — what you must NOT break + +**Reference repo**: The PAI (Personal AI Infrastructure) repo should be cloned at `~/PAI-reference/`. Read files there for implementation patterns. Do NOT copy files verbatim — adapt patterns to fit this vault's existing structure and conventions. + +**Safety rule**: After EVERY step, commit to git with a descriptive message. If anything breaks, `git revert` that commit. CLAUDE.md stays untouched until Step 6. + +--- + +## PART 1: CURRENT STATE (Audit Summary) + +### 1.1 Primary Instruction File + +`CLAUDE.md` at vault root. 58,119 bytes (~14,500 tokens). Monolithic. Contains: +- Vault overview, folder structure, frontmatter schemas +- Tool documentation (eventkit-cli, mail-cli) +- Identity & core directives (thinking partner, not yes-man) +- Provocation Protocol (counterarguments, draft critique, metacognitive close) +- AI Context Log management rules +- 18 specialized modes (6.1–6.18) +- Council of Experts method +- Emergency protocols +- Proactivity protocol +- Note linking & vault graph building rules + +This file is auto-loaded by Claude Code at every session start. + +### 1.2 Supplementary Context Files (read on demand) + +| File | Purpose | +|---|---| +| `AI/AI Context Log.md` | Current situation, priorities, pipeline — in Polish | +| `AI/My AgentOS/CV Krzysztof Goworek.md` | Professional profile | +| `Areas/Health/Health OS.md` | Health protocols, pharmacotherapy, fitness | +| `AI/My AgentOS/Deep Profile & Operating Manual.md` | Psychological profile, strengths, interaction rules | +| `AI/GPR to Reminders Mapping.md` | Project ↔ Reminders list mappings | + +### 1.3 The 18 Modes (to become 18 skills) + +| # | Mode | Trigger Keywords | +|---|---|---| +| 6.1 | AI Equilibrium Editor | newsletter, AI Equilibrium, content | +| 6.2 | Translator EN→PL | translation requests | +| 6.2b | Editing & Rewriting | editing existing prose | +| 6.3 | Zen Jaskiniowca | life advice, motivation, existential | +| 6.4 | Business Advisor | revenue, strategy, career | +| 6.5 | Strategy Advisor | negotiation, power dynamics | +| 6.6 | Productivity Advisor | time management, prioritisation | +| 6.7 | Vault Janitor | vault organisation, cleanup | +| 6.8 | Health & Fitness | diet, training, supplements, TRT | +| 6.9 | Communication & Writing | emails, LinkedIn, pitches | +| 6.10 | Network Management | relationships, follow-ups | +| 6.11 | Learning & Knowledge | Readwise, frameworks, research | +| 6.12 | Technical Architecture | tech stack, tools, automation | +| 6.13 | Legal Advisor | contracts, legal questions | +| 6.14 | Financial Advisor | portfolio, allocation | +| 6.15 | General Advisor | fallback mode | +| 6.16 | Weekly Review (GTD) | "start weekly review" | +| 6.17 | Daily Shutdown | "end of day", 18:00+ | +| 6.18 | Meeting Notes Processing | "process meetings" | + +### 1.4 Existing Automation + +5 launchd scripts in `AI/scripts/`: +| Script | Schedule | What it does | +|---|---|---| +| `daily-brief.sh` | 8:00 AM daily | Morning briefing via Claude CLI | +| `deep-work-block.sh` | 8:30 AM weekdays | Calendar analysis, deep work setup | +| `weekly-review.sh` | 10:00 AM Sunday | 7-phase GTD review | +| `vault-cleanup.sh` | 11:00 AM 1st Saturday | Vault maintenance | +| `daily-close.sh` | 10:00 PM daily | End-of-day processing | + +All invoke `claude -p --permission-mode bypassPermissions --model sonnet`. +Notifications via Discord webhook + Telegram bot fallback (`notify-utils.sh`). + +### 1.5 Tools & Integrations + +**Custom CLI tools** (Swift, in `Tools/`): +- `eventkit-cli` — Apple Calendar & Reminders +- `mail-cli` — Apple Mail + +**Remote control**: `afk-code` (Node.js, `~/afk-code`) — Discord/Telegram → Claude Code relay + +**6 MCP servers**: Gemini, Google Workspace, Brave Search, Firecrawl, Apple MCP (overridden by eventkit-cli), Perplexity + +**Obsidian plugins** (integration-relevant): obsidian-git, dataview, readwise-official, obsidian-claude-chat, templater-obsidian, auto-template-trigger, quickadd, calendar, obsidian-advanced-uri + +### 1.6 Vault Structure + +``` +LifeOS/ +├── AI/ # AgentOS config, prompts, skills, scripts, context log +│ ├── My AgentOS/ # 22 files: CV, deep profile, brand context, cue cards +│ ├── Prompts/ # 18 prompt templates (AIEQ pipeline, Quintant, meta) +│ ├── scripts/ # 5 automation scripts + utilities + launchd plists +│ └── skills/ # 5 existing skill build specs + eventkit-cli docs +├── Archive/ # Completed GPR projects, archived tasks +├── Areas/ # Companies, People, Investing, Health, Life Areas +├── Content/ # AI Equilibrium newsletter, ideas, video, landing pages +├── Notes/ # Daily Notes, Journal, Meeting Notes, Inbox, Topics +├── Projects/ # Active GPR projects, Deals, Proposals +├── Readwise/ # Synced highlights +├── Resources/ # Processes, coaching material +├── Tools/ # eventkit-cli, mail-cli source code +├── Tracking/ # Habits, Objectives, Key Results, Years +├── _System/ # Dashboard, 24 .base database files, Templates, Media +└── CLAUDE.md # Primary instruction file (58KB) +``` + +24 active `.base` database files in `_System/Databases/`. + +### 1.7 Critical Gaps (from audit) + +1. **No cross-session memory** — every session starts from zero +2. **No feedback/improvement loop** — system is static until manually edited +3. **Monolithic instruction file** — 58KB, no conditional loading +4. **No event-driven hooks** — everything is time-scheduled or manual +5. **No work-in-progress persistence** — multi-session tasks have no handoff +6. **No rule priority/conflict resolution** — left to AI judgement +7. **No context budget management** — no selective loading +8. **Proactivity learning resets** every session (not persistent) + +--- + +## PART 2: PAI PATTERNS TO ADOPT + +Reference: `~/PAI-reference/` (clone of https://github.com/danielmiessler/PAI) + +Read these directories for implementation patterns. Adapt, don't copy. + +### 2.1 Skill System + +**Read**: `~/PAI-reference/Packs/pai-core-install/skills/CORE/` + +PAI skill structure: +```yaml +--- +name: SkillName # TitleCase +description: What it does. USE WHEN [triggers]. [Capabilities]. +--- + +# SkillName + +## Voice Notification +[What to say when executing] + +## Customization +Check USER/SKILLCUSTOMIZATIONS/{SkillName}/PREFERENCES.md + +## Workflows +- WorkflowName: Description + +## Tools +- ToolName.ts: Does X +``` + +**Adapt for this vault**: Use the same YAML frontmatter pattern, but: +- Add `triggers:` array (keyword matching for routing) +- Add `context_files:` array (which context maps to load) +- Add `policies:` array (which policy files apply) +- Add `voice:` (tone/style for this mode) +- Keep the mode content from current CLAUDE.md — don't rewrite it + +### 2.2 Memory System + +**Read**: `~/PAI-reference/Packs/pai-core-install/skills/CORE/SYSTEM/MEMORYSYSTEM.md` + +PAI memory structure: +``` +MEMORY/ +├── WORK/ # Active work tracking +├── LEARNING/ # By category: SYSTEM, ALGORITHM, SYNTHESIS, SIGNALS +├── RESEARCH/ # Agent outputs +├── SECURITY/ # Audit trail +└── STATE/ # Runtime state (ephemeral) +``` + +**Adapt for this vault**: Simpler version in `AI/memory/`: +``` +memory/ +├── work/current.md # WIP tracking (what, stage, next step) +├── learnings/ +│ ├── execution.md # How to approach tasks +│ ├── preferences.md # User style, tone, habits +│ └── mistakes.md # Errors to not repeat +├── signals/ratings.jsonl # Timestamped quality ratings +└── context-log.md # Moved from AI/AI Context Log.md +``` + +### 2.3 Hook System + +**Read**: `~/PAI-reference/Packs/pai-hook-system/hooks/` + +PAI hooks are TypeScript files registered in Claude Code's `settings.json`. Key patterns: +- `SessionStart` hooks — inject context at session start +- `Stop` hooks — capture learnings at session end +- `UserPromptSubmit` hooks — process user input (e.g., rating capture) +- All hooks: read JSON from stdin, write JSON to stdout, exit 0 (never block) +- Fire-and-forget design — never crash Claude Code + +**Key files to study**: +- `~/PAI-reference/Packs/pai-hook-system/hooks/SessionStart/LoadContext.hook.ts` +- `~/PAI-reference/Packs/pai-hook-system/hooks/Stop/SessionSummary.hook.ts` +- `~/PAI-reference/Packs/pai-hook-system/hooks/Stop/WorkCompletionLearning.hook.ts` +- `~/PAI-reference/Packs/pai-hook-system/hooks/UserPromptSubmit/ExplicitRatingCapture.hook.ts` +- `~/PAI-reference/Packs/pai-hook-system/hooks/lib/` (shared utilities) + +**Adapt for this vault**: Create hooks in `AI/hooks/`: +- `on-session-start.ts` — read WIP state, context log, recent learnings +- `on-session-end.ts` — update WIP, capture learnings +- `on-feedback.ts` — detect ratings (1-10 pattern), store in signals +- Register in `~/.claude/settings.json` or project-level `.claude/settings.json` + +### 2.4 CORE Skill (Routing Layer) + +**Read**: `~/PAI-reference/Packs/pai-core-install/skills/CORE/SKILL.md` + +PAI's CORE skill is the "operating system" — auto-loaded, routes to other skills. + +**Adapt for this vault**: The slim CLAUDE.md becomes this. It should contain: +1. Identity (who the AI is, who the user is) +2. Universal rules (apply to ALL skills) +3. Skill routing instruction ("read the matching skill file from AI/skills/") +4. Memory instruction ("read WIP and context at session start, update at end") +5. Vault structure overview (condensed) +6. Tool reference (eventkit-cli, mail-cli essentials) + +Target: ~4,000 tokens vs current 14,500. + +### 2.5 Learning & Feedback Capture + +**Read**: `~/PAI-reference/Packs/pai-hook-system/hooks/UserPromptSubmit/ExplicitRatingCapture.hook.ts` +**Read**: `~/PAI-reference/Packs/pai-hook-system/hooks/Stop/WorkCompletionLearning.hook.ts` + +PAI captures: +- Explicit ratings: regex pattern `^(10|[1-9])(?:\s*[-:]\s*|\s+)?(.*)$` +- Work completion learnings: at session end, captures what worked/didn't +- Implicit sentiment: detects frustration in prompts + +**Adapt**: Same patterns, writing to `AI/memory/signals/` and `AI/memory/learnings/`. + +### 2.6 Two-Tier Configuration (SYSTEM/USER) + +**Read**: `~/PAI-reference/Packs/pai-core-install/skills/CORE/SYSTEM/` and `~/PAI-reference/Packs/pai-core-install/skills/CORE/USER/` + +PAI separates system-authored config (SYSTEM/) from user-authored overrides (USER/). System files update with PAI releases; USER files never get overwritten. + +**Adapt**: Not directly needed for this vault (no external updates to protect against), but the principle applies: keep policies as standalone files so skills can be modified independently. + +--- + +## PART 3: TRANSFORMATION STEPS + +### Constraints + +**DO NOT TOUCH**: +- Vault folder structure (Areas/, Projects/, Notes/, Content/, etc.) +- All 24 .base database files and their frontmatter schemas +- Daily Notes, Meeting Notes, Journal structure +- eventkit-cli and mail-cli (Tools/ directory) +- afk-code remote control system +- Obsidian plugins and their configs +- Git workflow and .gitignore +- Discord/Telegram webhook setup +- Apple Reminders as primary task system +- GPR project tracking +- CLAUDE.md (until Step 6) + +**ALWAYS**: +- Git commit after every step with descriptive message +- Verify the system works after each step +- Preserve ALL content from the 18 modes — extract, don't rewrite +- Keep everything in the AI/ directory (don't create new top-level folders) + +### Step 0: Prepare + +``` +□ Ensure git status is clean, all committed +□ Create branch: git checkout -b ai-genius-transformation +□ Read CLAUDE.md and document section boundaries (line ranges for each mode, each policy section) +□ Verify all 5 launchd scripts work: check AI/scripts/*.log for recent successful runs +□ Verify eventkit-cli works: run `eventkit-cli reminders list` +□ Verify mail-cli works: run `mail-cli inbox --limit 1` +□ Git commit: "docs: document section boundaries in CLAUDE.md for extraction" +``` + +### Step 1: Create Directory Structure + +**Risk: ZERO** + +``` +□ Create directories: + AI/skills/ + AI/context/ + AI/policies/ + AI/memory/ + AI/memory/learnings/ + AI/memory/signals/ + AI/memory/work/ + AI/hooks/ + AI/hooks/lib/ +□ Create AI/skills/_SKILL-TEMPLATE.md with: + --- + name: [Skill Name] + triggers: ["keyword1", "keyword2"] + context_files: + - AI/context/[domain].md + policies: + - [policy-name] + voice: "[tone description]" + --- + # [Skill Name] + ## When to Activate + ## Instructions + ## Workflows + ## Examples +□ Git commit: "scaffold: create modular AI architecture directories" +``` + +### Step 2: Extract Policies + +**Risk: LOW** — new files only, CLAUDE.md unchanged. + +Read CLAUDE.md. Find each of these sections. Extract to standalone files: + +``` +□ Provocation Protocol (Section 4) → AI/policies/provocation-protocol.md + Include: counterargument requirement, draft critique mode, metacognitive close, + provenance tagging, bypass commands ("just execute", "skip provocation") + +□ Council of Experts (Section 7) → AI/policies/council-of-experts.md + Include: the full method definition + +□ Proactivity Protocol → AI/policies/proactivity-protocol.md + Include: when to suggest, when to back off, the "3 nos" rule + +□ Note Linking Rules (Section 3) → AI/policies/linking-rules.md + Include: cross-linking mandates, bidirectional linking, daily note linking + +□ Security & Tool Boundaries → AI/policies/security-boundaries.md + Include: "NEVER use apple-mcp" rules, eventkit-cli mandate, legal disclaimers, + strategy advisor boundaries, all NEVER/ALWAYS rules that apply globally + +□ Emergency Protocols → AI/policies/emergency-protocols.md + Include: anxiety responses, victim mode detection + +□ Formatting & Style Rules → AI/policies/formatting-rules.md + Include: language matching (Polish/English), British English rule, + anti-slop rules, anti-corporatese, structured output rules, + no emojis rule, ADHD-aware formatting + +□ Git commit: "extract: 7 policy files from CLAUDE.md sections" +``` + +### Step 3: Extract Skills + +**Risk: LOW** — new files only, CLAUDE.md unchanged. + +For each mode, create a skill file. Read the PAI skill template at `~/PAI-reference/Packs/pai-core-install/skills/CORE/SKILL.md` for formatting reference. + +``` +□ For EACH of the 18 modes: + 1. Read the mode's full content from CLAUDE.md + 2. Create AI/skills/{skill-name}.md + 3. Add YAML frontmatter with: + - name, triggers, context_files, policies, voice + 4. Copy the FULL mode content (instructions, templates, examples, anti-patterns) + 5. Add policy references (which AI/policies/ files this skill needs) + 6. Add context map references (which AI/context/ files this skill needs) + +□ CRITICAL: Do NOT rewrite mode content. Extract verbatim. + The mode instructions were carefully crafted. Preserve them exactly. + Only add structure (frontmatter, headers) around them. + +□ VERIFY: For each skill file, compare against CLAUDE.md source section. + Nothing should be lost. Every rule, every example, every anti-pattern + must be in the skill file. + +□ Git commit: "extract: 18 skill files from CLAUDE.md modes" +``` + +### Step 4: Create Context Maps + +**Risk: LOW** — new files only. + +Each context map tells Claude which vault files matter for a domain. Read the vault structure, identify which folders/files are relevant to each domain. + +``` +□ Create AI/context/newsletter.md + Primary: Content/AI Equilibrium/, AI/Prompts/AIEQ*, _System/Databases/Newsletter Ideas.base + Secondary: Areas/Companies/ (business angles), Notes/Meeting Notes/ (last 7 days) + Related domains: content, business + +□ Create AI/context/business.md + Primary: Projects/GPR/, Areas/Companies/, Projects/Deals/, Projects/Proposals/ + Secondary: AI/GPR to Reminders Mapping.md, Tracking/Objectives/ + Related domains: strategy, network, financial + +□ Create AI/context/health.md + Primary: Areas/Health/, Areas/Health/Health OS.md + Secondary: Tracking/Habits/, Notes/Daily Notes/ (last 3 days for patterns) + Related domains: personal + +□ Create AI/context/content.md + Primary: Content/, _System/Databases/Content Calendar.base, _System/Databases/Short-form Video Ideas.base + Secondary: AI/Prompts/, Content/AI Equilibrium/ (for cross-promotion) + Related domains: newsletter, business + +□ Create AI/context/investing.md + Primary: Areas/Investing/, _System/Databases/Investing Holdings.base + Secondary: none + Related domains: financial + +□ Create AI/context/network.md + Primary: Areas/People/, _System/Databases/People.base + Secondary: Notes/Meeting Notes/ (last 14 days), Projects/Deals/ + Related domains: business + +□ Create AI/context/personal.md + Primary: [minimal — advisory only] + Flag: advisory_only: true — do not proactively analyze personal notes + Related domains: health + +□ Git commit: "create: 7 domain context maps for selective loading" +``` + +### Step 5: Bootstrap Memory System + +**Risk: LOW** — mostly new files. One file move (AI Context Log). + +``` +□ Create AI/memory/context-log.md + → Copy FULL content from AI/AI Context Log.md + → In the OLD file (AI/AI Context Log.md), replace content with: + "This file has moved to AI/memory/context-log.md. + All scripts and references should use the new location." + → Update any references in CLAUDE.md to point to new location + → Check AI/scripts/*.sh for references to the old path — update them + +□ Create AI/memory/work/current.md with template: + # Current Work in Progress + _Updated by hooks at session end. Read by hooks at session start._ + ## Active Items + None — fresh start. + ## Last Session + No previous session recorded. + ## Next Steps + None pending. + +□ Create AI/memory/learnings/execution.md with template: + # Execution Learnings + _How to approach tasks better. Updated when patterns are noticed._ + +□ Create AI/memory/learnings/preferences.md + → Seed with KNOWN preferences extracted from CLAUDE.md: + - Language preferences (Polish/English matching, British English) + - Anti-slop rules + - Formatting preferences + - Tool preferences (eventkit-cli over apple-mcp) + - Any explicit "user prefers X" rules + +□ Create AI/memory/learnings/mistakes.md with template: + # Mistake Log + _Errors to avoid. Updated when corrections occur._ + +□ Create empty AI/memory/signals/ratings.jsonl + +□ Git commit: "bootstrap: memory system with initial state and migrated context log" +``` + +### Step 6: Slim Down CLAUDE.md + +**Risk: MEDIUM** — this changes the primary instruction file. + +``` +□ FIRST: Rename current CLAUDE.md → CLAUDE-legacy.md + (safety net — instant rollback by renaming back) + +□ Create NEW CLAUDE.md containing ONLY (~4,000 tokens): + + 1. IDENTITY (condensed from Section 2) + - Who you are: expert assistant, thinking partner, not a yes-man + - Who the user is: Krzysztof, creator/strategist, ADHD-aware + - Voice: laconic, factual, structured, British English / natural Polish + - No emojis unless requested + + 2. UNIVERSAL RULES (apply to ALL skills) + - Read AI/memory/work/current.md at session start + - Read AI/memory/context-log.md at session start + - Before substantive work: read AI/My AgentOS/Deep Profile & Operating Manual.md + - At session end: update AI/memory/work/current.md + - Always read relevant policies referenced by the active skill + + 3. SKILL ROUTING + "Based on the user's request, identify the relevant skill from AI/skills/. + Read the matching skill file. Follow its instructions. + If the skill references context maps (context_files), read them. + If the skill references policies, read those policy files from AI/policies/. + If no skill matches, use AI/skills/general-advisor.md. + Inform the user which skill you selected." + + 4. VAULT STRUCTURE (condensed to ~500 tokens) + - Key folders and what's in them (one line each) + - Database location: _System/Databases/ + - Frontmatter: always include base: reference + + 5. TOOL REFERENCE (condensed) + - eventkit-cli: calendar and reminders operations (NEVER use apple-mcp) + - mail-cli: Apple Mail operations + - Full docs: reference the tool documentation files for details + + 6. GIT WORKFLOW (condensed) + - Commit conventions + - Branch rules + +□ TEST the new CLAUDE.md: + → Start a new Claude Code session + → Ask "help me draft a newsletter issue" — should route to ai-equilibrium-editor skill + → Ask "translate this to Polish" — should route to translator skill + → Ask "what's on my calendar today" — should use eventkit-cli + → Ask something that triggers provocation protocol — verify it still fires + → Verify formatting matches expectations (British English, structured, no slop) + +□ If tests pass: git commit "refactor: slim CLAUDE.md as modular skill router" +□ If tests fail: mv CLAUDE-legacy.md CLAUDE.md (instant rollback), debug + +□ Keep CLAUDE-legacy.md for 2 weeks minimum +``` + +### Step 7: Add Hooks + +**Risk: LOW** — additive, does not change existing behavior. + +Study PAI hook implementations first: +``` +Read: ~/PAI-reference/Packs/pai-hook-system/hooks/SessionStart/LoadContext.hook.ts +Read: ~/PAI-reference/Packs/pai-hook-system/hooks/Stop/SessionSummary.hook.ts +Read: ~/PAI-reference/Packs/pai-hook-system/hooks/UserPromptSubmit/ExplicitRatingCapture.hook.ts +Read: ~/PAI-reference/Packs/pai-hook-system/hooks/lib/ +``` + +Then create adapted versions: + +``` +□ Create AI/hooks/on-session-start.ts + → Reads AI/memory/work/current.md + → Reads last 50 lines of AI/memory/context-log.md + → Reads AI/memory/learnings/preferences.md (latest entries) + → Outputs summary as session context injection + → MUST: exit 0 always, never block, fast execution + +□ Create AI/hooks/on-session-end.ts + → Triggered on Stop event + → Prompts Claude to update AI/memory/work/current.md + → Prompts Claude to capture learnings if corrections occurred + → MUST: fire-and-forget, never block + +□ Create AI/hooks/on-feedback.ts + → Triggered on UserPromptSubmit + → Regex: detect rating pattern (1-10 with optional comment) + → Append to AI/memory/signals/ratings.jsonl + → If rating < 6: log to AI/memory/learnings/mistakes.md + → MUST: exit 0 always + +□ Create AI/hooks/lib/paths.ts + → Shared vault paths, memory paths + → File read/write utilities + +□ Register hooks in project .claude/settings.json + → Check PAI's settings.json for the registration format + +□ Test: start session (verify on-session-start fires), + give a rating (verify on-feedback captures it), + end session (verify on-session-end updates WIP) + +□ Git commit: "add: PAI-style hook system for memory and feedback" +``` + +### Step 8: Enhance Existing Automation + +**Risk: LOW** — minimal changes to working scripts. + +``` +□ Update AI/scripts/daily-brief.sh: + → Add to the Claude prompt: "Also read AI/memory/work/current.md + for any work in progress, and include WIP status in the briefing." + → Add: "Read AI/memory/learnings/mistakes.md for recent mistakes + to be aware of today." + +□ Update AI/scripts/daily-close.sh: + → Add to the Claude prompt: "Update AI/memory/work/current.md + with end-of-day status." + → Add: "Update AI/memory/context-log.md if today had significant changes." + +□ Update AI/scripts/weekly-review.sh: + → Add: "Read AI/memory/signals/ratings.jsonl for this week's + quality ratings. Summarize trends." + → Add: "Read AI/memory/learnings/ and synthesize patterns. + Propose any improvements to skill files or policies." + → Add: "Check AI/memory/learnings/mistakes.md — are any + mistakes repeating? If so, update the relevant skill file." + +□ Test each script: run manually, verify output includes memory data + +□ Git commit: "enhance: launchd scripts with memory awareness" +``` + +### Step 9: Add Challenger Protocol + +**Risk: LOW** — new policy file only. + +``` +□ Create AI/policies/challenger-protocol.md: + + # Challenger Protocol + + ## When to Activate + - Before executing any task that involves strategy, planning, or decisions + - When the user asks for something that could be done significantly better + - When cross-domain connections are relevant but not mentioned + + ## Behavior + - Consider: can this task be improved beyond what was asked? + - Consider: are there related items in other domains that connect? + - Consider: is the user's approach the best one, or is there a better path? + - If yes to any: briefly suggest the improvement BEFORE executing + - Format: "Before I do this — [suggestion]. Want me to proceed as asked, + or incorporate this?" + + ## Guardrails + - Maximum 1 challenge per interaction (don't be annoying) + - If user says "just do it" or similar: execute immediately, no further challenge + - Track in AI/memory/learnings/execution.md: was challenge accepted or overridden? + - Learn from patterns: if user always overrides a challenge type, stop offering it + + ## Integration + - This policy should be referenced by business-advisor, strategy-advisor, + communication-writing, and ai-equilibrium-editor skills + - Does NOT apply to: translator, vault-janitor, meeting-processing (execution-only skills) + +□ Add reference to challenger-protocol in relevant skill files' policies array + +□ Git commit: "add: challenger protocol for proactive quality improvement" +``` + +### Step 10: Verify and Clean Up + +``` +□ Full verification: + → Trigger at least 5 different skills explicitly — verify routing + → Test memory: end session, start new session, verify WIP handoff + → Test feedback: give a rating, verify it appears in ratings.jsonl + → Test challenger: give a suboptimal business request, verify pushback + → Test cross-domain: ask about newsletter + business, verify both context maps load + → Run all 5 launchd scripts manually, verify they work with memory + → Test afk-code: send a command via Discord, verify response + +□ If all pass: + □ Remove CLAUDE-legacy.md + □ Final commit: "complete: LifeOS transformation to modular architecture" + +□ If any fail: + □ Fix the specific issue + □ Or: git revert [specific commit] to undo just that step + □ Re-test after fix +``` + +--- + +## PART 4: POST-TRANSFORMATION + +Once complete, the system supports: + +- **Adding new skills**: create one .md file in AI/skills/ +- **Adding new domains**: create a context map in AI/context/ +- **Adding new policies**: create a .md file in AI/policies/ +- **Adding new hooks**: create a .ts file in AI/hooks/, register in settings.json +- **System self-improvement**: weekly review analyzes ratings and proposes changes to skill files +- **Cross-session continuity**: WIP state + learnings persist automatically +- **Feedback loop**: ratings and corrections accumulate, inform future behavior + +The architecture grows by adding files. No monolith to edit. Each component is independent, testable, and replaceable. diff --git a/Plans/user-manual.md b/Plans/user-manual.md new file mode 100644 index 000000000..1e548c100 --- /dev/null +++ b/Plans/user-manual.md @@ -0,0 +1,366 @@ +# LifeOS AI Genius — User Manual + +## What Changed + +Your vault went from a 58KB monolithic CLAUDE.md to a modular, self-improving system. Claude still works the same way on the surface — you talk to it, it helps you. But underneath, everything is different. + +**Before**: Claude loaded 14,500 tokens of instructions every session. No memory between sessions. No feedback loop. No protection against dangerous commands. Every session started from zero. + +**After**: Claude loads ~1,700 tokens of routing instructions, then selectively loads only the skills, policies, and context relevant to your request. It remembers what you were working on. It captures your preferences and learns from mistakes. It protects against dangerous commands. It tracks its own performance and proposes improvements weekly. + +You don't need to change how you talk to Claude. But knowing what's available lets you get more out of it. + +--- + +## Quick Reference + +| You say | What happens | +|---|---| +| Any normal request | Claude classifies effort, routes to the right skill, loads relevant context | +| "research [topic]" | Research skill activates with appropriate depth | +| "extensive research on [topic]" | Multi-agent research with 4-5 parallel sources | +| "council: [question]" | 4 agents debate across 3 rounds, you get the transcript | +| "quick council: [question]" | Fast 1-round perspective check | +| "algorithm effort THOROUGH: [task]" | Full Algorithm with ISC table, research, council | +| "show proposals" | Review pending improvement proposals from weekly synthesis | +| "create a skill for [domain]" | Interactive new skill creation | +| "how am I doing on my goals?" | Telos reads Tracking/ and reports progress | +| "add belief: [something]" | Recorded in AI/telos/beliefs.md | +| "add lesson: [something]" | Recorded in AI/telos/lessons.md | +| "I predict [something] by [date]" | Tracked in AI/telos/predictions.md | +| "7 - good work" (or any 1-10 rating) | Captured in ratings.jsonl, feeds into learning | +| "skip the algorithm" or "just do it" | Bypasses structured execution, does the task directly | + +--- + +## The Effort System + +Every request is silently classified into an effort level. This determines how much infrastructure Claude deploys. + +### TRIVIAL +"hello", "thanks", "yes" +→ Direct response. No skill routing, no overhead. + +### QUICK +"translate this to Polish", "what's on my calendar today" +→ Routes to one skill. No ISC, no parallel agents. Just does it. + +### STANDARD +"draft a newsletter issue", "review this proposal" +→ The Algorithm activates. ISC table created. Research available. 1-3 parallel agents possible. This is the default for real work. + +### THOROUGH +"redesign my content strategy", "evaluate this business opportunity" +→ Full Algorithm. Extensive research. Council debate. 3-5 parallel agents. Deep analysis. Use this for important decisions. + +### DETERMINED +"keep going until all tests pass", "don't stop until this is production-ready" +→ Everything available. Unlimited iterations. 10+ parallel agents. Use rarely — for mission-critical tasks. + +### Overriding Effort + +You can force a level: +- "algorithm effort THOROUGH: [task]" — forces THOROUGH +- "this is a quick one" — forces QUICK +- "skip the algorithm" — bypasses ISC entirely + +### When to Override + +- Something Claude classifies as STANDARD but you know is simple → "this is a quick one" +- Something that looks simple but matters a lot → "algorithm effort THOROUGH: [task]" +- You're in a hurry → "just do it, no ISC" + +--- + +## The Algorithm & ISC + +For STANDARD+ tasks, Claude creates an ISC (Ideal State Criteria) table before doing any work. This is a table of "what done well looks like." + +Example: +``` +## ISC: Draft newsletter about AI agents +**Effort:** STANDARD | **Phase:** OBSERVE | **Iteration:** 1 + +| # | What Ideal Looks Like | Source | Capability | Status | +|---|---|---|---|---| +| 1 | Topic covers AI agents with fresh angle | EXPLICIT | research | PENDING | +| 2 | 3-5 non-obvious insights | EXPLICIT | research | PENDING | +| 3 | Matches AI Equilibrium voice | INFERRED | newsletter skill | PENDING | +| 4 | British English, no slop | IMPLICIT | formatting | PENDING | +``` + +**You see this table after the OBSERVE phase.** You can: +- Add rows: "also make sure it references last week's issue" +- Remove rows: "don't worry about cross-linking" +- Adjust: "row 2 should be 5-7 insights, not 3-5" +- Or just confirm: "looks good, go ahead" + +Claude then works through THINK → PLAN → BUILD → EXECUTE → VERIFY → LEARN, updating the ISC as it goes. At the end, you see the completed table with all statuses. + +### When to Care About the ISC + +- **For important work**: read the ISC after OBSERVE, make sure it captures what you actually want +- **For routine work**: glance at it, confirm, let Claude work +- **When something goes wrong**: the ISC shows exactly which criterion failed and why + +### When to Ignore It + +- QUICK tasks don't have an ISC +- If you say "just do it" or "skip the algorithm", no ISC is created +- For simple, well-understood tasks, the ISC is overhead — skip it + +--- + +## Research + +Three depth levels, triggered by how you phrase the request. + +### Quick Research +"quick research on [topic]" or simple factual questions +→ One source, 3-5 bullet points, ~10 seconds + +### Standard Research +"research [topic]" or "investigate [topic]" +→ Two parallel agents (WebSearch + Perplexity), cross-referenced findings +→ Structured report with agreements, disagreements, confidence assessment +→ ~15-30 seconds + +### Extensive Research +"extensive research on [topic]" or "deep dive into [topic]" +→ 4-5 parallel agents (WebSearch, Perplexity, Brave, Firecrawl) +→ Comprehensive report with source analysis and open questions +→ ~60-90 seconds + +### Research Outputs Are Saved +Every research result goes to `AI/memory/research/YYYY-MM-DD_{topic}.md`. Next time you ask about the same topic, Claude can reference past research. + +--- + +## Council (Multi-Agent Debate) + +For decisions that benefit from multiple perspectives. + +### Full Debate +"council: should I pivot my newsletter from weekly to biweekly?" + +4 agents debate across 3 rounds: +- **Strategist**: long-term thinking, competitive dynamics +- **Pragmatist**: implementation reality, resource constraints +- **Challenger**: devil's advocate, what could go wrong +- **Domain Expert**: deep knowledge of the specific topic (adapts per question) + +You get: full transcript + convergence points + remaining disagreements + recommended path. + +### Quick Council +"quick council: is biweekly better?" + +1 round, 3-4 agents state positions. No back-and-forth. Fast. + +### When to Use Council +- Business decisions with trade-offs +- Strategy questions where you want friction before deciding +- Anything where you'd normally want to "sleep on it" — council gives you the multiple perspectives immediately + +### Custom Council Members +"council with security: should we use this API?" +→ Adds a security-focused agent to the default four. + +--- + +## Feedback & Learning + +The system learns from your reactions. Two mechanisms: + +### Explicit Ratings +After any response, type a rating: +``` +8 - solid newsletter draft +3 - completely missed the point +10 +6 - ok but too verbose +``` + +Format: number (1-10), optionally followed by a comment. The feedback hook captures it automatically. + +### Implicit Sentiment +You don't need to rate everything. The system detects: +- "no, that's wrong" → negative signal captured +- "perfect, exactly right" → positive signal captured +- "try again" → frustration detected +- Corrections → inferred preference + +### What Happens With Feedback + +1. Ratings accumulate in `AI/memory/signals/ratings.jsonl` +2. Low ratings trigger entries in `AI/memory/learnings/mistakes.md` +3. Weekly synthesis analyses patterns: which skills rate well/poorly, what mistakes repeat +4. If a mistake occurs 3+ times: automatic proposal to add a rule preventing it +5. You review proposals with "show proposals" and approve/reject + +### The Weekly Cycle + +Every Sunday: +- **9:00 AM**: Learning synthesis runs — analyses the week's ratings, mistakes, preferences +- **10:00 AM**: Weekly review runs — includes AI System Health section with rating trends and pending proposals + +The daily brief (8:00 AM) reminds you if there are pending proposals. + +--- + +## Proposals & Self-Improvement + +The system proposes its own improvements. You approve or reject. + +### How Proposals Are Generated +- Weekly synthesis detects patterns (recurring mistakes, low-performing skills, high-confidence preferences) +- Each pattern generates a proposal: what to change, why, and evidence supporting it + +### Reviewing Proposals +Say "show proposals". Claude presents each pending proposal with: +- The skill or policy affected +- What would change +- Evidence (which ratings, mistakes, or patterns support it) +- Confidence level + +For each: approve, reject, or skip. +- **Approve**: change is applied to the skill/policy file, committed to git +- **Reject**: logged with your reason, won't be proposed again +- **Skip**: stays pending for next review + +### What Gets Proposed +- Skill file improvements (better instructions, missing rules) +- New policy rules (from repeated mistakes) +- Preference promotions (observed preference → explicit policy rule) +- Never: changes to CLAUDE.md, hooks, or settings.json + +--- + +## Telos (Life OS) + +Manages your goals, beliefs, lessons, and predictions. + +### Goal Progress +"how am I doing on my goals?" +→ Reads Tracking/Objectives/ and Key Results/, cross-references with active projects, reports what's on track, at risk, or stalled. + +### Adding Personal Knowledge +- "add belief: most productivity advice assumes neurotypical brains" → recorded with date and confidence +- "add lesson: rushing architecture decisions always costs more later" → recorded with context +- "I predict AI agents will replace 50% of SaaS by 2027" → tracked with timeframe, reviewed quarterly + +### Quarterly Review +"quarterly review" +→ Comprehensive report: goal progress, learning trends, prediction accuracy check, belief updates, recommended focus for next quarter. + +The quarterly review uses the challenger protocol: it will push back on goals you're not pursuing and call out mismatches between stated priorities and actual time allocation. + +--- + +## Creating New Skills + +"create a skill for [domain]" + +Interactive process: +1. Claude asks what the skill covers, triggers, relevant vault files, tone +2. Checks for overlap with existing skills +3. Creates the skill file with proper frontmatter +4. Creates a context map if needed +5. Validates the result +6. Commits to git + +The system also proposes new skills automatically: if the weekly synthesis detects you keep asking about a topic that no skill covers, it generates a proposal. + +--- + +## Observability (Phase 6) + +Every tool call, security decision, rating, and session is logged to `AI/memory/events/YYYY-MM-DD.jsonl`. + +### Session Activity Reports +After each session, `AI/memory/work/{session}/activity-report.md` shows: +- Total tool calls by type +- Security events +- Ratings captured +- Algorithm usage (if ISC was created) +- Session duration + +### Optional Dashboard +If you started the observability server (`bash AI/observability/start.sh`), open `AI/observability/dashboard.html` for a real-time view of what Claude is doing. Useful for monitoring complex Algorithm executions with parallel agents. + +--- + +## Memory & Cross-Session Continuity + +### What Persists Between Sessions +- **WIP state**: `AI/memory/work/current.md` — what you were working on, where you left off +- **Preferences**: `AI/memory/learnings/preferences.md` — accumulated style preferences with confidence scores +- **Mistakes**: `AI/memory/learnings/mistakes.md` — errors to avoid, with occurrence counts +- **Ratings**: `AI/memory/signals/ratings.jsonl` — all explicit and implicit feedback +- **Research**: `AI/memory/research/` — past research outputs +- **Context log**: `AI/memory/context-log.md` — current situation, priorities, pipeline + +### What's Injected at Session Start +On your first prompt each session, the context hook injects: +- Current WIP state (what you were working on last) +- Recent preferences +- Recent context log entries + +This means Claude starts each session already knowing what you were doing. + +--- + +## Security + +The security validator checks every Bash command, file edit, file write, and file read against patterns: + +- **Blocked** (hard stop): `rm -rf /`, `git push --force origin main`, accessing `~/.ssh` or credentials +- **Confirm** (asks you): `rm -rf`, `git push --force`, editing CLAUDE.md or settings.json +- **Alert** (logs but allows): `curl | sh`, `sudo`, `eval` + +All decisions are logged to `AI/memory/security/`. The weekly synthesis reviews security events. + +This matters most for the launchd scripts and afk-code sessions that run with `--permission-mode bypassPermissions`. + +--- + +## File Locations Reference + +| What | Where | +|---|---| +| Main instruction file | `CLAUDE.md` (~1,700 tokens) | +| Skills | `AI/skills/*.md` (28+ files) | +| Policies | `AI/policies/*.md` (10+ files) | +| Context maps | `AI/context/*.md` (9+ files) | +| WIP state | `AI/memory/work/current.md` | +| Session work dirs | `AI/memory/work/YYYY-MM-DD_slug/` | +| Preferences | `AI/memory/learnings/preferences.md` | +| Mistakes | `AI/memory/learnings/mistakes.md` | +| Ratings | `AI/memory/signals/ratings.jsonl` | +| Weekly synthesis | `AI/memory/learnings/synthesis/YYYY-WW.md` | +| Proposals | `AI/memory/proposals/pending/` | +| Research outputs | `AI/memory/research/` | +| Event logs | `AI/memory/events/YYYY-MM-DD.jsonl` | +| Telos personal | `AI/telos/` (beliefs, lessons, wisdom, predictions) | +| Security audit trail | `AI/memory/security/` | +| Hooks | `AI/hooks/*.ts` (8 hooks) | +| Scripts | `AI/scripts/*.sh` (6 scripts) | + +--- + +## Tips + +1. **Rate things.** Even a bare "7" after a response feeds the learning system. Low ratings trigger mistake tracking. High ratings confirm what's working. + +2. **Say "show proposals" on Mondays.** The synthesis runs Sunday morning. Fresh proposals are waiting. + +3. **Use "council:" for decisions you'd otherwise ruminate on.** The intellectual friction is the point — it gives you pre-digested perspectives to synthesise. + +4. **Override effort when it matters.** Claude defaults most work to STANDARD. If something is important, say "algorithm effort THOROUGH" to get the full treatment. + +5. **Don't fight the ISC.** If you see the ISC table and want to skip it, say "just do it". But for important tasks, spending 10 seconds reviewing the ISC catches misunderstandings before Claude does 5 minutes of wrong work. + +6. **Add beliefs and lessons when they strike you.** "Add lesson: [thing I just realised]" takes 2 seconds and builds your personal knowledge base over time. The quarterly review surfaces these. + +7. **Let the system propose skills.** If you keep asking about something with no matching skill, the weekly synthesis will notice. Approve the proposal and you get a dedicated skill with proper context loading. + +8. **Check activity reports after complex sessions.** `AI/memory/work/{session}/activity-report.md` shows what Claude actually did — useful when you're not sure if it was thorough enough.