windoliver · windoliver · Mar 20, 2026 · Mar 20, 2026 · Mar 20, 2026
diff --git a/TODOS.md b/TODOS.md
@@ -0,0 +1,62 @@
+# TODOS
+
+Items discovered during Grove v2 CEO review (2026-03-20).
+
+## P2 — Post-v2 cleanup
+
+### Remove TmuxRuntime fallback
+- **What:** Remove `TmuxRuntime` adapter once acpx is proven stable in production.
+- **Why:** TmuxRuntime exists only as a safety net during acpx integration. Once acpx handles all supported agents reliably, the tmux codepath is dead weight.
+- **Effort:** S | **Depends on:** Phase 2 (AgentRuntime) shipped + acpx stability proven (~1 month)
+
+### Separate EventBus semaphore from store semaphore
+- **What:** Nexus per-store semaphore (20 concurrent ops) may be shared with EventBus. EventBus should have its own semaphore to avoid reducing DAG write throughput.
+- **Why:** Under load (10+ agents), EventBus publishes compete with DAG writes for the same semaphore slots.
+- **Effort:** S | **Depends on:** Phase 3 (EventBus routing) shipped
+
+### Git worktree pooling for swarm-scale
+- **What:** Pre-create a pool of git worktrees and assign them to agents on spawn, rather than creating worktrees on-demand.
+- **Why:** `git worktree add` takes 200-500ms. At 50+ agents, sequential creation is 10-25 seconds. Pooling amortizes this.
+- **Effort:** M | **Depends on:** Swarm runtime (prior CEO plan)
+
+### Provider.ts capability flag cleanup
+- **What:** The 11 optional capability interfaces + type guards in `provider.ts` are an anti-pattern. The `AgentRuntime` interface provides a path to simplify: spawn/send/close replace ad-hoc claim/heartbeat/workspace methods.
+- **Why:** New features currently require adding a new interface + type guard + conditional logic everywhere. Fragile and hard to trace.
+- **Effort:** M | **Depends on:** Phase 2 (AgentRuntime) shipped
+
+### Incremental stop condition evaluation for swarm-scale
+- **What:** `evaluateStopConditions()` in lifecycle.ts scans all contributions (O(n)). At 1000+ contributions, this adds 50-100ms per contribute. Maintain running state (current best score, contribution count, last improvement round) and update incrementally to O(1).
+- **Why:** Swarm-scale (50+ agents) produces contributions rapidly. O(n) evaluation becomes a bottleneck.
+- **Effort:** M | **Depends on:** Swarm runtime (prior CEO plan)
+
+## P3 — Edge cases and polish
+
+### Contract re-evaluation on mid-session change
+- **What:** When GROVE.md changes mid-session (cherry-pick #6), optionally re-evaluate existing contributions against the new contract. Currently deferred — only detection + notification + diff ships in v2.
+- **Why:** Semantically tricky: a previously accepted contribution might now violate the new contract. Need a policy for handling this (flag, reject, ignore).
+- **Effort:** M | **Depends on:** Cherry-pick #6 (contract watcher) shipped
+
+### Empty summary validation
+- **What:** `grove_contribute` with an empty string summary passes schema validation. Add minimum length check (e.g., 10 chars).
+- **Why:** Empty summaries provide no value in the contribution DAG and make the TUI feed unreadable.
+- **Effort:** S | **Depends on:** Phase 1 (enforcement pipeline)
+
+### Score tie policy
+- **What:** Define outcome when a new contribution has the same score as the frontier best. Currently undefined — could be "unchanged", "tied", or treated as "improved" (same threshold met).
+- **Why:** Tie-breaking affects frontier ranking and outcome derivation. Need a consistent policy.
+- **Effort:** S | **Depends on:** Phase 1 (outcome derivation)
+
+### Create DESIGN.md
+- **What:** Document Grove's design system: spacing scale, typography, component patterns, voice/tone. Currently only `theme.ts` exists with color tokens.
+- **Why:** Web dashboard (12-month roadmap) needs a design system reference to avoid diverging from the TUI's established aesthetic.
+- **Effort:** M (via /design-consultation) | **Depends on:** Nothing (can be done anytime)
+
+### Raise dimmed color to #777 for WCAG AA contrast
+- **What:** Change `theme.dimmed` from `#666666` to `#777777` in `theme.ts`. Current value fails WCAG AA contrast (3.9:1 vs required 4.5:1).
+- **Why:** Accessibility compliance. Minimal visual change — dimmed still looks dimmed.
+- **Effort:** S (1 line change) | **Depends on:** Nothing
+
+### Contract watcher debounce tuning
+- **What:** The contract file watcher (cherry-pick #6) needs a debounce interval to handle rapid saves (e.g., editor auto-save). Start with 1s, tune based on usage.
+- **Why:** Without debounce, rapid GROVE.md edits trigger multiple diff/notification cycles.
+- **Effort:** S | **Depends on:** Cherry-pick #6 shipped
diff --git a/docs/designs/grove-v2-architecture.md b/docs/designs/grove-v2-architecture.md
@@ -0,0 +1,51 @@
+---
+status: ACTIVE
+---
+# CEO Plan: Grove v2 — Contract Enforcement, Event-Driven Routing, acpx Agent Lifecycle
+
+Generated by /plan-ceo-review on 2026-03-20
+Branch: worktree-ticklish-coalescing-stonebraker | Mode: SELECTIVE EXPANSION
+Repo: windoliver/grove
+
+## Vision
+
+### 10x Check
+The 10x version of Grove v2 is not just "enforced contracts + better agent spawning" — it's a **self-healing coordination runtime** where the system automatically detects contract violations, suggests fixes to agents, re-routes work when agents crash, and evolves contracts mid-session based on observed patterns. The enforcement pipeline becomes an intelligent system, not just a validator.
+
+At 10x, Grove becomes the "Kubernetes for AI agents" — you declare the desired state (contract), deploy agents (topology), and the runtime reconciles reality to match. Agents crash? Auto-respawn. Gates too strict? System suggests relaxation. Stop condition met? Graceful shutdown with audit trail.
+
+### Delight Opportunities Surfaced
+1. Structured rejection feedback (agent self-corrects on first try)
+2. Contract validation CLI (catch errors at authoring time)
+3. Dry-run mode for contribute (preview before commit)
+4. Enforcement pipeline audit log (black box recorder)
+5. Auto-reconnect on crash (resilient lifecycle)
+6. Contract diff on mid-session update (live-tuning experiments)
+
+## Scope Decisions
+
+| # | Proposal | Effort | Decision | Reasoning |
+|---|----------|--------|----------|-----------|
+| 1 | Structured Rejection Feedback | S | ACCEPTED | Enforcement without feedback = frustrating enforcement. Agents need to self-correct. |
+| 2 | Contract Validation CLI | S | ACCEPTED | Authoring-time errors >> runtime errors. Type checker for coordination protocol. |
+| 3 | Dry-Run Mode for Contribute | S | ACCEPTED | Preview before commit. Agents can evaluate locally before contributing. |
+| 4 | Enforcement Pipeline Audit Log | M | ACCEPTED | Observability is not optional. Black box recorder for trustworthy experiments. |
+| 5 | Auto-Reconnect on Crash | M | ACCEPTED | Crashes are inevitable with 5+ agents. Resilient lifecycle is a differentiator. |
+| 6 | Contract Diff on Mid-Session Update | M | ACCEPTED | Live-tuning experiments without restart. Researcher workflow improvement. |
+
+## Accepted Scope (added to base v2 plan)
+- Structured rejection errors in enforcement pipeline (all validation steps return typed error objects)
+- `grove contract validate` CLI command
+- `grove_contribute --dry-run` parameter on MCP tool
+- Audit log for every enforcement pipeline run (stored as metadata or system contributions)
+- AgentRuntime auto-reconnect with circuit-breaker (max 3 retries, exponential backoff)
+- Contract change detection + diff + agent notification mid-session
+
+## Deferred to TODOS.md
+- (none — all proposals accepted)
+
+## Skipped
+- (none)
+
+## Relationship to Prior Plans
+- **Swarm Runtime** (2026-03-19, PROMOTED): The swarm design depends on v2's foundation. AgentRuntime interface enables swarm-scale spawning. Enforcement pipeline ensures swarm contributions are valid. Event routing enables swarm convergence signals.