Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions TODOS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# TODOS

Items discovered during Grove v2 CEO review (2026-03-20).

## P2 — Post-v2 cleanup

### Remove TmuxRuntime fallback
- **What:** Remove `TmuxRuntime` adapter once acpx is proven stable in production.
- **Why:** TmuxRuntime exists only as a safety net during acpx integration. Once acpx handles all supported agents reliably, the tmux codepath is dead weight.
- **Effort:** S | **Depends on:** Phase 2 (AgentRuntime) shipped + acpx stability proven (~1 month)

### Separate EventBus semaphore from store semaphore
- **What:** Nexus per-store semaphore (20 concurrent ops) may be shared with EventBus. EventBus should have its own semaphore to avoid reducing DAG write throughput.
- **Why:** Under load (10+ agents), EventBus publishes compete with DAG writes for the same semaphore slots.
- **Effort:** S | **Depends on:** Phase 3 (EventBus routing) shipped

### Git worktree pooling for swarm-scale
- **What:** Pre-create a pool of git worktrees and assign them to agents on spawn, rather than creating worktrees on-demand.
- **Why:** `git worktree add` takes 200-500ms. At 50+ agents, sequential creation is 10-25 seconds. Pooling amortizes this.
- **Effort:** M | **Depends on:** Swarm runtime (prior CEO plan)

### Provider.ts capability flag cleanup
- **What:** The 11 optional capability interfaces + type guards in `provider.ts` are an anti-pattern. The `AgentRuntime` interface provides a path to simplify: spawn/send/close replace ad-hoc claim/heartbeat/workspace methods.
- **Why:** New features currently require adding a new interface + type guard + conditional logic everywhere. Fragile and hard to trace.
- **Effort:** M | **Depends on:** Phase 2 (AgentRuntime) shipped

### Incremental stop condition evaluation for swarm-scale
- **What:** `evaluateStopConditions()` in lifecycle.ts scans all contributions (O(n)). At 1000+ contributions, this adds 50-100ms per contribute. Maintain running state (current best score, contribution count, last improvement round) and update incrementally to O(1).
- **Why:** Swarm-scale (50+ agents) produces contributions rapidly. O(n) evaluation becomes a bottleneck.
- **Effort:** M | **Depends on:** Swarm runtime (prior CEO plan)

## P3 — Edge cases and polish

### Contract re-evaluation on mid-session change
- **What:** When GROVE.md changes mid-session (cherry-pick #6), optionally re-evaluate existing contributions against the new contract. Currently deferred — only detection + notification + diff ships in v2.
- **Why:** Semantically tricky: a previously accepted contribution might now violate the new contract. Need a policy for handling this (flag, reject, ignore).
- **Effort:** M | **Depends on:** Cherry-pick #6 (contract watcher) shipped

### Empty summary validation
- **What:** `grove_contribute` with an empty string summary passes schema validation. Add minimum length check (e.g., 10 chars).
- **Why:** Empty summaries provide no value in the contribution DAG and make the TUI feed unreadable.
- **Effort:** S | **Depends on:** Phase 1 (enforcement pipeline)

### Score tie policy
- **What:** Define outcome when a new contribution has the same score as the frontier best. Currently undefined — could be "unchanged", "tied", or treated as "improved" (same threshold met).
- **Why:** Tie-breaking affects frontier ranking and outcome derivation. Need a consistent policy.
- **Effort:** S | **Depends on:** Phase 1 (outcome derivation)

### Create DESIGN.md
- **What:** Document Grove's design system: spacing scale, typography, component patterns, voice/tone. Currently only `theme.ts` exists with color tokens.
- **Why:** Web dashboard (12-month roadmap) needs a design system reference to avoid diverging from the TUI's established aesthetic.
- **Effort:** M (via /design-consultation) | **Depends on:** Nothing (can be done anytime)

### Raise dimmed color to #777 for WCAG AA contrast
- **What:** Change `theme.dimmed` from `#666666` to `#777777` in `theme.ts`. Current value fails WCAG AA contrast (3.9:1 vs required 4.5:1).
- **Why:** Accessibility compliance. Minimal visual change — dimmed still looks dimmed.
- **Effort:** S (1 line change) | **Depends on:** Nothing

### Contract watcher debounce tuning
- **What:** The contract file watcher (cherry-pick #6) needs a debounce interval to handle rapid saves (e.g., editor auto-save). Start with 1s, tune based on usage.
- **Why:** Without debounce, rapid GROVE.md edits trigger multiple diff/notification cycles.
- **Effort:** S | **Depends on:** Cherry-pick #6 shipped
51 changes: 51 additions & 0 deletions docs/designs/grove-v2-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
status: ACTIVE
---
# CEO Plan: Grove v2 — Contract Enforcement, Event-Driven Routing, acpx Agent Lifecycle

Generated by /plan-ceo-review on 2026-03-20
Branch: worktree-ticklish-coalescing-stonebraker | Mode: SELECTIVE EXPANSION
Repo: windoliver/grove

## Vision

### 10x Check
The 10x version of Grove v2 is not just "enforced contracts + better agent spawning" — it's a **self-healing coordination runtime** where the system automatically detects contract violations, suggests fixes to agents, re-routes work when agents crash, and evolves contracts mid-session based on observed patterns. The enforcement pipeline becomes an intelligent system, not just a validator.

At 10x, Grove becomes the "Kubernetes for AI agents" — you declare the desired state (contract), deploy agents (topology), and the runtime reconciles reality to match. Agents crash? Auto-respawn. Gates too strict? System suggests relaxation. Stop condition met? Graceful shutdown with audit trail.

### Delight Opportunities Surfaced
1. Structured rejection feedback (agent self-corrects on first try)
2. Contract validation CLI (catch errors at authoring time)
3. Dry-run mode for contribute (preview before commit)
4. Enforcement pipeline audit log (black box recorder)
5. Auto-reconnect on crash (resilient lifecycle)
6. Contract diff on mid-session update (live-tuning experiments)

## Scope Decisions

| # | Proposal | Effort | Decision | Reasoning |
|---|----------|--------|----------|-----------|
| 1 | Structured Rejection Feedback | S | ACCEPTED | Enforcement without feedback = frustrating enforcement. Agents need to self-correct. |
| 2 | Contract Validation CLI | S | ACCEPTED | Authoring-time errors >> runtime errors. Type checker for coordination protocol. |
| 3 | Dry-Run Mode for Contribute | S | ACCEPTED | Preview before commit. Agents can evaluate locally before contributing. |
| 4 | Enforcement Pipeline Audit Log | M | ACCEPTED | Observability is not optional. Black box recorder for trustworthy experiments. |
| 5 | Auto-Reconnect on Crash | M | ACCEPTED | Crashes are inevitable with 5+ agents. Resilient lifecycle is a differentiator. |
| 6 | Contract Diff on Mid-Session Update | M | ACCEPTED | Live-tuning experiments without restart. Researcher workflow improvement. |

## Accepted Scope (added to base v2 plan)
- Structured rejection errors in enforcement pipeline (all validation steps return typed error objects)
- `grove contract validate` CLI command
- `grove_contribute --dry-run` parameter on MCP tool
- Audit log for every enforcement pipeline run (stored as metadata or system contributions)
- AgentRuntime auto-reconnect with circuit-breaker (max 3 retries, exponential backoff)
- Contract change detection + diff + agent notification mid-session

## Deferred to TODOS.md
- (none — all proposals accepted)

## Skipped
- (none)

## Relationship to Prior Plans
- **Swarm Runtime** (2026-03-19, PROMOTED): The swarm design depends on v2's foundation. AgentRuntime interface enables swarm-scale spawning. Enforcement pipeline ensures swarm contributions are valid. Event routing enables swarm convergence signals.
Loading