Skip to content

docs: Grove v2 CEO review — architecture decisions & TODOS#135

Open
windoliver wants to merge 3 commits intomainfrom
feat/grove-v2-ceo-review
Open

docs: Grove v2 CEO review — architecture decisions & TODOS#135
windoliver wants to merge 3 commits intomainfrom
feat/grove-v2-ceo-review

Conversation

@windoliver
Copy link
Copy Markdown
Owner

@windoliver windoliver commented Mar 20, 2026

Summary

Full review chain for #132 (Grove v2 architecture): /office-hours/plan-ceo-review/plan-eng-review/plan-design-review. Locks in all decisions before implementation.

Scope decisions (6 cherry-picks accepted)

  1. Structured rejection feedback — enforcement pipeline returns typed errors so agents self-correct
  2. Contract validation CLI (grove contract validate) — catch GROVE.md errors at authoring time
  3. Dry-run mode (grove_contribute --dry-run) — preview enforcement result without storing
  4. Enforcement pipeline audit log — black box recorder for every contribute call
  5. Auto-reconnect on crash — AgentRuntime respawns agents with circuit-breaker
  6. Contract diff on mid-session update — live-tune experiments without restart

Architecture decisions (CEO + Eng review)

  • Enforcement pipeline → new enforcement-pipeline.ts (decorator pattern, like enforcing-store)
  • EventBus → Nexus EventBus directly (cross-process), best-effort (DAG is truth)
  • TUI → consolidate 25 views into 5 screens composing existing panels (simple/advanced toggle)
  • Runtime boundary → AgentRuntime (pure agent ops) + SessionManager (claim+workspace+heartbeat). SpawnManager deleted.
  • Hook failures → flagged in metadata, not rolled back
  • First contribution → auto-passes gates (establishes baseline)
  • GROVE.md → in immutablePaths (agents can't modify their own constraints)
  • AgentRuntime tests → conformance test pattern (like store.conformance.ts)
  • Stop eval → accept O(n) for v2, defer incremental to swarm-scale
  • ~80% of Phase 1 is wiring existing code (evaluateStopConditions(), HookRunner, gate schemas all exist)

TUI design decisions (Design review — 4/10 → 8/10)

5-screen flow with wireframes:

Screen 1: PRESET SELECT          Screen 2: AGENT DETECT
┌────────────────────────┐      ┌────────────────────────┐
│ Grove            [q]uit│      │ Grove › review-loop [Esc]│
│                        │      │                        │
│ Pick a preset:         │      │ Agents detected:       │
│ > review-loop    2roles│      │ ● claude-code          │
│   research-loop  3roles│      │ ● codex                │
│   custom...            │      │ ○ gemini  not found    │
│                        │      │                        │
│ Recent sessions:       │      │ Role mapping:          │
│   PR #42   12c    2m   │      │ coder    → claude  ●   │
│                        │      │ reviewer → codex   ●   │
│ j/k  Enter  c  q      │      │ Enter  Esc  Tab        │
└────────────────────────┘      └────────────────────────┘

Screen 3: GOAL INPUT            Screen 4: RUNNING (stacked)
┌────────────────────────┐      ┌────────────────────────┐
│ Grove › goal     [Esc] │      │ Grove › abc123  [Tab]  │
│                        │      │                        │
│ What should agents do? │      │ ⣷ coder   working      │
│ ┌──────────────────┐   │      │ ○ reviewer idle        │
│ │ Review PR #42  _ │   │      │                        │
│ └──────────────────┘   │      │ ── Contributions ──    │
│                        │      │ #3 work  Fixed XSS     │
│ Enter start  Esc back  │      │ #2 review  LGTM        │
└────────────────────────┘      │ #1 work  CSRF tokens   │
                                │                        │
Screen 5: COMPLETE              │ [RUNNING] 3c │ 2m      │
┌────────────────────────┐      │ val_bpb ███░░ 37%      │
│ Grove › abc123    ✓    │      │ j/k  d  Tab  q         │
│                        │      └────────────────────────┘
│ Session complete       │
│ target metric reached  │
│                        │
│ 12c │ 4m 32s │ $0.47  │
│ val_bpb: 0.94 (≤0.95) │
│                        │
│ Enter new  r replay  q │
└────────────────────────┘

Key design decisions:

  • Breadcrumbs hide below 70-col terminals (keyboard-first, Esc to go back)
  • Agent crashes = inline status change (✗ crashed — reconnecting… in yellow, not red). Calm, not catastrophic.
  • Screen 3→4 transition is progressive: agents spawn one-by-one with braille spinners
  • Screen 4 advanced mode: Tab toggles to existing 25-panel boardroom, Tab back to simple
  • Screen 5 is data-specific (no emoji, no "great job", just stats + metric)
  • Dimmed color raised to #777 for WCAG AA contrast (from #666)
  • 3 new components needed: ProgressBar, BreadcrumbBar, SpawnProgress (all use existing theme tokens)

Interaction state matrix:

Screen Loading Empty Error Partial
Preset Select Scanning… No presets, c to create GROVE.md parse error
Agent Detect Detecting… No agents, install hint Detection failed 2/3 roles mapped
Running Per-agent spawn spinners No contributions yet Agent failed to start Some crashed
Complete No contributions (abort) Unexpected end Manual stop

Files

  • docs/designs/grove-v2-architecture.md — CEO plan with vision, scope decisions, accepted expansions
  • TODOS.md — 11 deferred items (P2/P3) from CEO + eng + design reviews

References

Test plan

CEO review of issue #132 (Grove v2 architecture). Accepted 6 scope
expansions, locked in key architectural decisions, created TODOS.md
with 8 deferred items.
Eng review discovered evaluateStopConditions() is O(n) per contribute.
Acceptable for v2 scale but needs incremental evaluation for swarm.
Design review rated plan 4/10 → 8/10. Added wireframes, interaction
states, emotional arc, responsive breakpoints, and 6 design decisions.
Two new TODOs: create DESIGN.md and fix dimmed contrast for WCAG AA.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant