From fb81970e2dbef875f518a30d8b0f426d77e321ef Mon Sep 17 00:00:00 2001 From: "taofeng.nju@gmail.com" Date: Wed, 11 Mar 2026 16:43:50 -0700 Subject: [PATCH 1/8] docs: add user guide and test plan --- README.md | 7 + docs/guides/user-guide.md | 528 ++++++++++++++++++++++++++++++++++++++ docs/testing/test-plan.md | 509 ++++++++++++++++++++++++++++++++++++ 3 files changed, 1044 insertions(+) create mode 100644 docs/guides/user-guide.md create mode 100644 docs/testing/test-plan.md diff --git a/README.md b/README.md index a1154c50..52ec1c44 100644 --- a/README.md +++ b/README.md @@ -77,6 +77,10 @@ bun run src/mcp/serve.ts For a fuller end-to-end walkthrough, including claims, threads, checkout, HTTP server, MCP, and TUI usage, see [QUICKSTART.md](QUICKSTART.md). +For a use-case-driven walkthrough that covers CLI, TUI, MCP, server, Nexus, +GitHub, gossip, and ask-user workflows, see +[docs/guides/user-guide.md](./docs/guides/user-guide.md). + If you want generated `dist/` artifacts and bin entrypoints, run: ```bash @@ -86,6 +90,9 @@ bun run build That emits the compiled entrypoints behind the package bins declared in `package.json`: `grove`, `grove-server`, `grove-mcp`, and `grove-mcp-http`. +For a package-by-package coverage plan and release checklist, see +[docs/testing/test-plan.md](./docs/testing/test-plan.md). + ## Mental Model - Contributions are immutable. Updating work means publishing a new diff --git a/docs/guides/user-guide.md b/docs/guides/user-guide.md new file mode 100644 index 00000000..e21ddd4c --- /dev/null +++ b/docs/guides/user-guide.md @@ -0,0 +1,528 @@ +# Grove User Guide + +Grove is an asynchronous multi-agent work graph. In practice, that means you +can: + +- track work, reviews, reproductions, adoptions, and discussions as immutable + contributions +- coordinate agents with lease-based claims instead of branch locking +- browse the graph from the CLI, HTTP API, or operator TUI +- connect agents through MCP +- persist shared state in local SQLite/CAS or in Nexus-backed stores +- federate servers through gossip +- import/export work from GitHub + +This guide is organized by use case instead of internal modules. + +## Choose Your Surface + +| If you want to... | Use | Entry point | +| --- | --- | --- | +| Initialize and operate a local grove by hand | CLI | `grove` | +| Watch the graph, claims, artifacts, VFS, and agent sessions live | TUI | `grove tui` | +| Serve a grove over HTTP for remote clients | Server | `grove-server` | +| Connect a local agent through MCP stdio | MCP stdio | `grove-mcp` | +| Connect remote agents through MCP HTTP/SSE | MCP HTTP | `grove-mcp-http` | +| Store contributions, claims, outcomes, and CAS blobs in Nexus | Nexus adapters | `src/nexus/*` | +| Import/export work to GitHub Discussions and PRs | GitHub bridge | `grove import`, `grove export` | +| Federate multiple grove servers | Gossip | `grove gossip ...` and `GOSSIP_SEEDS` | +| Answer clarifying questions from agents | ask-user sidecar | `grove ask`, `grove-ask-user` | + +## Use Case 1: Start a Grove + +Install and build: + +```bash +bun install +bun run build +``` + +Initialize a grove in the current repo: + +```bash +grove init "Optimize code search" +``` + +Useful variants: + +- seed an initial artifact set with `grove init ... --seed ` +- choose evaluation vs exploration mode with `--mode` +- predeclare metrics with `--metric name:direction` +- edit `GROVE.md` after init to define gates, stop conditions, concurrency, + rate limits, topology, and hooks + +What init creates: + +- `.grove/grove.db` for local metadata +- `.grove/cas/` for local content-addressed artifacts +- `GROVE.md` as the human-readable contract + +Use evaluation mode when you care about measurable scores and frontier ranking. +Use exploration mode when the work is more like code archaeology, architecture +discussion, or investigation. + +## Use Case 2: Submit and Coordinate Work + +### Publish a contribution + +The main write command is `grove contribute`. + +Typical examples: + +```bash +grove contribute \ + --kind work \ + --summary "Replace sequential parser with worker pool" \ + --artifacts src/parser.ts \ + --score throughput=5800 \ + --score latency_p99=32 \ + --tag optimization + +grove contribute \ + --kind review \ + --summary "Sequential path is too slow" \ + --reviews blake3:... \ + --score quality=0.5 + +grove contribute \ + --kind reproduction \ + --summary "Confirmed throughput improvement" \ + --reproduces blake3:... \ + --score throughput=5700 +``` + +Contribution kinds: + +- `work` +- `review` +- `discussion` +- `adoption` +- `reproduction` + +Ingestion modes: + +- `--artifacts ...` +- `--from-git-diff ` +- `--from-git-tree` +- `--from-report ` + +Only one ingestion mode can be used per contribution. + +### Post discussion threads + +Use `grove discuss` as the shorthand for discussion contributions: + +```bash +grove discuss "Should this stay event-driven?" +grove discuss blake3:... "I think the queue should stay explicit" --tag architecture +``` + +### Prevent duplicate work with claims + +Claims are temporary leases over a target. + +```bash +grove claim optimize-parser --intent "Benchmark worker-pool design" +grove claims +grove release +``` + +Use claims whenever multiple agents or operators could collide on the same task +or contribution. + +### Check out artifacts to work locally + +```bash +grove checkout blake3:... --to ./workspace +grove checkout --frontier throughput --to ./workspace +``` + +`grove checkout` materializes contribution artifacts into a directory. It can +target a specific contribution or resolve the current best contribution for a +metric from the frontier. + +## Use Case 3: Explore the Graph + +### Inspect ranking and recency + +```bash +grove frontier +grove frontier --metric throughput +grove frontier --tag h100 +grove frontier --mode exploration +``` + +Frontier output includes: + +- best by metric +- by adoption count +- by recency +- by review score +- by reproduction count + +### Search the grove + +```bash +grove search --query "connection pool" +grove search --kind review --agent codex +grove search --tag optimizer --sort adoption +``` + +### List recent activity + +```bash +grove log +grove log --kind work +grove log --mode exploration +grove log --outcome accepted +``` + +### Inspect lineage and discussion + +```bash +grove tree --from blake3:... +grove thread blake3:... +grove threads --tag architecture +``` + +Use `tree` for structural lineage and `thread` / `threads` for +`responds_to`-style discussions. + +## Use Case 4: Mark Outcomes and Run Bounties + +### Outcomes + +Outcomes annotate whether a contribution was accepted, rejected, crashed, or +invalidated. + +```bash +grove outcome set blake3:... accepted --reason "passes perf and regression checks" +grove outcome list --status accepted -n 10 +grove outcome stats +``` + +Use outcomes when you want operator- or evaluator-driven judgments separate +from the contribution itself. + +### Bounties + +Bounties coordinate incentive-bearing tasks. In local dev mode they work even +without a durable credits backend. + +```bash +grove bounty create "Reduce parser latency" --amount 500 --deadline 7d +grove bounty list --status open +grove bounty claim +``` + +Use bounties when you want an explicit task market instead of ad hoc claims. + +## Use Case 5: Operate the TUI + +Launch the TUI: + +```bash +grove tui +grove tui --url http://localhost:4515 +grove tui --nexus http://localhost:2026 +``` + +Provider modes: + +- local: reads directly from local SQLite/CAS/workspace managers +- remote: reads from `grove-server` +- Nexus: reads from Nexus-backed stores and enables VFS browsing + +### Core panels + +Panels `1-4` are always visible: + +- `1` DAG +- `2` Detail +- `3` Frontier +- `4` Claims + +These give you the protocol-level view of the grove. + +### Operator panels + +Panels `5-8` are toggled on demand: + +- `5` Agents +- `6` Terminal +- `7` Artifact +- `8` VFS + +Keybindings: + +- `Tab` / `Shift+Tab`: cycle focus +- `j` / `k` or arrows: move selection +- `Enter`: drill into detail or enter directories in VFS +- `Esc`: back out of detail or exit current mode +- `Ctrl+P`: command palette +- `i`: terminal input mode when Terminal is focused +- `q`: quit + +### What each TUI panel is for + +- DAG: browse graph structure and kind/outcome coloring +- Detail: inspect the full manifest, relations, scores, thread, and outcome +- Frontier: inspect ranked entries across frontier signals +- Claims: inspect active claims, leases, and duplicate targets +- Agents: correlate claims with tmux sessions; shows a graph when topology is + configured +- Terminal: watch captured output from the selected tmux session and type into + it +- Artifact: preview text/binary artifacts and diff parent vs child content +- VFS: browse Nexus VFS directories when running with `--nexus` + +### TUI operator workflow + +Recommended flow: + +1. Start in Dashboard or DAG to see current state. +2. Move to Frontier to find the best current work. +3. Open Detail on a contribution to inspect scores, relations, and artifacts. +4. Toggle Claims and Agents to see who is working on what. +5. Toggle Artifact to inspect files and diffs. +6. Toggle Terminal to watch a tmux-backed agent session. +7. Use the command palette to spawn or kill sessions when tmux is available. + +### TUI caveats + +- tmux is required for agent session management and terminal capture +- VFS only appears when the provider supports it, which is the Nexus-backed TUI +- current spawn behavior is shell-first: the command palette starts `$SHELL` + rather than a prewired `claude` or `codex` command +- local mode currently has the most complete claim/workspace/session lifecycle +- Nexus mode is strong for shared state and VFS, but its spawned-session + lifecycle is not yet as complete as local mode + +## Use Case 6: Connect Agents Through MCP + +Grove ships two MCP runtimes: + +- `grove-mcp`: stdio transport for local agents +- `grove-mcp-http`: HTTP/SSE transport for remote or shared agents + +See [mcp-setup.md](./mcp-setup.md) for host-specific configuration. + +### Tool families + +Contribution tools: + +- `grove_contribute` +- `grove_review` +- `grove_reproduce` +- `grove_discuss` + +Coordination tools: + +- `grove_claim` +- `grove_release` +- `grove_checkout` +- `grove_check_stop` + +Query tools: + +- `grove_frontier` +- `grove_search` +- `grove_log` +- `grove_tree` +- `grove_thread` + +Outcome tools: + +- `grove_set_outcome` +- `grove_get_outcome` +- `grove_list_outcomes` + +Bounty tools: + +- `grove_bounty_create` +- `grove_bounty_list` +- `grove_bounty_claim` +- `grove_bounty_settle` + +Sidecar tool: + +- `ask_user` + +### Agent identity + +For both CLI and MCP-hosted agents, set identity metadata when possible: + +- `GROVE_AGENT_ID` +- `GROVE_AGENT_NAME` +- `GROVE_AGENT_PROVIDER` +- `GROVE_AGENT_MODEL` +- `GROVE_AGENT_PLATFORM` +- `GROVE_AGENT_TOOLCHAIN` +- `GROVE_AGENT_RUNTIME` + +Identity matters because it shows up in contributions, claims, frontier +filters, and TUI detail views. + +## Use Case 7: Serve Grove Over HTTP + +Start the server: + +```bash +GROVE_DIR=/path/to/.grove PORT=4515 grove-server +``` + +Optional federation environment: + +```bash +GOSSIP_SEEDS=peer-a@http://host-a:4515,peer-b@http://host-b:4515 +GOSSIP_PEER_ID=my-peer +GOSSIP_ADDRESS=http://my-host:4515 +``` + +Primary route groups: + +- `/api/contributions` +- `/api/frontier` +- `/api/search` +- `/api/dag/*` +- `/api/diff/*` +- `/api/threads*` +- `/api/claims*` +- `/api/outcomes*` +- `/api/grove` +- `/api/gossip/*` + +Server mode is the best fit when you want: + +- multiple operators or agents to point at one grove +- HTTP clients or remote TUI access +- gossip-enabled federation between grove servers + +## Use Case 8: Use Nexus-Backed Storage + +Nexus in this repo is a backend adapter layer, not a standalone `grove-nexus` +runtime. + +What Nexus-backed mode provides: + +- contribution store +- claim store +- outcome store +- CAS over Nexus VFS +- VFS browsing from the TUI +- zone scoping and HTTP client support + +Where you use it today: + +- programmatically through `src/nexus/*` +- from the TUI with `grove tui --nexus ` +- in integration tests under `tests/nexus` + +What Nexus mode is best for: + +- shared storage and shared operator visibility +- browsing artifacts and VFS state across a shared zone +- staging toward multi-machine operation + +Current limitation: + +- there is no dedicated Nexus execution control plane in this repo slice, so + local-mode TUI still provides the most complete claim/workspace/session + lifecycle for spawned agents + +## Use Case 9: Bridge Grove and GitHub + +Export a contribution: + +```bash +grove export --to-discussion owner/repo blake3:... +grove export --to-pr owner/repo blake3:... +``` + +Import existing GitHub work into Grove: + +```bash +grove import --from-pr owner/repo#44 +grove import --from-discussion owner/repo#43 +``` + +Use this when you want Grove to be the system of record for ongoing agent work +while still interoperating with existing GitHub discussions or pull requests. + +Requirements: + +- `gh` CLI installed +- authenticated GitHub session available to `gh` + +## Use Case 10: Federate Servers with Gossip + +Gossip is server-to-server federation, not an agent-facing workflow. + +Query a running server: + +```bash +grove gossip peers --server http://localhost:4515 +grove gossip status --server http://localhost:4515 +grove gossip frontier --server http://localhost:4515 +grove gossip watch --server http://localhost:4515 +``` + +Participate directly from the CLI: + +```bash +grove gossip exchange http://peer:4515 --peer-id local-peer +grove gossip shuffle http://peer:4515 --peer-id local-peer +grove gossip sync peer-a@http://a:4515,peer-b@http://b:4515 --peer-id local-peer +grove gossip daemon peer-a@http://a:4515 --peer-id local-peer --port 4516 --interval 30 +``` + +Use gossip when you want frontier propagation, peer discovery, and liveness +tracking across multiple grove servers. + +## Use Case 11: Route Questions Through ask-user + +There are two surfaces here: + +- `grove ask` for CLI-based question answering +- `grove-ask-user` for a standalone MCP sidecar + +Strategies supported by `@grove/ask-user`: + +- `interactive` +- `rules` +- `llm` +- `agent` + +Examples: + +```bash +grove ask "Should I keep the queue explicit?" +grove ask "Which database?" --options Postgres,MySQL,SQLite +GROVE_ASK_USER_CONFIG=./ask-user.json grove-ask-user +``` + +Use this when agents need a consistent way to request clarification without +embedding one-off prompting logic into every host. + +## Use Case 12: Learn from the Example Scenarios + +The examples are the fastest way to understand Grove's intended collaboration +shapes: + +- `examples/autoresearch`: evaluation-mode work, review, reproduction, adoption, + and stop conditions +- `examples/code-exploration`: exploration-mode findings, replies, and reviews +- `examples/multi-agent`: implement/review/reproduce/adopt collaboration + +The `examples/multi-agent/launch.sh` script is the closest current example of a +real multi-agent workflow using MCP tools and a shared grove. + +## Where to Go Deeper + +- [MCP setup guide](./mcp-setup.md) +- [Protocol spec](../../spec/PROTOCOL.md) +- [Grove contract spec](../../spec/GROVE-CONTRACT.md) +- [Lifecycle spec](../../spec/LIFECYCLE.md) +- [Frontier spec](../../spec/FRONTIER.md) +- [Relations spec](../../spec/RELATIONS.md) +- [Test plan](../testing/test-plan.md) diff --git a/docs/testing/test-plan.md b/docs/testing/test-plan.md new file mode 100644 index 00000000..a956bbf1 --- /dev/null +++ b/docs/testing/test-plan.md @@ -0,0 +1,509 @@ +# Grove Test Plan + +This document defines a package-by-package test plan for Grove. It is intended +to do two things: + +1. preserve confidence in the protocol, storage, and agent-facing surfaces that + already have good coverage +2. close the biggest remaining gaps, especially around the TUI and Nexus-backed + operator workflows + +The plan is grouped by package/module and by test layer. + +## Test Layers + +Use these layers consistently: + +- schema/spec tests: contract and JSON-schema validation +- unit tests: pure logic, parsing, mapping, ranking, error handling +- component/provider tests: adapters, views, hooks, and transport wrappers +- integration tests: multiple modules wired together with realistic stores +- scenario/e2e tests: complete user flows across CLI, MCP, server, or examples +- manual operator tests: required where TUI, tmux, Ghostty, or external systems + are involved + +## Current Coverage Snapshot + +Automated test files currently present: + +| Area | Test files | +| --- | ---: | +| `spec` | 4 | +| `packages/ask-user` | 8 | +| `src/core` | 17 | +| `src/local` | 16 | +| `src/cli` | 26 | +| `src/server` | 2 | +| `tests/server` | 11 | +| `src/mcp` | 9 | +| `src/github` | 6 | +| `src/gossip` | 4 | +| `tests/gossip` | 3 | +| `src/tui` | 7 | +| `tests/nexus` | 6 | +| `examples` | 3 | + +Interpretation: + +- strongest areas: protocol/core logic, local storage/workspaces, CLI, server, + GitHub bridge, and gossip +- medium-confidence areas: MCP and Nexus adapter internals +- weakest areas: TUI app/view behavior, remote/Nexus operator workflows, and a + few user-facing edges such as the server diff route and MCP outcomes tools + +## Priority Order + +### P0: highest-value gaps + +- add direct tests for TUI app/view behavior +- add remote-provider and nexus-provider tests +- add Nexus-backed operator workflow tests +- add MCP outcome tool tests +- add server diff route tests + +### P1: next confidence layer + +- add end-to-end flows that connect CLI, MCP, server, and TUI +- add GitHub import/export scenario coverage +- add more gossip daemon and federation workflow coverage +- add manual TUI execution checklists to release criteria + +### P2: maintenance and drift control + +- keep example scenarios aligned with docs +- keep MCP docs aligned with registered tool list +- add regression coverage for any new contribution kinds, topology rules, or + server routes as they land + +## Package-By-Package Plan + +### `spec/*` + +Purpose: + +- define protocol and contract shape +- validate JSON schemas and wire formats + +Keep: + +- schema validation tests for contribution, relation, artifact, claim, and + grove-contract schemas + +Add: + +- golden fixtures for valid and invalid contract frontmatter +- compatibility tests between `spec/*` and CLI/server parsers +- regression tests for new contract fields before implementation ships + +Exit criteria: + +- every contract or schema change updates both schema tests and at least one + higher-level integration test + +### `packages/ask-user` + +Purpose: + +- answer clarification questions through `interactive`, `rules`, `llm`, or + `agent` strategies +- register the `ask_user` MCP tool + +Keep: + +- config parsing/loading tests +- strategy tests for rules, interactive, llm, and agent strategies +- registration tests and stdio e2e + +Add: + +- fallback-chain tests for real config combinations used in Grove +- env-driven config tests that mirror CLI and MCP usage +- failure-injection tests for agent subprocess timeout, stderr noise, and + malformed config files + +Manual: + +- run `grove ask` with and without `GROVE_ASK_USER_CONFIG` +- run `grove-ask-user` and confirm an MCP host can discover `ask_user` + +### `src/core` + +Purpose: + +- immutable models, manifest/CID logic, contract parsing, frontier ranking, + lifecycle/stop conditions, threads, backoff, errors, topology, hooks, and + workspace/path-safety protocols + +Keep: + +- model immutability +- manifest determinism and CID verification +- frontier ranking and scale behavior +- lifecycle stop-condition evaluation +- contract parsing/validation +- thread traversal and hot-thread logic +- path-safety and subprocess behavior + +Add: + +- topology validation edge cases: duplicate roles, bad edges, invalid tree + parents, spawn depth/child limits +- lifecycle + frontier combined regression tests for mixed evaluation and + exploration contributions +- property-style tests for relation traversal invariants + +Exit criteria: + +- every protocol invariant is enforced in either unit or integration form + +### `src/local` + +Purpose: + +- production local adapter layer: SQLite stores, filesystem CAS, workspace + manager, reconciler, hook runner, local bounty/outcome/gossip stores + +Keep: + +- SQLite store CRUD, FTS, migrations, and concurrency +- CAS reads/writes and concurrency +- workspace lifecycle and conformance +- reconciler behavior +- hook runner behavior +- local outcome and bounty store tests + +Add: + +- hook + workspace integration tests for checkout/contribute cleanup flow +- failure-injection around partial CAS writes and interrupted workspace cleanup +- reconciliation tests involving stale claims plus multiple agent workspaces + +Manual: + +- create a local grove, contribute files, checkout them, and clean workspaces + +### `src/cli` + +Purpose: + +- human/operator command surface + +Commands covered by the test plan: + +- `init` +- `contribute` +- `discuss` +- `claim`, `release`, `claims` +- `checkout` +- `frontier`, `search`, `log`, `tree` +- `thread`, `threads` +- `ask` +- `bounty` +- `outcome` +- `import`, `export` +- `gossip` +- `tui` entry behavior + +Keep: + +- parsing and behavior tests for each command family +- CLI integration tests for end-to-end invocation +- formatting tests for list, table, and DAG output + +Add: + +- command crossovers: `contribute` -> `frontier` -> `checkout` +- agent identity propagation across CLI commands +- regression tests for grove-directory discovery and `--grove` overrides +- GitHub CLI error-path tests when `gh` is missing or unauthenticated +- `tui` argument parsing tests for all provider modes + +Manual: + +- run the full operator flow from the CLI only: init, claim, contribute, log, + thread, outcome, bounty + +### `src/server` plus `tests/server` + +Purpose: + +- HTTP API and remote control-plane surface + +Keep: + +- route tests for contributions, claims, frontier, search, DAG, threads, + outcomes, grove metadata, and integration wiring +- middleware error handling tests +- full-server e2e in `src/server/e2e.test.ts` + +Add: + +- dedicated tests for `/api/diff/:parentCid/:childCid/:artifactName` +- artifact metadata and artifact download negative-path tests +- tests for optional behavior when outcome store or gossip is not configured +- topology route tests for missing vs configured topology +- multipart upload edge cases: empty artifacts, duplicate names, invalid CID + +Manual: + +- start `grove-server`, then exercise it from `grove tui --url ...` and from + raw HTTP clients + +### `src/mcp` + +Purpose: + +- agent-facing tool surface and transport bindings + +Keep: + +- agent identity tests +- server integration test asserting the full registered tool surface +- tool-family tests for contributions, claims, queries, workspace, stop, and + bounties + +Add: + +- direct tests for `src/mcp/tools/outcomes.ts` +- transport tests for HTTP/SSE session lifecycle in `grove-mcp-http` +- negative-path tests for missing stores or missing workspace manager +- regression tests for token-saving trimmed responses in query tools +- tests ensuring `ask_user` remains registered in the combined server + +Manual: + +- connect Claude Code or Codex to `grove-mcp` +- connect an HTTP MCP client to `grove-mcp-http` + +### `src/nexus` plus `tests/nexus` + +Purpose: + +- Nexus-backed CAS/store adapters and supporting client/cache/semaphore logic + +Keep: + +- unit and integration coverage for Nexus CAS and store behavior +- resilience and edge-case tests +- mock-client coverage + +Add: + +- end-to-end user workflows using Nexus-backed claims, contributions, outcomes, + and TUI browsing +- concurrency/conflict tests across multiple logical agents +- VFS browsing tests tied to real TUI provider expectations +- explicit tests for zone scoping, revision conflict recovery, and retry + backoff behavior under mixed read/write workloads + +Critical gap: + +- no full operator workflow currently ties Nexus to the TUI, MCP, and + claim/workspace/session lifecycle together + +Manual: + +- run `grove tui --nexus ...` +- browse VFS +- inspect contributions, claims, frontier, and outcomes +- once Nexus execution lifecycle is complete, validate spawn/kill and cleanup + +### `src/github` + +Purpose: + +- import/export bridge between Grove and GitHub Discussions/PRs + +Keep: + +- refs parsing +- mapper tests +- error mapping tests +- adapter unit/integration tests +- client conformance tests + +Add: + +- CLI-level import/export scenario tests with realistic contribution content +- artifact-heavy PR export fixtures +- discussion import/export round trips preserving thread context +- failure tests for missing `gh`, auth failure, and rate limiting + +Manual: + +- import a real PR into a scratch grove +- export a contribution to a test repo as both Discussion and PR + +### `src/gossip` plus `tests/gossip` + +Purpose: + +- server federation via CYCLON peer sampling, frontier exchange, and liveness + +Keep: + +- CYCLON behavior +- HTTP transport coverage +- protocol behavior +- convergence, routes, and failure propagation tests + +Add: + +- daemon-mode integration tests with more than two peers +- restart/rejoin tests +- tests for stale peer expiry and liveness transitions across timeouts +- throughput/load reporting assertions if queue depth becomes meaningful + +Manual: + +- run two or more `grove-server` instances with `GOSSIP_SEEDS` +- confirm peer discovery and merged frontier convergence + +### `src/tui` + +Purpose: + +- operator command center for DAG, detail, frontier, claims, agents, terminal, + artifact preview, and Nexus VFS + +Current automated coverage: + +- panel-focus and navigation hooks +- graph layout and edge rendering +- tmux manager +- spawn validator +- local provider + +Largest gaps: + +- `src/tui/app.tsx` root behavior +- `src/tui/main.ts` provider-mode wiring +- all major views and most shared components +- `remote-provider.ts` +- `nexus-provider.ts` +- end-user spawn/kill/operator loops + +Add next: + +- view tests for dashboard, DAG, detail, frontier, claims, activity, artifact, + terminal, agent list/graph, and VFS browser +- component tests for table, status bar, panel bar, command palette, and input + handling +- provider tests for remote and Nexus providers +- app-level tests for: + - panel toggling and focus + - detail drill-in and back navigation + - terminal input mode + - command palette spawn/kill flow + - topology-aware graph mode + - artifact diff toggle +- Ghostty fallback tests for terminal rendering + +Manual TUI checklist: + +1. Local mode + - launch `grove tui` + - confirm DAG, Detail, Frontier, and Claims render with seed data + - open Agent and Terminal panels with tmux available + - spawn a session from the command palette + - confirm claim/session visibility and terminal capture + - kill the session and confirm cleanup +2. Remote mode + - launch `grove-server` + - connect with `grove tui --url http://localhost:4515` + - confirm dashboard, detail, frontier, claims, artifacts, and outcomes work +3. Nexus mode + - launch `grove tui --nexus ` + - confirm VFS browsing works + - confirm contribution/detail/frontier/outcome views use Nexus-backed data + - document current spawn limitations until the Nexus lifecycle is complete +4. Topology mode + - add topology to `GROVE.md` + - confirm Agent panel renders graph view and command palette respects role + capacity +5. Artifact mode + - preview text artifacts + - preview binary artifacts + - diff parent vs child artifact content + +Release gate: + +- no release should claim TUI support without completing the manual TUI + checklist on the supported provider modes + +### `examples/*` + +Purpose: + +- scenario-level documentation and regression fixtures + +Keep: + +- autoresearch +- code exploration +- multi-agent collaboration + +Add: + +- GitHub import/export scenario +- bounty-driven scenario +- server-backed operator scenario +- Nexus-first operator scenario +- TUI-oriented walkthrough fixture or scripted smoke check + +## Cross-Surface Scenarios to Add + +These are the highest-value integration additions because they reflect how +people actually use Grove: + +1. Local operator loop + - `grove init` + - `grove claim` + - `grove contribute` + - `grove outcome set` + - inspect from `grove tui` +2. MCP multi-agent loop + - two MCP clients connect + - agent A contributes work + - agent B claims and reviews + - agent A derives from review + - stop conditions evaluated +3. Server + TUI loop + - `grove-server` + - remote TUI connects + - claims, contributions, artifacts, outcomes stay consistent +4. Nexus-backed collaboration loop + - Nexus-backed contribution + claim + outcome flow + - TUI reads the same shared state + - manual or automated operator verification +5. GitHub bridge loop + - import PR -> review/adopt/discuss in Grove -> export back out +6. Federated server loop + - contributions created on one server + - frontier converges across peers through gossip + +## Release Checklist + +Before a significant release: + +- run `bun test` +- run `bun test --cwd packages/ask-user` +- run `bun run build` +- run `bun run typecheck` +- run `bun run check` +- run the example scenarios +- run the manual TUI checklist +- if GitHub or Nexus code changed, run their integration suites +- if server or MCP code changed, run at least one real transport smoke test + +## Definition of Done for New Features + +Every new user-visible feature should ship with: + +- one unit test for the local logic +- one integration test at the public boundary that users actually touch +- a doc update in the user guide if it changes operator behavior +- a manual TUI checklist update if it affects the TUI + +This is the standard needed to keep Grove coherent as a product rather than a +collection of partially connected subsystems. From 2c59a0a2424c87b6c5c5e878e8aa3a0eab8cce29 Mon Sep 17 00:00:00 2001 From: "taofeng.nju@gmail.com" Date: Wed, 11 Mar 2026 16:51:01 -0700 Subject: [PATCH 2/8] docs: clarify nexus-first guidance --- docs/guides/user-guide.md | 52 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/docs/guides/user-guide.md b/docs/guides/user-guide.md index e21ddd4c..0eda828a 100644 --- a/docs/guides/user-guide.md +++ b/docs/guides/user-guide.md @@ -14,6 +14,32 @@ can: This guide is organized by use case instead of internal modules. +## Default Recommendation + +Treat Nexus as the primary Grove operating mode when you have a shared Nexus +endpoint available. + +That means: + +- use Nexus for shared contributions, claims, outcomes, artifacts, and VFS +- use the TUI as a shared operator view over that state +- connect Claude Code, Codex, or other MCP hosts to the same grove + +Use local mode when: + +- you want a single-machine scratch grove +- you are developing or debugging without shared infrastructure +- you need the most complete spawned-session claim/workspace lifecycle today + +Important caveat: + +- `grove tui` still defaults to local mode unless you pass `--nexus` today +- that default is an implementation detail, not the product story we should + optimize around +- if Grove is going to present Nexus as the first-class path, the TUI should + eventually resolve Nexus by configuration and show its active provider mode + clearly on startup + ## Choose Your Surface | If you want to... | Use | Entry point | @@ -430,6 +456,13 @@ Current limitation: local-mode TUI still provides the most complete claim/workspace/session lifecycle for spawned agents +Recommended stance: + +- treat Nexus as the default shared-state backend in docs and onboarding +- treat local mode as the fallback and single-machine compatibility path +- do not imply full local-mode session lifecycle parity in Nexus mode until the + missing claim/workspace/session plumbing is complete + ## Use Case 9: Bridge Grove and GitHub Export a contribution: @@ -517,6 +550,25 @@ shapes: The `examples/multi-agent/launch.sh` script is the closest current example of a real multi-agent workflow using MCP tools and a shared grove. +## Testing and Current Gaps + +The detailed engineering matrix lives in +[../testing/test-plan.md](../testing/test-plan.md), but users should know the +high-level support picture: + +- strongest coverage today: `core`, `local`, `cli`, `server`, `github`, and + `gossip` +- medium-confidence areas: MCP and Nexus adapter internals +- weakest areas: TUI app/view behavior, remote/Nexus operator workflows, the + server diff route, and MCP outcome tools + +For TUI users, the practical takeaway is: + +- local mode is the most complete operator path today +- remote mode is solid for observing a server-backed grove +- Nexus mode is strong for shared state and VFS, but still needs lifecycle work + before it can honestly replace local mode for spawned-session management + ## Where to Go Deeper - [MCP setup guide](./mcp-setup.md) From afe23973561e1cd046ac8eb3f843b0b93ed3e118 Mon Sep 17 00:00:00 2001 From: "taofeng.nju@gmail.com" Date: Wed, 11 Mar 2026 19:01:02 -0700 Subject: [PATCH 3/8] docs: align nexus-first guidance with main --- docs/guides/user-guide.md | 66 ++++++++++++++++++++++----------------- docs/testing/test-plan.md | 59 ++++++++++++++++++++-------------- 2 files changed, 74 insertions(+), 51 deletions(-) diff --git a/docs/guides/user-guide.md b/docs/guides/user-guide.md index 0eda828a..092c4326 100644 --- a/docs/guides/user-guide.md +++ b/docs/guides/user-guide.md @@ -29,16 +29,17 @@ Use local mode when: - you want a single-machine scratch grove - you are developing or debugging without shared infrastructure -- you need the most complete spawned-session claim/workspace lifecycle today +- you need tmux-backed session spawn, kill, and live terminal management today Important caveat: -- `grove tui` still defaults to local mode unless you pass `--nexus` today -- that default is an implementation detail, not the product story we should - optimize around -- if Grove is going to present Nexus as the first-class path, the TUI should - eventually resolve Nexus by configuration and show its active provider mode - clearly on startup +- `grove tui` now auto-selects Nexus when it is configured through + `GROVE_NEXUS_URL` or `.grove/grove.json` +- `--url` still means a remote `grove-server` +- `--nexus` remains as an explicit Nexus override when you want to bypass + auto-detection +- if an auto-detected Nexus endpoint is unavailable, not really Nexus, or + requires auth, the TUI falls back to local mode and prints a warning ## Choose Your Surface @@ -248,6 +249,7 @@ Use bounties when you want an explicit task market instead of ad hoc claims. Launch the TUI: ```bash +export GROVE_NEXUS_URL=http://localhost:2026 grove tui grove tui --url http://localhost:4515 grove tui --nexus http://localhost:2026 @@ -255,9 +257,11 @@ grove tui --nexus http://localhost:2026 Provider modes: -- local: reads directly from local SQLite/CAS/workspace managers +- Nexus: auto-selected from `GROVE_NEXUS_URL` or `.grove/grove.json` and + provides shared contributions, claims, outcomes, artifacts, and VFS - remote: reads from `grove-server` -- Nexus: reads from Nexus-backed stores and enables VFS browsing +- local: fallback mode that reads directly from local SQLite/CAS/workspace + managers ### Core panels @@ -300,7 +304,7 @@ Keybindings: - Terminal: watch captured output from the selected tmux session and type into it - Artifact: preview text/binary artifacts and diff parent vs child content -- VFS: browse Nexus VFS directories when running with `--nexus` +- VFS: browse Nexus VFS directories when the active backend is Nexus ### TUI operator workflow @@ -316,13 +320,14 @@ Recommended flow: ### TUI caveats -- tmux is required for agent session management and terminal capture -- VFS only appears when the provider supports it, which is the Nexus-backed TUI +- active backend selection is shown in the dashboard header +- VFS only appears when the active provider is Nexus +- tmux-backed session management and live terminal capture are still local-mode + features today - current spawn behavior is shell-first: the command palette starts `$SHELL` rather than a prewired `claude` or `codex` command -- local mode currently has the most complete claim/workspace/session lifecycle -- Nexus mode is strong for shared state and VFS, but its spawned-session - lifecycle is not yet as complete as local mode +- `--url` is for a remote `grove-server`; `--nexus` is an explicit Nexus + override, not the normal way to opt into Nexus-first usage ## Use Case 6: Connect Agents Through MCP @@ -441,7 +446,10 @@ What Nexus-backed mode provides: Where you use it today: - programmatically through `src/nexus/*` -- from the TUI with `grove tui --nexus ` +- from the TUI via `grove tui` when Nexus is configured through + `GROVE_NEXUS_URL` or `.grove/grove.json` +- from the TUI with `grove tui --nexus ` when you want an explicit + override - in integration tests under `tests/nexus` What Nexus mode is best for: @@ -452,16 +460,17 @@ What Nexus mode is best for: Current limitation: -- there is no dedicated Nexus execution control plane in this repo slice, so - local-mode TUI still provides the most complete claim/workspace/session - lifecycle for spawned agents +- Nexus now covers shared-state reads, claims, outcomes, artifacts, VFS, and + workspace bookkeeping, but tmux-backed session spawn and terminal management + are still attached to local mode Recommended stance: - treat Nexus as the default shared-state backend in docs and onboarding -- treat local mode as the fallback and single-machine compatibility path -- do not imply full local-mode session lifecycle parity in Nexus mode until the - missing claim/workspace/session plumbing is complete +- treat local mode as the fallback and single-machine session-manager path +- do not imply that `--url` and Nexus are interchangeable: `--url` is + `grove-server`, Nexus is auto-detected or explicitly overridden with + `--nexus` ## Use Case 9: Bridge Grove and GitHub @@ -558,16 +567,17 @@ high-level support picture: - strongest coverage today: `core`, `local`, `cli`, `server`, `github`, and `gossip` -- medium-confidence areas: MCP and Nexus adapter internals -- weakest areas: TUI app/view behavior, remote/Nexus operator workflows, the - server diff route, and MCP outcome tools +- medium-confidence areas: TUI backend resolution and provider lifecycle, MCP, + and Nexus adapter internals +- weakest areas: TUI app/view behavior, full cross-surface operator scenarios, + the server diff route, and MCP outcome tools For TUI users, the practical takeaway is: -- local mode is the most complete operator path today +- `grove tui` is now Nexus-first when Nexus is configured - remote mode is solid for observing a server-backed grove -- Nexus mode is strong for shared state and VFS, but still needs lifecycle work - before it can honestly replace local mode for spawned-session management +- local mode is still the path for tmux-backed session spawn and terminal + capture ## Where to Go Deeper diff --git a/docs/testing/test-plan.md b/docs/testing/test-plan.md index a956bbf1..396b1789 100644 --- a/docs/testing/test-plan.md +++ b/docs/testing/test-plan.md @@ -5,8 +5,8 @@ to do two things: 1. preserve confidence in the protocol, storage, and agent-facing surfaces that already have good coverage -2. close the biggest remaining gaps, especially around the TUI and Nexus-backed - operator workflows +2. close the biggest remaining gaps, especially around end-user TUI behavior + and cross-surface operator workflows The plan is grouped by package/module and by test layer. @@ -34,30 +34,32 @@ Automated test files currently present: | `src/local` | 16 | | `src/cli` | 26 | | `src/server` | 2 | -| `tests/server` | 11 | +| `tests/server` | 10 | | `src/mcp` | 9 | | `src/github` | 6 | | `src/gossip` | 4 | | `tests/gossip` | 3 | -| `src/tui` | 7 | -| `tests/nexus` | 6 | +| `src/tui` | 14 | +| `tests/nexus` | 5 | | `examples` | 3 | Interpretation: - strongest areas: protocol/core logic, local storage/workspaces, CLI, server, GitHub bridge, and gossip -- medium-confidence areas: MCP and Nexus adapter internals -- weakest areas: TUI app/view behavior, remote/Nexus operator workflows, and a - few user-facing edges such as the server diff route and MCP outcomes tools +- medium-confidence areas: TUI backend resolution and provider lifecycle, MCP, + and Nexus adapter internals +- weakest areas: TUI app/view behavior, full cross-surface operator scenarios, + and a few user-facing edges such as the server diff route and MCP outcomes + tools ## Priority Order ### P0: highest-value gaps - add direct tests for TUI app/view behavior -- add remote-provider and nexus-provider tests -- add Nexus-backed operator workflow tests +- add higher-level `src/tui/main.ts` startup and provider-wiring tests +- add Nexus-first cross-surface operator scenario tests - add MCP outcome tool tests - add server diff route tests @@ -290,23 +292,24 @@ Keep: Add: - end-to-end user workflows using Nexus-backed claims, contributions, outcomes, - and TUI browsing + TUI browsing, and backend auto-detection - concurrency/conflict tests across multiple logical agents - VFS browsing tests tied to real TUI provider expectations - explicit tests for zone scoping, revision conflict recovery, and retry backoff behavior under mixed read/write workloads -Critical gap: +Remaining gap: -- no full operator workflow currently ties Nexus to the TUI, MCP, and - claim/workspace/session lifecycle together +- no repeatable end-to-end scenario currently ties Nexus auto-detection, TUI + shared-state usage, MCP clients, and example workflows together Manual: +- run `GROVE_NEXUS_URL=http://... grove tui` - run `grove tui --nexus ...` - browse VFS - inspect contributions, claims, frontier, and outcomes -- once Nexus execution lifecycle is complete, validate spawn/kill and cleanup +- verify local fallback behavior when auto-detected Nexus is unhealthy ### `src/github` @@ -372,16 +375,19 @@ Current automated coverage: - graph layout and edge rendering - tmux manager - spawn validator +- spawn manager and spawn lifecycle - local provider +- remote provider +- Nexus provider +- backend resolution +- provider-shared and provider-utils helpers Largest gaps: - `src/tui/app.tsx` root behavior -- `src/tui/main.ts` provider-mode wiring +- `src/tui/main.ts` full startup wiring - all major views and most shared components -- `remote-provider.ts` -- `nexus-provider.ts` -- end-user spawn/kill/operator loops +- end-user operator loops across backend modes Add next: @@ -389,7 +395,6 @@ Add next: terminal, agent list/graph, and VFS browser - component tests for table, status bar, panel bar, command palette, and input handling -- provider tests for remote and Nexus providers - app-level tests for: - panel toggling and focus - detail drill-in and back navigation @@ -397,12 +402,17 @@ Add next: - command palette spawn/kill flow - topology-aware graph mode - artifact diff toggle +- startup integration tests for: + - Nexus auto-detection from `GROVE_NEXUS_URL` + - Nexus auto-detection from `.grove/grove.json` + - fallback to local on unhealthy or auth-protected Nexus + - explicit `--url` and `--nexus` precedence - Ghostty fallback tests for terminal rendering Manual TUI checklist: 1. Local mode - - launch `grove tui` + - launch `grove tui` with no Nexus configuration - confirm DAG, Detail, Frontier, and Claims render with seed data - open Agent and Terminal panels with tmux available - spawn a session from the command palette @@ -413,10 +423,12 @@ Manual TUI checklist: - connect with `grove tui --url http://localhost:4515` - confirm dashboard, detail, frontier, claims, artifacts, and outcomes work 3. Nexus mode - - launch `grove tui --nexus ` + - launch `GROVE_NEXUS_URL=http://... grove tui` + - verify the dashboard shows a Nexus backend label - confirm VFS browsing works - confirm contribution/detail/frontier/outcome views use Nexus-backed data - - document current spawn limitations until the Nexus lifecycle is complete + - verify local fallback behavior for unhealthy or auth-protected Nexus + - verify explicit `--nexus ` override still works 4. Topology mode - add topology to `GROVE.md` - confirm Agent panel renders graph view and command palette respects role @@ -474,6 +486,7 @@ people actually use Grove: - claims, contributions, artifacts, outcomes stay consistent 4. Nexus-backed collaboration loop - Nexus-backed contribution + claim + outcome flow + - `grove tui` resolves Nexus from config or env - TUI reads the same shared state - manual or automated operator verification 5. GitHub bridge loop From 3d816a1e7716101400639df03622359a5a308764 Mon Sep 17 00:00:00 2001 From: "taofeng.nju@gmail.com" Date: Wed, 11 Mar 2026 20:28:50 -0700 Subject: [PATCH 4/8] docs: rewrite early guide as operator story --- docs/guides/user-guide.md | 177 ++++++++++++++++++++++++++------------ 1 file changed, 120 insertions(+), 57 deletions(-) diff --git a/docs/guides/user-guide.md b/docs/guides/user-guide.md index 092c4326..97e9a2bd 100644 --- a/docs/guides/user-guide.md +++ b/docs/guides/user-guide.md @@ -88,13 +88,40 @@ Use evaluation mode when you care about measurable scores and frontier ranking. Use exploration mode when the work is more like code archaeology, architecture discussion, or investigation. -## Use Case 2: Submit and Coordinate Work +## Use Case 2: Run One Work Round -### Publish a contribution +Suppose you just ran: -The main write command is `grove contribute`. +```bash +grove init "Optimize code search" +``` + +Now you want one agent to improve the parser, another to review it, and a third +to confirm the benchmark actually reproduces. -Typical examples: +### Start by claiming the target + +Claims are the first coordination step. They are short leases over a target, so +multiple agents do not accidentally work on the same thing at once. + +```bash +grove claim optimize-parser --intent "Benchmark worker-pool design" +grove claims +``` + +If the work stops or moves elsewhere, release the claim: + +```bash +grove release +``` + +Use claims for tasks, subsystems, benchmark targets, or even a specific +contribution that someone is actively reviewing. + +### Publish the first work contribution + +The main write command is `grove contribute`. A typical first contribution is a +`work` contribution with artifacts and scores: ```bash grove contribute \ @@ -104,7 +131,35 @@ grove contribute \ --score throughput=5800 \ --score latency_p99=32 \ --tag optimization +``` +That records a new immutable node in the graph. The artifact content goes into +CAS, the manifest gets a CID, and the contribution becomes visible to frontier, +search, TUI, and MCP clients. + +The common contribution kinds are: + +- `work` for a candidate implementation or result +- `review` for feedback on another contribution +- `discussion` for replies and deliberation +- `adoption` for signaling that a contribution should be carried forward +- `reproduction` for confirming that a result holds up independently + +You can ingest artifacts in one of four ways: + +- `--artifacts ...` for explicit files +- `--from-git-diff ` for a patch-style contribution +- `--from-git-tree` for the current tree +- `--from-report ` for a generated report + +Only one ingestion mode can be used per contribution. + +### Add review and reproduction, not just more code + +Once the first implementation exists, the next useful actions are usually +review and reproduction rather than another blind code change. + +```bash grove contribute \ --kind review \ --summary "Sequential path is too slow" \ @@ -118,24 +173,13 @@ grove contribute \ --score throughput=5700 ``` -Contribution kinds: - -- `work` -- `review` -- `discussion` -- `adoption` -- `reproduction` - -Ingestion modes: +That is the main Grove pattern: work creates a candidate, review critiques it, +and reproduction checks whether the result is real. -- `--artifacts ...` -- `--from-git-diff ` -- `--from-git-tree` -- `--from-report ` +### Use discussion when the graph needs conversation -Only one ingestion mode can be used per contribution. - -### Post discussion threads +Not every next step is another artifact. Sometimes the right move is to attach +reasoning to the work before anyone edits more code. Use `grove discuss` as the shorthand for discussion contributions: @@ -144,33 +188,31 @@ grove discuss "Should this stay event-driven?" grove discuss blake3:... "I think the queue should stay explicit" --tag architecture ``` -### Prevent duplicate work with claims - -Claims are temporary leases over a target. - -```bash -grove claim optimize-parser --intent "Benchmark worker-pool design" -grove claims -grove release -``` +Use a root discussion when the question is broad, and a reply when the comment +is about one concrete contribution. -Use claims whenever multiple agents or operators could collide on the same task -or contribution. +### Check out the best candidate locally -### Check out artifacts to work locally +When you want to inspect or build on a contribution, materialize its artifacts +into a workspace: ```bash grove checkout blake3:... --to ./workspace grove checkout --frontier throughput --to ./workspace ``` -`grove checkout` materializes contribution artifacts into a directory. It can -target a specific contribution or resolve the current best contribution for a -metric from the frontier. +`grove checkout` either targets one CID directly or resolves the current best +contribution for a metric from the frontier. -## Use Case 3: Explore the Graph +## Use Case 3: Read the Graph Before You Act Again -### Inspect ranking and recency +After one round of work, review, and reproduction, the operator question +changes from "what command do I run?" to "what does the grove think is true +now?" + +### Start with the frontier + +The frontier is the fastest way to see what currently matters. ```bash grove frontier @@ -179,15 +221,20 @@ grove frontier --tag h100 grove frontier --mode exploration ``` -Frontier output includes: +Frontier views highlight: - best by metric -- by adoption count -- by recency -- by review score -- by reproduction count +- most adopted +- most recent +- highest reviewed +- most reproduced + +If you are running an evaluation grove, this is usually the first screen after +new results land. -### Search the grove +### Search for specific work or agents + +When you know roughly what you want, search is faster than graph browsing: ```bash grove search --query "connection pool" @@ -195,7 +242,12 @@ grove search --kind review --agent codex grove search --tag optimizer --sort adoption ``` -### List recent activity +Use search when you want a topic, a kind of contribution, or work from a +specific agent rather than the top-ranked frontier entry. + +### Read the recent log + +The log answers "what just happened?" across the whole grove: ```bash grove log @@ -204,7 +256,13 @@ grove log --mode exploration grove log --outcome accepted ``` -### Inspect lineage and discussion +This is useful when several agents are working in parallel and you need a quick +operator scan before opening the TUI. + +### Inspect lineage and conversation together + +Once one contribution matters, inspect both its structural ancestry and its +discussion context: ```bash grove tree --from blake3:... @@ -212,15 +270,19 @@ grove thread blake3:... grove threads --tag architecture ``` -Use `tree` for structural lineage and `thread` / `threads` for -`responds_to`-style discussions. +Use `tree` for "how did this result get here?" and `thread` or `threads` for +"what are people saying about it?" + +## Use Case 4: Close the Loop -## Use Case 4: Mark Outcomes and Run Bounties +Once the grove has enough evidence, you can either record a decision or open a +new market for work. -### Outcomes +### Record the outcome -Outcomes annotate whether a contribution was accepted, rejected, crashed, or -invalidated. +Outcomes are the operator or evaluator judgment layer. They mark whether a +contribution was accepted, rejected, crashed, or invalidated without changing +the contribution CID itself. ```bash grove outcome set blake3:... accepted --reason "passes perf and regression checks" @@ -228,13 +290,13 @@ grove outcome list --status accepted -n 10 grove outcome stats ``` -Use outcomes when you want operator- or evaluator-driven judgments separate -from the contribution itself. +Use outcomes when the question is no longer "what was contributed?" but +"what do we believe about it now?" -### Bounties +### Create or claim a bounty -Bounties coordinate incentive-bearing tasks. In local dev mode they work even -without a durable credits backend. +Bounties are for tasks that should stay open until someone explicitly takes +them and finishes them. ```bash grove bounty create "Reduce parser latency" --amount 500 --deadline 7d @@ -242,7 +304,8 @@ grove bounty list --status open grove bounty claim ``` -Use bounties when you want an explicit task market instead of ad hoc claims. +Use bounties when you want a persistent task market instead of lighter-weight +ad hoc claims. ## Use Case 5: Operate the TUI From 4915b0d944909be22c4fbe1006215ff07b5b7bc8 Mon Sep 17 00:00:00 2001 From: "taofeng.nju@gmail.com" Date: Wed, 11 Mar 2026 20:42:12 -0700 Subject: [PATCH 5/8] docs: clarify source-checkout CLI usage --- docs/guides/user-guide.md | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/docs/guides/user-guide.md b/docs/guides/user-guide.md index 97e9a2bd..cc861dbc 100644 --- a/docs/guides/user-guide.md +++ b/docs/guides/user-guide.md @@ -64,12 +64,24 @@ bun install bun run build ``` +If you are working from this source checkout, set a repo-local CLI helper first: + +```bash +export GROVE="bun run src/cli/main.ts" +``` + +If you have installed the compiled bin onto your `PATH`, you can use `grove` +directly instead. + Initialize a grove in the current repo: ```bash -grove init "Optimize code search" +$GROVE init "Optimize code search" ``` +For readability, the rest of this guide uses `grove ...` in examples. In a +source checkout, substitute `$GROVE ...`. + Useful variants: - seed an initial artifact set with `grove init ... --seed ` @@ -93,7 +105,7 @@ discussion, or investigation. Suppose you just ran: ```bash -grove init "Optimize code search" +$GROVE init "Optimize code search" ``` Now you want one agent to improve the parser, another to review it, and a third From f4dabfcffc1330f4e67e386ae660afe21f20ba73 Mon Sep 17 00:00:00 2001 From: "taofeng.nju@gmail.com" Date: Wed, 11 Mar 2026 21:03:02 -0700 Subject: [PATCH 6/8] docs: use shell function for source-checkout cli --- README.md | 12 ++++++------ docs/guides/user-guide.md | 16 ++++++++-------- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 52ec1c44..4adaaabb 100644 --- a/README.md +++ b/README.md @@ -61,13 +61,13 @@ workspace `@grove/ask-user` package through its published export map. bun install bun run build export GROVE_AGENT_ID=codex-local -export GROVE="bun run src/cli/main.ts" +grove() { bun run src/cli/main.ts "$@"; } -$GROVE init "Latency hunt" --metric latency_ms:minimize -$GROVE contribute --summary "Baseline measurements" --artifacts README.md --tag baseline -$GROVE frontier -$GROVE discuss "Should we optimize the parser or the cache first?" -$GROVE claims +grove init "Latency hunt" --metric latency_ms:minimize +grove contribute --summary "Baseline measurements" --artifacts README.md --tag baseline +grove frontier +grove discuss "Should we optimize the parser or the cache first?" +grove claims # Optional runtime surfaces bun run src/server/serve.ts diff --git a/docs/guides/user-guide.md b/docs/guides/user-guide.md index cc861dbc..5e9a5d71 100644 --- a/docs/guides/user-guide.md +++ b/docs/guides/user-guide.md @@ -64,23 +64,23 @@ bun install bun run build ``` -If you are working from this source checkout, set a repo-local CLI helper first: +If you are working from this source checkout, define a shell helper first: ```bash -export GROVE="bun run src/cli/main.ts" +grove() { bun run src/cli/main.ts "$@"; } ``` -If you have installed the compiled bin onto your `PATH`, you can use `grove` -directly instead. +If you have installed the compiled bin onto your `PATH`, you can skip this +helper. Initialize a grove in the current repo: ```bash -$GROVE init "Optimize code search" +grove init "Optimize code search" ``` -For readability, the rest of this guide uses `grove ...` in examples. In a -source checkout, substitute `$GROVE ...`. +The rest of this guide uses `grove ...` directly. In a source checkout, that +assumes you defined the helper above. Useful variants: @@ -105,7 +105,7 @@ discussion, or investigation. Suppose you just ran: ```bash -$GROVE init "Optimize code search" +grove init "Optimize code search" ``` Now you want one agent to improve the parser, another to review it, and a third From 94a85af0e201dfd85d5d80f36bfc48be40471782 Mon Sep 17 00:00:00 2001 From: "taofeng.nju@gmail.com" Date: Wed, 11 Mar 2026 21:07:34 -0700 Subject: [PATCH 7/8] docs: use real artifact path in guide --- docs/guides/user-guide.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/guides/user-guide.md b/docs/guides/user-guide.md index 5e9a5d71..9a54f587 100644 --- a/docs/guides/user-guide.md +++ b/docs/guides/user-guide.md @@ -138,8 +138,8 @@ The main write command is `grove contribute`. A typical first contribution is a ```bash grove contribute \ --kind work \ - --summary "Replace sequential parser with worker pool" \ - --artifacts src/parser.ts \ + --summary "Tighten CLI entrypoint behavior" \ + --artifacts src/cli/main.ts \ --score throughput=5800 \ --score latency_p99=32 \ --tag optimization @@ -149,6 +149,9 @@ That records a new immutable node in the graph. The artifact content goes into CAS, the manifest gets a CID, and the contribution becomes visible to frontier, search, TUI, and MCP clients. +In the examples below, `blake3:...` means "the real CID returned by an earlier +command." + The common contribution kinds are: - `work` for a candidate implementation or result From ac2dc51c7616bfb27ff345f2e2e1432fc0ddd717 Mon Sep 17 00:00:00 2001 From: "taofeng.nju@gmail.com" Date: Wed, 11 Mar 2026 21:12:31 -0700 Subject: [PATCH 8/8] docs: clarify cid flow in guide examples --- docs/guides/user-guide.md | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/docs/guides/user-guide.md b/docs/guides/user-guide.md index 9a54f587..863644c5 100644 --- a/docs/guides/user-guide.md +++ b/docs/guides/user-guide.md @@ -149,8 +149,14 @@ That records a new immutable node in the graph. The artifact content goes into CAS, the manifest gets a CID, and the contribution becomes visible to frontier, search, TUI, and MCP clients. -In the examples below, `blake3:...` means "the real CID returned by an earlier -command." +Before moving on, note the CID that `grove contribute` prints. If you missed +it, you can recover it with `grove log`: + +```bash +grove log -n 1 +``` + +In the examples below, replace `blake3:...` with that real CID. The common contribution kinds are: @@ -178,13 +184,13 @@ review and reproduction rather than another blind code change. grove contribute \ --kind review \ --summary "Sequential path is too slow" \ - --reviews blake3:... \ + --reviews \ --score quality=0.5 grove contribute \ --kind reproduction \ --summary "Confirmed throughput improvement" \ - --reproduces blake3:... \ + --reproduces \ --score throughput=5700 ``` @@ -200,7 +206,7 @@ Use `grove discuss` as the shorthand for discussion contributions: ```bash grove discuss "Should this stay event-driven?" -grove discuss blake3:... "I think the queue should stay explicit" --tag architecture +grove discuss "I think the queue should stay explicit" --tag architecture ``` Use a root discussion when the question is broad, and a reply when the comment @@ -212,7 +218,7 @@ When you want to inspect or build on a contribution, materialize its artifacts into a workspace: ```bash -grove checkout blake3:... --to ./workspace +grove checkout --to ./workspace grove checkout --frontier throughput --to ./workspace ``` @@ -280,8 +286,8 @@ Once one contribution matters, inspect both its structural ancestry and its discussion context: ```bash -grove tree --from blake3:... -grove thread blake3:... +grove tree --from +grove thread grove threads --tag architecture ``` @@ -300,7 +306,7 @@ contribution was accepted, rejected, crashed, or invalidated without changing the contribution CID itself. ```bash -grove outcome set blake3:... accepted --reason "passes perf and regression checks" +grove outcome set accepted --reason "passes perf and regression checks" grove outcome list --status accepted -n 10 grove outcome stats ```