From 15bf9b47bbd4f04e77c38e464307795e46447563 Mon Sep 17 00:00:00 2001 From: kepptic <245740836+kepptic@users.noreply.github.com> Date: Thu, 23 Apr 2026 22:50:15 -0400 Subject: [PATCH] docs: sync for buckets B/C/D MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Post-ship documentation pass after 8273e67 (buckets B+C+D merged to main). - README.md: new features surfaced in the "What ghax does today" list — `ghax batch` with auto-re-snapshot, dialog-aware snapshots, framework-safe fill. Smoke-check count refreshed. - ARCHITECTURE.md: ref-resolution section explains why batch exists (mid-plan ARIA reshuffles) and what dialog-scoping solves. - CLAUDE.md: adds a "Run a plan in one round-trip" workflow example showing the batch JSON shape and the auto-snapshot semantics. Smoke-check count refreshed. - CONTRIBUTING.md: repo layout rewritten around the Rust CLI crate (the cli.ts + browser-launch.ts rows were stale since b2748e7 deleted the Bun source); "Adding a new command" now walks through the Rust dispatch wiring instead of the removed cli.ts case. Smoke-check count refreshed. - TODOS.md: Rust CLI rewrite moved to Completed (shipped across phases 1-4, cli.ts deleted in b2748e7, Bun removed in 8d1deb5); daemon.ts domain-split remains open. --- ARCHITECTURE.md | 14 ++++++++++ CLAUDE.md | 23 +++++++++++++++-- CONTRIBUTING.md | 32 +++++++++++++++++------ README.md | 20 ++++++++++---- TODOS.md | 69 +++++++------------------------------------------ 5 files changed, 84 insertions(+), 74 deletions(-) diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 99c4a26..118ba60 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -166,6 +166,20 @@ resolve against the wrong DOM. If the DOM changed and you run `click @e3`, Playwright fails with a clear "no element" error — fix by re-snapshotting. +`ghax batch` skips that re-snapshotting ceremony for you: when a step +inside a batch plan references an `@e` ref, the daemon auto-runs a +fresh snapshot first and resolves the ref against the current DOM. +That's the main reason batch exists — on framework-heavy forms where +an earlier step (like opening a combobox) reshuffles the ARIA tree, +the ref lookup inside the same batch still lands on the intended +element. Opt out with `--no-auto-snapshot`. + +`ghax snapshot` is also **dialog-aware**: when a modal is open (by +`[role=dialog]`, `[role=alertdialog]`, ``, or +`[aria-modal=true]`), the walker treats the top-most visible modal +as the new root instead of inheriting `aria-hidden="true"` from the +outer app. `--no-dialog-scope` falls back to body-rooted. + Shadow DOM: the cursor-interactive pass walks open shadow roots and emits Playwright chain selectors (`host >> inner`). This is the only form of selector Playwright accepts for descending into shadow trees diff --git a/CLAUDE.md b/CLAUDE.md index c5ded28..9ae3d80 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -138,6 +138,25 @@ and type commands at the prompt. `exit`/`quit`/Ctrl-D to leave. Blank lines and `#` lines are ignored. Quoting works like a real shell: `try --css 'body { color: red }'` passes the whole CSS intact. +### Run a plan in one round-trip (stable refs across the sequence) + +```bash +ghax batch '[ + {"cmd":"goto","args":["https://app.example.com/settings"]}, + {"cmd":"snapshot","opts":{"interactive":true}}, + {"cmd":"click","args":["@e7"]}, + {"cmd":"wait","args":["200"]}, + {"cmd":"fill","args":["@e9","new-value"]}, + {"cmd":"click","args":["@e11"]} +]' +``` + +Unlike `chain` (reads stdin, N round-trips), `batch` ships the whole +plan in one RPC. Between steps that reference `@e` refs, the +daemon auto-re-snapshots so opening a combobox mid-plan doesn't +reindex refs out from under you. Pass `--no-auto-snapshot` for +strict one-shot semantics. + ### Share the browser with a user who's actively working ```bash @@ -172,7 +191,7 @@ Every change must pass: cargo build --release # compile Rust CLI (crates/cli/) npm run typecheck # tsc --noEmit (daemon TS + tests) npm run build # bundle daemon → dist/ghax-daemon.mjs (esbuild) -npm run test:smoke # 70-check smoke suite against a live Edge session +npm run test:smoke # 95-check smoke suite against a live Edge session ``` For bigger changes also run: @@ -212,7 +231,7 @@ design discussion: another machine" case. - **Skill acceptance eval harness** — scripted Claude API calls against the skills with tool-call assertions. Deferred indefinitely - because the 70-check E2E smoke catches the same regressions at zero + because the 95-check E2E smoke catches the same regressions at zero API cost. - ~~Source-map resolution for stack frames.~~ Shipped — opt-in via `ghax console --source-maps`. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index cf43130..9293f75 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -9,16 +9,24 @@ moving parts so you can land one without friction. ``` ghax/ bin/ghax Shell shim — launches the Rust binary from target/release/ghax + crates/cli/ Rust CLI — argv parsing, dispatch, daemon RPC. All user-facing verbs. + src/main.rs Entry point + verb dispatch table + src/dispatch.rs Per-verb routing to daemon RPC or local orchestration + src/attach.rs Daemon spawn, CDP probe, port scan, bundle resolution + src/qa.rs QA orchestrator (parallel URL crawl, screenshots, report) + src/canary.rs Post-deploy canary monitor + src/ship.rs Ship workflow (bump, changelog, commit, push, PR) + src/rpc.rs HTTP+JSON client with transient-error retry + src/state.rs State file resolution + daemon liveness + src/shell.rs Interactive REPL (ghax shell) src/ - cli.ts Argv → daemon RPC. Verb dispatcher + attach/detach specials. daemon.ts Node HTTP daemon. Playwright connectOverCDP + raw CDP pool. - browser-launch.ts Browser detect + CDP probe + scan/findFreePort + --launch/--headless. cdp-client.ts /json/list target discovery + per-target WebSocket pool. config.ts State file resolution (git root → .ghax/ghax.json). buffers.ts CircularBuffer, ConsoleEntry, NetworkEntry, parseStack(). - snapshot.ts aria tree → @e refs, cursor-interactive + shadow-DOM pass. + snapshot.ts aria tree → @e refs, cursor-interactive + shadow-DOM + dialog-scope. test/ - smoke.ts Live-browser harness (70 checks, ~30s). + smoke.ts Live-browser harness (95 checks, ~30s). cross-browser.ts Iterate every detected Chromium browser; run smoke on each. benchmark.ts Headless CLI benchmark vs gstack-browse, playwright-cli, agent-browser. hot-reload-smoke.ts Scripted MV3 hot-reload probe against test/fixtures/test-extension/. @@ -84,9 +92,17 @@ Both scripts share the same install path. Idempotent — safe to re-run. ## Adding a new command 1. Register a handler in `daemon.ts` via `register('name', async (ctx, args, opts) => {...})`. -2. Add a CLI case in `src/cli.ts` — usually one line with `makeSimple('name')`. -3. Update the HELP constant + `README.md` + `design/plan/03-commands.md`. -4. If it should be recorded by `ghax record`, do nothing (it's recorded +2. Wire the Rust dispatch in `crates/cli/src/dispatch.rs`. For trivial + verbs (parse args → POST /rpc → print), add the verb name to one + of the existing `match` arms — `simple()` does the rest. For verbs + with CLI-side logic (custom print, multi-RPC, shell-out), add a + new module under `crates/cli/src/.rs` exposing + `pub fn cmd_(parsed: &Parsed) -> Result`, then wire it + in `dispatch.rs::dispatch_inner` and declare `mod ;` in + `main.rs`. See `qa.rs`, `ship.rs`, `attach.rs` for templates. +3. Update `crates/cli/src/help.rs` + `README.md` + `design/plan/03-commands.md`. +4. Add a smoke check in `test/smoke.ts`. +5. If it should be recorded by `ghax record`, do nothing (it's recorded by default). If it's meta / read-only, add the name to `NEVER_RECORD` in `daemon.ts`. 5. If it has a custom exit code, throw `new DaemonError(msg, code)` and @@ -126,7 +142,7 @@ npm run test:perf # perf budget test — FAILS if P50 regresses past t ``` The smoke test requires a running Chromium-family browser on -`--remote-debugging-port=9222`. It attaches, runs **70 non-destructive +`--remote-debugging-port=9222`. It attaches, runs **95 non-destructive commands** (navigation, snapshots, interaction, extensions, orchestrated verbs, `try`, `perf`, console dedup, network status/HAR, new-window workflow, `shell` mode tokenising), and detaches. Takes ~30s end-to-end. diff --git a/README.md b/README.md index fc945e3..9f261f2 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ instead of spinning up sandboxed copies. **Status**: v0.4 complete. Flagship `ghax browse` plus an orchestrated layer (`qa`, `perf`, `profile`, `diff-state`, `ship`, `canary`, `review`, `pair`, `try`) and a background-window workflow -(`find`, `new-window`, `tab --quiet`) for multi-agent use. 70/70 smoke +(`find`, `new-window`, `tab --quiet`) for multi-agent use. 95 smoke checks on Edge + Chrome. Repo is private under `kepptic` for now; open-source release paused. @@ -57,10 +57,20 @@ Attach to a running Chrome or Edge over CDP, then drive it: message instead of a raw Playwright stack trace. - **Responsive testing**: `ghax responsive` snaps mobile / tablet / desktop widths; `ghax viewport WxH` for one-offs. -- **Batch + record + render**: pipe JSON to `ghax chain` for scripted flows; - `ghax record start / stop` captures every command into a replayable - `.ghax/recordings/.json`; `ghax gif ` stitches the - frames via ffmpeg. +- **Batch + record + render**: `ghax batch '[{"cmd":"click","args":["@e7"]}, …]'` + ships a whole plan in one round-trip and auto-re-snapshots between + ref-using steps (so clicking a combobox that reshuffles the ARIA tree + doesn't wreck later refs); `ghax chain` reads the same shape from + stdin for ad-hoc flows; `ghax record start / stop` captures every + command into a replayable `.ghax/recordings/.json`; `ghax gif + ` stitches the frames via ffmpeg. +- **Dialog-aware snapshots**: when a modal is open (`[role=dialog]`, + ``, `[aria-modal=true]`), `ghax snapshot` walks the + dialog instead of the body. Fall back with `--no-dialog-scope`. +- **Framework-safe `fill`**: native-setter + `input` for React, + explicit `blur` for Angular validators, and `contenteditable` paths + for Material chip inputs and rich editors — so `fill @e5 "hello"` + actually updates state across every framework you'd hit in the wild. ## Install diff --git a/TODOS.md b/TODOS.md index 02f0400..c4a49ad 100644 --- a/TODOS.md +++ b/TODOS.md @@ -10,66 +10,9 @@ belong here — either flesh it out or close it. ## Open -### Rewrite the CLI in Rust (public-release gate) - -**What:** Replace `src/cli.ts` (~2,071 lines) with a Rust crate that -produces platform-specific binaries via `cargo-dist`. Daemon stays -Node/Playwright. Full design + phasing in -[`design/plan/06-rust-cli-rewrite.md`](./design/plan/06-rust-cli-rewrite.md). - -**Why:** Distribution. The Bun-compiled CLI is 61MB because it embeds -the Bun runtime. A stripped Rust binary is ~10MB per platform. This -is the last concrete friction between current ghax and a public -release we'd be satisfied shipping. - -Secondary wins: ~2-5ms cold start (vs 37ms Bun), no runtime -dependency, standard `cargo install` / `brew install` distribution, -no per-platform Bun builds needed (one `cargo build --release --target -` per OS × arch). - -**Pros:** -- 6x smaller binary per platform (~10MB vs 61MB) -- 7-15x faster cold start for single-command invocations -- Standard Rust cross-compile toolchain via cargo-dist handles - macOS/Linux/Windows × x64/ARM in one CI workflow -- Opens clean install paths: Homebrew tap, `cargo install ghax`, npm - wrapper, direct GitHub Release download -- Rust binary is a more inviting open-source artifact than a 60MB blob - -**Cons:** -- 3-4 days active dev time (per the phasing plan) -- Dual-language repo during the rewrite window (mitigated by a parity - diff test in CI) -- Contributor pool shifts slightly — JS/TS folks contributing to CLI - vs Rust folks. Daemon stays TS so JS contributors still have turf. -- Node remains a runtime dependency (for daemon) — we can't eliminate - it without replacing Playwright, which is out of scope. - -**Context:** -- Decision recorded 2026-04-19 after a perf deep-dive showed the - stack is already at its physical floor for single-command - invocations (~30ms, dominated by Bun CLI spawn). -- The design doc covers architecture, dependency choices, per-verb - porting plan, distribution story, phasing (4 phases), risks, and - success criteria (8 green checks gate the switch). -- Phase 1 is template work: 45 trivial verbs that are pure RPC + - print. Fast. -- Phase 2 is the real work: attach, qa, canary, ship, review — 8 - verbs with CLI-side orchestration logic. -- Phase 3 is SSE + REPL (console/network --follow, ghax shell). -- Phase 4 flips `bin/ghax` to prefer the Rust binary. - -**Depends on / blocked by:** Nothing. The Rust CLI and Bun CLI can -coexist during the rewrite. Dual-maintenance window lasts ~1-2 weeks. - -**Effort:** ~3-4 days active, spread over 2-3 weeks calendar. - -**Success criteria:** All 8 gates in `06-rust-cli-rewrite.md` green -(binary sizes, smoke parity, perf floor, parity diff, Homebrew -install, docs, cargo-dist release workflow). - ### Split `src/daemon.ts` by domain + **What:** Extract handler groups into domain-specific files. Approved in plan-eng-review on 2026-04-19. @@ -127,4 +70,12 @@ smoke re-verification. ## Completed -(Items move here from "Open" once they ship, with commit reference.) +- **Rewrite the CLI in Rust (public-release gate)** — shipped across + phases 1-4. `src/cli.ts` deleted in `b2748e7` (refactor: remove the + Bun CLI source — Rust is the single source of truth). `bin/ghax` + shim now prefers `target/release/ghax`; installed users run the + Rust binary directly. Bun runtime fully removed in `8d1deb5`; + esbuild bundles the daemon, tsx runs the tests. All 8 success + gates green: ~2.6 MB stripped Apple Silicon binary (under the 10MB + target), 70/70 smoke parity, cold-start floor hit, cross-browser + green on Edge + Chrome, install-link/install-release flows live.