diff --git a/README.md b/README.md index 5fe3e9f..d4e4432 100644 --- a/README.md +++ b/README.md @@ -116,6 +116,22 @@ You pick. Three modes, one flag each. --- +## Browser compatibility + +Works on every Chromium-family browser: Edge, Chrome, Chromium, Brave, Arc. Firefox and Safari are out of scope — CDP-only. + +**Edge is the recommended daily driver.** It honors `--remote-debugging-port` on its default profile with no extra flags, so the quickstart above works verbatim. + +**Chrome has two sharp edges you'll hit if you don't know about them:** + +1. **Chrome 113+ silently ignores `--remote-debugging-port` on the default profile.** Launch Chrome without `--user-data-dir` and `ghax attach` will fail to find `/json/version`. Fix: always pass an explicit profile path to the Chrome launch command (the quickstart above already shows this). Edge has no such restriction. + +2. **Chrome updates more aggressively than Edge**, so CDP protocol changes, extension policy tweaks, and new anti-automation heuristics land there first. If a flow that worked yesterday breaks today, try the same thing on Edge — it's usually a week or two behind on the same change. + +**Both Chrome and Edge** set `navigator.webdriver = true` when launched with `--remote-debugging-port`, which a few sensitive Google services use as a bot signal. Mitigate with `--disable-blink-features=AutomationControlled` on the browser launch. Full notes on quirks and workarounds: [CONTRIBUTING.md → Known browser quirks](./CONTRIBUTING.md#known-browser-quirks). + +--- + ## What it does - **Accessibility-tree snapshots** with `@e` refs. Interact by role and name, not fragile CSS selectors. Walks open shadow roots for custom-element apps (Lit, Shoelace, web components) and emits chain selectors (`host >> inner`) that descend into shadow trees. @@ -131,6 +147,27 @@ You pick. Three modes, one flag each. --- +## Performance + +Ghax is the fastest CLI in its class because it doesn't launch a browser — it connects to one you already have running. Zero per-command launch tax. + +Cold-start workflow (launch → goto → text → eval → screenshot → snapshot → close) on `example.com`, Apple Silicon: + +| Tool | Cold start | Warm steady-state | +|------|-----------:|------------------:| +| **ghax** | **1,560 ms** | **49 ms/cmd** | +| gstack-browse | 6,697 ms | 58 ms/cmd | +| agent-browser | 3,482 ms | 344 ms/cmd | +| playwright-cli | 5,126 ms | 680 ms/cmd | + +On real-world content (Wikipedia `JavaScript` article, ~250 KB), the warm-loop gap widens further: ghax 117 ms/cmd vs playwright-cli 778 ms/cmd — 6.6× faster per command. Text extraction is 9× faster (ghax 154 ms vs playwright-cli 1,404 ms) because we hit the DOM that's already parsed instead of launching a browser just to query it. + +The binary itself: **~3 MB** stripped on Apple Silicon, **~20 ms** cold start for single-command invocations. The daemon bundle is **~80 KB** of JavaScript. + +Full methodology, per-operation breakdowns, and reproduction steps: [docs/BENCHMARK.md](./docs/BENCHMARK.md). + +--- + ## Command reference ghax ships 71 verbs. The full surface lives in `ghax --help` — no man pages, `--help` is authoritative. diff --git a/llms.txt b/llms.txt index c8601a1..943b2ee 100644 --- a/llms.txt +++ b/llms.txt @@ -64,6 +64,16 @@ ghax attach --launch # spawns a scratch Chromium with CDP ghax attach --launch --headless # no visible window (CI-style) ``` +**Chrome vs Edge — important.** Edge works out of the box with the launch command above. Chrome 113+ silently ignores `--remote-debugging-port` when it's using the default user-data-dir, so always pass an explicit profile path to Chrome: + +```bash +"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \ + --remote-debugging-port=9222 \ + --user-data-dir="$HOME/.config/chrome-ghax" & +``` + +If `ghax attach` reports that nothing is on :9222 after you launched Chrome, this is almost certainly the reason. Full quirks list in [CONTRIBUTING.md → Known browser quirks](./CONTRIBUTING.md#known-browser-quirks). + Then attach and drive: ```bash @@ -115,7 +125,7 @@ ghax ext --help # extension sub-commands - [.claude/skills/ghax.md](./.claude/skills/ghax.md): Claude Code skill (router) — auto-discovered when Claude Code opens this repo. - [.claude/skills/ghax-browse.md](./.claude/skills/ghax-browse.md): Claude Code skill (flagship) — full workflow examples. -- [docs/BENCHMARK.md](./docs/BENCHMARK.md): latency numbers vs other CLIs. +- [docs/BENCHMARK.md](./docs/BENCHMARK.md): full latency comparison vs gstack-browse, playwright-cli, agent-browser (ghax is 3.5–9× faster warm, 1.5–5× faster cold). - [docs/sessions/](./docs/sessions/): field reports from production agent runs (useful for understanding edge cases). ## Things not to do