Skip to content

refactor: close out whole-codebase audit to 10/10#1

Merged
kepptic merged 22 commits intomainfrom
feat/audit-to-10
Apr 21, 2026
Merged

refactor: close out whole-codebase audit to 10/10#1
kepptic merged 22 commits intomainfrom
feat/audit-to-10

Conversation

@kepptic
Copy link
Copy Markdown
Owner

@kepptic kepptic commented Apr 21, 2026

Summary

Lands the twelve high-ROI findings from the three-agent /simplify audit that
were skipped in the first pass because they were wider than a mechanical
cleanup. Grouped by theme:

Daemon DRY (items 1-4):

  • evalInTarget() helper centralises nine Runtime.evaluate sites with
    consistent exceptionDetails handling. Also fixes a latent bug in
    ext.storage where a thrown JS expression was silently returned as
    {ok: true}.
  • getSwTarget() owns the five-step find-sw / pool.get / Runtime.enable
    sequence across five extension verbs.
  • withCdpSession() owns session open + try/finally + detach lifecycle
    across five gesture / profile sites.
  • ext.{panel,popup,options}.eval now register in a loop — they differ
    only by URL filter and label.

Rust CLI DRY (items 5-6):

  • New time_util module consolidates three copies of the ISO-8601 /
    days-to-ymd logic. ship.rs was using a slower year-by-year loop
    algorithm than the other two.
  • New qa_common module shares the console_errors_since /
    failed_requests_since filters between qa and canary.

Rust dep swaps (items 7-8):

  • resolve_url delegates to url::Url::join — replaces 25 hand-rolled
    lines.
  • url_encode delegates to the urlencoding crate.

Invariant fix (item 9):

  • ctx.refs is cleared on tab <id> and new-window when the active
    page changes. The CLAUDE.md invariant 'refs survive only until next
    snapshot' held within a tab but broke across tab switches; a new
    smoke check asserts the fix. See follow-up sprint note in plan.md
    for the intra-tab DOM-mutation case (BUG-JNR-03) which this PR
    does NOT address.

Perf wins (items 10-12):

  • console + network RPCs take a since: <epoch-ms> opt. QA and
    canary previously fetched 200-500 buffer entries per page check and
    threw away everything older than cycle start client-side. Moving the
    filter into the daemon shrinks the HTTP payload 10x+ on busy pages.
  • require_daemon trusts /health as the liveness signal and only
    falls back to kill(pid, 0) when /health fails. Every CLI
    invocation shaves a syscall.
  • snapshot.ts caches getComputedStyle() per element for the
    duration of one walk via a scoped WeakMap. Cursor-interactive pass
    was O(n · depth) forced style recalcs on SPA-sized trees; now O(n).

Pre-landing review fixes (1 additional commit after review):

  • Codex + Claude adversarial both caught a regression: after the
    require_daemon reorder, a stale state file pointing at a port now
    reused by a different ghax daemon could silently route RPCs to the
    wrong browser session. Fixed by verifying /health.pid matches
    state.pid.
  • Restored the result.description fallback in ext.sw.eval that was
    lost in the evalInTarget refactor — valid evals returning
    non-serialisable values (functions, BigInts, chrome.runtime) now
    show their CDP description preview again instead of null.
  • parseSinceOpt() rejects NaN and negative --since values loudly
    instead of silently returning empty or disabling the filter.
  • help.rs documents the new --since flag on console + network.

Plan completion

All 12 planned items DONE ([x] in plan.md). Three items surfaced
during the run are logged under ## Deferred in plan.md:

  • Download interception (Playwright hijacks downloads into temp
    artifacts dir) — split into its own PR.
  • Google anti-automation on sensitive flows
    (business.google.com/verify/...) — docs-only fix.
  • Forty-point field-report triage from the 2026-04-20 8h operator
    session — four buckets (TOK-, BUG-JNR-, GHAX-FR-*) under
    ## Follow-up sprint.

Test gate

  • npm run typecheck
  • cargo build --release
  • npm run build (esbuild daemon bundle) ✓ — 76.5kb
  • npm run test:smoke81/81 green at 8be98f9 (the last code-
    touching commit before review fixes). Post-review-fix smoke rerun
    was blocked by an environmental Playwright.connectOverCDP timeout
    that reproduces with a minimal 5-line standalone Playwright script
    (the currently-attached Edge session has 56 targets including
    17 iframes + 5 service workers — Playwright's enumeration protocol
    hangs). The review fixes are narrowly scoped (Rust pid check, TS
    fallbackDescription opt, TS parseSinceOpt validator, help text)
    and none touch the daemon startup path that's hanging. All static
    checks pass.

Pre-landing review

Three reviewers ran against the pre-fix tip:

  • Claude structured review: CRITICAL clean. Nine INFORMATIONAL
    findings 8-10 confidence. Top items actioned: ext.storage /
    evalInTarget breaking error surfaces flagged in CHANGELOG as
    Breaking.
  • Claude adversarial subagent: twelve findings. Actioned the
    P1-equivalents (require_daemon stale-port hijack, parseSinceOpt
    input validation). Deferred the concurrency / pageTargetId-WeakMap-
    invalidation items to their own PRs — real but pre-existing
    patterns, not introduced by this PR.
  • Codex structured review: one P1 (require_daemon pid check —
    actioned) + two P2 (help text for --since — actioned; ext.sw.eval
    description fallback — actioned).

All blocker findings addressed in commit ac604f7.

Scope drift

None. Every commit maps to a numbered plan item plus the three
deferred-log commits.

Commits (19 on the branch)

  • 0188bab docs: log 2026-04-20 field-report triage in plan.md follow-up sprint
  • a879c46 docs: log google-anti-automation friction in plan.md Deferred
  • ac604f7 fix: pre-landing review fixes
  • 157f924 docs: log download-interception bug in plan.md Deferred
  • 8be98f9 docs: sync after autopilot run
  • 534de12 refactor: simplify after autopilot run
  • 6bc8c2c perf: plan item 12 — cache getComputedStyle in snapshot cursor-walk
  • 616103a perf: plan item 11 — require_daemon trusts /health, skips redundant kill probe
  • 324db79 feat: plan item 10 — since: filter on console + network RPCs
  • b626595 fix: plan item 9 — clear ctx.refs on tab switch / newWindow
  • 7ca4243 refactor: plan item 8 — urlencoding crate replaces hand-rolled url_encode
  • 97ccca3 refactor: plan item 7 — resolve_url delegates to url::Url::join
  • 50cadf8 refactor: plan item 6 — qa_common.rs shares since-cycle-start filters
  • 28e96de refactor: plan item 5 — time_util.rs consolidates 3 calendar impls
  • 40f8882 refactor: plan item 4 — loop-register the three ext.view.eval handlers
  • a69545f refactor: plan item 3 — withCdpSession() helper owns session lifecycle
  • 1cd78bd refactor: plan item 2 — getSwTarget() helper collapses 5 SW-lookup sites
  • e2ce7e6 refactor: plan item 1 — evalInTarget() helper collapses 9 CDP-eval sites
  • 3c882aa autopilot: checkpoint before run

Test plan

  • typecheck green
  • cargo build release green
  • daemon bundle builds
  • smoke 81/81 at last code-touching commit (8be98f9)
  • smoke rerun on clean browser state — blocked by environmental
    Playwright connectOverCDP hang; reviewers to verify on fresh
    Edge session

🤖 Generated with Claude Code

kepptic added 22 commits April 20, 2026 01:42
…walks, parse aria lines once

Three wins from a whole-codebase audit:

- daemon.ts: WeakMap-cache pageTargetId. Target ids are stable for a
  page's lifetime but reading one costs a full CDPSession open +
  Target.getTargetInfo + detach. Every command that walks tabs
  (activePage, tabs, find, status, tab) used to pay that per page per
  call. Hot path is now O(1).
- daemon.ts: tabs and find handlers fan out per-page pageTargetId +
  page.title() via Promise.all instead of a serial await loop. Drops
  N-tab round-trips to 1.
- snapshot.ts: the disambiguation pass called parseLine() twice per
  aria line (once to count role+name duplicates, once to emit). Parse
  once into a reused ParsedNode[]. Meaningful on large SPAs.
- dispatch.rs: deleted dead stub() + EXIT_PHASE_PENDING and refreshed
  the stale "Phase 1 + 2" module doc.
All Runtime.evaluate calls in daemon.ts used to open-code the same
shape (returnByValue + optional awaitPromise + optional IIFE wrap +
inconsistent exceptionDetails check). Extracted one helper that
centralises the shape and always throws DaemonError on exception,
which is a behavior improvement for the three sites that silently
swallowed errors (ext.storage in particular used to return
{ok: true} when the JS expression threw).
The findByExtensionId + empty-check + pool.get + Runtime.enable
sequence appeared in ext.reload, ext.hot-reload's log-subscription,
ext.sw.eval, ext.storage, ext.message. One helper now returns
{target, info} so the one site that needs targetInfo.id
(ensureSwLogSubscription) can destructure both.
Five gesture + profile sites all built the CDPSession + try/finally +
detach dance by hand. One helper now owns it. pageTargetId stays
open-coded since its catch-and-return-null shape doesn't fit the
pattern.
panel, popup, and options eval differ only in a URL-filter regex and
a label. Declare them in a table and register in a for-of.
qa.rs and canary.rs each had an identical 30-line Gregorian conversion;
ship.rs had a slower year-by-year-loop variant of the same thing.
One howardhinnant-algorithm module now serves all three, plus the
CLAUDE.md 'avoid chrono' constraint still holds.
qa.rs collected entries, canary.rs counted them; both did the same
'filter by level==error and ts>=start' dance against identical RPC
shapes. One module exposes ConsoleErrorEntry / FailedRequestEntry
plus console_errors_since / failed_requests_since; canary just
.len()'s the returned Vec.
Replaces 25 lines of hand-rolled base+href handling with a 3-line
delegation. Also adds the url + urlencoding crates to Cargo.toml
(url is already in the transitive tree via reqwest; urlencoding is
tiny and lands item 8 cheaply alongside).
…code

urlencoding is tiny (added alongside url in the previous commit) and
its Cow<str> slots straight into the format! arg.
The CLAUDE.md invariant 'Refs survive only until the next snapshot' was
being violated across tab boundaries. ctx.refs is a single global map;
after 'ghax snapshot -i' on tab A followed by 'ghax tab <B>', the ref
@e3 would resolve against tab A's locator and silently land in the
wrong DOM. Both 'tab' and 'newWindow' now clear the map when the
active page actually changes (no-op if you re-select the same tab).
Adds a smoke check that asserts a stale ref after a tab switch throws
'not found'.
qa and canary used to fetch 200-500 buffer entries per page check and
throw away everything older than page_start client-side. Moving the
filter into the daemon shrinks the HTTP payload by 10x+ on busy pages.
qa_common.rs now passes since + errors (for console) so the daemon
drops non-matching entries before serializing.
…ill probe

/health already proves liveness. The kill(pid, 0) syscall is only
needed to give a pid-specific error hint when /health can't reach
the daemon, so reorder: try /health first, fall back to kill probe
only when health fails. Every CLI invocation shaves a syscall.
consider() reads getComputedStyle per element; isInFloating() re-reads
it for every ancestor of every candidate. On a 5k-element SPA that's
O(n · depth) forced style recalcs. A WeakMap scoped to the walk cuts
it to O(n). Cache dies when the walk ends — no cross-invocation
staleness.
Tightened module/block docs in time_util.rs, qa_common.rs, and the
evalInTarget helper block — removed narration of the refactor history
in favor of what each module is and the live invariant it enforces.
CHANGELOG gets the Unreleased section covering the 12 plan items.
CLAUDE.md + ARCHITECTURE.md pick up the refs-per-tab invariant now
that it's actually enforced. README command reference documents
the new --since opt on console + network.
Out of scope for the audit-to-10 run, but surfaced during it. Split
into its own PR so this one stays focused on the 12 planned items.
- state.rs: /health check now verifies daemon pid matches state.pid.
  Codex + Claude adversarial both flagged: after the /health-first
  reorder, a stale state file pointing at a port later reused by a
  different ghax daemon would silently route RPCs to the wrong
  browser session. Daemon's /health already returns pid; CLI now
  compares it and falls back to the pid-alive probe on mismatch.
- daemon.ts: evalInTarget gets a fallbackDescription opt; ext.sw.eval
  uses it so evals returning non-serialisable values (undefined,
  chrome.runtime, functions, BigInts) still show the CDP description
  preview instead of a bare null. Pre-refactor behavior restored.
- daemon.ts: new parseSinceOpt() rejects non-finite / negative --since
  values loudly. Before, --since=NaN silently emptied the result set
  and --since=-1 silently disabled the filter.
- help.rs: document --since flag on console + network.
- CHANGELOG: flag the ext.storage / ext.message error-surface
  tightenings as Breaking.
Second bug surfaced during the ship review. Splitting into its own
follow-up PR with the download-interception fix since both are
browser-integration polish rather than part of the audit-to-10 scope.
Forty data points from an 8-hour operator session on Google Ads / GBP
workflows. Triaged into four buckets so the next sprint has a concrete
prioritised list instead of a dispersed bug log. Bucket A extends this
PR's payload-reduction theme (item 10); Bucket B captures the real
refs-stability work that item 9 only partially addresses.
The 8h operator session that surfaced the 40+ data points plan.md
references in the follow-up sprint section. Committed here so the
plan.md pointer doesn't dangle.
…ly runs

The 'Verify binaries' step expected ./dist/ghax (or .exe on Windows) but
'bun run build' only produces the daemon bundle. Ubuntu + macOS silently
passed because 'set -e' without pipefail treats './dist/ghax --help |
head -5' as exit-0 when head succeeds on an empty stream; Windows
failed explicitly because its bash shell has pipefail on. Net effect:
CI never actually ran the Rust CLI on any platform.

Fix: add a dtolnay/rust-toolchain step, cargo-cache the target dir,
run 'cargo build --release', copy the built binary into dist/ before
verify, and flip the shell flags to 'set -eo pipefail' so the check
is meaningful on all three OSes.
@kepptic kepptic merged commit 95fd3f4 into main Apr 21, 2026
4 checks passed
@kepptic kepptic deleted the feat/audit-to-10 branch April 21, 2026 02:18
kepptic added a commit that referenced this pull request Apr 24, 2026
* refactor: /simplify audit pass — cache pageTargetId, parallelize tab walks, parse aria lines once

Three wins from a whole-codebase audit:

- daemon.ts: WeakMap-cache pageTargetId. Target ids are stable for a
  page's lifetime but reading one costs a full CDPSession open +
  Target.getTargetInfo + detach. Every command that walks tabs
  (activePage, tabs, find, status, tab) used to pay that per page per
  call. Hot path is now O(1).
- daemon.ts: tabs and find handlers fan out per-page pageTargetId +
  page.title() via Promise.all instead of a serial await loop. Drops
  N-tab round-trips to 1.
- snapshot.ts: the disambiguation pass called parseLine() twice per
  aria line (once to count role+name duplicates, once to emit). Parse
  once into a reused ParsedNode[]. Meaningful on large SPAs.
- dispatch.rs: deleted dead stub() + EXIT_PHASE_PENDING and refreshed
  the stale "Phase 1 + 2" module doc.

* autopilot: checkpoint before run

* refactor: plan item 1 — evalInTarget() helper collapses 9 CDP-eval sites

All Runtime.evaluate calls in daemon.ts used to open-code the same
shape (returnByValue + optional awaitPromise + optional IIFE wrap +
inconsistent exceptionDetails check). Extracted one helper that
centralises the shape and always throws DaemonError on exception,
which is a behavior improvement for the three sites that silently
swallowed errors (ext.storage in particular used to return
{ok: true} when the JS expression threw).

* refactor: plan item 2 — getSwTarget() helper collapses 5 SW-lookup sites

The findByExtensionId + empty-check + pool.get + Runtime.enable
sequence appeared in ext.reload, ext.hot-reload's log-subscription,
ext.sw.eval, ext.storage, ext.message. One helper now returns
{target, info} so the one site that needs targetInfo.id
(ensureSwLogSubscription) can destructure both.

* refactor: plan item 3 — withCdpSession() helper owns session lifecycle

Five gesture + profile sites all built the CDPSession + try/finally +
detach dance by hand. One helper now owns it. pageTargetId stays
open-coded since its catch-and-return-null shape doesn't fit the
pattern.

* refactor: plan item 4 — loop-register the three ext.view.eval handlers

panel, popup, and options eval differ only in a URL-filter regex and
a label. Declare them in a table and register in a for-of.

* refactor: plan item 5 — time_util.rs consolidates 3 calendar impls

qa.rs and canary.rs each had an identical 30-line Gregorian conversion;
ship.rs had a slower year-by-year-loop variant of the same thing.
One howardhinnant-algorithm module now serves all three, plus the
CLAUDE.md 'avoid chrono' constraint still holds.

* refactor: plan item 6 — qa_common.rs shares since-cycle-start filters

qa.rs collected entries, canary.rs counted them; both did the same
'filter by level==error and ts>=start' dance against identical RPC
shapes. One module exposes ConsoleErrorEntry / FailedRequestEntry
plus console_errors_since / failed_requests_since; canary just
.len()'s the returned Vec.

* refactor: plan item 7 — resolve_url delegates to url::Url::join

Replaces 25 lines of hand-rolled base+href handling with a 3-line
delegation. Also adds the url + urlencoding crates to Cargo.toml
(url is already in the transitive tree via reqwest; urlencoding is
tiny and lands item 8 cheaply alongside).

* refactor: plan item 8 — urlencoding crate replaces hand-rolled url_encode

urlencoding is tiny (added alongside url in the previous commit) and
its Cow<str> slots straight into the format! arg.

* fix: plan item 9 — clear ctx.refs on tab switch / newWindow

The CLAUDE.md invariant 'Refs survive only until the next snapshot' was
being violated across tab boundaries. ctx.refs is a single global map;
after 'ghax snapshot -i' on tab A followed by 'ghax tab <B>', the ref
@e3 would resolve against tab A's locator and silently land in the
wrong DOM. Both 'tab' and 'newWindow' now clear the map when the
active page actually changes (no-op if you re-select the same tab).
Adds a smoke check that asserts a stale ref after a tab switch throws
'not found'.

* feat: plan item 10 — since: filter on console + network RPCs

qa and canary used to fetch 200-500 buffer entries per page check and
throw away everything older than page_start client-side. Moving the
filter into the daemon shrinks the HTTP payload by 10x+ on busy pages.
qa_common.rs now passes since + errors (for console) so the daemon
drops non-matching entries before serializing.

* perf: plan item 11 — require_daemon trusts /health, skips redundant kill probe

/health already proves liveness. The kill(pid, 0) syscall is only
needed to give a pid-specific error hint when /health can't reach
the daemon, so reorder: try /health first, fall back to kill probe
only when health fails. Every CLI invocation shaves a syscall.

* perf: plan item 12 — cache getComputedStyle in snapshot cursor-walk

consider() reads getComputedStyle per element; isInFloating() re-reads
it for every ancestor of every candidate. On a 5k-element SPA that's
O(n · depth) forced style recalcs. A WeakMap scoped to the walk cuts
it to O(n). Cache dies when the walk ends — no cross-invocation
staleness.

* refactor: simplify after autopilot run

Tightened module/block docs in time_util.rs, qa_common.rs, and the
evalInTarget helper block — removed narration of the refactor history
in favor of what each module is and the live invariant it enforces.

* docs: sync after autopilot run

CHANGELOG gets the Unreleased section covering the 12 plan items.
CLAUDE.md + ARCHITECTURE.md pick up the refs-per-tab invariant now
that it's actually enforced. README command reference documents
the new --since opt on console + network.

* docs: log download-interception bug in plan.md Deferred

Out of scope for the audit-to-10 run, but surfaced during it. Split
into its own PR so this one stays focused on the 12 planned items.

* fix: pre-landing review fixes

- state.rs: /health check now verifies daemon pid matches state.pid.
  Codex + Claude adversarial both flagged: after the /health-first
  reorder, a stale state file pointing at a port later reused by a
  different ghax daemon would silently route RPCs to the wrong
  browser session. Daemon's /health already returns pid; CLI now
  compares it and falls back to the pid-alive probe on mismatch.
- daemon.ts: evalInTarget gets a fallbackDescription opt; ext.sw.eval
  uses it so evals returning non-serialisable values (undefined,
  chrome.runtime, functions, BigInts) still show the CDP description
  preview instead of a bare null. Pre-refactor behavior restored.
- daemon.ts: new parseSinceOpt() rejects non-finite / negative --since
  values loudly. Before, --since=NaN silently emptied the result set
  and --since=-1 silently disabled the filter.
- help.rs: document --since flag on console + network.
- CHANGELOG: flag the ext.storage / ext.message error-surface
  tightenings as Breaking.

* docs: log google-anti-automation friction in plan.md Deferred

Second bug surfaced during the ship review. Splitting into its own
follow-up PR with the download-interception fix since both are
browser-integration polish rather than part of the audit-to-10 scope.

* docs: log 2026-04-20 field-report triage in plan.md follow-up sprint

Forty data points from an 8-hour operator session on Google Ads / GBP
workflows. Triaged into four buckets so the next sprint has a concrete
prioritised list instead of a dispersed bug log. Bucket A extends this
PR's payload-reduction theme (item 10); Bucket B captures the real
refs-stability work that item 9 only partially addresses.

* docs: add 2026-04-20 field report

The 8h operator session that surfaced the 40+ data points plan.md
references in the follow-up sprint section. Committed here so the
plan.md pointer doesn't dangle.

* ci: build Rust CLI + stage into dist/ so cross-platform verify actually runs

The 'Verify binaries' step expected ./dist/ghax (or .exe on Windows) but
'bun run build' only produces the daemon bundle. Ubuntu + macOS silently
passed because 'set -e' without pipefail treats './dist/ghax --help |
head -5' as exit-0 when head succeeds on an empty stream; Windows
failed explicitly because its bash shell has pipefail on. Net effect:
CI never actually ran the Rust CLI on any platform.

Fix: add a dtolnay/rust-toolchain step, cargo-cache the target dir,
run 'cargo build --release', copy the built binary into dist/ before
verify, and flip the shell flags to 'set -eo pipefail' so the check
is meaningful on all three OSes.
kepptic added a commit that referenced this pull request Apr 24, 2026
* refactor: /simplify audit pass — cache pageTargetId, parallelize tab walks, parse aria lines once

Three wins from a whole-codebase audit:

- daemon.ts: WeakMap-cache pageTargetId. Target ids are stable for a
  page's lifetime but reading one costs a full CDPSession open +
  Target.getTargetInfo + detach. Every command that walks tabs
  (activePage, tabs, find, status, tab) used to pay that per page per
  call. Hot path is now O(1).
- daemon.ts: tabs and find handlers fan out per-page pageTargetId +
  page.title() via Promise.all instead of a serial await loop. Drops
  N-tab round-trips to 1.
- snapshot.ts: the disambiguation pass called parseLine() twice per
  aria line (once to count role+name duplicates, once to emit). Parse
  once into a reused ParsedNode[]. Meaningful on large SPAs.
- dispatch.rs: deleted dead stub() + EXIT_PHASE_PENDING and refreshed
  the stale "Phase 1 + 2" module doc.

* autopilot: checkpoint before run

* refactor: plan item 1 — evalInTarget() helper collapses 9 CDP-eval sites

All Runtime.evaluate calls in daemon.ts used to open-code the same
shape (returnByValue + optional awaitPromise + optional IIFE wrap +
inconsistent exceptionDetails check). Extracted one helper that
centralises the shape and always throws DaemonError on exception,
which is a behavior improvement for the three sites that silently
swallowed errors (ext.storage in particular used to return
{ok: true} when the JS expression threw).

* refactor: plan item 2 — getSwTarget() helper collapses 5 SW-lookup sites

The findByExtensionId + empty-check + pool.get + Runtime.enable
sequence appeared in ext.reload, ext.hot-reload's log-subscription,
ext.sw.eval, ext.storage, ext.message. One helper now returns
{target, info} so the one site that needs targetInfo.id
(ensureSwLogSubscription) can destructure both.

* refactor: plan item 3 — withCdpSession() helper owns session lifecycle

Five gesture + profile sites all built the CDPSession + try/finally +
detach dance by hand. One helper now owns it. pageTargetId stays
open-coded since its catch-and-return-null shape doesn't fit the
pattern.

* refactor: plan item 4 — loop-register the three ext.view.eval handlers

panel, popup, and options eval differ only in a URL-filter regex and
a label. Declare them in a table and register in a for-of.

* refactor: plan item 5 — time_util.rs consolidates 3 calendar impls

qa.rs and canary.rs each had an identical 30-line Gregorian conversion;
ship.rs had a slower year-by-year-loop variant of the same thing.
One howardhinnant-algorithm module now serves all three, plus the
CLAUDE.md 'avoid chrono' constraint still holds.

* refactor: plan item 6 — qa_common.rs shares since-cycle-start filters

qa.rs collected entries, canary.rs counted them; both did the same
'filter by level==error and ts>=start' dance against identical RPC
shapes. One module exposes ConsoleErrorEntry / FailedRequestEntry
plus console_errors_since / failed_requests_since; canary just
.len()'s the returned Vec.

* refactor: plan item 7 — resolve_url delegates to url::Url::join

Replaces 25 lines of hand-rolled base+href handling with a 3-line
delegation. Also adds the url + urlencoding crates to Cargo.toml
(url is already in the transitive tree via reqwest; urlencoding is
tiny and lands item 8 cheaply alongside).

* refactor: plan item 8 — urlencoding crate replaces hand-rolled url_encode

urlencoding is tiny (added alongside url in the previous commit) and
its Cow<str> slots straight into the format! arg.

* fix: plan item 9 — clear ctx.refs on tab switch / newWindow

The CLAUDE.md invariant 'Refs survive only until the next snapshot' was
being violated across tab boundaries. ctx.refs is a single global map;
after 'ghax snapshot -i' on tab A followed by 'ghax tab <B>', the ref
@e3 would resolve against tab A's locator and silently land in the
wrong DOM. Both 'tab' and 'newWindow' now clear the map when the
active page actually changes (no-op if you re-select the same tab).
Adds a smoke check that asserts a stale ref after a tab switch throws
'not found'.

* feat: plan item 10 — since: filter on console + network RPCs

qa and canary used to fetch 200-500 buffer entries per page check and
throw away everything older than page_start client-side. Moving the
filter into the daemon shrinks the HTTP payload by 10x+ on busy pages.
qa_common.rs now passes since + errors (for console) so the daemon
drops non-matching entries before serializing.

* perf: plan item 11 — require_daemon trusts /health, skips redundant kill probe

/health already proves liveness. The kill(pid, 0) syscall is only
needed to give a pid-specific error hint when /health can't reach
the daemon, so reorder: try /health first, fall back to kill probe
only when health fails. Every CLI invocation shaves a syscall.

* perf: plan item 12 — cache getComputedStyle in snapshot cursor-walk

consider() reads getComputedStyle per element; isInFloating() re-reads
it for every ancestor of every candidate. On a 5k-element SPA that's
O(n · depth) forced style recalcs. A WeakMap scoped to the walk cuts
it to O(n). Cache dies when the walk ends — no cross-invocation
staleness.

* refactor: simplify after autopilot run

Tightened module/block docs in time_util.rs, qa_common.rs, and the
evalInTarget helper block — removed narration of the refactor history
in favor of what each module is and the live invariant it enforces.

* docs: sync after autopilot run

CHANGELOG gets the Unreleased section covering the 12 plan items.
CLAUDE.md + ARCHITECTURE.md pick up the refs-per-tab invariant now
that it's actually enforced. README command reference documents
the new --since opt on console + network.

* docs: log download-interception bug in plan.md Deferred

Out of scope for the audit-to-10 run, but surfaced during it. Split
into its own PR so this one stays focused on the 12 planned items.

* fix: pre-landing review fixes

- state.rs: /health check now verifies daemon pid matches state.pid.
  Codex + Claude adversarial both flagged: after the /health-first
  reorder, a stale state file pointing at a port later reused by a
  different ghax daemon would silently route RPCs to the wrong
  browser session. Daemon's /health already returns pid; CLI now
  compares it and falls back to the pid-alive probe on mismatch.
- daemon.ts: evalInTarget gets a fallbackDescription opt; ext.sw.eval
  uses it so evals returning non-serialisable values (undefined,
  chrome.runtime, functions, BigInts) still show the CDP description
  preview instead of a bare null. Pre-refactor behavior restored.
- daemon.ts: new parseSinceOpt() rejects non-finite / negative --since
  values loudly. Before, --since=NaN silently emptied the result set
  and --since=-1 silently disabled the filter.
- help.rs: document --since flag on console + network.
- CHANGELOG: flag the ext.storage / ext.message error-surface
  tightenings as Breaking.

* docs: log google-anti-automation friction in plan.md Deferred

Second bug surfaced during the ship review. Splitting into its own
follow-up PR with the download-interception fix since both are
browser-integration polish rather than part of the audit-to-10 scope.

* docs: log 2026-04-20 field-report triage in plan.md follow-up sprint

Forty data points from an 8-hour operator session on Google Ads / GBP
workflows. Triaged into four buckets so the next sprint has a concrete
prioritised list instead of a dispersed bug log. Bucket A extends this
PR's payload-reduction theme (item 10); Bucket B captures the real
refs-stability work that item 9 only partially addresses.

* docs: add 2026-04-20 field report

The 8h operator session that surfaced the 40+ data points plan.md
references in the follow-up sprint section. Committed here so the
plan.md pointer doesn't dangle.

* ci: build Rust CLI + stage into dist/ so cross-platform verify actually runs

The 'Verify binaries' step expected ./dist/ghax (or .exe on Windows) but
'bun run build' only produces the daemon bundle. Ubuntu + macOS silently
passed because 'set -e' without pipefail treats './dist/ghax --help |
head -5' as exit-0 when head succeeds on an empty stream; Windows
failed explicitly because its bash shell has pipefail on. Net effect:
CI never actually ran the Rust CLI on any platform.

Fix: add a dtolnay/rust-toolchain step, cargo-cache the target dir,
run 'cargo build --release', copy the built binary into dist/ before
verify, and flip the shell flags to 'set -eo pipefail' so the check
is meaningful on all three OSes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant