From c950f0e8bf67faed39630acc15946d899a7ff563 Mon Sep 17 00:00:00 2001 From: Bingran You Date: Wed, 8 Apr 2026 17:34:54 -0700 Subject: [PATCH] docs: capture test snapshot boundaries and family anchors --- .../evidence-levels-and-missing-artifacts.md | 39 +++++++++++++++++-- ...ment-fixtures-and-ci-fail-closed-policy.md | 4 ++ .../test-framework-overview.md | 16 +++++++- .../test-lane-coverage-map.md | 12 ++++++ 4 files changed, 67 insertions(+), 4 deletions(-) diff --git a/reconstruction-guardrails/verification-and-native-test-oracles/evidence-levels-and-missing-artifacts.md b/reconstruction-guardrails/verification-and-native-test-oracles/evidence-levels-and-missing-artifacts.md index e0364d8..468e6db 100644 --- a/reconstruction-guardrails/verification-and-native-test-oracles/evidence-levels-and-missing-artifacts.md +++ b/reconstruction-guardrails/verification-and-native-test-oracles/evidence-levels-and-missing-artifacts.md @@ -12,6 +12,19 @@ soft_links: This repository should distinguish between what the current source snapshot proves, what it strongly suggests, and what it does not expose yet. +## Snapshot packaging boundary + +The current Claude Code snapshot behaves like a partial source export rather than a full repository checkout. + +Direct root-level scanning did not expose: + +- the top-level repository manifest or lockfiles +- committed workflow files or other CI orchestration assets +- the committed `test/`, `tests/`, `__tests__`, or `fixtures/` directories themselves +- runner-specific config files or coverage config files + +That absence should be treated as evidence, not as an invitation to guess. The missing runner, CI, coverage, and fixture-corpus details are blocked first by snapshot packaging boundaries, not by tree organization. + ## Confirmed from the current snapshot The snapshot is sufficient to confirm all of these: @@ -20,26 +33,44 @@ The snapshot is sufficient to confirm all of these: - fixture and VCR replay are first-class testing mechanisms - there are direct signals for multiple lane families, including at least one compatibility lane, at least one integration lane, dedicated end-to-end harnesses, conformance-sensitive auth verification, and many narrow regression or fidelity oracles - narrow seams such as injected dependencies, exported testing helpers, resets, and test-only helper surfaces are part of the current design +- a shared `test/preload.ts` layer exists for reset and shard-isolation work across same-process tests +- at least part of the suite runs in a Bun-flavored environment, and at least one visible lane is invoked through a script-wrapped `npm run test:file ...` entrypoint +- sharded execution exists, including explicit Windows-shard signals +- coverage output exists as a generated artifact, even though the tool and thresholds are not exposed The tree should treat those as lane-family and architecture facts, not as proof of the full hidden runner inventory. +## Directly named family and layout signals + +The current snapshot directly names enough test assets to anchor the framework: + +- `test/utils/settings/backward-compatibility.test.ts` as a script-addressable compatibility lane +- `test/utils/transcriptSearch.renderFidelity.test.tsx`, `toolSearchText.test.tsx`, `test/utils/powershell/dangerousCmdlets.test.ts`, and `test/utils/sandbox/webfetch-preapproved-separation.test.ts` as narrow regression, fidelity, or policy-boundary contracts +- `managedSettingsHeadless.int.test.ts` as a true integration lane +- `daemon/auth.test.ts`, `bash/prefix.test.ts`, `officialRegistry.test.ts`, `backgroundShells.test.ts`, `diskOutput.test.ts`, `spawn.test.ts`, and `validate.test.ts` as additional family signals outside the visible `test/utils` path +- JSON `fixtures/` recordings rooted at configurable test fixture paths rather than at one hardcoded machine-local directory + +These are evidence anchors, not the full upstream test tree. + ## Strongly suggested but not fully proven The tree can safely treat these as strong signals rather than as closed facts: -- the TypeScript runner environment is Bun-oriented in at least part of the stack +- the TypeScript runner environment is Bun-oriented in at least part of the stack, but Bun has not been proven to be the only runner for every lane - the regression or unit layer is broader than the few directly named test references exposed in comments and helper exports - repo-level scripts wrap at least some runner commands instead of every lane being invoked directly +- historical or local Jest constraints influenced some helper placement, but that does not prove Jest remains the current primary runner ## Still missing for exact runner-level reproduction The current snapshot does not fully expose: - the top-level repository manifest and script table -- the complete test directory layout +- the complete test directory layout and exhaustive file inventory +- the full contents of `test/preload.ts` - the exhaustive lane inventory and lane-to-command matrix - the full committed fixture corpus -- the CI workflow and any sharding or coverage rules +- the CI workflow and the exact sharding or coverage rules Those artifacts are the main blockers for claiming exact reproduction of upstream test plumbing. @@ -51,5 +82,7 @@ While those artifacts are missing, the tree should: - preserve clear evidence labels for inferred versus confirmed details - claim lane purpose and behavior ownership more confidently than lane naming or runner wiring - refuse to guess exact runner wiring that the snapshot did not show +- treat directly named test families as anchor examples rather than as an exhaustive file tree +- avoid treating generic mentions of external tools or generated workflow templates as proof of Claude Code's own hidden test stack This is a knowledge-quality rule, not a refusal to make progress. The visible framework is already rich enough to guide a clean-room rebuild of the verification architecture itself. diff --git a/reconstruction-guardrails/verification-and-native-test-oracles/test-environment-fixtures-and-ci-fail-closed-policy.md b/reconstruction-guardrails/verification-and-native-test-oracles/test-environment-fixtures-and-ci-fail-closed-policy.md index 2078933..7d3e162 100644 --- a/reconstruction-guardrails/verification-and-native-test-oracles/test-environment-fixtures-and-ci-fail-closed-policy.md +++ b/reconstruction-guardrails/verification-and-native-test-oracles/test-environment-fixtures-and-ci-fail-closed-policy.md @@ -37,8 +37,11 @@ Equivalent behavior should preserve: Across those families, the shared contract preserves: - explicit activation in test posture +- JSON recordings under a `fixtures/` subtree - hash-based fixture naming from normalized inputs +- more than one naming family, including generic name-plus-hash fixtures, transcript-derived API replay fixtures, and dedicated token-count fixtures - replay from a configurable fixture root +- fixture roots coming from explicit test configuration rather than from hardcoded machine-local paths - rehydration back into runtime-shaped results rather than raw text blobs - replayed results still participating in the same downstream usage, cost, or accounting paths that live responses would drive - input dehydration and path normalization so equivalent tests keep hitting the same recordings across machines @@ -84,6 +87,7 @@ If a clean-room rebuild keeps external API-backed tests, it should preserve all - a dedicated test posture - multiple fixture families when different API-adjacent callers need different oracle shapes +- a configurable fixture root and committed `fixtures/` subtree rather than machine-local scratch files - deterministic fixture hashing and hydration - fail-closed CI behavior for missing recordings - explicit recording refresh diff --git a/reconstruction-guardrails/verification-and-native-test-oracles/test-framework-overview.md b/reconstruction-guardrails/verification-and-native-test-oracles/test-framework-overview.md index e1b5fdc..d3128c8 100644 --- a/reconstruction-guardrails/verification-and-native-test-oracles/test-framework-overview.md +++ b/reconstruction-guardrails/verification-and-native-test-oracles/test-framework-overview.md @@ -34,6 +34,17 @@ The snapshot provides direct signals for all of these verification layer familie - module-state isolation through exported reset, seed, and cleanup helpers for caches, watchers, registries, and other sticky services - domain-owned contract assets derived from upstream-native tests +## Visible entry and layout anchors + +Even without the hidden top-level manifest, the snapshot still exposes several concrete anchors for the test framework: + +- a script-wrapped single-file compatibility path, which names `test/utils/settings/backward-compatibility.test.ts` directly +- a Bun-flavored test environment for at least part of the suite, because product-owned comments describe behavior under `bun test` +- a shared `test/preload.ts` setup layer that clears memoized hooks, plugin registries, and other sticky caches between tests that share one process or shard +- a visible mix of family naming conventions: `test/utils/...`, `.test.ts`, `.test.tsx`, `.int.test.ts`, and non-`test/utils` families such as daemon, shell, task, and registry-focused contracts + +These anchors do not reveal the full upstream file tree, but they do prove that the framework is not one undifferentiated runner with ad hoc cases. + ## Stable tier model A faithful rebuild should preserve these tiers as distinct concerns: @@ -52,9 +63,12 @@ The subsystem mapping behind those tiers is spelled out in [test-lane-coverage-m The tree can safely claim: -- there is a script-oriented entry layer +- there is a script-oriented entry layer, including at least one single-file lane - the product code is written to coexist with a Bun-flavored module-mocking environment - the visible framework depends on more than a generic "run tests" command +- a shared preload or reset layer exists to clean module state between same-shard tests +- sharded execution exists, including at least one Windows-specific shard +- coverage output exists as a generated artifact, even though the exact coverage driver and thresholds remain hidden - the end-to-end harnesses that are visible are designed to preserve real approval, transport, and credential paths rather than UI-only fakes The tree should not overclaim: diff --git a/reconstruction-guardrails/verification-and-native-test-oracles/test-lane-coverage-map.md b/reconstruction-guardrails/verification-and-native-test-oracles/test-lane-coverage-map.md index 67ceb43..5bf448b 100644 --- a/reconstruction-guardrails/verification-and-native-test-oracles/test-lane-coverage-map.md +++ b/reconstruction-guardrails/verification-and-native-test-oracles/test-lane-coverage-map.md @@ -63,6 +63,18 @@ The visible compatibility lanes protect durable public formats rather than trans The clearest current example is settings evolution, where additive schema change, invalid-field preservation, and backward compatibility must remain guarded even as the runtime evolves. +## Visible family anchors + +The snapshot does not expose the full test tree, but it does expose representative names that anchor the lane map: + +- fast regression or fidelity anchors such as `test/utils/transcriptSearch.renderFidelity.test.tsx`, `toolSearchText.test.tsx`, `test/utils/powershell/dangerousCmdlets.test.ts`, `bash/prefix.test.ts`, `spawn.test.ts`, `validate.test.ts`, `officialRegistry.test.ts`, `backgroundShells.test.ts`, and `diskOutput.test.ts` +- an integration anchor in `managedSettingsHeadless.int.test.ts` +- a compatibility anchor in `test/utils/settings/backward-compatibility.test.ts` +- a sandbox-boundary anchor in `test/utils/sandbox/webfetch-preapproved-separation.test.ts` +- end-to-end permission, bridge, and remote-transport lanes being exposed more through harness-capable product surfaces than through visible test filenames in this snapshot + +These names are evidence anchors, not a claim that the full upstream test tree is now visible. + ## Reconstruction rule A faithful rebuild should preserve: