From c950f0e8bf67faed39630acc15946d899a7ff563 Mon Sep 17 00:00:00 2001
From: Bingran You <bingran.you@berkeley.edu>
Date: Wed, 8 Apr 2026 17:34:54 -0700
Subject: [PATCH] docs: capture test snapshot boundaries and family anchors

---
 .../evidence-levels-and-missing-artifacts.md  | 39 +++++++++++++++++--
 ...ment-fixtures-and-ci-fail-closed-policy.md |  4 ++
 .../test-framework-overview.md                | 16 +++++++-
 .../test-lane-coverage-map.md                 | 12 ++++++
 4 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/reconstruction-guardrails/verification-and-native-test-oracles/evidence-levels-and-missing-artifacts.md b/reconstruction-guardrails/verification-and-native-test-oracles/evidence-levels-and-missing-artifacts.md
index e0364d8..468e6db 100644
--- a/reconstruction-guardrails/verification-and-native-test-oracles/evidence-levels-and-missing-artifacts.md
+++ b/reconstruction-guardrails/verification-and-native-test-oracles/evidence-levels-and-missing-artifacts.md
@@ -12,6 +12,19 @@ soft_links:
 
 This repository should distinguish between what the current source snapshot proves, what it strongly suggests, and what it does not expose yet.
 
+## Snapshot packaging boundary
+
+The current Claude Code snapshot behaves like a partial source export rather than a full repository checkout.
+
+Direct root-level scanning did not expose:
+
+- the top-level repository manifest or lockfiles
+- committed workflow files or other CI orchestration assets
+- the committed `test/`, `tests/`, `__tests__`, or `fixtures/` directories themselves
+- runner-specific config files or coverage config files
+
+That absence should be treated as evidence, not as an invitation to guess. The missing runner, CI, coverage, and fixture-corpus details are blocked first by snapshot packaging boundaries, not by tree organization.
+
 ## Confirmed from the current snapshot
 
 The snapshot is sufficient to confirm all of these:
@@ -20,26 +33,44 @@ The snapshot is sufficient to confirm all of these:
 - fixture and VCR replay are first-class testing mechanisms
 - there are direct signals for multiple lane families, including at least one compatibility lane, at least one integration lane, dedicated end-to-end harnesses, conformance-sensitive auth verification, and many narrow regression or fidelity oracles
 - narrow seams such as injected dependencies, exported testing helpers, resets, and test-only helper surfaces are part of the current design
+- a shared `test/preload.ts` layer exists for reset and shard-isolation work across same-process tests
+- at least part of the suite runs in a Bun-flavored environment, and at least one visible lane is invoked through a script-wrapped `npm run test:file ...` entrypoint
+- sharded execution exists, including explicit Windows-shard signals
+- coverage output exists as a generated artifact, even though the tool and thresholds are not exposed
 
 The tree should treat those as lane-family and architecture facts, not as proof of the full hidden runner inventory.
 
+## Directly named family and layout signals
+
+The current snapshot directly names enough test assets to anchor the framework:
+
+- `test/utils/settings/backward-compatibility.test.ts` as a script-addressable compatibility lane
+- `test/utils/transcriptSearch.renderFidelity.test.tsx`, `toolSearchText.test.tsx`, `test/utils/powershell/dangerousCmdlets.test.ts`, and `test/utils/sandbox/webfetch-preapproved-separation.test.ts` as narrow regression, fidelity, or policy-boundary contracts
+- `managedSettingsHeadless.int.test.ts` as a true integration lane
+- `daemon/auth.test.ts`, `bash/prefix.test.ts`, `officialRegistry.test.ts`, `backgroundShells.test.ts`, `diskOutput.test.ts`, `spawn.test.ts`, and `validate.test.ts` as additional family signals outside the visible `test/utils` path
+- JSON `fixtures/` recordings rooted at configurable test fixture paths rather than at one hardcoded machine-local directory
+
+These are evidence anchors, not the full upstream test tree.
+
 ## Strongly suggested but not fully proven
 
 The tree can safely treat these as strong signals rather than as closed facts:
 
-- the TypeScript runner environment is Bun-oriented in at least part of the stack
+- the TypeScript runner environment is Bun-oriented in at least part of the stack, but Bun has not been proven to be the only runner for every lane
 - the regression or unit layer is broader than the few directly named test references exposed in comments and helper exports
 - repo-level scripts wrap at least some runner commands instead of every lane being invoked directly
+- historical or local Jest constraints influenced some helper placement, but that does not prove Jest remains the current primary runner
 
 ## Still missing for exact runner-level reproduction
 
 The current snapshot does not fully expose:
 
 - the top-level repository manifest and script table
-- the complete test directory layout
+- the complete test directory layout and exhaustive file inventory
+- the full contents of `test/preload.ts`
 - the exhaustive lane inventory and lane-to-command matrix
 - the full committed fixture corpus
-- the CI workflow and any sharding or coverage rules
+- the CI workflow and the exact sharding or coverage rules
 
 Those artifacts are the main blockers for claiming exact reproduction of upstream test plumbing.
 
@@ -51,5 +82,7 @@ While those artifacts are missing, the tree should:
 - preserve clear evidence labels for inferred versus confirmed details
 - claim lane purpose and behavior ownership more confidently than lane naming or runner wiring
 - refuse to guess exact runner wiring that the snapshot did not show
+- treat directly named test families as anchor examples rather than as an exhaustive file tree
+- avoid treating generic mentions of external tools or generated workflow templates as proof of Claude Code's own hidden test stack
 
 This is a knowledge-quality rule, not a refusal to make progress. The visible framework is already rich enough to guide a clean-room rebuild of the verification architecture itself.
diff --git a/reconstruction-guardrails/verification-and-native-test-oracles/test-environment-fixtures-and-ci-fail-closed-policy.md b/reconstruction-guardrails/verification-and-native-test-oracles/test-environment-fixtures-and-ci-fail-closed-policy.md
index 2078933..7d3e162 100644
--- a/reconstruction-guardrails/verification-and-native-test-oracles/test-environment-fixtures-and-ci-fail-closed-policy.md
+++ b/reconstruction-guardrails/verification-and-native-test-oracles/test-environment-fixtures-and-ci-fail-closed-policy.md
@@ -37,8 +37,11 @@ Equivalent behavior should preserve:
 Across those families, the shared contract preserves:
 
 - explicit activation in test posture
+- JSON recordings under a `fixtures/` subtree
 - hash-based fixture naming from normalized inputs
+- more than one naming family, including generic name-plus-hash fixtures, transcript-derived API replay fixtures, and dedicated token-count fixtures
 - replay from a configurable fixture root
+- fixture roots coming from explicit test configuration rather than from hardcoded machine-local paths
 - rehydration back into runtime-shaped results rather than raw text blobs
 - replayed results still participating in the same downstream usage, cost, or accounting paths that live responses would drive
 - input dehydration and path normalization so equivalent tests keep hitting the same recordings across machines
@@ -84,6 +87,7 @@ If a clean-room rebuild keeps external API-backed tests, it should preserve all
 
 - a dedicated test posture
 - multiple fixture families when different API-adjacent callers need different oracle shapes
+- a configurable fixture root and committed `fixtures/` subtree rather than machine-local scratch files
 - deterministic fixture hashing and hydration
 - fail-closed CI behavior for missing recordings
 - explicit recording refresh
diff --git a/reconstruction-guardrails/verification-and-native-test-oracles/test-framework-overview.md b/reconstruction-guardrails/verification-and-native-test-oracles/test-framework-overview.md
index e1b5fdc..d3128c8 100644
--- a/reconstruction-guardrails/verification-and-native-test-oracles/test-framework-overview.md
+++ b/reconstruction-guardrails/verification-and-native-test-oracles/test-framework-overview.md
@@ -34,6 +34,17 @@ The snapshot provides direct signals for all of these verification layer familie
 - module-state isolation through exported reset, seed, and cleanup helpers for caches, watchers, registries, and other sticky services
 - domain-owned contract assets derived from upstream-native tests
 
+## Visible entry and layout anchors
+
+Even without the hidden top-level manifest, the snapshot still exposes several concrete anchors for the test framework:
+
+- a script-wrapped single-file compatibility path, which names `test/utils/settings/backward-compatibility.test.ts` directly
+- a Bun-flavored test environment for at least part of the suite, because product-owned comments describe behavior under `bun test`
+- a shared `test/preload.ts` setup layer that clears memoized hooks, plugin registries, and other sticky caches between tests that share one process or shard
+- a visible mix of family naming conventions: `test/utils/...`, `.test.ts`, `.test.tsx`, `.int.test.ts`, and non-`test/utils` families such as daemon, shell, task, and registry-focused contracts
+
+These anchors do not reveal the full upstream file tree, but they do prove that the framework is not one undifferentiated runner with ad hoc cases.
+
 ## Stable tier model
 
 A faithful rebuild should preserve these tiers as distinct concerns:
@@ -52,9 +63,12 @@ The subsystem mapping behind those tiers is spelled out in [test-lane-coverage-m
 
 The tree can safely claim:
 
-- there is a script-oriented entry layer
+- there is a script-oriented entry layer, including at least one single-file lane
 - the product code is written to coexist with a Bun-flavored module-mocking environment
 - the visible framework depends on more than a generic "run tests" command
+- a shared preload or reset layer exists to clean module state between same-shard tests
+- sharded execution exists, including at least one Windows-specific shard
+- coverage output exists as a generated artifact, even though the exact coverage driver and thresholds remain hidden
 - the end-to-end harnesses that are visible are designed to preserve real approval, transport, and credential paths rather than UI-only fakes
 
 The tree should not overclaim:
diff --git a/reconstruction-guardrails/verification-and-native-test-oracles/test-lane-coverage-map.md b/reconstruction-guardrails/verification-and-native-test-oracles/test-lane-coverage-map.md
index 67ceb43..5bf448b 100644
--- a/reconstruction-guardrails/verification-and-native-test-oracles/test-lane-coverage-map.md
+++ b/reconstruction-guardrails/verification-and-native-test-oracles/test-lane-coverage-map.md
@@ -63,6 +63,18 @@ The visible compatibility lanes protect durable public formats rather than trans
 
 The clearest current example is settings evolution, where additive schema change, invalid-field preservation, and backward compatibility must remain guarded even as the runtime evolves.
 
+## Visible family anchors
+
+The snapshot does not expose the full test tree, but it does expose representative names that anchor the lane map:
+
+- fast regression or fidelity anchors such as `test/utils/transcriptSearch.renderFidelity.test.tsx`, `toolSearchText.test.tsx`, `test/utils/powershell/dangerousCmdlets.test.ts`, `bash/prefix.test.ts`, `spawn.test.ts`, `validate.test.ts`, `officialRegistry.test.ts`, `backgroundShells.test.ts`, and `diskOutput.test.ts`
+- an integration anchor in `managedSettingsHeadless.int.test.ts`
+- a compatibility anchor in `test/utils/settings/backward-compatibility.test.ts`
+- a sandbox-boundary anchor in `test/utils/sandbox/webfetch-preapproved-separation.test.ts`
+- end-to-end permission, bridge, and remote-transport lanes being exposed more through harness-capable product surfaces than through visible test filenames in this snapshot
+
+These names are evidence anchors, not a claim that the full upstream test tree is now visible.
+
 ## Reconstruction rule
 
 A faithful rebuild should preserve: