agent-team-foundation · bingran-you · Apr 14, 2026 · Apr 14, 2026
@@ -14,6 +14,7 @@ This leaf covers:
 
 - interactive `/login` and `/logout` command behavior inside a running session
 - `claude auth login` and `claude auth logout` subcommand behavior
+- the read-only `claude auth status` surface in JSON and human-readable modes
 - the shared OAuth browser and manual-code transport used to acquire auth codes
 - the central token-install path that replaces one authenticated identity with another
 - logout teardown, cache invalidation, and auth-version-driven re-fetch behavior
@@ -46,6 +47,24 @@ Equivalent behavior should preserve:
 - forced organization validation happening after token installation and before the flow is considered complete
 - incompatible CLI flags for mutually exclusive login modes failing early instead of silently picking one
 
+## Read-only auth status surfaces are provider-aware and source-aware
+
+The analyzed source snapshot and the reconstruction validation corpus expose a top-level `claude auth status` surface. The local `claude 2.1.19` help output on this machine does not advertise the `auth` command family, so the safest clean-room claim is that this is a **version-sensitive public surface**, not a guaranteed baseline for every installed build.
+
+Equivalent behavior should preserve:
+
+- `claude auth status` staying read-only and inspection-oriented rather than silently refreshing, logging in, or mutating auth state
+- a human-readable `--text` branch and a structured JSON branch instead of one hard-coded presentation format
+- logged-in detection considering every real auth route the runtime can currently use, including third-party provider mode, OAuth-style auth, helper-managed API keys, and direct API-key environments
+- the structured output surfacing at least overall logged-in state, resolved auth method, and active API-provider family
+- JSON output appending source-specific detail only when it is actually meaningful, such as API-key source or Claude-account fields
+- text output suppressing empty or `none` properties instead of printing a wall of meaningless placeholders
+- direct environment-key-only setups still producing a useful text breadcrumb rather than looking entirely unauthenticated
+- explicit "not logged in" guidance appearing when no usable auth path exists
+- process exit status remaining meaningful for scripts, with success when some usable auth path exists and failure when no usable auth path exists
+
+The reconstruction-critical point is that auth status is not only "am I signed into Claude.ai?" It is a provider-aware inspection surface over the effective credential posture.
+
 ## OAuth transport contract
 
 Equivalent behavior should preserve:
@@ -131,6 +150,7 @@ Equivalent behavior should preserve:
 
 ## Failure modes
 
+- **status lie**: `auth status` reports only Anthropic OAuth state and hides a still-working third-party provider path, or vice versa
 - **partial account switch**: new tokens are stored, but transcript safety cleanup, feature refresh, or cache invalidation is skipped and the live session still behaves like the previous account
 - **destructive helper flow**: a limited-scope OAuth helper mistakenly reuses the full install path and wipes the user's real logged-in session
 - **stale signature replay**: old signed transcript blocks survive an in-session login and the next request fails because they were bound to a different key
@@ -146,5 +166,6 @@ In the observed source, platform-service behavior is verified through sequencing
 Equivalent coverage should prove:
 
 - config resolution, policy gates, persistence, and service startup ordering preserve the contracts and failure handling described above
+- `claude auth status` correctly reflects third-party, OAuth, helper-key, direct-key, and logged-out states in both text and JSON modes
 - provider-backed or OS-bound branches use fixtures, seeded stores, or narrow seams so auth, update, telemetry, and trust behavior stays reproducible
 - users still encounter the expected startup, settings, trust, diagnostics, and account-state behavior through the real CLI surface
@@ -17,4 +17,5 @@ Relevant leaves:
 - **[sandbox-selection-and-bypass-guards.md](sandbox-selection-and-bypass-guards.md)** — How sandbox selection, excluded commands, policy-gated overrides, and Windows refusal paths interact.
 - **[config-permission-and-sandbox-admin-surfaces.md](config-permission-and-sandbox-admin-surfaces.md)** — Registry-backed config mutation on eligible builds, plus permission browser, denied-command retry, and sandbox admin surfaces.
 - **[yolo-classifier-contracts.md](yolo-classifier-contracts.md)** — Acceptance oracles for automatic-approval transcript shaping, fail-safe classifier verdicts, staged review, and environment-specific deny overlays.
+- **[auto-mode-introspection-cli-surfaces.md](auto-mode-introspection-cli-surfaces.md)** — How version-sensitive `claude auto-mode defaults/config/critique` expose default rules, effective merged rules, and critique-only review without mutating live permission posture.
 - **[e2e-permission-testing-contracts.md](e2e-permission-testing-contracts.md)** — How a test-only approval probe reuses the normal permission dialog path without widening the public tool surface.
@@ -0,0 +1,91 @@
+---
+title: "Auto Mode Introspection CLI Surfaces"
+owners: []
+soft_links:
+  - /tools-and-permissions/permissions/permission-mode-transitions-and-gates.md
+  - /tools-and-permissions/permissions/permission-rule-loading-and-persistence.md
+  - /tools-and-permissions/permissions/yolo-classifier-contracts.md
+  - /platform-services/provider-model-mapping-and-capability-gates.md
+  - /reconstruction-guardrails/verification-and-native-test-oracles/released-cli-e2e-test-set.md
+---
+
+# Auto Mode Introspection CLI Surfaces
+
+The analyzed source snapshot and reconstruction validation corpus expose a small top-level `claude auto-mode` namespace. It does not turn auto mode on, and it is not another permission prompt. It is a read-only diagnostic surface for understanding what the classifier would use: the published default rules, the currently effective merged rules, and an optional critique of custom overrides.
+
+The local `claude 2.1.19` help output on this machine does not advertise `auto-mode`, so the safest clean-room claim is that this is a **version-sensitive public surface**, not a universal baseline command family.
+
+## Scope boundary
+
+This leaf covers:
+
+- the read-only `claude auto-mode defaults`, `config`, and `critique` subcommands
+- how those commands expose default classifier rules, effective merged rules, and critique behavior without mutating live permission state
+- how custom-rule replacement semantics are reflected back to the user
+
+It intentionally does not re-document:
+
+- the deeper permission mode transition graph already covered in [permission-mode-transitions-and-gates.md](permission-mode-transitions-and-gates.md)
+- the classifier acceptance oracle already covered in [yolo-classifier-contracts.md](yolo-classifier-contracts.md)
+- the runtime approval pipeline that actually consumes auto mode during tool decisions
+
+## This command family inspects auto mode; it does not activate it
+
+Equivalent behavior should preserve:
+
+- `claude auto-mode ...` staying separate from live mode-switch commands or startup permission flags
+- these commands remaining safe, read-only inspection surfaces rather than hidden ways to widen tool permissions
+- the command family disappearing entirely when the targeted build does not expose classifier introspection, instead of leaving dead stubs that promise behavior the runtime cannot fulfill
+
+The reconstruction-critical distinction is that auto-mode introspection is about legibility of classifier policy, not changing the current approval posture.
+
+## `defaults` exposes the published external baseline rules
+
+Equivalent behavior should preserve:
+
+- `claude auto-mode defaults` printing a machine-readable JSON payload
+- that payload covering the main user-facing rule sections for the external classifier policy, including allow, soft-deny, and environment-oriented guidance
+- the defaults command exposing the externally relevant baseline rather than leaking internal-only override layers that the ordinary user-facing classifier contract does not promise
+
+## `config` exposes the effective merged rules with replace-by-section semantics
+
+Equivalent behavior should preserve:
+
+- `claude auto-mode config` printing JSON rather than human prose
+- effective config reflecting user-supplied sections when present and falling back to defaults when a section is absent or empty
+- section merging using replace-by-section semantics rather than concatenating user rules onto defaults blindly
+- the output making it possible to see what the classifier would actually read without re-deriving precedence from several settings files by hand
+
+The important clean-room point is that non-empty custom sections replace the corresponding default section; they do not merely append to it.
+
+## `critique` is a side-query review surface over custom rules
+
+Equivalent behavior should preserve:
+
+- `claude auto-mode critique` acting only when the user actually has custom auto-mode rules
+- an explicit empty-state response when no custom rules exist, including guidance to consult defaults for reference
+- critique using the current main-loop model by default, with an optional model override
+- the critique path reviewing custom rules in the context of the defaults they replace, not as isolated disconnected text snippets
+- critique output focusing on clarity, completeness, conflicts, and actionability rather than pretending to be the classifier's final allow or block verdict
+- critique failure surfacing as an explicit command error instead of silently falling back to invented advice
+
+This preserves the user-facing promise: critique helps the operator improve custom rules, but it is not the auto-approval engine itself.
+
+## Failure modes
+
+- **surface conflation**: the command family is rebuilt as a hidden way to enter auto mode rather than to inspect it
+- **append-only drift**: `config` merges user rules with defaults additively when the observed contract uses replace-by-section semantics
+- **default leak**: `defaults` exposes internal-only policy layers or rollout-specific overrides instead of the ordinary external baseline
+- **empty-state silence**: `critique` with no custom rules returns nothing useful and leaves the user without a next step
+- **pseudo-classifier output**: critique is presented as if it were the actual approval verdict rather than advisory analysis of custom rules
+
+## Test Design
+
+In the observed source, this surface is exercised through packaged CLI checks and rule-shaping behavior that reuses the same classifier rule sources the runtime would later consume.
+
+Equivalent coverage should prove:
+
+- `defaults` and `config` return stable JSON shapes with the expected section semantics
+- custom-section replacement versus default fallback behaves exactly as documented here
+- `critique` produces the expected empty-state guidance when no custom rules exist and a bounded advisory response when they do
+- command availability is version- or build-sensitive only where the product truly hides the surface, not because the rebuild forgot to wire it