Skip to content

feat(support-surface): cut over adapters and guidance#205

Closed
MacAttak wants to merge 9 commits intomainfrom
work/04-adapter-and-support-surface-cutover
Closed

feat(support-surface): cut over adapters and guidance#205
MacAttak wants to merge 9 commits intomainfrom
work/04-adapter-and-support-surface-cutover

Conversation

@MacAttak
Copy link
Copy Markdown
Contributor

Summary

  • Align Claude Code, Codex, and Opencode on one shared operator-surface contract for branch, closeout, approval, warnings, and next-step guidance.
  • Align support skills and command wrappers on the same runtime-root and worktree-adoption vocabulary.
  • Add regression proof that fails closed when source guidance or packaged surfaces drift.

Approval Lineage

  • Design approval: STALE (artifact-set-changed), originally approved via /sw-plan.
  • Unit-spec approval for 04-adapter-and-support-surface-cutover: STALE (artifact-set-changed), originally approved via /sw-build.
  • Shipping freshness: fresh against origin/main at ship time.

What Changed

  • Claude Code, Codex, and Opencode now all render the same branch, closeout, approval, warning, and next-command summary through the shared renderWorkInProgressSummary / loadOperatorSurfaceSummary path.
  • sw-status, sw-doctor, sw-init, and sw-guard now use the same operator vocabulary as the adapters for runtime roots, adoption, and next-step guidance.
  • Codex and Opencode command wrappers plus AGENTS.md / CLAUDE.md now teach the same .specwright-local/, git-admin, /sw-adopt, and /sw-status story as the runtime surfaces.
  • The source and packaged regressions now fail closed if built output drifts from the source guidance or the shared operator-surface wiring.
  • This branch also captures the current tracked audit and research artifacts requested during this workstream; they did not change the unit's gate verdicts.

Why The Agent Implemented It This Way

  • Task 1 routed all interactive adapters through one shared operator summary so branch posture, approval posture, closeout fallback, and next command come from one contract instead of per-adapter reconstruction.
  • Task 2 aligned support-skill and command-wrapper wording because a shared runtime surface is only helpful if the surrounding docs stop teaching a conflicting layout or takeover model.
  • Task 3 extended the shell and packaged-build regressions so the published Claude surface and support docs fail closed when the source and build outputs drift.
  • One deliberate deviation was recorded: tests/test-support-surface-cutover-docs.sh was added to the proof set because it is the canonical shell regression that the Claude build harness already executes for support-surface coverage.

Acceptance Criteria

  • AC-1: PASS
    Evidence: adapters/shared/specwright-operator-surface.mjs, adapters/claude-code/hooks/session-start.mjs, adapters/codex/hooks/session-start.mjs, and adapters/opencode/plugin.ts now share the operator-surface path; covered by evals/tests/test_operator_surface_visibility.py.
  • AC-2: PASS
    Evidence: core/skills/sw-status/SKILL.md, core/skills/sw-doctor/SKILL.md, core/skills/sw-init/SKILL.md, and core/skills/sw-guard/SKILL.md now carry the same runtime-root and adoption vocabulary; covered by evals/tests/test_support_surface_status_card.py.
  • AC-3: PASS
    Evidence: AGENTS.md, CLAUDE.md, adapters/claude-code/CLAUDE.md, and the Codex/Opencode command docs now align on .specwright-local/, git-admin, /sw-adopt, and /sw-status; covered by evals/tests/test_support_surface_status_card.py.
  • AC-4: PASS
    Evidence: the shared operator surface and adapter hooks degrade gracefully when closeout or migration state is missing; covered by evals/tests/test_operator_surface_visibility.py.
  • AC-5: PASS
    Evidence: tests/test-support-surface-cutover-docs.sh, tests/test-claude-code-build.sh, and evals/tests/test_support_surface_status_card.py fail closed on source/build drift and coverage-marker regression.

Spec Conformance

  • All five unit acceptance criteria passed in the latest gate-spec run.
  • The unit remains scoped to integration-tier behavior; deliverable-level IC-B verification is deferred because this is Unit 04 of 5.

Gate Summary

Gate Verdict Notes
build PASS bash build/build.sh passed; branch freshness was fresh at verify and ship time
tests PASS python -m pytest evals/tests/ -v && bash tests/test-claude-code-build.sh passed
security PASS No secret, injection, or auth regressions found in the changed files
wiring WARN renderOperationalWarningLines is exported but has no external consumer
semantic PASS No unchecked-error, fail-open, or error-data-leakage findings in the changed executable files
spec PASS All five acceptance criteria mapped to implementation and passing regression evidence

Remaining Attention

  • Refresh the design approval lineage if you want the audit trail to match the current design artifact set.
  • Refresh the unit-spec approval lineage for 04-adapter-and-support-surface-cutover if you want the audit trail to match the current unit artifact set.
  • gate-wiring found one advisory issue: renderOperationalWarningLines is exported from adapters/shared/specwright-operator-surface.mjs but has no external consumer. If it does not need to be part of the public helper surface, make it local to the module.
  • No code-quality gate produced a BLOCK finding.

Evidence Links

  • Specwright's detailed review-packet.md and gate evidence for this unit are clone-local, so this PR body inlines the reviewer-usable summary instead of linking to local-only .git/specwright/... artifacts.
  • Reviewer-verifiable tracked proof in this branch:
    • adapters/shared/specwright-operator-surface.mjs
    • adapters/claude-code/hooks/session-start.mjs
    • adapters/codex/hooks/session-start.mjs
    • adapters/opencode/plugin.ts
    • core/skills/sw-status/SKILL.md
    • core/skills/sw-doctor/SKILL.md
    • core/skills/sw-init/SKILL.md
    • core/skills/sw-guard/SKILL.md
    • AGENTS.md
    • CLAUDE.md
    • adapters/claude-code/CLAUDE.md
    • evals/tests/test_operator_surface_visibility.py
    • evals/tests/test_support_surface_status_card.py
    • tests/test-support-surface-cutover-docs.sh
    • tests/test-claude-code-build.sh

@github-actions
Copy link
Copy Markdown
Contributor

Eval Smoke Results

❌ Smoke evals regressed — 2 regressions, 4 improvements

Eval Pass Rate Duration Tokens (input+output) Verdict
grader-function-tests 1.00 (+0.00) 512ms (-982ms) 0 (+0) improved
structural-handoff-template 1.00 (+0.00) 26ms (-17ms) 0 (+0) improved
structural-skill-validation 0.00 (-1.00) 4306ms (+1748ms) 0 (+0) regression
structural-state-enforcement 1.00 (+0.00) 67ms (-24ms) 0 (+0) improved
workflow-yaml-validation 1.00 (+0.00) 1092ms (-208ms) 0 (+0) improved

Posted by Specwright eval-smoke workflow. This comment is updated on each push.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 22, 2026

Code Review — PR #205: feat(support-surface): cut over adapters and guidance

Summary

This PR does what it sets out to do well: all three adapter session-start hooks now share a single renderWorkInProgressSummary path through loadOperatorSurfaceSummary, documentation and skill files now carry consistent runtime-root vocabulary, and the regression suite fails closed on source/build drift. The overall direction is sound.

Four issues worth addressing before merge, roughly ordered by impact:


[Bug] Codex adapter outer catch silently discards all errors

adapters/codex/hooks/session-start.mjs line 58:

} catch {
  process.exit(0);
}

The Claude Code hook (session-start.mjs) logs to stderr before exiting. The Codex hook exits zero with no output, making any startup failure invisible. Given that this hook reads runtime state, a silent failure here means a broken install looks identical to a clean one. Should at minimum write to stderr like the CC hook does:

} catch (err) {
  process.stderr.write(`Specwright: Failed to read state: ${err.message}\n`);
}

[Gap] Codex continuation handler missing ## Correction Summary extraction

The Claude Code hook (lines 45–48) extracts the ## Correction Summary block from continuation.md and appends it as a separate --- Quality Corrections --- block. The Codex hook just includes the raw snapshot. This means quality corrections from a prior compaction are surfaced to Claude but not to Codex. If this is intentional it should be documented; otherwise the Codex handler should match the CC handler.


[Advisory] renderOperationalWarningLines is exported but has no external consumer

This was flagged by the gate-wiring WARN in the PR body itself. renderOperationalWarningLines is exported from specwright-operator-surface.mjs but is only called internally by renderWorkInProgressSummary. Either make it non-exported (private to the module) or add a comment explaining it's an intentional extension point. Leaving it exported creates a confusing public API surface.


[Test gap] renderOperationalWarningLines path has no regression test

The new test test_primary_operator_surfaces_show_branch_approval_and_next_without_closeout covers branch/approval/next from renderOperatorSurfaceLines. But the renderOperationalWarningLines path (lock warnings, ownership conflict, shipping status) in renderWorkInProgressSummary has zero test coverage. A test with work.lock, ownerConflict, or work.status === 'shipping' would close this gap.

The undocumented .slice(0, 2) cap in formatWarningLines also has no test — if a third warning exists it's silently dropped, with no indication to the caller or user.


Minor observations

  • Warning filter is implicit: formatWarningLines silently suppresses approval-*, missing-closeout, and branch-mismatch codes because those have dedicated rendering sections. This is correct but undocumented — a single inline comment explaining the suppression rationale would help future maintainers.
  • formatBranchStatusLine unavailable fallback untested: The four-branch logic in formatBranchStatusLine is good, but only the happy path (expected === observed) is covered by the new test.
  • Duplicate process.exit(0): Both Codex and CC hooks have a bare process.exit(0) at line 62 after the outer catch. With process.exit(0) already inside the catch, the trailing call is unreachable in the error path and redundant on the success path. Harmless but worth cleaning up.
  • AUDIT.md — F14 status: Prior audit BLOCKER F14 (hook handler unit tests) is listed as "stale" rather than resolved. The new Codex test (_run_codex_session_start) is an integration test, not a unit test. The hook handler unit test gap (path traversal, agent-type routing, code/non-code filtering) from F14 remains open; this PR doesn't close it.

Verdict

The shared-surface cutover itself is clean and well-tested at the integration level. The three issues above (silent Codex error swallowing, missing Correction Summary parity, and the untested warning paths) are the only things I'd push back on before merging.

Comment on lines 30 to 48
@@ -50,23 +47,12 @@ try {
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Claude Code hook parses ## Correction Summary out of the continuation snapshot and appends it as a separate --- Quality Corrections --- block so the model gets targeted guidance on what to avoid. This hook just includes the raw snapshot. If the omission is intentional (Codex doesn't need this guidance), a comment here would help. If it's an oversight, the extraction logic from the CC hook should be replicated.

return lines;
}

export function renderOperationalWarningLines(options = {}) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renderOperationalWarningLines is exported but has no external consumer — it's only called internally by renderWorkInProgressSummary on line 166. The gate-wiring advisory in the PR body already flags this. Either drop the export keyword or add a comment marking it as an intentional extension point, otherwise the public API surface is misleadingly wide.

Comment on lines +64 to +72
return warnings
.filter((warning) => {
const code = normalizeString(warning?.code) ?? '';
if (code.startsWith('approval-')) {
return false;
}

return !['missing-closeout', 'branch-mismatch'].includes(code);
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filter suppresses approval-*, missing-closeout, and branch-mismatch codes because those have dedicated rendering sections (formatApprovalStatusLine, formatCloseoutLines, formatBranchStatusLine). This is the right call, but without a comment here it reads as arbitrary exclusion. A future maintainer adding a new code to specwright-status-card.mjs won't know these codes are silently dropped.

Worth adding a one-liner:

Suggested change
return warnings
.filter((warning) => {
const code = normalizeString(warning?.code) ?? '';
if (code.startsWith('approval-')) {
return false;
}
return !['missing-closeout', 'branch-mismatch'].includes(code);
})
.filter((warning) => {
// Approval, closeout, and branch warnings have dedicated rendering sections above;
// suppress them here to avoid duplication.
const code = normalizeString(warning?.code) ?? '';
if (code.startsWith('approval-')) {
return false;
}
return !['missing-closeout', 'branch-mismatch'].includes(code);
})

})
.map((warning) => normalizeString(warning?.summary))
.filter(Boolean)
.slice(0, 2)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.slice(0, 2) silently drops any third-or-later warning with no indication to the caller or the user. There's no test for this boundary and no comment explaining why 2 was chosen. If the intent is to keep the session-start surface terse, the cap is reasonable — but it should be a named constant and tested (e.g. assert that a 3-warning fixture only emits 2 WARNING: lines).

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d2cd092f90

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

const continuationPath = stateInfo.continuationPath;
const work = normalizeActiveWork(stateInfo);
const ownerConflict = findSelectedWorkOwnerConflict(stateInfo);
const operatorSummary = loadOperatorSurfaceSummary(stateInfo, work);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip operator summary load for inactive work

Move this summary construction below the shipped/abandoned early-return check. As written, loadOperatorSurfaceSummary runs even when the hook should exit with no output, which still triggers status-card reads (approvals.md, closeout files). If those files are malformed, summarizeApproval logs an error to stderr, so Codex surfaces startup warnings for work that is already inactive.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

The useful operator-surface/support-surface work from this branch has been revived on top of current main in #208. Leaving this PR open for traceability for now, but #208 is the clean reviewable replacement and preserves the user-facing improvements without the current branch conflict noise.

@MacAttak MacAttak closed this Apr 22, 2026
@MacAttak MacAttak deleted the work/04-adapter-and-support-surface-cutover branch April 22, 2026 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant