Conversation
…logy Closes seven canonical-methodology gaps surfaced by cross-referencing Anthropic's "Harness Design for Long-Running Agentic Applications" (Mar 2026) against the existing repo. All seven entries land in [Unreleased]; no version bump. Critical additions (P1 / P2 / P3): - §12 Context anxiety in docs/ai-operating-contract.md: names the intra-session premature-task-shrink failure, parallel in structure to §11 verbal-completion illusion. Distinguished from §4 (cross-session fact loss) and §3 (evidence-before-completion). Explicit "prefer Context Reset over Compaction" remedy. Glossary entry + cross-refs in resumption-protocol.md §Step 2b (outgoing-session symmetric rule) and ai-project-memory.md §Pre-compression protection list (rescue list is not the trigger condition). - §Acceptance criteria as a Sprint Contract in docs/multi-agent-handoff.md: Planner-side time-axis discipline that catches unverifiable AC at write-time, not at Implementer's egress self-check. Two named rules — Reviewer-anticipation (Planner imagines Reviewer's audit) and Reverse-shape (AC text reverse-shapes Implementer choices). Three pre-handoff self-check questions. Catches the Anthropic-style sprint-contract concern without breaking role-separation. - docs/harness-evolution-discipline.md (new file): per-material-model- release re-evaluation of canonical methodology components whose load- bearing-ness depends on a specific model-class failure mode. Sibling to anti-entropy-discipline.md (which excludes canonical methodology by design). Four-step procedure (map / re-test / classify / record). Companion mode-decision-tree.md row makes sweep-backed canonical retirement Lean-eligible (symmetric to existing project-local row), giving the methodology its first canonical-component weight-shedding path. Registered in docs/README.md Tier-3 + docs/file-role-map.md. Enrichments (1d / 1e / B2 / 1f): - §Single-agent anti-collusion rule §Why this rule exists: preamble naming the underlying behavioural failure (self-evaluation bias) the structural rule enforces against. Explicit "this is not prompt- engineerable away" with the Tool-permission matrix's no-write row as the load-bearing form. - §Capability gating by risk level: "Risk is one axis; capability frontier is another" callout. Risk-axis is the encoded mechanical enforcement boundary; capability-frontier is the Planner-judgement signal alongside it. Matrix's gating column is a floor, not a ceiling. - §Boundary with non-mechanical evaluation in docs/mechanical-enforcement-discipline.md: first canonical three-evaluator comparison (Mechanical / Application-driven / Agentic Reviewer audit) with layering (floor / bridge / ceiling) and a Planner-side allocation rule at Phase 3 (each AC to the cheapest evaluator that catches its failure shape). New anti-pattern: routing by familiarity rather than by failure shape. Companion edit: multi-agent-handoff.md §Reviewer §Must not do gains a "spend audit attention on what a mechanical check should have caught" row. Discipline: - No vendor / model names in normative content (model-agnostic). - No new schema fields, no new manifest enums, no new role definitions. - All seven entries cite existing fields and rules; new normative content stays inside the file-role-map.md SoT discipline. - Local CI validators all pass (internal-links, legacy-terms, summary-drift, role-consistency, schema-syntax, changelog-drift, template-conformance, cluster-disjointness, schema-drift). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes seven canonical-methodology gaps surfaced by cross-referencing Anthropic's Harness Design for Long-Running Agentic Applications (Mar 2026) and the existing OpenAI Harness engineering: leveraging Codex in an agent-first world (Feb 2026) against this repo.
All seven entries land in
[Unreleased]; no version bump. +331 / −1 across 13 files (1 new file).What changed (seven gaps closed)
Critical — P1 / P2 / P3
docs/ai-operating-contract.md §12resumption-protocol.md §Step 2b(outgoing-session symmetric rule) andai-project-memory.md §Pre-compression protection list.docs/multi-agent-handoff.md §Acceptance criteria as a Sprint Contractdocs/harness-evolution-discipline.md(new file) +mode-decision-tree.mdrow + index registrationsanti-entropy-discipline.md(which excludes canonical methodology by design). Four-step procedure (map / re-test / classify / record). Companionmode-decision-tree.mdrow makes sweep-backed canonical retirement Lean-eligible — first canonical-component weight-shedding path.Enrichments — 1d / 1e / B2 / 1f
multi-agent-handoff.md §Single-agent anti-collusion rule §Why this rule existsagent-persona-discipline.md's observation that medium reverse-shapes persona).multi-agent-handoff.md §Capability gating by risk levelmechanical-enforcement-discipline.md §Boundary with non-mechanical evaluationmulti-agent-handoff.md §Reviewer §Must not dogains a row pointing back.Discipline
CLAUDE.md §2).file-role-map.mdSoT discipline. New file registered indocs/README.mdTier-3 anddocs/file-role-map.md.CLAUDE.md §Mode implication(canonical methodology edits at L1+).Test plan
Local CI validators (mirrors of
.github/workflows/validate.ymljobs):internal-links— no broken relative links across 189 markdown fileslegacy-terms— no legacy terms outside allow-listsummary-drift— 53 docs, no TL;DR-vs-body driftrole-consistency— role contract consistent across SoT + 9 mirrors (8 invariants)schema-syntax— 4 schemas validchangelog-drift—CHANGELOG.json↔CHANGELOG.mdin synctemplate-conformance— 7 manifest examples validcluster-disjointness— self-test passesschema-drift— generated.jsonmatches.yamlCHANGELOG.json— valid JSON syntaxOut of scope (deferred)
Two lower-priority items from the original cross-reference were deliberately not addressed:
multi-agent-handoff.md §Task Prompt structure); persisting it would split the SoT against the Manifest-as-state-snapshot discipline. Already a "design choice, not gap" in the original analysis.🤖 Generated with Claude Code