Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions CHANGELOG.json

Large diffs are not rendered by default.

12 changes: 12 additions & 0 deletions CHANGELOG.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ These activate when the change touches a specific dimension. Skip the ones that
| [`playtest-discipline.md`](playtest-discipline.md) | Game / interactive / experience-driven changes need playtest evidence. |
| [`post-delivery-observation.md`](post-delivery-observation.md) | Phase 8 observation: production findings, metrics, continuous-evidence. |
| [`anti-entropy-discipline.md`](anti-entropy-discipline.md) | Time-driven garbage-collection sweeps for accumulated drift (stale CCKNs, expired deprecations, doc-reference rot). Complements the edit-boundary mechanical enforcement and the delivery-event Phase 8 observation. |
| [`harness-evolution-discipline.md`](harness-evolution-discipline.md) | Per-material-model-release re-evaluation of canonical methodology components whose load-bearing-ness depends on a specific model-class failure mode (§11 verbal-completion illusion, §12 Context anxiety, §Pre-handoff self-check, §Anti-rationalization rules, Compaction algorithm, etc.). Three outcomes per component: re-justify / retire-eligible / new-failure-surfaced. Sibling to `anti-entropy-discipline.md` (which excludes canonical methodology); together they let the methodology shed weight as well as accumulate it. |
| [`adoption-strategy.md`](adoption-strategy.md) | Rolling out the methodology to a team that isn't using it yet. |
| [`adoption-anti-metrics.md`](adoption-anti-metrics.md) | How to recognize fake adoption (compliance theatre). |
| [`ci-cd-integration-hooks.md`](ci-cd-integration-hooks.md) | Wiring methodology gates into CI/CD pipelines. |
Expand Down
62 changes: 62 additions & 0 deletions docs/ai-operating-contract.md
Original file line number Diff line number Diff line change
Expand Up @@ -365,6 +365,68 @@ At each of these, couple the narration (if any) with the tool call in the same t

---

## 12. Context anxiety — premature task-shrink under perceived context pressure

### Why this rule exists

A long-running session can drift into a failure shape distinct from §4 (context hygiene) and §11 (verbal-completion illusion): the model **perceives** that it is approaching its context window limit and begins wrapping up the change early — truncating remaining tasks, skipping planned evidence collection, summarising instead of executing, declaring "done" when the work is structurally incomplete. The perception may or may not be accurate; sometimes the session has plenty of context left. From the user's side this looks like the agent gave up halfway through a planned task without raising an escalation. The agent did not crash, did not refuse, did not run out of permission — it **prematurely closed the work** because it expected to run out of room.

This is **context anxiety**: premature task closure under unverified pressure perception. Naming it separately matters because the remedy differs from the adjacent failures — §4 is about facts not written to files; §11 is about a single tool call missed at an action transition; §12 is about the tail of a multi-task plan collapsing into a wrap-up narrative. Misdiagnosing one as another routes the agent into the wrong corrective.

### Symptoms

- The plan had **N** tasks; only the first **M** ≪ **N** were attempted, with the remainder narrated as "follow-ups" without a follow-up artifact.
- Output tone shifts from execution mode ("running tests…") to wrap-up mode ("the change is complete; remaining items can be addressed later") without an explicit escalation marker.
- Evidence rows that were planned to be collected disappear from `evidence_plan.artifacts` rather than acquiring `status: collected`.
- A multi-phase change collapses Phase 4 / 5 / 6 narration into one self-declared "delivered" claim.
- Recovery via Discovery Loop or §5 active escalation is **bypassed in favour of self-truncation**.

### Must do

- When you sense context pressure, **verify before truncating**: estimate remaining tokens against the task list (the `resumption-protocol.md §Step 2b` 30% rule applies symmetrically to outgoing sessions). If pressure is real, route through §5 escalation or produce an outgoing handoff prompt per `skills/engineering-workflow/templates/handoff-prompt-template.md`; if pressure is perceptual, continue executing.
- If context will not fit the remaining work, **declare the truncation explicitly** as a §5 escalation (or, when a methodology-specific path applies — Discovery Loop, Phase Re-entry — route through that). The truncation must be a named act, not a silent omission.
- Prefer **Context Reset over Compaction** when a long-running session is approaching its window limit: a fresh session reading a Manifest-backed handoff carries the work forward without the perception that triggered the anxiety; in-place compaction shrinks the conversation but preserves the same session's pressure perception, so the anxiety re-fires on the next stretch of work.

### Must not do

- Reduce remaining task scope **without naming the change** in `scope_deltas` or §5 escalation.
- Substitute "the rest is straightforward, leaving as a follow-up" for actual execution when the plan committed to executing — a real follow-up needs an artifact (a new manifest, a tracked task, a `next_action` pointer), not a narrative gesture.
- Treat the conversation's tail as an external gate ("the user will run out of patience, so I'll wrap up here"). Conversation length is not a phase boundary.
- Use **compaction inside an anxious session as the remedy** — compaction summarises the past but does not lift the perception that caused the anxiety; the same session continues with the same perceived pressure and re-fires the anxiety on the next stretch. The remedy that addresses the cause is Context Reset.

### Distinguishing from adjacent failures

| Failure mode | Where it strikes | Visible cause | Remedy |
|---|---|---|---|
| **§4 Context hygiene failure** | Across sessions; compression discards facts not written to files | A compression event | Write key decisions to files proactively |
| **§11 Verbal-completion illusion** | Action-transition points (after task-list updates, plan approval, phase opens) | None — model chose `end_turn` instead of the tool call | Couple narration with the tool call in the same turn |
| **§12 Context anxiety (this rule)** | Tail of long single-session work; remaining tasks shrink prematurely | Perception of approaching context limit (may or may not be real) | Verify before truncating; if real, declare truncation as escalation; prefer Context Reset over Compaction |
| **§3 Evidence-before-completion failure** | Pre-handoff "I'm done" claim without evidence | Confusion of self-claim with verification | Collect evidence; cite paths; pass §10 self-check |

The four failure modes share a surface symptom (work declared done that is not done), but the **upstream cause and the corrective action differ**. Misdiagnosing context anxiety as verbal-completion illusion (or vice versa) routes the agent into the wrong remedy: §11's couple-narration-with-tool-call rule does not lift §12's perceived pressure, and §12's prefer-Context-Reset rule does not address §11's missing tool call.

### Risk-point inventory

Agents observing themselves should treat the following moments as high-risk for context anxiety:

- The narrative shift from "executing task **N** of **M**" to "the major work is done" while **N** < **M**.
- The tail of an intensive tool-use stretch (many code searches, long file reads) just before transitioning into a wrap-up sentence.
- Any moment when the agent considers "I'll leave this as a follow-up" without naming a follow-up artifact (manifest, task, `next_action`).
- Mid-execution moments when the agent counts remaining tokens or estimates window fill — the act of counting is a risk point because the count's interpretation drives the next action.
- After a long stretch of evidence collection, when the next planned task requires another long stretch — the temptation is to declare the first stretch sufficient.
- The tail of a long-running delegation D2-progress write — the canonical role's instinct may be to synthesise rather than continue iterating.

### Relation to other sections

- §4 (Context hygiene) is the **cross-session** form: facts not written to files are lost. §12 is the **intra-session** form: tasks not executed are lost. The two are independent — a session can fail at §4 (lose facts to compression) while passing §12 (still execute every planned task), and vice versa.
- §11 (Verbal-completion illusion) is the sibling intra-session failure: both are silent task-shrink, but §11 strikes at *action-transition points* (one tool call short) while §12 strikes at *the tail of long work* (many tasks short).
- §5 (Active escalation) is the legitimate exit valve. Context anxiety routes around §5 by declaring done; the corrective is to route through §5 instead.
- `skills/engineering-workflow/references/resumption-protocol.md §Step 2b context-budget rule` is the **incoming-session counterpart**: incoming sessions estimate before reading; §12 is the **outgoing-session symmetric rule** — outgoing sessions estimate before declaring done.
- `skills/engineering-workflow/references/long-running-delegation.md` D1 (checkpoint-bounded) and D2 (artifact-grounded progress) prevent §12 from surfacing in the first place: a delegation that writes progress at every iteration cannot collapse into a final wrap-up because the iterations themselves are the audit trail.
- `docs/ai-project-memory.md §Pre-compression protection list` covers what to *rescue* on imminent compression; §12 covers what *not to do* when the agent merely *anticipates* compression. The protection list is the first-aid kit; §12 is the rule against premature triage.

---

## Rejected patterns

This methodology has positive rules (what to do) and negative rules (what not to do, embedded in §1–§11 above). A small set of patterns recur from adjacent AI-assistance frameworks but are **explicitly rejected** in this contract — they appear plausible at first glance but conflict with one or more of the core rules above. Listing them here makes the rejection auditable and prevents accidental adoption when patterns drift in from other tools.
Expand Down
2 changes: 2 additions & 0 deletions docs/ai-project-memory.md
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,8 @@ When the AI detects an imminent context compression (nearing window limits), res

Protection method: **write to files immediately** (if not already written), or **restate near the tail of the conversation** so it escapes the compression zone.

**This list is the rescue protocol, not the trigger condition.** It tells you *what to save* when compression is imminent; it does not authorise *premature task closure* on the perception that compression is approaching. The latter is `docs/ai-operating-contract.md §12 Context anxiety` — declaring work done early because the agent expects to run out of room is a distinct failure mode from losing facts to actual compression. Run this list when compression is real; route through §5 active escalation when remaining work no longer fits.

---

## Cross-session resumption
Expand Down
1 change: 1 addition & 0 deletions docs/anti-entropy-discipline.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,3 +161,4 @@ Anti-entropy sweeps are themselves changes — they go through Phase 0 → Phase
- [`docs/autonomy-ladder-discipline.md`](autonomy-ladder-discipline.md) §Anti-patterns — defines the rung-claim and rung-skipping anti-patterns; the Rung-claim-evidence sweep above is the time-driven detector for the former (the latter is caught at the change boundary, not on calendar).
- [`skills/engineering-workflow/references/mode-decision-tree.md`](../skills/engineering-workflow/references/mode-decision-tree.md) §Scenarios that force Lean — the asymmetric retirement-cost row that lets sweep-backed retirements drop to Lean mode while additions stay Full per L60. This is what makes the methodology able to *shed* weight rather than only accumulate; the Discipline-provenance sweep above produces the finding that the row consumes.
- [`docs/glossary.md §Provenance drift`](glossary.md) — the term defined for the failure mode the Discipline-provenance sweep targets.
- [`docs/harness-evolution-discipline.md`](harness-evolution-discipline.md) — the **canonical-methodology counterpart**. Anti-entropy targets project-local discipline-provenance drift, calendar-driven, and explicitly excludes canonical methodology components ("origins by definition"). Harness-evolution targets canonical methodology components whose load-bearing-ness depends on a specific model-class failure mode, model-release-driven. Together the two cover both axes — project-local time-decay and canonical-methodology model-capability-decay — that would otherwise let the methodology accumulate ceremony monotonically.
2 changes: 1 addition & 1 deletion docs/file-role-map.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Without the map, the same rule drifts across `AGENTS.md`, `CLAUDE.md`, `GEMINI.m
|---|---|---|
| [`AGENTS.md`](../AGENTS.md) | SoT — operating contract (the 10 core rules) | Canonical; all runtimes inherit from here |
| [`multi-agent-handoff.md`](multi-agent-handoff.md) | SoT — role contract (Planner / Implementer / Reviewer definitions, field-ownership matrix, tool-permission matrix, anti-collusion, handoff minima, Task Prompt structure) | Canonical for multi-agent discipline; `agents/`, `.cursor/rules/`, `reference-implementations/roles/` all point back here |
| `docs/*.md` (other) | SoT — topic-specific definitions ([`surfaces.md`](surfaces.md), [`source-of-truth-patterns.md`](source-of-truth-patterns.md), [`breaking-change-framework.md`](breaking-change-framework.md), [`rollback-asymmetry.md`](rollback-asymmetry.md), [`phase-gate-discipline.md`](phase-gate-discipline.md), [`ai-operating-contract.md`](ai-operating-contract.md), [`agent-persona-discipline.md`](agent-persona-discipline.md), [`output-craft-discipline.md`](output-craft-discipline.md), [`glossary.md`](glossary.md), [`phase-command-vocabulary.md`](phase-command-vocabulary.md), [`repo-as-context-discipline.md`](repo-as-context-discipline.md), [`mechanical-enforcement-discipline.md`](mechanical-enforcement-discipline.md), [`tool-design-principles.md`](tool-design-principles.md), [`anti-entropy-discipline.md`](anti-entropy-discipline.md), [`autonomy-ladder-discipline.md`](autonomy-ladder-discipline.md), [`observability-legibility-discipline.md`](observability-legibility-discipline.md), [`throughput-first-merge-philosophy.md`](throughput-first-merge-philosophy.md), …) | Canonical per topic; referenced from the contracts above |
| `docs/*.md` (other) | SoT — topic-specific definitions ([`surfaces.md`](surfaces.md), [`source-of-truth-patterns.md`](source-of-truth-patterns.md), [`breaking-change-framework.md`](breaking-change-framework.md), [`rollback-asymmetry.md`](rollback-asymmetry.md), [`phase-gate-discipline.md`](phase-gate-discipline.md), [`ai-operating-contract.md`](ai-operating-contract.md), [`agent-persona-discipline.md`](agent-persona-discipline.md), [`output-craft-discipline.md`](output-craft-discipline.md), [`glossary.md`](glossary.md), [`phase-command-vocabulary.md`](phase-command-vocabulary.md), [`repo-as-context-discipline.md`](repo-as-context-discipline.md), [`mechanical-enforcement-discipline.md`](mechanical-enforcement-discipline.md), [`tool-design-principles.md`](tool-design-principles.md), [`anti-entropy-discipline.md`](anti-entropy-discipline.md), [`harness-evolution-discipline.md`](harness-evolution-discipline.md), [`autonomy-ladder-discipline.md`](autonomy-ladder-discipline.md), [`observability-legibility-discipline.md`](observability-legibility-discipline.md), [`throughput-first-merge-philosophy.md`](throughput-first-merge-philosophy.md), …) | Canonical per topic; referenced from the contracts above |
| [`skills/engineering-workflow/SKILL.md`](../skills/engineering-workflow/SKILL.md) + `skills/**` | SoT — execution layer (modes, phases, templates, references) | Canonical for workflow execution |
| [`schemas/`](../schemas/) + [`skills/engineering-workflow/templates/manifests/`](../skills/engineering-workflow/templates/manifests/) | SoT — machine-readable Change Manifest contract + worked examples | Canonical structural output |
| [`CLAUDE.md`](../CLAUDE.md) | Thin-bridge — Claude Code entry; points at [`AGENTS.md`](../AGENTS.md) + `skills/` | Onboarding only, no new normative content |
Expand Down
16 changes: 16 additions & 0 deletions docs/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -394,6 +394,22 @@ Code has merged but:

From the user's or operator's perspective, the change is not complete.

### Context anxiety

The intra-session failure mode in which a long-running agent **perceives** it is approaching its context window limit and prematurely truncates the remaining work — task scope shrinks, planned evidence is dropped, the session ends with a wrap-up narrative instead of executing the rest of the plan. The perception may or may not be accurate; the failure is the agent acting on the perception **without verifying** and **without declaring the truncation as a scope change or escalation**.

Distinguished from adjacent failures by where it strikes and what causes it:

| | Strike point | Cause | Remedy |
|---|---|---|---|
| `False completion` (above) | Pre-handoff "done" claim | Self-claim mistaken for verified result | Collect evidence; pass §10 self-check |
| `Context anxiety` (this entry) | Tail of long single-session work | Perception of approaching context limit | Verify before truncating; declare as escalation if real; prefer Context Reset over Compaction |
| Verbal-completion illusion (`ai-operating-contract.md` §11) | Action-transition points | Model chose `end_turn` instead of tool call | Couple narration with the tool call in same turn |

The remedy explicitly **prefers Context Reset (fresh session reading a Manifest-backed handoff) over Compaction (in-place summarisation)** when a long session is approaching its window limit: compaction shrinks the conversation but preserves the same session's pressure perception, so the anxiety re-fires on the next stretch; a fresh session does not carry the perception forward.

Canonical reference: `docs/ai-operating-contract.md §12`. Incoming-session counterpart (estimate before reading): `skills/engineering-workflow/references/resumption-protocol.md §Step 2b`. Pre-compression protection list (what to rescue on imminent compression): `docs/ai-project-memory.md §Pre-compression protection list`.

---

## Evidence and quality vocabulary
Expand Down
Loading
Loading