diff --git a/.gitignore b/.gitignore index e5bcf34c..6f3fad41 100644 --- a/.gitignore +++ b/.gitignore @@ -13,3 +13,7 @@ temp # Python cache __pycache__/ *.pyc + +.codex +*.json +plan.md diff --git a/README.md b/README.md index bd7f9145..904dc765 100644 --- a/README.md +++ b/README.md @@ -15,6 +15,7 @@ A Claude Code plugin that provides iterative development with independent AI rev - **Iteration over Perfection** -- Instead of expecting perfect output in one shot, Humanize leverages continuous feedback loops where issues are caught early and refined incrementally. - **One Build + One Review** -- Claude implements, Codex independently reviews. No blind spots. - **Ralph Loop with Swarm Mode** -- Iterative refinement continues until all acceptance criteria are met. Optionally parallelize with Agent Teams. +- **Manager-Driven Scenario Matrix** -- Humanize keeps a machine-readable task graph in `.humanize/rlcr//scenario-matrix.json`, lets the top-level manager reconcile task state, projects that state back into the Goal Tracker and checkpoint contract, and can nudge a stuck agent toward a narrower recovery path without replacing the single-mainline rule. - **Begin with the End in Mind** -- Before the loop starts, Humanize verifies that *you* understand the plan you are about to execute. The human must remain the architect. ([Details](docs/usage.md#begin-with-the-end-in-mind)) ## How It Works @@ -25,6 +26,14 @@ A Claude Code plugin that provides iterative development with independent AI rev The loop has two phases: **Implementation** (Claude works, Codex reviews summaries) and **Code Review** (Codex checks code quality with severity markers). Issues feed back into implementation until resolved. +New-format loops also maintain a compatibility-first manager orchestration runtime: + +- `scenario-matrix.json` is the machine-native control plane. It stores authoritative task state, dependency edges, task packets, repair-wave clustering, checkpoint/convergence metadata, and oversight signals. +- The top-level manager is the only authoritative scheduler and matrix reconciler. Execution agents implement code, while the manager assigns bounded task packets, ingests feedback, and keeps exactly one current primary objective. +- Review findings first enter a raw backlog, are deduplicated and normalized into grouped issue backlogs, and only become executable tasks when the manager explicitly promotes them. +- `goal-tracker.md` and `round-N-contract.md` remain human-facing compatibility views, but their mutable task sections are now projected from the matrix. +- Oversight interventions such as `nudge`, `reframe`, `split`, or `resequence` only steer the active agent back onto the current task. They do not create multiple mainlines or take over implementation authority. + ## Install ```bash @@ -69,6 +78,20 @@ Requires [codex CLI](https://github.com/openai/codex) for review. See the full [ humanize monitor gemini # Gemini invocations only ``` + The RLCR monitor now shows scenario-matrix readiness, the current mainline projection, the current manager checkpoint, convergence state, repair-wave context, and any active oversight action alongside the existing loop status. + +6. **Render the current scenario matrix as an HTML dashboard**: + ```bash + source /scripts/humanize.sh + humanize matrix # latest local RLCR session + humanize matrix --input tmp.json # explicit matrix/session/state file + humanize matrix --serve # local browser client with refresh + ``` + + `humanize matrix` generates a local HTML snapshot with the current primary objective, supporting window, dependency graph, feedback queues, recent events, and convergence/oversight status. + + `humanize matrix --serve` starts a local HTML client on `http://127.0.0.1:/`. Leave that page open and use the in-page `Refresh Snapshot` button instead of reopening freshly generated files. + ## Monitor Dashboard

@@ -78,6 +101,7 @@ Requires [codex CLI](https://github.com/openai/codex) for review. See the full [ ## Documentation - [Usage Guide](docs/usage.md) -- Commands, options, environment variables +- [Scenario Matrix Guide](scenario-matrix.md) -- Manager role, task packets, repair waves, and convergence flow - [Install for Claude Code](docs/install-for-claude.md) -- Full installation instructions - [Install for Codex](docs/install-for-codex.md) -- Codex skill runtime setup - [Install for Kimi](docs/install-for-kimi.md) -- Kimi CLI skill setup diff --git a/docs/usage.md b/docs/usage.md index b7e9738a..f5ac4d92 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -11,6 +11,18 @@ Humanize creates an iterative feedback loop with two phases: The loop continues until all acceptance criteria are met or no issues remain. +New-format RLCR loops also keep a compatibility-first runtime artifact at `.humanize/rlcr//scenario-matrix.json`. The matrix is the machine-readable control plane: it records seeded plan tasks, dependency edges, manager authority, task packets, repair waves, checkpoint/convergence state, review-driven state changes, and bounded oversight interventions when the active agent appears stuck. + +The top-level orchestrating session acts as the **manager**. It is the only authoritative scheduler and matrix reconciler. Execution agents implement code, while the manager reviews progress, ingests findings, and keeps exactly one current `primary objective` plus a bounded supporting window. + +Subagents do not receive the full global loop prompt by default. Instead, Humanize projects a **task packet** from the matrix that includes the current primary objective, local task, direct dependencies, downstream impact, allowed scope, success criteria, stop criteria, and explicit out-of-scope boundaries. This is how the runtime avoids "subagent single-player mode" caused by limited LLM context. + +Review findings first enter a raw backlog, are deduplicated, and are normalized into grouped issue backlogs before the manager decides whether any of them should become executable repair tasks. Low-value or out-of-bound findings can stay deferred in a watchlist instead of automatically joining the frontier. + +`goal-tracker.md` and `round-N-contract.md` remain the human-facing workflow. Humanize projects matrix state back into those files so the active checkpoint still has exactly one current mainline objective even when several supporting tasks are queued behind it. + +Oversight does not replace the executing agent. It only injects bounded corrections such as `nudge`, `reframe`, `split`, `reclassify`, or `resequence` when repeated failures suggest the current method or task framing is unhealthy. + ## Begin with the End in Mind Before the RLCR loop starts any work, Humanize runs a **Plan Understanding Quiz** -- a brief pre-flight check that verifies you genuinely understand the plan you are about to execute. @@ -64,6 +76,12 @@ The quiz is advisory, not a gate. You always have the option to proceed. But tha | `/gen-plan --input --output ` | Generate structured plan from draft | | `/refine-plan --input ` | Refine an annotated plan and generate a QA ledger | | `/ask-codex [question]` | One-shot consultation with Codex | +| `humanize matrix [--input ] [--output ] [--serve]` | Render a local HTML dashboard or run a refreshable local matrix client | + +For scenario-matrix inspection, there are now two modes: + +- `humanize matrix` writes a static HTML snapshot next to the current matrix or session. +- `humanize matrix --serve` starts a localhost HTML client. Keep that browser tab open and use the page's `Refresh Snapshot` button to pull the latest matrix view without reopening generated files. ## Command Reference @@ -227,6 +245,53 @@ for getting a second opinion, reviewing a design, or asking domain-specific ques Responses are saved to `.humanize/skill//` with `input.md`, `output.md`, and `metadata.md` for reference. +### humanize matrix + +After sourcing `scripts/humanize.sh`, you can render a scenario-matrix snapshot into a local HTML dashboard: + +```bash +source scripts/humanize.sh + +humanize matrix +humanize matrix --input .humanize/rlcr/2026-04-01_20-41-00 +humanize matrix --input tmp.json --output /tmp/matrix-view.html +``` + +Input resolution rules: + +- No `--input`: use the latest local RLCR session under `.humanize/rlcr/` +- Session directory: resolve that session's `scenario-matrix.json` +- `state.md` / `*-state.md`: follow `scenario_matrix_file` from the state file +- `.json` file: render that matrix file directly +- Project directory: resolve the latest local RLCR session under that project + +The generated HTML snapshot includes: + +- current primary objective and supporting window +- task board grouped into primary/supporting/active/done/deferred buckets +- dependency edges between tasks +- checkpoint, convergence, and oversight status +- recent events plus execution/review feedback queues +- per-task detail drill-down without reading raw JSON + +## Monitoring + +Load the helper script and run the RLCR monitor: + +```bash +source scripts/humanize.sh +humanize monitor rlcr +humanize matrix +``` + +The monitor remains compatible with legacy loops, but for new loops it also surfaces: + +- scenario-matrix readiness (`ready`, `legacy`, `missing`, `invalid`, or `not_applicable`) +- the current matrix-derived mainline summary +- the current manager checkpoint and convergence status +- the primary repair-wave or cluster context when one is active +- any active oversight action currently steering the next round + ## Configuration Humanize uses a 4-layer config hierarchy (lowest to highest priority): diff --git a/hooks/lib/loop-common.sh b/hooks/lib/loop-common.sh index 2425449b..c8ff2669 100755 --- a/hooks/lib/loop-common.sh +++ b/hooks/lib/loop-common.sh @@ -42,6 +42,8 @@ readonly FIELD_PRIVACY_MODE="privacy_mode" readonly FIELD_MAINLINE_STALL_COUNT="mainline_stall_count" readonly FIELD_LAST_MAINLINE_VERDICT="last_mainline_verdict" readonly FIELD_DRIFT_STATUS="drift_status" +readonly FIELD_SCENARIO_MATRIX_FILE="scenario_matrix_file" +readonly FIELD_SCENARIO_MATRIX_REQUIRED="scenario_matrix_required" readonly MAINLINE_VERDICT_ADVANCED="advanced" readonly MAINLINE_VERDICT_STALLED="stalled" @@ -407,6 +409,8 @@ _parse_state_fields() { STATE_MAINLINE_STALL_COUNT=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_MAINLINE_STALL_COUNT}:" | sed "s/${FIELD_MAINLINE_STALL_COUNT}: *//" | tr -d ' ' || true) STATE_LAST_MAINLINE_VERDICT=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_LAST_MAINLINE_VERDICT}:" | sed "s/${FIELD_LAST_MAINLINE_VERDICT}: *//" | tr -d ' ' || true) STATE_DRIFT_STATUS=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_DRIFT_STATUS}:" | sed "s/${FIELD_DRIFT_STATUS}: *//" | tr -d ' ' || true) + STATE_SCENARIO_MATRIX_FILE=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_SCENARIO_MATRIX_FILE}:" | sed "s/${FIELD_SCENARIO_MATRIX_FILE}: *//; s/^\"//; s/\"\$//" || true) + STATE_SCENARIO_MATRIX_REQUIRED=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_SCENARIO_MATRIX_REQUIRED}:" | sed "s/${FIELD_SCENARIO_MATRIX_REQUIRED}: *//" | tr -d ' ' || true) } # Parse state file frontmatter and set variables (tolerant mode with defaults) @@ -457,6 +461,7 @@ parse_state_file() { STATE_MAINLINE_STALL_COUNT="${STATE_MAINLINE_STALL_COUNT:-0}" STATE_LAST_MAINLINE_VERDICT="${STATE_LAST_MAINLINE_VERDICT:-$MAINLINE_VERDICT_UNKNOWN}" STATE_DRIFT_STATUS="${STATE_DRIFT_STATUS:-$DRIFT_STATUS_NORMAL}" + STATE_SCENARIO_MATRIX_REQUIRED="${STATE_SCENARIO_MATRIX_REQUIRED:-false}" # STATE_REVIEW_STARTED left as-is (empty if missing, to allow schema validation) return 0 @@ -536,6 +541,7 @@ parse_state_file_strict() { STATE_MAINLINE_STALL_COUNT="${STATE_MAINLINE_STALL_COUNT:-0}" STATE_LAST_MAINLINE_VERDICT="${STATE_LAST_MAINLINE_VERDICT:-$MAINLINE_VERDICT_UNKNOWN}" STATE_DRIFT_STATUS="${STATE_DRIFT_STATUS:-$DRIFT_STATUS_NORMAL}" + STATE_SCENARIO_MATRIX_REQUIRED="${STATE_SCENARIO_MATRIX_REQUIRED:-false}" return 0 } diff --git a/hooks/lib/scenario-matrix.sh b/hooks/lib/scenario-matrix.sh new file mode 100644 index 00000000..aaf68a65 --- /dev/null +++ b/hooks/lib/scenario-matrix.sh @@ -0,0 +1,4137 @@ +#!/usr/bin/env bash +# +# Shared helpers for scenario matrix runtime state. +# + +[[ -n "${_SCENARIO_MATRIX_LOADED:-}" ]] && return 0 2>/dev/null || true +_SCENARIO_MATRIX_LOADED=1 + +readonly SCENARIO_MATRIX_SCHEMA_VERSION=2 + +scenario_matrix_trim() { + printf '%s' "$1" | sed 's/^[[:space:]]*//; s/[[:space:]]*$//' +} + +scenario_matrix_normalize_header() { + printf '%s' "$1" \ + | tr '[:upper:]' '[:lower:]' \ + | tr -d '`' \ + | sed 's/[[:space:]]*(.*$//; s/[[:space:]]\+/ /g; s/^[[:space:]]*//; s/[[:space:]]*$//' +} + +scenario_matrix_row_to_cells() { + local row + row=$(scenario_matrix_trim "$1") + [[ "$row" == \|* ]] && row="${row#|}" + [[ "$row" == *\| ]] && row="${row%|}" + printf '%s' "$row" | jq -rR 'split("|") | map(gsub("^\\s+|\\s+$"; "")) | .[]' +} + +scenario_matrix_read_lines_into_array() { + local target_var="$1" + local input_text="$2" + local line quoted_line + + eval "$target_var=()" + if [[ -z "$input_text" ]]; then + return 0 + fi + + while IFS= read -r line; do + printf -v quoted_line '%q' "$line" + eval "$target_var+=( $quoted_line )" + done <<< "$input_text" +} + +scenario_matrix_csv_to_json_array() { + local raw + raw=$(scenario_matrix_trim "$1") + if [[ -z "$raw" || "$raw" == "-" ]]; then + printf '[]\n' + return 0 + fi + + printf '%s' "$raw" | jq -R 'split(",") | map(gsub("^\\s+|\\s+$"; "") | select(length > 0 and . != "-"))' +} + +scenario_matrix_json_array_from_lines() { + if [[ $# -eq 0 ]]; then + printf '[]\n' + return 0 + fi + + printf '%s\n' "$@" | jq -R . | jq -s . +} + +scenario_matrix_lines_to_json_string() { + if [[ $# -eq 0 ]]; then + printf '""\n' + return 0 + fi + + printf '%s\n' "$@" | jq -Rs 'sub("\\n$"; "")' +} + +scenario_matrix_parse_coverage_ledger_paragraph_json() { + local paragraph="$1" + local normalized + + normalized=$(printf '%s' "$paragraph" | jq -Rs 'gsub("\\r"; "") | gsub("\\n+"; " ") | gsub("[[:space:]]+"; " ") | gsub("^\\s+|\\s+$"; "")') + + jq -cn --argjson paragraph "$normalized" ' + def trim_text: + gsub("^\\s+|\\s+$"; ""); + def clean_surface: + trim_text + | sub("^[*-][[:space:]]*"; "") + | sub("^[0-9]+\\.[[:space:]]*"; "") + | sub("^[Ss]urface:[[:space:]]*"; ""); + def clean_notes: + trim_text + | sub("^[Nn]otes?:[[:space:]]*"; "") + | sub("^[;:.,-]+[[:space:]]*"; ""); + + ($paragraph | trim_text) as $raw + | if $raw == "" then + empty + else + ( + (try ($raw | capture("^surface:[[:space:]]*(?.+?)[[:space:]]*(?:[;|,])[[:space:]]*status:[[:space:]]*(?covered|partial|unclear)\\b(?.*)$"; "i")) catch null) + // (try ($raw | capture("^(?.+?)[[:space:]]*\\|[[:space:]]*(?covered|partial|unclear)[[:space:]]*\\|(?.*)$"; "i")) catch null) + // (try ($raw | capture("^(?.+?)[[:space:]]*:[[:space:]]*(?covered|partial|unclear)\\b(?.*)$"; "i")) catch null) + // (try ($raw | capture("^(?.+?)[[:space:]]+is[[:space:]]+(?covered|partial|unclear)\\b(?.*)$"; "i")) catch null) + // (try ($raw | capture("^(?.+?)[[:space:]]*-[[:space:]]*(?covered|partial|unclear)\\b(?.*)$"; "i")) catch null) + ) as $match + | if $match == null then + empty + else + { + surface: (($match.surface // "") | clean_surface), + status: (($match.status // "unclear") | ascii_downcase), + notes: (($match.notes // "") | clean_notes) + } + | select(.surface != "") + end + end + ' +} + +scenario_matrix_emit_breakdown_result() { + local result_status="$1" + local tasks_json="$2" + local warnings_json="$3" + + jq -cn \ + --arg result_status "$result_status" \ + --argjson tasks "$tasks_json" \ + --argjson warnings "$warnings_json" \ + '{ + status: $result_status, + tasks: $tasks, + warnings: $warnings + }' +} + +scenario_matrix_extract_task_breakdown_section() { + local plan_path="$1" + + awk ' + /^##[[:space:]]*Task Breakdown([[:space:]]*$|[[:space:][:punct:]].*)/ { + in_section = 1 + next + } + /^##[[:space:]]+/ { + if (in_section) { + exit + } + } + in_section { + print + } + ' "$plan_path" +} + +scenario_matrix_parse_task_breakdown_json() { + local plan_path="$1" + local section + + section=$(scenario_matrix_extract_task_breakdown_section "$plan_path") + if [[ -z "$section" ]]; then + scenario_matrix_emit_breakdown_result "missing" '[]' '[]' + return 0 + fi + + local -a table_lines=() + while IFS= read -r line; do + [[ "$line" =~ ^[[:space:]]*\| ]] || continue + table_lines+=("$line") + done <<< "$section" + + if [[ ${#table_lines[@]} -lt 2 ]]; then + scenario_matrix_emit_breakdown_result \ + "malformed" \ + '[]' \ + "$(scenario_matrix_json_array_from_lines "Task Breakdown section is present but does not contain a valid markdown table.")" + return 0 + fi + + local header_line="${table_lines[0]}" + local separator_line="${table_lines[1]}" + + if ! printf '%s\n' "$separator_line" | grep -qE '^[[:space:]|:-]+$'; then + scenario_matrix_emit_breakdown_result \ + "malformed" \ + '[]' \ + "$(scenario_matrix_json_array_from_lines "Task Breakdown table separator row is malformed.")" + return 0 + fi + + local -a header_cells=() + scenario_matrix_read_lines_into_array header_cells "$(scenario_matrix_row_to_cells "$header_line")" + + local idx_task_id=-1 + local idx_description=-1 + local idx_target_ac=-1 + local idx_tag=-1 + local idx_depends_on=-1 + + local i normalized_header + for i in "${!header_cells[@]}"; do + normalized_header=$(scenario_matrix_normalize_header "${header_cells[$i]}") + case "$normalized_header" in + "task id") + idx_task_id=$i + ;; + "description") + idx_description=$i + ;; + "target ac") + idx_target_ac=$i + ;; + tag*) + idx_tag=$i + ;; + "depends on") + idx_depends_on=$i + ;; + esac + done + + if [[ $idx_task_id -lt 0 || $idx_description -lt 0 || $idx_target_ac -lt 0 || $idx_tag -lt 0 || $idx_depends_on -lt 0 ]]; then + scenario_matrix_emit_breakdown_result \ + "malformed" \ + '[]' \ + "$(scenario_matrix_json_array_from_lines "Task Breakdown header must include Task ID, Description, Target AC, Tag, and Depends On columns.")" + return 0 + fi + + local tasks_json='[]' + local malformed_message="" + local row_number task_id description target_ac_raw tag_raw depends_on_raw tag_lower + local target_ac_json depends_on_json state task_json lane seeded_task_count + seeded_task_count=0 + + for ((i = 2; i < ${#table_lines[@]}; i++)); do + local -a row_cells=() + scenario_matrix_read_lines_into_array row_cells "$(scenario_matrix_row_to_cells "${table_lines[$i]}")" + row_number=$i + + task_id=$(scenario_matrix_trim "${row_cells[$idx_task_id]:-}") + description=$(scenario_matrix_trim "${row_cells[$idx_description]:-}") + target_ac_raw=$(scenario_matrix_trim "${row_cells[$idx_target_ac]:-}") + tag_raw=$(scenario_matrix_trim "${row_cells[$idx_tag]:-}") + depends_on_raw=$(scenario_matrix_trim "${row_cells[$idx_depends_on]:-}") + + if [[ -z "$task_id" && -z "$description" && -z "$target_ac_raw" && -z "$tag_raw" && -z "$depends_on_raw" ]]; then + continue + fi + + if [[ -z "$task_id" || ! "$task_id" =~ ^[a-zA-Z0-9._-]+$ ]]; then + malformed_message="Task Breakdown row $row_number has an invalid Task ID." + break + fi + + if [[ -z "$description" ]]; then + malformed_message="Task Breakdown row $row_number is missing a Description." + break + fi + + target_ac_json=$(scenario_matrix_csv_to_json_array "$target_ac_raw") + if [[ "$(printf '%s' "$target_ac_json" | jq 'length')" -eq 0 ]]; then + malformed_message="Task Breakdown row $row_number is missing Target AC entries." + break + fi + + tag_lower=$(printf '%s' "$tag_raw" | tr '[:upper:]' '[:lower:]' | tr -d '[:space:]') + if [[ "$tag_lower" != "coding" && "$tag_lower" != "analyze" ]]; then + malformed_message="Task Breakdown row $row_number has an invalid Tag value: $tag_raw" + break + fi + + depends_on_json=$(scenario_matrix_csv_to_json_array "$depends_on_raw") + state="pending" + if [[ "$(printf '%s' "$depends_on_json" | jq 'length')" -eq 0 ]]; then + state="ready" + fi + lane="supporting" + if [[ "$seeded_task_count" -eq 0 ]]; then + lane="mainline" + fi + + task_json=$(jq -cn \ + --arg id "$task_id" \ + --arg title "$description" \ + --arg lane "$lane" \ + --arg routing "$tag_lower" \ + --arg state "$state" \ + --argjson target_ac "$target_ac_json" \ + --argjson depends_on "$depends_on_json" ' + { + id: $id, + title: $title, + lane: $lane, + routing: $routing, + source: "plan", + kind: "feature", + severity: null, + confidence: null, + finding_status: null, + file_ref: null, + review_phase: null, + owner: null, + scope: { + summary: "", + paths: [], + constraints: [] + }, + cluster_id: null, + repair_wave: null, + risk_bucket: "planned", + admission: { + status: "active", + reason: "seeded_from_plan" + }, + authority: { + write_mode: "manager_only", + authoritative_source: "manager" + }, + target_ac: $target_ac, + depends_on: $depends_on, + state: $state, + assumptions: [], + strategy: { + current: "", + attempt_count: 0, + repeated_failure_count: 0, + method_switch_required: false + }, + health: { + stuck_score: 0, + last_progress_round: 0 + }, + metadata: { + seed_source: "task_breakdown" + } + }') + + tasks_json=$(jq -cn --argjson tasks "$tasks_json" --argjson task "$task_json" '$tasks + [$task]') + seeded_task_count=$((seeded_task_count + 1)) + done + + if [[ -n "$malformed_message" ]]; then + scenario_matrix_emit_breakdown_result \ + "malformed" \ + '[]' \ + "$(scenario_matrix_json_array_from_lines "$malformed_message")" + return 0 + fi + + scenario_matrix_emit_breakdown_result "parsed" "$tasks_json" '[]' +} + +scenario_matrix_initialize_file() { + local matrix_file="$1" + local logical_plan_file="$2" + local backup_plan_file="$3" + local mode="$4" + local current_round="${5:-0}" + local status_override="${6:-}" + local resolved_backup_plan_file="" + + if [[ -n "$backup_plan_file" ]]; then + if [[ "$backup_plan_file" == /* ]]; then + resolved_backup_plan_file="$backup_plan_file" + elif [[ -f "$backup_plan_file" ]]; then + resolved_backup_plan_file="$backup_plan_file" + else + local matrix_repo_root="" + matrix_repo_root=$(cd "$(dirname "$matrix_file")/../../.." 2>/dev/null && pwd) || matrix_repo_root="" + if [[ -n "$matrix_repo_root" && -f "$matrix_repo_root/$backup_plan_file" ]]; then + resolved_backup_plan_file="$matrix_repo_root/$backup_plan_file" + fi + fi + fi + + local breakdown_json + case "$status_override" in + "not_applicable") + breakdown_json='{"status":"not_applicable","tasks":[],"warnings":[]}' + ;; + *) + if [[ -n "$resolved_backup_plan_file" && -f "$resolved_backup_plan_file" ]]; then + breakdown_json=$(scenario_matrix_parse_task_breakdown_json "$resolved_backup_plan_file") + else + breakdown_json='{"status":"missing","tasks":[],"warnings":[]}' + fi + ;; + esac + + local temp_file="${matrix_file}.tmp.$$" + jq -cn \ + --arg logical_plan_file "$logical_plan_file" \ + --arg backup_plan_file "$backup_plan_file" \ + --arg mode "$mode" \ + --arg created_at "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \ + --argjson current_round "$current_round" \ + --argjson breakdown "$breakdown_json" \ + '{ + schema_version: '"$SCENARIO_MATRIX_SCHEMA_VERSION"', + created_at: $created_at, + plan: { + file: $logical_plan_file, + backup_file: $backup_plan_file, + task_breakdown_status: ($breakdown.status // "missing"), + warnings: ($breakdown.warnings // []) + }, + runtime: { + mode: $mode, + current_round: $current_round, + projection_mode: "compatibility", + checkpoint: { + sequence: 0, + current_id: "", + frontier_signature: "", + frontier_changed: false, + primary_task_id: null, + supporting_task_ids: [], + frontier_reason: "uninitialized", + updated_at: null + }, + convergence: { + status: "continue", + next_action: "hold_checkpoint", + guidance: "Reconcile the manager frontier before deriving execution guidance.", + residual_risk_score: 0, + must_fix_open_count: 0, + high_risk_open_count: 0, + active_task_count: 0, + watchlist_count: 0, + recent_high_value_novelty_count: 0, + updated_at: null + } + }, + metadata: { + seed_task_count: (($breakdown.tasks // []) | length), + seed_source: ( + if ($breakdown.status // "missing") == "parsed" then + "task_breakdown" + elif ($breakdown.status // "missing") == "not_applicable" then + "not_applicable" + else + "fallback" + end + ) + }, + manager: { + role: "top_level_session", + authority_mode: "manager_reconcile", + authoritative_writer: "manager", + current_primary_task_id: ( + first( + ($breakdown.tasks // [])[].id + ) // null + ), + last_reconciled_at: $created_at + }, + feedback: { + execution: [], + review: [] + }, + raw_findings: [], + finding_groups: [], + tasks: ($breakdown.tasks // []), + events: [], + oversight: { + status: "idle", + last_action: "none", + updated_at: null, + intervention: null, + history: [] + } + }' > "$temp_file" + + mv "$temp_file" "$matrix_file" +} + +scenario_matrix_validate_file() { + local matrix_file="$1" + + jq -e ' + def valid_string_array: + type == "array" and all(.[]; type == "string"); + + def valid_optional_string: + . == null or type == "string"; + + def valid_optional_number: + . == null or type == "number"; + + def valid_scope: + type == "object" + and ((.summary // "") | type) == "string" + and ((.paths // []) | valid_string_array) + and ((.constraints // []) | valid_string_array); + + def valid_checkpoint: + type == "object" + and ((.sequence // 0) | type) == "number" + and ((.current_id // "") | type) == "string" + and ((.frontier_signature // "") | type) == "string" + and ((.frontier_changed // false) | type) == "boolean" + and ((.primary_task_id // null) | valid_optional_string) + and ((.supporting_task_ids // []) | valid_string_array) + and ((.frontier_reason // "") | type) == "string" + and ((.updated_at // null) | valid_optional_string); + + def valid_convergence: + type == "object" + and ((.status // "continue") | type) == "string" + and ((.next_action // "hold_checkpoint") | type) == "string" + and ((.guidance // "") | type) == "string" + and ((.residual_risk_score // 0) | type) == "number" + and ((.must_fix_open_count // 0) | type) == "number" + and ((.high_risk_open_count // 0) | type) == "number" + and ((.active_task_count // 0) | type) == "number" + and ((.watchlist_count // 0) | type) == "number" + and ((.recent_high_value_novelty_count // 0) | type) == "number" + and ((.updated_at // null) | valid_optional_string); + + def valid_review_surface_entry: + type == "object" + and ((.surface // "") | type) == "string" + and ((.reason // "") | type) == "string" + and ((.confidence // null) | valid_optional_string); + + def valid_sibling_risk_entry: + type == "object" + and ((.summary // "") | type) == "string" + and ((.derived_from // null) | valid_optional_string) + and ((.expansion_axis // null) | valid_optional_string) + and ((.why_likely // null) | valid_optional_string) + and ((.recommended_check // null) | valid_optional_string) + and ((.confidence // null) | valid_optional_string); + + def valid_coverage_ledger_entry: + type == "object" + and ((.surface // "") | type) == "string" + and ((.status // "") | type) == "string" + and ((.notes // "") | type) == "string"; + + def valid_review_coverage: + type == "object" + and ((.source_phase // null) | valid_optional_string) + and ((.source_round // 0) | type) == "number" + and ((.updated_at // null) | valid_optional_string) + and ((.touched_failure_surfaces // []) | type) == "array" + and all((.touched_failure_surfaces // [])[]; valid_review_surface_entry) + and ((.likely_sibling_risks // []) | type) == "array" + and all((.likely_sibling_risks // [])[]; valid_sibling_risk_entry) + and ((.coverage_ledger // []) | type) == "array" + and all((.coverage_ledger // [])[]; valid_coverage_ledger_entry) + and ((.raw_sections // {}) | type) == "object" + and (((.raw_sections.touched_failure_surfaces // "") | type) == "string") + and (((.raw_sections.likely_sibling_risks // "") | type) == "string") + and (((.raw_sections.coverage_ledger // "") | type) == "string") + and ((.summary // {}) | type) == "object" + and (((.summary.surface_count // 0) | type) == "number") + and (((.summary.sibling_risk_count // 0) | type) == "number") + and (((.summary.covered_count // 0) | type) == "number") + and (((.summary.partial_or_unclear_count // 0) | type) == "number"); + + def valid_admission: + type == "object" + and ((.status // "active") | type) == "string" + and ((.reason // "") | type) == "string"; + + def valid_task_authority: + type == "object" + and ((.write_mode // "manager_only") == "manager_only") + and ((.authoritative_source // "manager") == "manager"); + + def valid_v2_task_fields: + ((.owner // null) | valid_optional_string) + and ((.owner // "") != "manager") + and ((.scope // {}) | valid_scope) + and ((.cluster_id // null) | valid_optional_string) + and ((.repair_wave // null) | valid_optional_string) + and ((.risk_bucket // "normal") | type) == "string" + and ((.admission // {}) | valid_admission) + and ((.authority // {}) | valid_task_authority); + + def valid_task: + type == "object" + and (.id | type) == "string" + and (.title | type) == "string" + and (.lane | type) == "string" + and (.routing | type) == "string" + and ((.source // "plan") | type) == "string" + and ((.kind // "feature") | type) == "string" + and ((.severity // null) | valid_optional_string) + and ((.confidence // null) | valid_optional_number) + and ((.finding_status // null) | valid_optional_string) + and ((.file_ref // null) | valid_optional_string) + and ((.review_phase // null) | valid_optional_string) + and (.target_ac | valid_string_array) + and (.depends_on | valid_string_array) + and (.state | type) == "string" + and (.assumptions | valid_string_array) + and (.strategy | type) == "object" + and ((.strategy.current // "") | type) == "string" + and ((.strategy.attempt_count // 0) | type) == "number" + and ((.strategy.repeated_failure_count // 0) | type) == "number" + and ((.strategy.method_switch_required // false) | type) == "boolean" + and (.health | type) == "object" + and ((.health.stuck_score // 0) | type) == "number" + and ((.health.last_progress_round // 0) | type) == "number" + and ((.metadata // {}) | type) == "object"; + + def valid_feedback_entry: + type == "object" + and ((.source // "") | type) == "string" + and ((.kind // "") | type) == "string" + and ((.summary // "") | type) == "string" + and ((.suggested_by // "") | type) == "string" + and ((.task_id // null) | valid_optional_string) + and ((.source_file // null) | valid_optional_string) + and ((.created_at // null) | valid_optional_string) + and ((.authoritative // false) == false); + + def valid_raw_finding: + type == "object" + and ((.id // "") | type) == "string" + and ((.title // "") | type) == "string" + and ((.summary // "") | type) == "string" + and ((.severity // null) | valid_optional_string) + and ((.confidence // null) == null or (.confidence | type) == "number") + and ((.source // "") | type) == "string" + and ((.kind // "") | type) == "string" + and ((.review_phase // "") | type) == "string" + and ((.cluster_id // null) | valid_optional_string) + and ((.repair_wave_hint // null) | valid_optional_string) + and ((.admission_status // "") | type) == "string" + and ((.admission_reason // "") | type) == "string" + and ((.lane // "") | type) == "string" + and ((.state // "") | type) == "string" + and ((.routing // "") | type) == "string" + and ((.risk_bucket // "") | type) == "string" + and ((.finding_key // "") | type) == "string" + and ((.related_task_id // null) | valid_optional_string) + and ((.link_task_id // null) | valid_optional_string) + and ((.file_ref // null) | valid_optional_string) + and ((.target_ac // []) | valid_string_array) + and ((.depends_on // []) | valid_string_array) + and ((.surface_key // "") | type) == "string" + and ((.surface_label // "") | type) == "string" + and ((.group_key // "") | type) == "string" + and ((.first_seen_round // 0) | type) == "number" + and ((.last_seen_round // 0) | type) == "number" + and ((.occurrence_count // 0) | type) == "number"; + + def valid_finding_group: + type == "object" + and ((.id // "") | type) == "string" + and ((.title // "") | type) == "string" + and ((.summary // "") | type) == "string" + and ((.surface_key // "") | type) == "string" + and ((.surface_label // "") | type) == "string" + and ((.state // "") | type) == "string" + and ((.admission_status // "") | type) == "string" + and ((.severity // null) | valid_optional_string) + and ((.risk_bucket // "") | type) == "string" + and ((.target_ac // []) | valid_string_array) + and ((.related_task_ids // []) | valid_string_array) + and ((.file_refs // []) | valid_string_array) + and ((.finding_ids // []) | valid_string_array) + and ((.sample_summaries // []) | valid_string_array) + and ((.finding_count // 0) | type) == "number" + and ((.first_seen_round // 0) | type) == "number" + and ((.last_seen_round // 0) | type) == "number"; + + def valid_manager: + type == "object" + and ((.role // "") | type) == "string" + and ((.authority_mode // "") == "manager_reconcile") + and ((.authoritative_writer // "") == "manager") + and ((.current_primary_task_id // null) | valid_optional_string) + and ((.last_reconciled_at // null) | valid_optional_string); + + def active_mainlines: + [ + .tasks[] + | select(.lane == "mainline") + | select(.state != "done" and .state != "deferred") + ]; + + (.schema_version | type) == "number" + and (.schema_version >= 1 and .schema_version <= '"$SCENARIO_MATRIX_SCHEMA_VERSION"') + and (.plan | type) == "object" + and (.plan.task_breakdown_status | type) == "string" + and ((.plan.warnings // []) | type) == "array" + and all((.plan.warnings // [])[]; type == "string") + and (.runtime | type) == "object" + and (.runtime.mode | type) == "string" + and ((.runtime.current_round // 0) | type) == "number" + and ((.runtime.checkpoint // {}) | valid_checkpoint) + and ((.runtime.convergence // {}) | valid_convergence) + and (((.runtime.review_coverage // null) == null) or ((.runtime.review_coverage // {}) | valid_review_coverage)) + and (.tasks | type) == "array" + and all(.tasks[]; valid_task) + and (.events | type) == "array" + and all(.events[]; type == "object") + and (.oversight | type) == "object" + and ((.oversight.status // "idle") | type) == "string" + and ((.oversight.last_action // "none") | type) == "string" + and ((.oversight.intervention == null) or (.oversight.intervention | type) == "object") + and ((.oversight.history // []) | type) == "array" + and all((.oversight.history // [])[]; type == "object") + and ( + if .schema_version >= 2 then + (.manager | valid_manager) + and all(.tasks[]; valid_v2_task_fields) + and (.feedback | type) == "object" + and ((.feedback.execution // []) | type) == "array" + and all((.feedback.execution // [])[]; valid_feedback_entry) + and ((.feedback.review // []) | type) == "array" + and all((.feedback.review // [])[]; valid_feedback_entry) + and ((.raw_findings // []) | type) == "array" + and all((.raw_findings // [])[]; valid_raw_finding) + and ((.finding_groups // []) | type) == "array" + and all((.finding_groups // [])[]; valid_finding_group) + and ((active_mainlines | length) <= 1) + and ( + if (active_mainlines | length) == 1 then + (.manager.current_primary_task_id // null) == active_mainlines[0].id + elif (.manager.current_primary_task_id // null) == null then + true + else + true + end + ) + and all(.tasks[]; ((.authority.authoritative_source // "manager") == "manager")) + else + true + end + ) + ' "$matrix_file" >/dev/null 2>&1 +} + +scenario_matrix_dependency_hint_from_review() { + local review_content_lower + review_content_lower=$(printf '%s' "$1" | tr '[:upper:]' '[:lower:]') + + if printf '%s\n' "$review_content_lower" | grep -qE 'depends on|dependency|dependent task|downstream|upstream|interface chang|contract chang|schema chang|follow-on task'; then + echo "true" + else + echo "false" + fi +} + +scenario_matrix_extract_review_coverage_json() { + local review_content="$1" + local current_section="" + local trimmed heading_candidate normalized_header section_end=false + local surfaces_json='[]' + local sibling_json='[]' + local coverage_json='[]' + local -a surface_lines=() + local -a sibling_lines=() + local -a coverage_lines=() + local coverage_paragraph="" + local line content surface reason confidence summary derived_from expansion_axis why_likely recommended_check + local -a parts=() + local cell_lines row_json cells_output coverage_entry_json + + while IFS= read -r line; do + trimmed=$(scenario_matrix_trim "$line") + if [[ -z "$trimmed" ]]; then + if [[ "$current_section" == "coverage" && -n "$coverage_paragraph" ]]; then + coverage_entry_json=$(scenario_matrix_parse_coverage_ledger_paragraph_json "$coverage_paragraph") + if [[ -n "$coverage_entry_json" ]]; then + coverage_json=$(jq -cn --argjson rows "$coverage_json" --argjson row "$coverage_entry_json" '$rows + [$row]') + fi + coverage_paragraph="" + fi + continue + fi + + if [[ "$trimmed" == "COMPLETE" || "$trimmed" == "CONTINUE" || "$trimmed" == "STOP" ]]; then + if [[ "$current_section" == "coverage" && -n "$coverage_paragraph" ]]; then + coverage_entry_json=$(scenario_matrix_parse_coverage_ledger_paragraph_json "$coverage_paragraph") + if [[ -n "$coverage_entry_json" ]]; then + coverage_json=$(jq -cn --argjson rows "$coverage_json" --argjson row "$coverage_entry_json" '$rows + [$row]') + fi + coverage_paragraph="" + fi + current_section="" + continue + fi + + heading_candidate=$(printf '%s' "$trimmed" | sed 's/^#\{1,6\}[[:space:]]*//') + normalized_header=$(scenario_matrix_normalize_header "$heading_candidate") + case "$normalized_header" in + "touched failure surfaces") + if [[ "$current_section" == "coverage" && -n "$coverage_paragraph" ]]; then + coverage_entry_json=$(scenario_matrix_parse_coverage_ledger_paragraph_json "$coverage_paragraph") + if [[ -n "$coverage_entry_json" ]]; then + coverage_json=$(jq -cn --argjson rows "$coverage_json" --argjson row "$coverage_entry_json" '$rows + [$row]') + fi + coverage_paragraph="" + fi + current_section="surfaces" + continue + ;; + "likely sibling risks") + if [[ "$current_section" == "coverage" && -n "$coverage_paragraph" ]]; then + coverage_entry_json=$(scenario_matrix_parse_coverage_ledger_paragraph_json "$coverage_paragraph") + if [[ -n "$coverage_entry_json" ]]; then + coverage_json=$(jq -cn --argjson rows "$coverage_json" --argjson row "$coverage_entry_json" '$rows + [$row]') + fi + coverage_paragraph="" + fi + current_section="siblings" + continue + ;; + "coverage ledger") + current_section="coverage" + continue + ;; + "mainline gaps"|"blocking side issues"|"queued side issues"|"goal alignment summary") + if [[ "$current_section" == "coverage" && -n "$coverage_paragraph" ]]; then + coverage_entry_json=$(scenario_matrix_parse_coverage_ledger_paragraph_json "$coverage_paragraph") + if [[ -n "$coverage_entry_json" ]]; then + coverage_json=$(jq -cn --argjson rows "$coverage_json" --argjson row "$coverage_entry_json" '$rows + [$row]') + fi + coverage_paragraph="" + fi + current_section="" + continue + ;; + esac + + if [[ "$trimmed" =~ ^Mainline[[:space:]]+Progress[[:space:]]+Verdict: ]]; then + if [[ "$current_section" == "coverage" && -n "$coverage_paragraph" ]]; then + coverage_entry_json=$(scenario_matrix_parse_coverage_ledger_paragraph_json "$coverage_paragraph") + if [[ -n "$coverage_entry_json" ]]; then + coverage_json=$(jq -cn --argjson rows "$coverage_json" --argjson row "$coverage_entry_json" '$rows + [$row]') + fi + coverage_paragraph="" + fi + current_section="" + continue + fi + + case "$current_section" in + surfaces) + surface_lines+=("$trimmed") + if [[ "$trimmed" =~ ^[*-][[:space:]]+ ]]; then + content=$(printf '%s' "$trimmed" | sed 's/^[*-][[:space:]]*//') + surface=$(scenario_matrix_trim "${content%%|*}") + reason="" + confidence="" + if [[ "$content" == *"|"* ]]; then + IFS='|' read -r -a parts <<< "$content" + surface=$(scenario_matrix_trim "${parts[0]:-}") + for part in "${parts[@]:1}"; do + part=$(scenario_matrix_trim "$part") + case "$(printf '%s' "$part" | tr '[:upper:]' '[:lower:]')" in + why:*) + reason=$(scenario_matrix_trim "${part#*:}") + ;; + confidence:*) + confidence=$(scenario_matrix_trim "${part#*:}") + ;; + esac + done + fi + if [[ -n "$surface" ]]; then + row_json=$(jq -cn \ + --arg surface "$surface" \ + --arg reason "$reason" \ + --arg confidence "$confidence" \ + '{ + surface: $surface, + reason: $reason, + confidence: (if $confidence == "" then null else $confidence end) + }') + surfaces_json=$(jq -cn --argjson rows "$surfaces_json" --argjson row "$row_json" '$rows + [$row]') + fi + fi + ;; + siblings) + sibling_lines+=("$trimmed") + if [[ "$trimmed" =~ ^[*-][[:space:]]+ ]]; then + content=$(printf '%s' "$trimmed" | sed 's/^[*-][[:space:]]*//') + summary=$(scenario_matrix_trim "${content%%|*}") + derived_from="" + expansion_axis="" + why_likely="" + recommended_check="" + confidence="" + if [[ "$content" == *"|"* ]]; then + IFS='|' read -r -a parts <<< "$content" + summary=$(scenario_matrix_trim "${parts[0]:-}") + for part in "${parts[@]:1}"; do + part=$(scenario_matrix_trim "$part") + case "$(printf '%s' "$part" | tr '[:upper:]' '[:lower:]')" in + derived_from:*|derived-from:*) + derived_from=$(scenario_matrix_trim "${part#*:}") + ;; + axis:*) + expansion_axis=$(scenario_matrix_trim "${part#*:}") + ;; + why:*) + why_likely=$(scenario_matrix_trim "${part#*:}") + ;; + check:*|recommended_check:*|recommended-check:*) + recommended_check=$(scenario_matrix_trim "${part#*:}") + ;; + confidence:*) + confidence=$(scenario_matrix_trim "${part#*:}") + ;; + esac + done + fi + if [[ -n "$summary" ]]; then + row_json=$(jq -cn \ + --arg summary "$summary" \ + --arg derived_from "$derived_from" \ + --arg expansion_axis "$expansion_axis" \ + --arg why_likely "$why_likely" \ + --arg recommended_check "$recommended_check" \ + --arg confidence "$confidence" \ + '{ + summary: $summary, + derived_from: (if $derived_from == "" then null else $derived_from end), + expansion_axis: (if $expansion_axis == "" then null else $expansion_axis end), + why_likely: (if $why_likely == "" then null else $why_likely end), + recommended_check: (if $recommended_check == "" then null else $recommended_check end), + confidence: (if $confidence == "" then null else $confidence end) + }') + sibling_json=$(jq -cn --argjson rows "$sibling_json" --argjson row "$row_json" '$rows + [$row]') + fi + fi + ;; + coverage) + coverage_lines+=("$trimmed") + if [[ "$trimmed" == \|* ]]; then + if [[ -n "$coverage_paragraph" ]]; then + coverage_entry_json=$(scenario_matrix_parse_coverage_ledger_paragraph_json "$coverage_paragraph") + if [[ -n "$coverage_entry_json" ]]; then + coverage_json=$(jq -cn --argjson rows "$coverage_json" --argjson row "$coverage_entry_json" '$rows + [$row]') + fi + coverage_paragraph="" + fi + if printf '%s\n' "$trimmed" | grep -qE '^[[:space:]|:-]+$'; then + continue + fi + cells_output=$(scenario_matrix_row_to_cells "$trimmed") + scenario_matrix_read_lines_into_array parts "$cells_output" + surface=$(scenario_matrix_trim "${parts[0]:-}") + if [[ "$(printf '%s' "$surface" | tr '[:upper:]' '[:lower:]')" == "surface" ]]; then + continue + fi + if [[ -n "$surface" ]]; then + row_json=$(jq -cn \ + --arg surface "$surface" \ + --arg status "$(scenario_matrix_trim "${parts[1]:-}")" \ + --arg notes "$(scenario_matrix_trim "${parts[2]:-}")" \ + '{ + surface: $surface, + status: (if $status == "" then "unclear" else $status end), + notes: $notes + }') + coverage_json=$(jq -cn --argjson rows "$coverage_json" --argjson row "$row_json" '$rows + [$row]') + fi + else + if [[ -n "$coverage_paragraph" ]]; then + coverage_paragraph+=$'\n' + fi + coverage_paragraph+="$trimmed" + fi + ;; + esac + done <<< "$review_content" + + if [[ "$current_section" == "coverage" && -n "$coverage_paragraph" ]]; then + coverage_entry_json=$(scenario_matrix_parse_coverage_ledger_paragraph_json "$coverage_paragraph") + if [[ -n "$coverage_entry_json" ]]; then + coverage_json=$(jq -cn --argjson rows "$coverage_json" --argjson row "$coverage_entry_json" '$rows + [$row]') + fi + fi + + jq -cn \ + --argjson touched_failure_surfaces "$surfaces_json" \ + --argjson likely_sibling_risks "$sibling_json" \ + --argjson coverage_ledger "$coverage_json" \ + --arg touched_raw "$(scenario_matrix_lines_to_json_string "${surface_lines[@]}")" \ + --arg sibling_raw "$(scenario_matrix_lines_to_json_string "${sibling_lines[@]}")" \ + --arg coverage_raw "$(scenario_matrix_lines_to_json_string "${coverage_lines[@]}")" ' + { + touched_failure_surfaces: $touched_failure_surfaces, + likely_sibling_risks: $likely_sibling_risks, + coverage_ledger: $coverage_ledger, + raw_sections: { + touched_failure_surfaces: ( + try ($touched_raw | fromjson) catch "" + ), + likely_sibling_risks: ( + try ($sibling_raw | fromjson) catch "" + ), + coverage_ledger: ( + try ($coverage_raw | fromjson) catch "" + ) + }, + summary: { + surface_count: ($touched_failure_surfaces | length), + sibling_risk_count: ($likely_sibling_risks | length), + covered_count: ([ $coverage_ledger[] | select(((.status // "") | ascii_downcase) == "covered") ] | length), + partial_or_unclear_count: ([ $coverage_ledger[] | select(((.status // "") | ascii_downcase) == "partial" or ((.status // "") | ascii_downcase) == "unclear") ] | length) + } + }' +} + +scenario_matrix_record_feedback() { + local matrix_file="$1" + local feedback_channel="$2" + local task_id="${3:-}" + local suggested_by="${4:-}" + local feedback_kind="${5:-}" + local feedback_summary="${6:-}" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + if [[ "$feedback_channel" != "execution" && "$feedback_channel" != "review" ]]; then + return 1 + fi + + local created_at temp_file + created_at=$(date -u +%Y-%m-%dT%H:%M:%SZ) + temp_file="${matrix_file}.tmp.$$" + + jq \ + --arg feedback_channel "$feedback_channel" \ + --arg task_id "${task_id:-}" \ + --arg suggested_by "${suggested_by:-unknown}" \ + --arg feedback_kind "${feedback_kind:-note}" \ + --arg feedback_summary "$feedback_summary" \ + --arg created_at "$created_at" ' + .feedback = (.feedback // {execution: [], review: []}) + | .feedback[$feedback_channel] = (.feedback[$feedback_channel] // []) + | .feedback[$feedback_channel] += [ + { + source: $feedback_channel, + kind: $feedback_kind, + task_id: (if $task_id == "" then null else $task_id end), + summary: $feedback_summary, + suggested_by: $suggested_by, + source_file: null, + created_at: $created_at, + authoritative: false + } + ] + ' "$matrix_file" > "$temp_file" && mv "$temp_file" "$matrix_file" +} + +scenario_matrix_record_execution_feedback() { + scenario_matrix_record_feedback "$1" "execution" "${2:-}" "${3:-}" "${4:-}" "${5:-}" +} + +scenario_matrix_record_review_feedback() { + scenario_matrix_record_feedback "$1" "review" "${2:-}" "${3:-}" "${4:-}" "${5:-}" +} + +scenario_matrix_current_primary_task_id() { + local matrix_file="$1" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + jq -r ' + if (.manager.current_primary_task_id // null) != null then + .manager.current_primary_task_id + else + ( + first( + .tasks[] + | select(.lane == "mainline") + | select(.state != "done" and .state != "deferred") + | .id + ) // empty + ) + end + ' "$matrix_file" +} + +scenario_matrix_render_task_packet_markdown() { + local matrix_file="$1" + local task_id="${2:-}" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + jq -r --arg task_id "$task_id" ' + . as $root + | def clean_text: + tostring + | gsub("\\|"; "/") + | gsub("\\r?\\n"; " "); + def active_task: + first( + $root.tasks[] + | select(.lane == "mainline") + | select(.state != "done" and .state != "deferred") + ); + def packet_task: + if $task_id != "" then + first($root.tasks[] | select(.id == $task_id)) + elif ($root.manager.current_primary_task_id // null) != null then + first($root.tasks[] | select(.id == $root.manager.current_primary_task_id)) + else + active_task + end; + def target_task: packet_task; + def downstream_ids($id): + [ + $root.tasks[] + | select((.depends_on // []) | index($id)) + | .id + ]; + def cluster_label($task): + if ($task.cluster_id // null) != null then + $task.cluster_id + elif ($task.repair_wave // null) != null then + $task.repair_wave + else + "none" + end; + def scope_summary($task): + if (($task.scope.summary // "") | length) > 0 then + $task.scope.summary + else + "Stay within the current scenario matrix frontier and do not widen into unrelated queued work." + end; + def scope_paths($task): + if (($task.scope.paths // []) | length) > 0 then + (($task.scope.paths // []) | join(", ")) + else + "unspecified" + end; + def protected_constraints($task): + ( + (($task.scope.constraints // []) + ($task.assumptions // [])) + | map(clean_text) + ) as $items + | if ($items | length) > 0 then + ($items | join("; ")) + else + "Keep dependencies, tracker alignment, and the single-mainline objective intact." + end; + def success_criteria($task): + if ($task.state == "needs_replan") then + "Produce a narrower recovery step for " + ($task.id // "task") + " without broadening scope." + elif $task.state == "blocked" then + "Unblock " + ($task.id // "task") + " by resolving the required upstream dependency without widening scope." + else + "Advance " + ($task.id // "task") + " toward " + (($task.target_ac // []) | join(", ")) + " without widening scope beyond the current frontier." + end; + def stop_criteria($task): + "Stop and report back if progress would require widening scope, changing unrelated tasks, or invalidating an upstream/downstream dependency assumption."; + def out_of_scope($task): + ( + [ + $root.tasks[] + | .id as $other_task_id + | select(.id != ($task.id // "")) + | select(.state != "done" and .state != "deferred") + | select(.lane != "mainline") + | select((($root.runtime.checkpoint.supporting_task_ids // []) | index($other_task_id)) == null) + | select(.state != "blocked" and .state != "needs_replan") + | ($other_task_id + ": " + (.title | clean_text)) + ] + ) as $other_open + | if ($other_open | length) > 0 then + ($other_open | join("; ")) + else + "Any unrelated follow-up outside the assigned task packet." + end; + + if (target_task | type) == "object" then + "## Current Task Packet\n\n" + + "- Primary Objective: `" + + (($root.manager.current_primary_task_id // target_task.id // "unknown") | clean_text) + + "`" + + (if (($root.manager.current_primary_task_id // null) != null and ($root.manager.current_primary_task_id != target_task.id)) then + " (this packet is delegated from the current primary objective)" + else + "" + end) + + "\n- Assigned Task: `" + + ((target_task.id // "unknown") | clean_text) + + "` - " + + ((target_task.title // "Untitled task") | clean_text) + + "\n- Cluster / Repair Wave: `" + + (cluster_label(target_task) | clean_text) + + "`" + + "\n- Direct Upstream Dependencies: " + + ( + if ((target_task.depends_on // []) | length) > 0 then + ((target_task.depends_on // []) | join(", ")) + else + "none" + end + ) + + "\n- Direct Downstream Impact: " + + ( + if (downstream_ids(target_task.id) | length) > 0 then + (downstream_ids(target_task.id) | join(", ")) + else + "none" + end + ) + + "\n- Target ACs: " + + ( + if ((target_task.target_ac // []) | length) > 0 then + ((target_task.target_ac // []) | join(", ")) + else + "-" + end + ) + + "\n- Risk Bucket: `" + + ((target_task.risk_bucket // "normal") | clean_text) + + "`" + + "\n- Allowed Scope Summary: " + + (scope_summary(target_task) | clean_text) + + "\n- Allowed Scope Paths: " + + (scope_paths(target_task) | clean_text) + + "\n- Protected Constraints: " + + (protected_constraints(target_task) | clean_text) + + "\n- Success Criteria: " + + (success_criteria(target_task) | clean_text) + + "\n- Stop Criteria: " + + (stop_criteria(target_task) | clean_text) + + "\n- Out Of Scope: " + + (out_of_scope(target_task) | clean_text) + + "\n\nIf you delegate work to a subagent, project these same packet fields and narrow the scope instead of dropping the global context." + else + empty + end + ' "$matrix_file" +} + +scenario_matrix_current_task_packet_markdown() { + scenario_matrix_render_task_packet_markdown "$1" "" +} + +scenario_matrix_task_packet_feedback_instructions_markdown() { + cat <<'EOF' +## Task Packet Feedback Readback + +If you delegated work or learned packet-relevant context that should influence future scheduling, record it in your summary under: + +## Task Packet Feedback +| Task ID | Source | Kind | Summary | +|---------|--------|------|---------| + +Allowed `Kind` values: +- `state_suggestion` +- `scope_update` +- `dependency_note` +- `cluster_hint` +- `stop_note` + +Only record non-authoritative observations or suggestions there. Do not treat them as direct task-state edits. +EOF +} + +scenario_matrix_extract_task_packet_feedback_section() { + local summary_file="$1" + + awk ' + /^##[[:space:]]*Task Packet Feedback([[:space:]]*$|[[:space:][:punct:]].*)/ { + in_section = 1 + next + } + /^##[[:space:]]+/ { + if (in_section) { + exit + } + } + in_section { + print + } + ' "$summary_file" +} + +scenario_matrix_parse_task_packet_feedback_json() { + local summary_file="$1" + local section + + section=$(scenario_matrix_extract_task_packet_feedback_section "$summary_file") + if [[ -z "$section" ]]; then + printf '[]\n' + return 0 + fi + + local -a table_lines=() + while IFS= read -r line; do + [[ "$line" =~ ^[[:space:]]*\| ]] || continue + table_lines+=("$line") + done <<< "$section" + + if [[ ${#table_lines[@]} -lt 2 ]]; then + printf '[]\n' + return 0 + fi + + local -a header_cells=() + scenario_matrix_read_lines_into_array header_cells "$(scenario_matrix_row_to_cells "${table_lines[0]}")" + + local idx_task_id=-1 + local idx_source=-1 + local idx_kind=-1 + local idx_summary=-1 + local i normalized_header + for i in "${!header_cells[@]}"; do + normalized_header=$(scenario_matrix_normalize_header "${header_cells[$i]}") + case "$normalized_header" in + "task id") + idx_task_id=$i + ;; + "source") + idx_source=$i + ;; + "kind") + idx_kind=$i + ;; + "summary") + idx_summary=$i + ;; + esac + done + + if [[ $idx_task_id -lt 0 || $idx_source -lt 0 || $idx_kind -lt 0 || $idx_summary -lt 0 ]]; then + printf '[]\n' + return 0 + fi + + local entries_json='[]' + local task_id source kind summary row_number + for ((i = 2; i < ${#table_lines[@]}; i++)); do + local -a row_cells=() + scenario_matrix_read_lines_into_array row_cells "$(scenario_matrix_row_to_cells "${table_lines[$i]}")" + row_number=$i + + task_id=$(scenario_matrix_trim "${row_cells[$idx_task_id]:-}") + source=$(scenario_matrix_trim "${row_cells[$idx_source]:-}") + kind=$(scenario_matrix_trim "${row_cells[$idx_kind]:-}") + summary=$(scenario_matrix_trim "${row_cells[$idx_summary]:-}") + + if [[ -z "$task_id" && -z "$source" && -z "$kind" && -z "$summary" ]]; then + continue + fi + + if [[ -z "$task_id" || -z "$source" || -z "$kind" || -z "$summary" ]]; then + continue + fi + + if [[ ! "$task_id" =~ ^[a-zA-Z0-9._-]+$ ]]; then + continue + fi + + entries_json=$(jq -cn \ + --argjson entries "$entries_json" \ + --arg task_id "$task_id" \ + --arg source "$source" \ + --arg kind "$kind" \ + --arg summary "$summary" \ + '$entries + [{ + task_id: $task_id, + suggested_by: $source, + kind: $kind, + summary: $summary + }]') + done + + printf '%s\n' "$entries_json" +} + +scenario_matrix_ingest_summary_feedback() { + local matrix_file="$1" + local summary_file="$2" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + if [[ ! -f "$summary_file" ]]; then + return 0 + fi + + local feedback_json source_file created_at temp_file + feedback_json=$(scenario_matrix_parse_task_packet_feedback_json "$summary_file") + source_file=$(basename "$summary_file") + created_at=$(date -u +%Y-%m-%dT%H:%M:%SZ) + temp_file="${matrix_file}.tmp.$$" + + jq \ + --arg source_file "$source_file" \ + --arg created_at "$created_at" \ + --argjson feedback_json "$feedback_json" ' + .feedback = (.feedback // {execution: [], review: []}) + | .feedback.execution = ( + (.feedback.execution // []) + | map(select((.source_file // "") != $source_file)) + ) + | .feedback.execution += ( + $feedback_json + | map( + . + { + source: "execution", + source_file: $source_file, + created_at: $created_at, + authoritative: false + } + ) + ) + ' "$matrix_file" > "$temp_file" && mv "$temp_file" "$matrix_file" +} + +scenario_matrix_slugify() { + printf '%s' "$1" \ + | tr '[:upper:]' '[:lower:]' \ + | sed 's/[^a-z0-9][^a-z0-9]*/-/g; s/^-//; s/-$//; s/-\{2,\}/-/g' +} + +scenario_matrix_normalize_finding_key() { + local normalized + normalized=$(scenario_matrix_slugify "$1") + if [[ -n "$normalized" ]]; then + printf '%s\n' "$normalized" + else + printf 'finding\n' + fi +} + +scenario_matrix_severity_rank() { + local severity + severity=$(printf '%s' "${1:-P4}" | tr '[:upper:]' '[:lower:]') + case "$severity" in + p0) echo 0 ;; + p1) echo 1 ;; + p2) echo 2 ;; + p3) echo 3 ;; + *) echo 4 ;; + esac +} + +scenario_matrix_guess_review_kind() { + local summary_lower + summary_lower=$(printf '%s' "$1" | tr '[:upper:]' '[:lower:]') + + if printf '%s\n' "$summary_lower" | grep -qE 'investigat|question|clarif|unknown|analy[sz]e'; then + echo "investigation" + elif printf '%s\n' "$summary_lower" | grep -qE 'test|validation|assert|coverage|fixture|smoke|regression'; then + echo "validation" + elif printf '%s\n' "$summary_lower" | grep -qE 'cleanup|style|format|typo|wording|docs?|readme|comment|nit'; then + echo "cleanup" + else + echo "defect" + fi +} + +scenario_matrix_guess_review_confidence() { + case "$(scenario_matrix_severity_rank "$1")" in + 0) echo "0.98" ;; + 1) echo "0.92" ;; + 2) echo "0.80" ;; + 3) echo "0.65" ;; + *) echo "0.55" ;; + esac +} + +scenario_matrix_guess_review_cluster() { + local summary_lower + summary_lower=$(printf '%s' "$1" | tr '[:upper:]' '[:lower:]') + + if printf '%s\n' "$summary_lower" | grep -qE 'depend|downstream|upstream|contract|interface|schema|parser|validator'; then + echo "dependency-contract" + elif printf '%s\n' "$summary_lower" | grep -qE 'test|validation|assert|coverage|fixture|smoke|regression'; then + echo "validation" + elif printf '%s\n' "$summary_lower" | grep -qE 'monitor|prompt|tracker|matrix|hook|finalize'; then + echo "runtime-surface" + elif printf '%s\n' "$summary_lower" | grep -qE 'docs?|readme|wording|comment|typo'; then + echo "docs-cleanup" + elif printf '%s\n' "$summary_lower" | grep -qE 'cleanup|style|format|lint|rename|refactor'; then + echo "cleanup" + elif printf '%s\n' "$summary_lower" | grep -qE 'investigat|question|clarif'; then + echo "investigation" + else + echo "general-review" + fi +} + +scenario_matrix_guess_review_admission_status() { + local phase="$1" + local severity="$2" + local kind="$3" + local section="$4" + local summary_lower + local severity_rank + + summary_lower=$(printf '%s' "$5" | tr '[:upper:]' '[:lower:]') + severity_rank=$(scenario_matrix_severity_rank "$severity") + + if [[ "$section" == "queued" || "$kind" == "cleanup" ]]; then + echo "watchlist" + elif printf '%s\n' "$summary_lower" | grep -qE 'docs?-only|wording|typo|comment|nit|style|format'; then + echo "watchlist" + elif [[ "$severity_rank" -ge 3 ]]; then + echo "watchlist" + elif [[ "$phase" == "implementation" && "$section" == "queued" ]]; then + echo "watchlist" + else + echo "active" + fi +} + +scenario_matrix_guess_review_admission_reason() { + local phase="$1" + local admission_status="$2" + local section="$3" + + if [[ "$admission_status" == "watchlist" ]]; then + echo "low_impact_or_out_of_scope" + elif [[ "$phase" == "review" ]]; then + echo "review_blocking" + elif [[ "$section" == "blocking" ]]; then + echo "blocking_follow_up" + else + echo "review_follow_up" + fi +} + +scenario_matrix_guess_review_state() { + local phase="$1" + local severity="$2" + local section="$3" + local admission_status="$4" + local severity_rank + + severity_rank=$(scenario_matrix_severity_rank "$severity") + + if [[ "$admission_status" == "watchlist" ]]; then + echo "deferred" + elif [[ "$phase" == "review" || "$section" == "blocking" || "$severity_rank" -le 1 ]]; then + echo "blocked" + elif [[ "$section" == "mainline_gaps" ]]; then + echo "needs_replan" + else + echo "pending" + fi +} + +scenario_matrix_guess_review_lane() { + if [[ "$1" == "watchlist" ]]; then + echo "queued" + else + echo "supporting" + fi +} + +scenario_matrix_guess_review_routing() { + if [[ "$1" == "investigation" ]]; then + echo "analyze" + else + echo "coding" + fi +} + +scenario_matrix_has_task_id() { + local matrix_file="$1" + local task_id="$2" + + jq -e --arg task_id "$task_id" '.tasks | any(.[]; .id == $task_id)' "$matrix_file" >/dev/null 2>&1 +} + +scenario_matrix_summary_mentions_task_id() { + local summary="$1" + local task_id="$2" + local remainder prefix match_start before_char after_char + + [[ -z "$summary" || -z "$task_id" ]] && return 1 + + remainder="$summary" + while [[ "$remainder" == *"$task_id"* ]]; do + prefix="${remainder%%"$task_id"*}" + match_start=${#prefix} + before_char="" + after_char="" + + if [[ "$match_start" -gt 0 ]]; then + before_char="${remainder:$((match_start - 1)):1}" + fi + if [[ $((match_start + ${#task_id})) -lt ${#remainder} ]]; then + after_char="${remainder:$((match_start + ${#task_id})):1}" + fi + + if [[ ! "$before_char" =~ [[:alnum:]_.-] ]] && [[ ! "$after_char" =~ [[:alnum:]_.-] ]]; then + return 0 + fi + + remainder="${remainder:$((match_start + 1))}" + done + + return 1 +} + +scenario_matrix_find_explicit_task_reference() { + local matrix_file="$1" + local summary="$2" + local summary_lower task_id task_id_lower + + summary_lower=$(printf '%s' "$summary" | tr '[:upper:]' '[:lower:]') + + while IFS= read -r task_id; do + [[ -z "$task_id" ]] && continue + if scenario_matrix_summary_mentions_task_id "$summary" "$task_id"; then + printf '%s\n' "$task_id" + return 0 + fi + + task_id_lower=$(printf '%s' "$task_id" | tr '[:upper:]' '[:lower:]') + if [[ "$task_id_lower" != "$task_id" ]] && scenario_matrix_summary_mentions_task_id "$summary_lower" "$task_id_lower"; then + printf '%s\n' "$task_id" + return 0 + fi + done < <(jq -r '.tasks[]?.id // empty' "$matrix_file") + + return 1 +} + +scenario_matrix_task_target_ac_json() { + local matrix_file="$1" + local task_id="$2" + + if [[ -z "$task_id" ]]; then + printf '[]\n' + return 0 + fi + + jq -c --arg task_id "$task_id" ' + first(.tasks[] | select(.id == $task_id) | (.target_ac // [])) // [] + ' "$matrix_file" +} + +scenario_matrix_is_dependency_related_summary() { + local summary_lower + summary_lower=$(printf '%s' "$1" | tr '[:upper:]' '[:lower:]') + if printf '%s\n' "$summary_lower" | grep -qE 'depend|downstream|upstream|contract|interface|schema|validator'; then + return 0 + fi + return 1 +} + +scenario_matrix_infer_related_task_id() { + local matrix_file="$1" + local summary="$2" + local primary_task_id="${3:-}" + local summary_lower + + summary_lower=$(printf '%s' "$summary" | tr '[:upper:]' '[:lower:]') + if [[ -n "$primary_task_id" ]] && scenario_matrix_is_dependency_related_summary "$summary"; then + local dependent_count dependent_task_id + dependent_count=$(jq -r --arg primary_task_id "$primary_task_id" ' + [ + .tasks[] + | select(.id != $primary_task_id) + | select(((.depends_on // []) | index($primary_task_id)) != null) + | .id + ] | length + ' "$matrix_file") + + if [[ "$dependent_count" == "1" ]]; then + dependent_task_id=$(jq -r --arg primary_task_id "$primary_task_id" ' + first( + .tasks[] + | select(.id != $primary_task_id) + | select(((.depends_on // []) | index($primary_task_id)) != null) + | .id + ) // empty + ' "$matrix_file") + if [[ -n "$dependent_task_id" ]]; then + printf '%s\n' "$dependent_task_id" + return 0 + fi + elif [[ "$dependent_count" != "0" ]]; then + # When multiple dependent tasks could match the same finding, keep it + # as a standalone bounded task instead of mutating the wrong dependent. + return 0 + fi + fi + + if [[ -n "$primary_task_id" ]]; then + printf '%s\n' "$primary_task_id" + fi +} + +scenario_matrix_build_generated_task_id() { + local matrix_file="$1" + local round="$2" + local finding_index="$3" + local finding_key="$4" + local slug base_id candidate suffix + + slug=$(scenario_matrix_normalize_finding_key "$finding_key") + slug=${slug:0:24} + base_id="finding-r${round}-f${finding_index}" + if [[ -n "$slug" ]]; then + base_id="${base_id}-${slug}" + fi + + candidate="$base_id" + suffix=2 + while scenario_matrix_has_task_id "$matrix_file" "$candidate"; do + candidate="${base_id}-${suffix}" + suffix=$((suffix + 1)) + done + + printf '%s\n' "$candidate" +} + +scenario_matrix_extract_review_findings_json() { + local matrix_file="$1" + local round="$2" + local review_phase="$3" + local review_content="$4" + local primary_task_id current_section findings_json finding_index + local line trimmed trimmed_lower severity summary file_ref kind confidence + local cluster_key cluster_id repair_wave admission_status admission_reason + local lane state routing related_task_id link_task_id target_ac_json + local depends_on_json finding_key task_id event_id explicit_task_id + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + primary_task_id=$(scenario_matrix_current_primary_task_id "$matrix_file" 2>/dev/null || true) + current_section="" + findings_json='[]' + finding_index=0 + + while IFS= read -r line; do + trimmed=$(scenario_matrix_trim "$line") + [[ -z "$trimmed" ]] && continue + + trimmed_lower=$(printf '%s' "$trimmed" | tr '[:upper:]' '[:lower:]') + case "$trimmed_lower" in + "## mainline gaps"*|"mainline gaps"*) + current_section="mainline_gaps" + continue + ;; + "## blocking side issues"*|"blocking side issues"*) + current_section="blocking" + continue + ;; + "## queued side issues"*|"queued side issues"*) + current_section="queued" + continue + ;; + esac + + severity="" + summary="" + file_ref="" + if [[ "$trimmed" =~ ^[*-]?[[:space:]]*\[(P[0-9]+)\][[:space:]]*(.+)$ ]]; then + severity="${BASH_REMATCH[1]}" + summary="${BASH_REMATCH[2]}" + elif [[ -n "$current_section" && "$trimmed" =~ ^[*-][[:space:]]+(.+)$ ]]; then + summary="${BASH_REMATCH[1]}" + case "$current_section" in + blocking) severity="P1" ;; + queued) severity="P3" ;; + *) severity="P2" ;; + esac + else + continue + fi + + summary=$(scenario_matrix_trim "$summary") + if [[ "$summary" =~ ^(.*)[[:space:]]-[[:space:]](\/[^[:space:]]+:[0-9][0-9:-]*)$ ]]; then + summary=$(scenario_matrix_trim "${BASH_REMATCH[1]}") + file_ref="${BASH_REMATCH[2]}" + fi + [[ -z "$summary" ]] && continue + + kind=$(scenario_matrix_guess_review_kind "$summary") + confidence=$(scenario_matrix_guess_review_confidence "$severity") + cluster_key=$(scenario_matrix_guess_review_cluster "$summary") + cluster_id="cluster-${cluster_key}" + repair_wave="wave-r${round}-${cluster_key}" + admission_status=$(scenario_matrix_guess_review_admission_status "$review_phase" "$severity" "$kind" "$current_section" "$summary") + admission_reason=$(scenario_matrix_guess_review_admission_reason "$review_phase" "$admission_status" "$current_section") + lane=$(scenario_matrix_guess_review_lane "$admission_status") + state=$(scenario_matrix_guess_review_state "$review_phase" "$severity" "$current_section" "$admission_status") + routing=$(scenario_matrix_guess_review_routing "$kind") + explicit_task_id=$(scenario_matrix_find_explicit_task_reference "$matrix_file" "$summary" 2>/dev/null || true) + if [[ -n "$explicit_task_id" ]]; then + related_task_id="$explicit_task_id" + else + related_task_id=$(scenario_matrix_infer_related_task_id "$matrix_file" "$summary" "$primary_task_id") + fi + link_task_id="" + if [[ -n "$explicit_task_id" ]]; then + link_task_id="$explicit_task_id" + elif [[ -n "$related_task_id" && "$related_task_id" != "$primary_task_id" ]]; then + link_task_id="$related_task_id" + fi + + target_ac_json=$(scenario_matrix_task_target_ac_json "$matrix_file" "${related_task_id:-$primary_task_id}") + depends_on_json='[]' + if [[ "$admission_status" != "watchlist" && -n "$primary_task_id" ]]; then + depends_on_json=$(jq -cn --arg primary_task_id "$primary_task_id" '[ $primary_task_id ]') + fi + + finding_key=$(scenario_matrix_normalize_finding_key "${summary} ${file_ref}") + finding_index=$((finding_index + 1)) + task_id=$(scenario_matrix_build_generated_task_id "$matrix_file" "$round" "$finding_index" "$finding_key") + event_id="evt-finding-${review_phase}-${round}-${finding_index}" + + findings_json=$(jq -cn \ + --argjson findings_json "$findings_json" \ + --arg event_id "$event_id" \ + --arg task_id "$task_id" \ + --arg title "$summary" \ + --arg summary "$summary" \ + --arg severity "$severity" \ + --argjson confidence "$confidence" \ + --arg kind "$kind" \ + --arg source "review" \ + --arg review_phase "$review_phase" \ + --arg cluster_id "$cluster_id" \ + --arg repair_wave "$repair_wave" \ + --arg admission_status "$admission_status" \ + --arg admission_reason "$admission_reason" \ + --arg lane "$lane" \ + --arg state "$state" \ + --arg routing "$routing" \ + --arg risk_bucket "$(if [[ "$(scenario_matrix_severity_rank "$severity")" -le 1 ]]; then echo "high"; elif [[ "$(scenario_matrix_severity_rank "$severity")" -eq 2 ]]; then echo "medium"; else echo "low"; fi)" \ + --arg finding_key "$finding_key" \ + --arg related_task_id "${related_task_id:-}" \ + --arg link_task_id "${link_task_id:-}" \ + --arg file_ref "${file_ref:-}" \ + --argjson target_ac "$target_ac_json" \ + --argjson depends_on "$depends_on_json" \ + '$findings_json + [{ + event_id: $event_id, + task_id: $task_id, + title: $title, + summary: $summary, + severity: $severity, + confidence: $confidence, + source: $source, + kind: $kind, + review_phase: $review_phase, + cluster_id: $cluster_id, + repair_wave: $repair_wave, + admission_status: $admission_status, + admission_reason: $admission_reason, + lane: $lane, + state: $state, + routing: $routing, + risk_bucket: $risk_bucket, + finding_key: $finding_key, + related_task_id: (if $related_task_id == "" then null else $related_task_id end), + link_task_id: (if $link_task_id == "" then null else $link_task_id end), + file_ref: (if $file_ref == "" then null else $file_ref end), + target_ac: $target_ac, + depends_on: $depends_on + }]') + done <<< "$review_content" + + printf '%s\n' "$findings_json" +} + +scenario_matrix_ingest_review_findings() { + local matrix_file="$1" + local round="$2" + local review_phase="$3" + local review_content="$4" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + local findings_json created_at temp_file + findings_json=$(scenario_matrix_extract_review_findings_json "$matrix_file" "$round" "$review_phase" "$review_content") + if [[ "$(printf '%s' "$findings_json" | jq 'length')" -eq 0 ]]; then + return 0 + fi + + created_at=$(date -u +%Y-%m-%dT%H:%M:%SZ) + temp_file="${matrix_file}.tmp.$$" + + jq \ + --arg created_at "$created_at" \ + --arg round "$round" \ + --arg review_phase "$review_phase" \ + --argjson findings "$findings_json" ' + def review_actor: + if $review_phase == "review" then + "code_review" + else + "implementation_review" + end; + + def severity_rank($severity): + (($severity // "P4") | ascii_downcase) as $s + | if $s == "p0" then 0 + elif $s == "p1" then 1 + elif $s == "p2" then 2 + elif $s == "p3" then 3 + else 4 + end; + + def higher_severity($left; $right): + if ($left // null) == null then + $right + elif ($right // null) == null then + $left + elif severity_rank($left) <= severity_rank($right) then + $left + else + $right + end; + + def risk_rank($bucket): + (($bucket // "planned") | ascii_downcase) as $b + | if $b == "high" then 0 + elif $b == "medium" then 1 + elif $b == "planned" then 2 + elif $b == "low" then 3 + else 2 + end; + + def higher_risk_bucket($left; $right): + if risk_rank($left) <= risk_rank($right) then + ($left // "planned") + else + ($right // "planned") + end; + + def review_surface_key_from_summary($summary): + ($summary // "" | ascii_downcase) as $s + | if $s | test("reserv|claim|release|haul|cargo|carried|carry-invariant|operate job") then + "reservation-lifecycle" + elif $s | test("rollback|reassign|interrupt|cancel|restore") then + "rollback-symmetry" + elif $s | test("conduit|payload|pipe|medium|evacuation|gas masses|liquid-to-solid|wires") then + "conduit-medium-integrity" + elif $s | test("thermal|temperature|heat|overpressure") then + "thermal-state" + elif $s | test("projection|overlay|snapshot|serializ|render kind") then + "projection-snapshot" + elif $s | test("place|placement|footprint|unreachable interaction|intake/output|home positions|overlap|remove mode") then + "placement-legality" + elif $s | test("door|movement|blocking during footprint|door closure") then + "door-flow-consistency" + elif $s | test("matter|mass|conservation|non-finite") then + "resource-conservation" + elif $s | test("generator|battery|power") then + "power-consistency" + elif $s | test("path|reachable") then + "pathing-reachability" + elif $s | test("test|validation|assert|coverage|fixture|smoke|regression") then + "validation-coverage" + elif $s | test("docs?|readme|wording|comment|typo") then + "docs-cleanup" + elif $s | test("cleanup|style|format|lint|rename|refactor") then + "cleanup" + elif $s | test("investigat|question|clarif") then + "investigation" + else + "general-review" + end; + + def review_surface_label($surface_key): + if $surface_key == "reservation-lifecycle" then + "Reservation / lifecycle backlog" + elif $surface_key == "rollback-symmetry" then + "Rollback symmetry backlog" + elif $surface_key == "conduit-medium-integrity" then + "Conduit / medium integrity backlog" + elif $surface_key == "thermal-state" then + "Thermal state backlog" + elif $surface_key == "projection-snapshot" then + "Projection / snapshot backlog" + elif $surface_key == "placement-legality" then + "Placement legality backlog" + elif $surface_key == "door-flow-consistency" then + "Door / flow consistency backlog" + elif $surface_key == "resource-conservation" then + "Resource conservation backlog" + elif $surface_key == "power-consistency" then + "Power consistency backlog" + elif $surface_key == "pathing-reachability" then + "Pathing / reachability backlog" + elif $surface_key == "validation-coverage" then + "Validation coverage backlog" + elif $surface_key == "docs-cleanup" then + "Docs cleanup backlog" + elif $surface_key == "cleanup" then + "Cleanup backlog" + elif $surface_key == "investigation" then + "Investigation backlog" + else + "General review backlog" + end; + + def review_group_key_from_fields($related_task_id; $link_task_id; $admission_status; $summary): + (review_surface_key_from_summary($summary)) as $surface_key + | (if ($admission_status // "active") == "watchlist" then "watchlist" else "active" end) as $mode + | (($link_task_id // $related_task_id // "global") | tostring) as $subject_key + | ($mode + ":" + $subject_key + ":" + $surface_key); + + def feedback_entry($finding): + { + source: "review", + kind: ( + if $finding.admission_status == "watchlist" then + "watchlist_finding" + else + "structured_finding" + end + ), + task_id: ($finding.link_task_id // $finding.related_task_id // null), + summary: $finding.summary, + suggested_by: review_actor, + source_file: null, + created_at: $created_at, + authoritative: false + }; + + def event_entry($finding): + { + id: $finding.event_id, + type: "review_finding", + round: ($round | tonumber), + phase: $review_phase, + task_id: ($finding.link_task_id // $finding.related_task_id // null), + severity: $finding.severity, + kind: $finding.kind, + review_phase: $finding.review_phase, + finding_key: $finding.finding_key, + finding_id: $finding.task_id, + group_key: review_group_key_from_fields(($finding.related_task_id // null); ($finding.link_task_id // null); $finding.admission_status; $finding.summary), + cluster_id: $finding.cluster_id, + repair_wave: $finding.repair_wave, + admission_status: $finding.admission_status, + created_at: $created_at + }; + + def is_embedded_finding_task($task): + (($task.metadata.seed_source // "") == "review_finding") + or ( + (($task.id // "") | startswith("finding-r")) + and ( + (($task.source // "") == "review") + or ((($task.metadata.finding_key // "") | length) > 0) + or ((($task.metadata.finding_summary // "") | length) > 0) + or ((($task.review_phase // "") | length) > 0) + ) + ); + + def raw_finding_from_task($task): + (($task.metadata.finding_summary // $task.title // "Review finding") | tostring) as $summary + | (review_surface_key_from_summary($summary)) as $surface_key + | { + id: ($task.id // "finding"), + title: ($task.title // $summary), + summary: $summary, + severity: ($task.severity // null), + confidence: ($task.confidence // null), + source: "review", + kind: ($task.kind // "defect"), + review_phase: ($task.review_phase // $review_phase), + cluster_id: ($task.cluster_id // null), + repair_wave_hint: ($task.repair_wave // null), + admission_status: ($task.admission.status // "active"), + admission_reason: ($task.admission.reason // "legacy_review_finding"), + lane: ($task.lane // "queued"), + state: ($task.state // "blocked"), + routing: ($task.routing // "coding"), + risk_bucket: ($task.risk_bucket // "planned"), + finding_key: ( + $task.metadata.finding_key + // ( + ($summary + " " + ($task.file_ref // "")) + | ascii_downcase + | gsub("[^a-z0-9._ -]"; " ") + | gsub("\\s+"; "-") + | gsub("^-|-$"; "") + ) + ), + related_task_id: ($task.metadata.related_task_id // null), + link_task_id: null, + file_ref: ($task.file_ref // null), + target_ac: ($task.target_ac // []), + depends_on: ($task.depends_on // []), + surface_key: $surface_key, + surface_label: review_surface_label($surface_key), + group_key: review_group_key_from_fields(($task.metadata.related_task_id // null); null; ($task.admission.status // "active"); $summary), + first_seen_round: ($task.metadata.source_round // ($round | tonumber)), + last_seen_round: ($task.metadata.source_round // ($round | tonumber)), + occurrence_count: 1 + }; + + def raw_finding_from_finding($finding): + (review_surface_key_from_summary($finding.summary)) as $surface_key + | { + id: $finding.task_id, + title: $finding.title, + summary: $finding.summary, + severity: $finding.severity, + confidence: $finding.confidence, + source: $finding.source, + kind: $finding.kind, + review_phase: $finding.review_phase, + cluster_id: ($finding.cluster_id // null), + repair_wave_hint: ($finding.repair_wave // null), + admission_status: $finding.admission_status, + admission_reason: $finding.admission_reason, + lane: $finding.lane, + state: $finding.state, + routing: $finding.routing, + risk_bucket: $finding.risk_bucket, + finding_key: $finding.finding_key, + related_task_id: ($finding.related_task_id // null), + link_task_id: ($finding.link_task_id // null), + file_ref: ($finding.file_ref // null), + target_ac: ($finding.target_ac // []), + depends_on: ($finding.depends_on // []), + surface_key: $surface_key, + surface_label: review_surface_label($surface_key), + group_key: review_group_key_from_fields(($finding.related_task_id // null); ($finding.link_task_id // null); $finding.admission_status; $finding.summary), + first_seen_round: ($round | tonumber), + last_seen_round: ($round | tonumber), + occurrence_count: 1 + }; + + def merge_raw_finding($existing; $incoming): + $existing + | .title = $incoming.title + | .summary = $incoming.summary + | .severity = higher_severity(($existing.severity // null); ($incoming.severity // null)) + | .confidence = ($incoming.confidence // $existing.confidence) + | .source = ($incoming.source // $existing.source) + | .kind = ($incoming.kind // $existing.kind) + | .review_phase = ($incoming.review_phase // $existing.review_phase) + | .cluster_id = ($incoming.cluster_id // $existing.cluster_id) + | .repair_wave_hint = ($incoming.repair_wave_hint // $existing.repair_wave_hint) + | .admission_status = ($incoming.admission_status // $existing.admission_status) + | .admission_reason = ($incoming.admission_reason // $existing.admission_reason) + | .lane = ($incoming.lane // $existing.lane) + | .state = ($incoming.state // $existing.state) + | .routing = ($incoming.routing // $existing.routing) + | .risk_bucket = higher_risk_bucket(($existing.risk_bucket // "planned"); ($incoming.risk_bucket // "planned")) + | .related_task_id = ($incoming.related_task_id // $existing.related_task_id) + | .link_task_id = ($incoming.link_task_id // $existing.link_task_id) + | .file_ref = ($incoming.file_ref // $existing.file_ref) + | .target_ac = (((($existing.target_ac // []) + ($incoming.target_ac // [])) | unique)) + | .depends_on = (((($existing.depends_on // []) + ($incoming.depends_on // [])) | unique)) + | .surface_key = ($incoming.surface_key // $existing.surface_key) + | .surface_label = ($incoming.surface_label // $existing.surface_label) + | .group_key = ($incoming.group_key // $existing.group_key) + | .first_seen_round = ($existing.first_seen_round // ($round | tonumber)) + | .last_seen_round = ($round | tonumber) + | .occurrence_count = (($existing.occurrence_count // 0) + 1); + + def annotate_linked_task($task; $finding): + $task + | (.metadata // {}) as $metadata + | .metadata = ( + $metadata + + { + last_review_round: ($round | tonumber), + last_review_phase: $review_phase, + last_review_finding_key: $finding.finding_key, + last_review_summary: $finding.summary, + review_finding_keys: (((($metadata.review_finding_keys // []) + [$finding.finding_key]) | unique)), + related_task_id: ($finding.related_task_id // $metadata.related_task_id // null) + } + ) + | if $finding.admission_status == "watchlist" then + . + else + .state = $finding.state + | .risk_bucket = higher_risk_bucket((.risk_bucket // "planned"); ($finding.risk_bucket // "planned")) + end; + + def task_title_by_id($task_id; $tasks): + first($tasks[] | select(.id == $task_id) | .title) // $task_id; + + def finding_group_state($items): + if ($items | all(.[]; ((.admission_status // "active") == "watchlist") or ((.state // "") == "deferred"))) then + "deferred" + elif ($items | any(.[]; ((.state // "") == "blocked") or ((.state // "") == "needs_replan"))) then + "blocked" + else + "queued" + end; + + def finding_group_risk_bucket($items): + ($items | map(.risk_bucket // "planned") | sort_by(risk_rank(.)) | first) // "planned"; + + def finding_group_severity($items): + ($items | map(.severity // "P4") | sort_by(severity_rank(.)) | first) // null; + + def build_finding_groups($raw_findings; $tasks): + [ + $raw_findings[] + | select( + (.link_task_id // null) == null + or (.admission_status // "active") == "watchlist" + ) + ] + | sort_by([.group_key, .finding_key]) + | group_by(.group_key) + | map( + . as $items + | $items[0] as $rep + | ( + [ + $items[] + | (.link_task_id // .related_task_id // null) + | select(. != null) + ] | unique + ) as $related_ids + | ([$related_ids[] | task_title_by_id(.; $tasks)] | unique) as $related_titles + | ([$items[] | select((.file_ref // null) != null) | .file_ref] | unique) as $file_refs + | ([$items[] | .target_ac[]?] | unique) as $target_acs + | ([$items[] | .id] | unique) as $finding_ids + | ([$items[] | .summary] | unique) as $sample_summaries + | (finding_group_state($items)) as $group_state + | { + id: $rep.group_key, + title: ( + ($rep.surface_label // "Review backlog") + + ( + if ($related_titles | length) > 0 then + " for " + ($related_titles | join(", ")) + else + "" + end + ) + ), + summary: ( + ( + if ($sample_summaries | length) > 0 then + $sample_summaries[0] + else + ($rep.summary // "Review backlog") + end + ) + + " [" + + (($finding_ids | length) | tostring) + + " findings]" + ), + surface_key: $rep.surface_key, + surface_label: $rep.surface_label, + state: $group_state, + admission_status: ( + if $group_state == "deferred" then + "watchlist" + else + "active" + end + ), + severity: finding_group_severity($items), + risk_bucket: finding_group_risk_bucket($items), + target_ac: $target_acs, + related_task_ids: $related_ids, + file_refs: $file_refs, + finding_ids: $finding_ids, + sample_summaries: ($sample_summaries[:3]), + finding_count: ($finding_ids | length), + first_seen_round: ($items | map(.first_seen_round // ($round | tonumber)) | min), + last_seen_round: ($items | map(.last_seen_round // ($round | tonumber)) | max) + } + ) + | sort_by([ + ( + if .state == "blocked" then + 0 + elif .state == "queued" then + 1 + else + 2 + end + ), + -(.last_seen_round // 0), + .id + ]); + + .feedback = (.feedback // {execution: [], review: []}) + | .events = (.events // []) + | .raw_findings = (.raw_findings // []) + | .finding_groups = (.finding_groups // []) + | ([.tasks[] | select(is_embedded_finding_task(.)) | raw_finding_from_task(.)]) as $migrated_findings + | .tasks |= map(select(is_embedded_finding_task(.) | not)) + | reduce $migrated_findings[] as $raw_finding ( + .; + (.raw_findings | map(.finding_key // "") | index($raw_finding.finding_key)) as $existing_idx + | if $existing_idx != null then + .raw_findings[$existing_idx] = merge_raw_finding(.raw_findings[$existing_idx]; $raw_finding) + else + .raw_findings += [$raw_finding] + end + ) + | reduce $findings[] as $finding ( + .; + (raw_finding_from_finding($finding)) as $raw_finding + | (.raw_findings | map(.finding_key // "") | index($raw_finding.finding_key)) as $existing_raw_idx + | if $existing_raw_idx != null then + .raw_findings[$existing_raw_idx] = merge_raw_finding(.raw_findings[$existing_raw_idx]; $raw_finding) + else + .raw_findings += [$raw_finding] + end + | ( + if ($finding.link_task_id // null) == null or ($finding.link_task_id // "") == "" then + null + else + (.tasks | map(.id) | index($finding.link_task_id)) + end + ) as $linked_idx + | if $linked_idx != null then + .tasks[$linked_idx] = annotate_linked_task(.tasks[$linked_idx]; $finding) + else + . + end + | .feedback.review += [feedback_entry($finding)] + | .events += [event_entry($finding)] + ) + | .finding_groups = build_finding_groups(.raw_findings; .tasks) + | .raw_findings |= sort_by([-(.last_seen_round // 0), .finding_key]) + | .feedback.review |= sort_by([(.created_at // ""), (.summary // "")]) | .feedback.review |= reverse + | .events |= sort_by([(.round // 0), (.id // "")]) | .events |= reverse + | if (.manager | type) == "object" then + .manager.last_reconciled_at = $created_at + else + . + end + ' "$matrix_file" > "$temp_file" && mv "$temp_file" "$matrix_file" +} + +scenario_matrix_reconcile_manager_state() { + local matrix_file="$1" + local current_round="${2:-}" + local frontier_reason="${3:-manager_reconcile}" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + local created_at temp_file + created_at=$(date -u +%Y-%m-%dT%H:%M:%SZ) + temp_file="${matrix_file}.tmp.$$" + + jq \ + --arg created_at "$created_at" \ + --arg current_round "$current_round" \ + --arg frontier_reason "$frontier_reason" ' + . as $root + | + def severity_rank($severity): + (($severity // "P4") | ascii_downcase) as $s + | if $s == "p0" then 0 + elif $s == "p1" then 1 + elif $s == "p2" then 2 + elif $s == "p3" then 3 + else 4 + end; + + def severity_score($severity): + if severity_rank($severity) == 0 then 90 + elif severity_rank($severity) == 1 then 75 + elif severity_rank($severity) == 2 then 50 + elif severity_rank($severity) == 3 then 20 + else 0 + end; + + def risk_score($bucket): + (($bucket // "planned") | ascii_downcase) as $b + | if $b == "high" then 70 + elif $b == "medium" then 45 + elif $b == "planned" then 25 + elif $b == "low" then 10 + else 20 + end; + + def state_score($state): + if $state == "needs_replan" then 105 + elif $state == "blocked" then 95 + elif $state == "in_progress" then 85 + elif $state == "ready" then 75 + elif $state == "pending" then 55 + else 0 + end; + + def is_open_task($task): + ($task.state // "pending") != "done" + and ($task.state // "pending") != "deferred"; + + def is_watchlist($task): + ($task.admission.status // "active") == "watchlist"; + + def is_active_candidate($task): + is_open_task($task) and (is_watchlist($task) | not); + + def is_open_finding_group($group): + ($group.state // "queued") != "deferred"; + + def is_watchlist_finding_group($group): + ($group.admission_status // "active") == "watchlist" + or (($group.state // "queued") == "deferred"); + + def is_active_finding_group($group): + is_open_finding_group($group) and (is_watchlist_finding_group($group) | not); + + def dependency_criticality($task): + [ + $root.tasks[] + | select(.id != ($task.id // "")) + | select(((.depends_on // []) | index($task.id)) != null) + | .id + ] | length; + + def repair_wave_size($task): + if (($task.repair_wave // "") | length) > 0 then + [ + $root.tasks[] + | select((.repair_wave // "") == $task.repair_wave) + | select(is_active_candidate(.)) + | .id + ] | length + else + 0 + end; + + def frontier_task_score($task): + (state_score($task.state // "pending")) + + (risk_score($task.risk_bucket // "planned")) + + (severity_score($task.severity // null)) + + ((dependency_criticality($task)) * 12) + + ( + if repair_wave_size($task) > 1 then + (repair_wave_size($task) * 6) + else + 0 + end + ) + + ( + if ($task.kind // "feature") == "defect" then + 15 + elif ($task.kind // "feature") == "validation" then + 10 + elif ($task.kind // "feature") == "investigation" then + 5 + else + 0 + end + ) + + ( + if ($task.id // "") == ($root.manager.current_primary_task_id // "") then + 30 + else + 0 + end + ) + + ( + if ($task.lane // "") == "mainline" then + 15 + else + 0 + end + ); + + def task_by_id($task_id): + first($root.tasks[] | select(.id == $task_id)); + + def current_primary_candidate: + (.manager.current_primary_task_id // null) as $current_primary_task_id + | if $current_primary_task_id == null then + null + else + (task_by_id($current_primary_task_id) // null) as $candidate + | if ($candidate | type) == "object" and is_active_candidate($candidate) then + $candidate + else + null + end + end; + + def is_runnable_primary_candidate($task): + is_active_candidate($task) + and (($task.state // "pending") != "blocked") + and (($task.state // "pending") != "needs_replan"); + + def ranked_primary_candidate: + ( + [ + $root.tasks[] + | select(is_runnable_primary_candidate(.)) + | {id: .id, score: frontier_task_score(.)} + ] + ) as $runnable + | ( + if ($runnable | length) > 0 then + $runnable + else + [ + $root.tasks[] + | select(is_active_candidate(.)) + | {id: .id, score: frontier_task_score(.)} + ] + end + ) as $candidate_pool + | if ($candidate_pool | length) > 0 then + ($candidate_pool | sort_by([.score, .id]) | last | .id) + else + null + end; + + def related_to_primary($task; $primary): + ($task.id // "") != ($primary.id // "") + and ( + (($task.state // "pending") == "blocked") + or (($task.state // "pending") == "needs_replan") + or ((($primary.depends_on // []) | index($task.id)) != null) + or ( + (($primary.repair_wave // "") | length) > 0 + and (($task.repair_wave // "") == ($primary.repair_wave // "")) + ) + ); + + def supporting_ids($primary_id): + (task_by_id($primary_id) // null) as $primary + | if ($primary | type) != "object" then + [] + else + [ + $root.tasks[] + | select(is_active_candidate(.)) + | select(related_to_primary(.; $primary)) + | {id: .id, score: frontier_task_score(.)} + ] + | sort_by([-.score, .id]) + | .[:3] + | map(.id) + end; + + def residual_task_risk($task): + if is_watchlist($task) then + 6 + elif ($task.state // "pending") == "done" or ($task.state // "pending") == "deferred" then + 0 + else + ((risk_score($task.risk_bucket // "planned") / 2) | floor) + + ((severity_score($task.severity // null) / 3) | floor) + + ( + if ($task.state // "pending") == "blocked" or ($task.state // "pending") == "needs_replan" then + 22 + elif ($task.state // "pending") == "in_progress" then + 14 + else + 8 + end + ) + end; + + def residual_finding_group_risk($group): + if is_watchlist_finding_group($group) then + 6 + elif ($group.state // "queued") == "blocked" or ($group.state // "queued") == "needs_replan" then + ((risk_score($group.risk_bucket // "planned") / 2) | floor) + + ((severity_score($group.severity // null) / 3) | floor) + + 22 + else + ((risk_score($group.risk_bucket // "planned") / 2) | floor) + + ((severity_score($group.severity // null) / 3) | floor) + + 8 + end; + + def convergence_guidance($status; $frontier_changed; $primary_id): + if $status == "converged" then + "No must-fix or high-risk active tasks remain. Reuse the current checkpoint for verification/closure and avoid opening new implementation scope." + elif $status == "stabilizing" then + "Residual risk is low and recent checkpoints are not producing new high-value findings. Hold the current checkpoint and focus on bounded verification." + elif $frontier_changed then + "The frontier changed. Refresh the contract from the new checkpoint and keep exactly one primary objective." + elif $primary_id != null then + "Continue within the current checkpoint. Do not widen scope unless the manager promotes a new frontier." + else + "Reconcile the frontier before starting new work." + end; + + (.runtime.current_round // 0) as $existing_round + | ( + if ($current_round | length) > 0 then + ($current_round | tonumber) + else + $existing_round + end + ) as $round_num + | (current_primary_candidate) as $current_primary_task + | ( + if ($current_primary_task | type) == "object" then + ($current_primary_task.id // null) + else + ranked_primary_candidate + end + ) as $primary_id + | (supporting_ids($primary_id)) as $supporting_ids + | .runtime.current_round = $round_num + | .tasks |= map( + .id as $task_id + | + if (.state // "pending") == "done" or (.state // "pending") == "deferred" then + . + else + .lane = ( + if ($primary_id != null and $task_id == $primary_id) then + "mainline" + elif is_watchlist(.) then + "queued" + elif ($supporting_ids | index($task_id)) != null then + "supporting" + elif (.state == "blocked" or .state == "needs_replan") then + "supporting" + else + "queued" + end + ) + end + ) + | if (.manager | type) == "object" then + .manager.current_primary_task_id = $primary_id + | .manager.last_reconciled_at = $created_at + else + . + end + | ( + (task_by_id($primary_id) // null) as $primary_task + | ({ + primary_task_id: $primary_id, + supporting_task_ids: $supporting_ids, + open_active_ids: [ + .tasks[] + | select(is_active_candidate(.)) + | .id + ], + open_finding_group_ids: [ + (.finding_groups // [])[] + | select(is_active_finding_group(.)) + | .id + ], + blocked_ids: [ + .tasks[] + | select(is_active_candidate(.)) + | select(.state == "blocked" or .state == "needs_replan") + | .id + ], + blocked_finding_group_ids: [ + (.finding_groups // [])[] + | select(is_active_finding_group(.)) + | select(.state == "blocked" or .state == "needs_replan") + | .id + ], + repair_waves: [ + .tasks[] + | select(is_active_candidate(.)) + | select((.repair_wave // "") != "") + | .repair_wave + ] | unique + }) as $frontier_descriptor + | ($frontier_descriptor | @json) as $frontier_signature + | (.runtime.checkpoint // {}) as $checkpoint + | (($checkpoint.frontier_signature // "") != $frontier_signature) as $frontier_changed + | ( + if (($checkpoint.sequence // 0) == 0) then + 1 + elif $frontier_changed then + (($checkpoint.sequence // 0) + 1) + else + ($checkpoint.sequence // 0) + end + ) as $checkpoint_sequence + | ( + ([.tasks[] | select(is_active_candidate(.))] | length) + + ([.finding_groups[]? | select(is_active_finding_group(.))] | length) + ) as $active_task_count + | ( + ([.tasks[] | select(is_watchlist(.))] | length) + + ([.finding_groups[]? | select(is_watchlist_finding_group(.))] | length) + ) as $watchlist_count + | ( + ([.tasks[] + | select(is_active_candidate(.)) + | select( + (.state == "blocked") + or (.state == "needs_replan") + or ((.risk_bucket // "planned") == "high") + or (severity_rank(.severity // null) <= 1) + ) + ] | length) + + ([.finding_groups[]? + | select(is_active_finding_group(.)) + | select( + (.state == "blocked") + or (.state == "needs_replan") + or ((.risk_bucket // "planned") == "high") + or (severity_rank(.severity // null) <= 1) + ) + ] | length) + ) as $must_fix_open_count + | ( + ([.tasks[] + | select(is_active_candidate(.)) + | select((.risk_bucket // "planned") == "high" or severity_rank(.severity // null) <= 1) + ] | length) + + ([.finding_groups[]? + | select(is_active_finding_group(.)) + | select((.risk_bucket // "planned") == "high" or severity_rank(.severity // null) <= 1) + ] | length) + ) as $high_risk_open_count + | ( + (([.tasks[] | residual_task_risk(.)] | add) // 0) + + (([.finding_groups[]? | residual_finding_group_risk(.)] | add) // 0) + ) as $residual_risk_score + | ( + [ + .events[] + | select((.type // "") == "review_finding") + | select(((.round // 0) >= ($round_num - 1))) + | select(severity_rank(.severity // null) <= 2) + | (.finding_key // .id) + ] | unique | length + ) as $recent_high_value_novelty_count + | ( + if $must_fix_open_count == 0 and $high_risk_open_count == 0 and $active_task_count == 0 then + "converged" + elif $must_fix_open_count == 0 and $high_risk_open_count == 0 and $recent_high_value_novelty_count == 0 and $residual_risk_score <= 25 then + "stabilizing" + else + "continue" + end + ) as $convergence_status + | .runtime.checkpoint = { + sequence: $checkpoint_sequence, + current_id: ("checkpoint-" + ($checkpoint_sequence | tostring)), + frontier_signature: $frontier_signature, + frontier_changed: $frontier_changed, + primary_task_id: $primary_id, + supporting_task_ids: $supporting_ids, + frontier_reason: $frontier_reason, + updated_at: $created_at + } + | .runtime.convergence = { + status: $convergence_status, + next_action: ( + if $convergence_status == "converged" then + "prepare_closure" + elif $convergence_status == "stabilizing" then + "hold_checkpoint" + elif $frontier_changed then + "advance_checkpoint" + else + "hold_checkpoint" + end + ), + guidance: convergence_guidance($convergence_status; $frontier_changed; $primary_id), + residual_risk_score: $residual_risk_score, + must_fix_open_count: $must_fix_open_count, + high_risk_open_count: $high_risk_open_count, + active_task_count: $active_task_count, + watchlist_count: $watchlist_count, + recent_high_value_novelty_count: $recent_high_value_novelty_count, + updated_at: $created_at + } + ) + ' "$matrix_file" > "$temp_file" && mv "$temp_file" "$matrix_file" +} + +scenario_matrix_current_checkpoint_markdown() { + local matrix_file="$1" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + jq -r ' + . as $root + | + def clean_text: + tostring + | gsub("\\|"; "/") + | gsub("\\r?\\n"; " "); + + def primary_summary: + if ($root.runtime.checkpoint.primary_task_id // null) != null then + ( + first($root.tasks[] | select(.id == $root.runtime.checkpoint.primary_task_id)) // null + ) as $task + | if ($task | type) == "object" then + ($task.id // "unknown") + " - " + (($task.title // "Untitled task") | clean_text) + else + ($root.runtime.checkpoint.primary_task_id // "unknown") + end + else + "none" + end; + + "## Manager Checkpoint\n\n" + + "- Checkpoint: `" + + ((.runtime.checkpoint.current_id // "checkpoint-0") | clean_text) + + "`" + + (if (.runtime.checkpoint.frontier_changed // false) then " (frontier changed)" else " (frontier stable)" end) + + "\n- Primary Objective: " + + (primary_summary | clean_text) + + "\n- Supporting Window: " + + ( + if ((.runtime.checkpoint.supporting_task_ids // []) | length) > 0 then + ((.runtime.checkpoint.supporting_task_ids // []) | join(", ")) + else + "none" + end + ) + + "\n- Residual Risk: `" + + ((.runtime.convergence.residual_risk_score // 0) | tostring) + + "`" + + " (must-fix=" + + ((.runtime.convergence.must_fix_open_count // 0) | tostring) + + ", high-risk=" + + ((.runtime.convergence.high_risk_open_count // 0) | tostring) + + ", novelty=" + + ((.runtime.convergence.recent_high_value_novelty_count // 0) | tostring) + + ")" + + ( + if ((.runtime.review_coverage // null) | type) == "object" + and ( + (((.runtime.review_coverage.touched_failure_surfaces // []) | length) > 0) + or (((.runtime.review_coverage.likely_sibling_risks // []) | length) > 0) + or (((.runtime.review_coverage.coverage_ledger // []) | length) > 0) + ) then + "\n- Review Coverage Snapshot: surfaces=" + + (((.runtime.review_coverage.summary.surface_count // 0) | tostring)) + + ", sibling-risks=" + + (((.runtime.review_coverage.summary.sibling_risk_count // 0) | tostring)) + + ", partial-or-unclear=" + + (((.runtime.review_coverage.summary.partial_or_unclear_count // 0) | tostring)) + else + "" + end + ) + + "\n- Convergence Status: `" + + ((.runtime.convergence.status // "continue") | clean_text) + + "`" + + "\n- Guidance: " + + ((.runtime.convergence.guidance // "Reconcile the manager frontier before starting work.") | clean_text) + ' "$matrix_file" +} + +scenario_matrix_current_mainline_summary() { + local matrix_file="$1" + + if ! scenario_matrix_validate_file "$matrix_file"; then + echo "No valid scenario matrix is available." + return 1 + fi + + jq -r ' + . as $root + | + def active_task: + if ($root.manager.current_primary_task_id // null) != null then + ( + first( + $root.tasks[] + | select(.id == $root.manager.current_primary_task_id) + | select(.state != "done" and .state != "deferred") + ) // null + ) + else + ( + first( + .tasks[] + | select(.lane == "mainline") + | select(.state != "done" and .state != "deferred") + ) // null + ) + end; + + if (active_task | type) == "object" then + (active_task.id // "unknown") + + " - " + + (active_task.title // "Untitled task") + + " [state=" + + (active_task.state // "unknown") + + ", routing=" + + (active_task.routing // "unknown") + + "]" + elif (.tasks | length) > 0 then + "No active mainline task is recorded; reconcile the matrix before editing the contract." + else + "No tasks are currently recorded in the scenario matrix." + end + ' "$matrix_file" +} + +scenario_matrix_has_projectable_tasks() { + local matrix_file="$1" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + jq -e ' + (.runtime.mode // "implementation") == "implementation" + and (.tasks | length) > 0 + ' "$matrix_file" >/dev/null 2>&1 +} + +scenario_matrix_monitor_snapshot() { + local matrix_file="$1" + local matrix_required="${2:-false}" + + if [[ ! -f "$matrix_file" ]]; then + if [[ "$matrix_required" == "true" ]]; then + echo "missing|0|Scenario matrix file is missing.|idle|none|n/a|unknown|n/a|none" + else + echo "legacy|0|Legacy loop without scenario matrix.|idle|none|n/a|legacy|n/a|none" + fi + return 0 + fi + + if ! scenario_matrix_validate_file "$matrix_file"; then + echo "invalid|0|Scenario matrix file is invalid.|idle|none|n/a|unknown|n/a|none" + return 0 + fi + + jq -r ' + . as $root + | + def primary_task: + if ($root.manager.current_primary_task_id // null) != null then + ( + first( + $root.tasks[] + | select(.id == $root.manager.current_primary_task_id) + | select(.state != "done" and .state != "deferred") + ) // null + ) + else + ( + first( + $root.tasks[] + | select(.lane == "mainline") + | select(.state != "done" and .state != "deferred") + ) // null + ) + end; + + def clean_text: + tostring + | gsub("\\|"; "/") + | gsub("\\r?\\n"; " "); + + def wave_label($task): + if ($task | type) == "object" then + if (($task.repair_wave // "") | length) > 0 then + $task.repair_wave + elif (($task.cluster_id // "") | length) > 0 then + $task.cluster_id + else + "none" + end + else + "none" + end; + + [ + ( + if (.plan.task_breakdown_status // "missing") == "not_applicable" then + "not_applicable" + else + "ready" + end + ), + ((.tasks | length) | tostring), + ( + if (primary_task | type) == "object" then + (primary_task.id // "unknown") + + " - " + + (primary_task.title // "Untitled task") + + " [state=" + + (primary_task.state // "unknown") + + ", routing=" + + (primary_task.routing // "unknown") + + "]" + elif (.tasks | length) > 0 then + "No active mainline task is recorded." + else + "No tasks are currently recorded." + end + | clean_text + ), + ((.oversight.status // "idle") | clean_text), + ( + ( + .oversight.last_action + // (.oversight.intervention.action // "none") + // "none" + ) + | clean_text + ), + ((.runtime.checkpoint.current_id // "n/a") | clean_text), + ((.runtime.convergence.status // "unknown") | clean_text), + ((.runtime.convergence.next_action // "n/a") | clean_text), + (wave_label(primary_task) | clean_text) + ] | join("|") + ' "$matrix_file" +} + +scenario_matrix_render_goal_tracker_active_section() { + local matrix_file="$1" + local variant="${2:-full}" + + jq -r --arg variant "$variant" ' + def md_text: + tostring + | gsub("\\|"; "\\\\|") + | gsub("\\r?\\n"; " "); + + def owner: + if ((.owner // "") | length) > 0 then + .owner + elif (.routing // "") == "analyze" then + "codex" + else + "claude" + end; + + def notes_full: + [ + ("id=" + (.id // "unknown")), + (if ((.depends_on // []) | length) > 0 then + "depends_on=" + ((.depends_on // []) | join(", ")) + else + empty + end) + ] | join("; "); + + def notes_compact: + [ + ("id=" + (.id // "unknown")), + ("routing=" + (.routing // "unknown")), + (if ((.depends_on // []) | length) > 0 then + "depends_on=" + ((.depends_on // []) | join(", ")) + else + empty + end) + ] | join("; "); + + def target_acs: + ((.target_ac // []) | map(md_text) | join(", ")); + + def active_rows_full: + [ + .tasks[] + | select(.lane == "mainline") + | select(.state != "done" and .state != "deferred") + | "| [mainline] " + + ((.title // "Untitled task") | md_text) + + " | " + + target_acs + + " | " + + (.state // "pending") + + " | " + + (.routing // "coding") + + " | " + + owner + + " | " + + notes_full + + " |" + ]; + + def active_rows_compact: + [ + .tasks[] + | select(.lane == "mainline") + | select(.state != "done" and .state != "deferred") + | "| [mainline] " + + ((.title // "Untitled task") | md_text) + + " | " + + target_acs + + " | " + + (.state // "pending") + + " | " + + notes_compact + + " |" + ]; + + "#### Active Tasks\n" + + ( + if $variant == "compact" then + "| Task | Target AC | Status | Notes |\n" + + "|------|-----------|--------|-------|\n" + + ( + if (active_rows_compact | length) > 0 then + (active_rows_compact | join("\n")) + else + "| [matrix] No active mainline task recorded | - | pending | Reconcile scenario-matrix.json before editing the contract. |" + end + ) + else + "| Task | Target AC | Status | Tag | Owner | Notes |\n" + + "|------|-----------|--------|-----|-------|-------|\n" + + ( + if (active_rows_full | length) > 0 then + (active_rows_full | join("\n")) + else + "| [matrix] No active mainline task recorded | - | pending | coding | claude | Reconcile scenario-matrix.json before editing the contract. |" + end + ) + end + ) + ' "$matrix_file" +} + +scenario_matrix_render_goal_tracker_blocking_section() { + local matrix_file="$1" + + jq -r ' + def md_text: + tostring + | gsub("\\|"; "\\\\|") + | gsub("\\r?\\n"; " "); + + def blocking_rows: + [ + .tasks[] + | select(.lane != "mainline") + | select(.state == "blocked" or .state == "needs_replan") + | "| " + + ((.title // "Untitled task") | md_text) + + " [" + + (.id // "unknown") + + "] | " + + ((.runtime.current_round // 0) | tostring) + + " | " + + ((.target_ac // []) | map(md_text) | join(", ")) + + " | " + + ( + if ((.depends_on // []) | length) > 0 then + "Repair or confirm dependency: " + + ((.depends_on // []) | join(", ")) + else + "Replan the supporting task before promoting it." + end + ) + + " |" + ]; + + def group_rows: + [ + (.finding_groups // [])[] + | select(.state == "blocked") + | "| " + + ((.title // "Review backlog") | md_text) + + " | " + + ((.last_seen_round // .first_seen_round // .runtime.current_round // 0) | tostring) + + " | " + + ((.target_ac // []) | map(md_text) | join(", ")) + + " | " + + ( + "Resolve the grouped review backlog before promoting more follow-up work. " + + ((.summary // "") | md_text) + ) + + " |" + ]; + + "### Blocking Side Issues\n" + + "| Issue | Discovered Round | Blocking AC | Resolution Path |\n" + + "|-------|-----------------|-------------|-----------------|\n" + + ( + if ((blocking_rows + group_rows) | length) > 0 then + ((blocking_rows + group_rows) | join("\n")) + else + "" + end + ) + ' "$matrix_file" +} + +scenario_matrix_render_goal_tracker_queued_section() { + local matrix_file="$1" + + jq -r ' + def md_text: + tostring + | gsub("\\|"; "\\\\|") + | gsub("\\r?\\n"; " "); + + def queued_rows: + [ + .tasks[] + | select(.lane != "mainline") + | select(.state != "done" and .state != "deferred") + | select(.state != "blocked" and .state != "needs_replan") + | "| " + + ((.title // "Untitled task") | md_text) + + " [" + + (.id // "unknown") + + "] | " + + ((.runtime.current_round // 0) | tostring) + + " | " + + ( + if (.state // "pending") == "in_progress" then + "Supporting work is active but does not replace the current mainline." + else + "Supporting work is queued behind the current single-mainline objective." + end + ) + + " | " + + ( + if ((.depends_on // []) | length) > 0 then + "After dependency completion: " + + ((.depends_on // []) | join(", ")) + else + "When the next round contract promotes it." + end + ) + + " |" + ]; + + def group_rows: + [ + (.finding_groups // [])[] + | select(.state == "queued") + | "| " + + ((.title // "Review backlog") | md_text) + + " | " + + ((.last_seen_round // .first_seen_round // .runtime.current_round // 0) | tostring) + + " | " + + ("Grouped review backlog remains out of scope until the manager promotes it. " + ((.summary // "") | md_text)) + + " | " + + ( + if ((.related_task_ids // []) | length) > 0 then + "When related task context changes: " + + ((.related_task_ids // []) | join(", ")) + else + "When the manager promotes this backlog into active repair work." + end + ) + + " |" + ]; + + "### Queued Side Issues\n" + + "| Issue | Discovered Round | Why Not Blocking | Revisit Trigger |\n" + + "|-------|-----------------|------------------|-----------------|\n" + + ( + if ((queued_rows + group_rows) | length) > 0 then + ((queued_rows + group_rows) | join("\n")) + else + "" + end + ) + ' "$matrix_file" +} + +scenario_matrix_render_goal_tracker_completed_section() { + local matrix_file="$1" + + jq -r ' + def md_text: + tostring + | gsub("\\|"; "\\\\|") + | gsub("\\r?\\n"; " "); + + def completed_rows: + [ + .tasks[] + | select(.state == "done") + | "| " + + ((.target_ac // []) | map(md_text) | join(", ")) + + " | " + + ((.title // "Untitled task") | md_text) + + " [" + + (.id // "unknown") + + "] | " + + ((.health.last_progress_round // .runtime.current_round // 0) | tostring) + + " | " + + ((.health.last_progress_round // .runtime.current_round // 0) | tostring) + + " | " + + "scenario-matrix.json:" + + (.id // "unknown") + + " |" + ]; + + "### Completed and Verified\n" + + "| AC | Task | Completed Round | Verified Round | Evidence |\n" + + "|----|------|-----------------|----------------|----------|\n" + + ( + if (completed_rows | length) > 0 then + (completed_rows | join("\n")) + else + "" + end + ) + ' "$matrix_file" +} + +scenario_matrix_render_goal_tracker_deferred_section() { + local matrix_file="$1" + + jq -r ' + def md_text: + tostring + | gsub("\\|"; "\\\\|") + | gsub("\\r?\\n"; " "); + + def deferred_rows: + [ + .tasks[] + | select(.state == "deferred" or ((.admission.status // "") == "watchlist")) + | "| " + + ((.title // "Untitled task") | md_text) + + " [" + + (.id // "unknown") + + "] | " + + ((.target_ac // []) | map(md_text) | join(", ")) + + " | " + + ((.metadata.deferred_since_round // .metadata.source_round // .runtime.current_round // 0) | tostring) + + " | " + + ((.admission.reason // "Deferred until the manager promotes it.") | md_text) + + " | " + + ( + if ((.repair_wave // "") | length) > 0 then + "When repair wave " + (.repair_wave | md_text) + " is promoted." + elif ((.depends_on // []) | length) > 0 then + "When " + ((.depends_on // []) | join(", ")) + " changes or is promoted." + else + "When the manager promotes it into the active frontier." + end + ) + + " |" + ]; + + def group_rows: + [ + (.finding_groups // [])[] + | select(.state == "deferred") + | "| " + + ((.title // "Review backlog") | md_text) + + " | " + + ((.target_ac // []) | map(md_text) | join(", ")) + + " | " + + ((.first_seen_round // .last_seen_round // .runtime.current_round // 0) | tostring) + + " | " + + ("Deferred grouped review backlog. " + ((.summary // "") | md_text)) + + " | " + + "When the manager promotes this backlog into active repair work." + + " |" + ]; + + "### Explicitly Deferred\n" + + "| Task | Original AC | Deferred Since | Justification | When to Reconsider |\n" + + "|------|-------------|----------------|---------------|-------------------|\n" + + ( + if ((deferred_rows + group_rows) | length) > 0 then + ((deferred_rows + group_rows) | join("\n")) + else + "" + end + ) + ' "$matrix_file" +} + +scenario_matrix_replace_goal_tracker_section() { + local tracker_file="$1" + local start_pattern="$2" + local end_pattern="$3" + local replacement_file="$4" + local temp_file="${tracker_file}.tmp.$$" + + if ! awk \ + -v start_pattern="$start_pattern" \ + -v end_pattern="$end_pattern" \ + -v replacement_file="$replacement_file" ' + BEGIN { + in_section = 0 + replaced = 0 + } + { + if (!in_section && $0 ~ start_pattern) { + while ((getline line < replacement_file) > 0) { + print line + } + close(replacement_file) + in_section = 1 + replaced = 1 + next + } + if (in_section) { + if (end_pattern != "" && $0 ~ end_pattern) { + in_section = 0 + print + } + next + } + print + } + END { + if (!replaced) { + exit 1 + } + if (in_section && end_pattern != "") { + exit 1 + } + } + ' "$tracker_file" > "$temp_file"; then + rm -f "$temp_file" + return 1 + fi + + mv "$temp_file" "$tracker_file" +} + +scenario_matrix_sync_goal_tracker() { + local matrix_file="$1" + local tracker_file="$2" + + if [[ ! -f "$tracker_file" ]]; then + return 0 + fi + + if ! scenario_matrix_has_projectable_tasks "$matrix_file"; then + return 0 + fi + + if ! grep -q '^#### Active Tasks$' "$tracker_file" || \ + ! grep -q '^### Blocking Side Issues$' "$tracker_file" || \ + ! grep -q '^### Queued Side Issues$' "$tracker_file"; then + return 0 + fi + + local active_variant="compact" + if grep -q '^| Task | Target AC | Status | Tag | Owner | Notes |$' "$tracker_file"; then + active_variant="full" + fi + local has_completed_section="false" + local has_deferred_section="false" + if grep -q '^### Completed and Verified$' "$tracker_file"; then + has_completed_section="true" + fi + if grep -q '^### Explicitly Deferred$' "$tracker_file"; then + has_deferred_section="true" + fi + + local active_file="${tracker_file}.active.$$" + local blocking_file="${tracker_file}.blocking.$$" + local queued_file="${tracker_file}.queued.$$" + local completed_file="${tracker_file}.completed.$$" + local deferred_file="${tracker_file}.deferred.$$" + + scenario_matrix_render_goal_tracker_active_section "$matrix_file" "$active_variant" > "$active_file" + scenario_matrix_render_goal_tracker_blocking_section "$matrix_file" > "$blocking_file" + scenario_matrix_render_goal_tracker_queued_section "$matrix_file" > "$queued_file" + scenario_matrix_render_goal_tracker_completed_section "$matrix_file" > "$completed_file" + scenario_matrix_render_goal_tracker_deferred_section "$matrix_file" > "$deferred_file" + + scenario_matrix_replace_goal_tracker_section "$tracker_file" '^#### Active Tasks$' '^### Blocking Side Issues$' "$active_file" || { + rm -f "$active_file" "$blocking_file" "$queued_file" "$completed_file" "$deferred_file" + return 1 + } + scenario_matrix_replace_goal_tracker_section "$tracker_file" '^### Blocking Side Issues$' '^### Queued Side Issues$' "$blocking_file" || { + rm -f "$active_file" "$blocking_file" "$queued_file" "$completed_file" "$deferred_file" + return 1 + } + if [[ "$has_completed_section" == "true" ]]; then + local queued_end_pattern='^### Completed and Verified$' + scenario_matrix_replace_goal_tracker_section "$tracker_file" '^### Queued Side Issues$' "$queued_end_pattern" "$queued_file" || { + rm -f "$active_file" "$blocking_file" "$queued_file" "$completed_file" "$deferred_file" + return 1 + } + if [[ "$has_deferred_section" == "true" ]]; then + scenario_matrix_replace_goal_tracker_section "$tracker_file" '^### Completed and Verified$' '^### Explicitly Deferred$' "$completed_file" || { + rm -f "$active_file" "$blocking_file" "$queued_file" "$completed_file" "$deferred_file" + return 1 + } + scenario_matrix_replace_goal_tracker_section "$tracker_file" '^### Explicitly Deferred$' '' "$deferred_file" || { + rm -f "$active_file" "$blocking_file" "$queued_file" "$completed_file" "$deferred_file" + return 1 + } + else + scenario_matrix_replace_goal_tracker_section "$tracker_file" '^### Completed and Verified$' '' "$completed_file" || { + rm -f "$active_file" "$blocking_file" "$queued_file" "$completed_file" "$deferred_file" + return 1 + } + fi + else + local queued_end_pattern='' + if [[ "$has_deferred_section" == "true" ]]; then + queued_end_pattern='^### Explicitly Deferred$' + fi + scenario_matrix_replace_goal_tracker_section "$tracker_file" '^### Queued Side Issues$' "$queued_end_pattern" "$queued_file" || { + rm -f "$active_file" "$blocking_file" "$queued_file" "$completed_file" "$deferred_file" + return 1 + } + if [[ "$has_deferred_section" == "true" ]]; then + scenario_matrix_replace_goal_tracker_section "$tracker_file" '^### Explicitly Deferred$' '' "$deferred_file" || { + rm -f "$active_file" "$blocking_file" "$queued_file" "$completed_file" "$deferred_file" + return 1 + } + fi + fi + + rm -f "$active_file" "$blocking_file" "$queued_file" "$completed_file" "$deferred_file" +} + +scenario_matrix_render_round_contract() { + local matrix_file="$1" + local round="$2" + local mode="${3:-implementation}" + + jq -r --arg round "$round" --arg mode "$mode" ' + . as $root + | + def clean_text: + tostring + | gsub("\\|"; "/") + | gsub("\\r?\\n"; " "); + + def active_task: + if ($root.runtime.checkpoint.primary_task_id // null) != null then + ( + first( + $root.tasks[] + | select(.id == $root.runtime.checkpoint.primary_task_id) + | select(.state != "done" and .state != "deferred") + ) // null + ) + else + ( + first( + $root.tasks[] + | select(.lane == "mainline") + | select(.state != "done" and .state != "deferred") + ) // null + ) + end; + + def checkpoint_supporting_items: + [ + ($root.runtime.checkpoint.supporting_task_ids // [])[] + | . as $task_id + | first($root.tasks[] | select(.id == $task_id)) + | ($task_id + ": " + ((.title // "Untitled task") | clean_text)) + ]; + + def blocking_items: + [ + $root.tasks[] + | select(.lane != "mainline") + | select(.state == "blocked" or .state == "needs_replan") + | ((.id // "task") + ": " + ((.title // "Untitled task") | clean_text)) + ] + + [ + ($root.finding_groups // [])[] + | select(.state == "blocked") + | ("issue:" + (.id // "review-backlog") + ": " + ((.title // "Review backlog") | clean_text)) + ]; + + def queued_items: + [ + $root.tasks[] + | .id as $task_id + | select(.lane != "mainline") + | select(.state != "done" and .state != "deferred") + | select(.state != "blocked" and .state != "needs_replan") + | select((($root.runtime.checkpoint.supporting_task_ids // []) | index($task_id)) == null) + | ((.id // "task") + ": " + ((.title // "Untitled task") | clean_text)) + ] + + [ + ($root.finding_groups // [])[] + | select(.state == "queued") + | ("issue:" + (.id // "review-backlog") + ": " + ((.title // "Review backlog") | clean_text)) + ]; + + "# Round " + $round + " Contract\n\n" + + "- Mainline Objective: " + + ( + if (active_task | type) == "object" then + (active_task.id // "task") + + ": " + + ((active_task.title // "Untitled task") | clean_text) + else + "Reconcile scenario-matrix.json and recover a single mainline objective." + end + ) + + "\n- Target ACs: " + + ( + if (active_task | type) == "object" then + ((active_task.target_ac // []) | join(", ")) + else + "-" + end + ) + + "\n- Checkpoint: " + + (($root.runtime.checkpoint.current_id // "checkpoint-0") | clean_text) + + ( + if ($root.runtime.checkpoint.frontier_changed // false) then + " (frontier changed)" + else + " (frontier stable)" + end + ) + + "\n- Supporting Window In Scope: " + + ( + if (checkpoint_supporting_items | length) > 0 then + (checkpoint_supporting_items | join("; ")) + else + "none" + end + ) + + "\n- Blocking Side Issues In Scope: " + + ( + if (blocking_items | length) > 0 then + (blocking_items | join("; ")) + else + "none" + end + ) + + "\n- Queued Side Issues Out of Scope: " + + ( + if (queued_items | length) > 0 then + (queued_items | join("; ")) + else + "none" + end + ) + + "\n- Residual Risk: " + + ( + "score=" + + (($root.runtime.convergence.residual_risk_score // 0) | tostring) + + ", must-fix=" + + (($root.runtime.convergence.must_fix_open_count // 0) | tostring) + + ", high-risk=" + + (($root.runtime.convergence.high_risk_open_count // 0) | tostring) + + ", novelty=" + + (($root.runtime.convergence.recent_high_value_novelty_count // 0) | tostring) + ) + + "\n- Convergence Status: " + + (($root.runtime.convergence.status // "continue") | clean_text) + + "\n- Success Criteria: " + + ( + if $mode == "review" then + "Resolve the review-blocking work while keeping the scenario matrix and goal tracker aligned with the same single mainline objective." + elif $mode == "recovery" then + "Recover mainline progress without widening scope beyond the current scenario matrix frontier." + else + "Advance the current mainline objective without widening scope beyond the scenario matrix frontier." + end + ) + + "\n- Checkpoint Guidance: " + + (($root.runtime.convergence.guidance // "Reconcile the scenario matrix before widening scope.") | clean_text) + ' "$matrix_file" +} + +scenario_matrix_write_round_contract_scaffold() { + local matrix_file="$1" + local contract_file="$2" + local round="$3" + local mode="${4:-implementation}" + local force_write="${5:-false}" + + if ! scenario_matrix_has_projectable_tasks "$matrix_file"; then + return 0 + fi + + if [[ -f "$contract_file" && "$force_write" != "true" ]]; then + return 0 + fi + + local temp_file="${contract_file}.tmp.$$" + scenario_matrix_render_round_contract "$matrix_file" "$round" "$mode" > "$temp_file" + mv "$temp_file" "$contract_file" +} + +scenario_matrix_current_oversight_markdown() { + local matrix_file="$1" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + jq -r ' + (.oversight.intervention // null) as $intervention + | if (.oversight.status // "idle") == "active" and ($intervention | type) == "object" then + "## Oversight Intervention\n\n" + + "- Action: `" + ($intervention.action // "unknown") + "`\n" + + (if ($intervention.target_task_id // "") != "" then + "- Target Task: `" + $intervention.target_task_id + "`\n" + else + "" + end) + + "- Reason: " + ($intervention.reason // "Repeated failures require a narrower recovery path.") + "\n" + + "- Guidance: " + ($intervention.message // "Stay on the current task and try a different method.") + else + empty + end + ' "$matrix_file" +} + +scenario_matrix_current_review_coverage_markdown() { + local matrix_file="$1" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + jq -r ' + (.runtime.review_coverage // null) as $coverage + | if ($coverage | type) == "object" + and ( + (($coverage.touched_failure_surfaces // []) | length) > 0 + or (($coverage.likely_sibling_risks // []) | length) > 0 + or (($coverage.coverage_ledger // []) | length) > 0 + ) then + "## Recent Review Coverage\n\n" + + "- Source: `round " + + ((($coverage.source_round // 0) | tostring)) + + " / " + + (($coverage.source_phase // "implementation")) + + "`\n" + + "- Touched Failure Surfaces: " + + ( + if (($coverage.touched_failure_surfaces // []) | length) > 0 then + ( + ($coverage.touched_failure_surfaces // []) + | map( + (.surface // "unknown") + + ( + if (.confidence // null) != null and (.confidence // "") != "" then + " [confidence=" + (.confidence // "") + "]" + else + "" + end + ) + + ( + if (.reason // "") != "" then + ": " + (.reason // "") + else + "" + end + ) + ) + | .[:4] + | join("; ") + ) + else + "none captured" + end + ) + + "\n- Likely Sibling Risks: " + + ( + if (($coverage.likely_sibling_risks // []) | length) > 0 then + ( + ($coverage.likely_sibling_risks // []) + | map( + (.summary // "unknown") + + ( + if (.expansion_axis // null) != null and (.expansion_axis // "") != "" then + " [axis=" + (.expansion_axis // "") + "]" + else + "" + end + ) + + ( + if (.confidence // null) != null and (.confidence // "") != "" then + " [confidence=" + (.confidence // "") + "]" + else + "" + end + ) + ) + | .[:4] + | join("; ") + ) + else + "none captured" + end + ) + + ( + if (($coverage.coverage_ledger // []) | length) > 0 then + "\n\n| Surface | Status | Notes |\n|---------|--------|-------|\n" + + ( + ($coverage.coverage_ledger // []) + | map( + "| " + + (.surface // "unknown") + + " | " + + (.status // "unclear") + + " | " + + ((.notes // "") | gsub("\\|"; "/") | gsub("\\r?\\n"; " ")) + + " |" + ) + | join("\n") + ) + else + "" + end + ) + else + empty + end + ' "$matrix_file" +} + +scenario_matrix_apply_implementation_review() { + local matrix_file="$1" + local next_round="$2" + local verdict="$3" + local review_content="$4" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + local verdict_lower dependency_hint created_at event_id temp_file review_content_lower review_coverage_json + verdict_lower=$(printf '%s' "$verdict" | tr '[:upper:]' '[:lower:]' | tr -d '[:space:]') + dependency_hint=$(scenario_matrix_dependency_hint_from_review "$review_content") + review_content_lower=$(printf '%s' "$review_content" | tr '[:upper:]' '[:lower:]') + review_coverage_json=$(scenario_matrix_extract_review_coverage_json "$review_content") + created_at=$(date -u +%Y-%m-%dT%H:%M:%SZ) + event_id="evt-impl-${next_round}-$(date +%s)" + temp_file="${matrix_file}.tmp.$$" + + jq \ + --arg created_at "$created_at" \ + --arg event_id "$event_id" \ + --arg verdict "$verdict_lower" \ + --arg round "$next_round" \ + --arg dependency_hint "$dependency_hint" \ + --arg review_content_lower "$review_content_lower" \ + --argjson review_coverage "$review_coverage_json" ' + def active_mainline_id: + first( + .tasks[] + | select(.lane == "mainline") + | select(.state != "done" and .state != "deferred") + | .id + ); + + def should_split: + $review_content_lower | test("split|smaller|decompose|narrow"); + + def should_reclassify: + $review_content_lower | test("queued|not blocking|non-blocking|follow-up|defer"); + + def dependency_state($tasks; $id): + first($tasks[] | select(.id == $id) | .state) // "missing"; + + def reopened_dependency_state($tasks; $task): + if (($task.depends_on // []) | any(.[]; dependency_state($tasks; .) == "needs_replan")) then + "needs_replan" + elif (($task.depends_on // []) | any(.[]; dependency_state($tasks; .) == "blocked")) then + "blocked" + elif (($task.depends_on // []) | length) == 0 then + "ready" + elif (($task.depends_on // []) | all(.[]; dependency_state($tasks; .) == "done")) then + "ready" + else + "pending" + end; + + (active_mainline_id) as $active_id + | .runtime.current_round = ($round | tonumber) + | .runtime.review_coverage = ( + $review_coverage + + { + source_phase: "implementation", + source_round: ($round | tonumber), + updated_at: $created_at + } + ) + | .runtime.last_review = { + phase: "implementation", + verdict: $verdict, + dependency_hint: ($dependency_hint == "true"), + coverage_available: ( + (($review_coverage.touched_failure_surfaces // []) | length) > 0 + or (($review_coverage.likely_sibling_risks // []) | length) > 0 + or (($review_coverage.coverage_ledger // []) | length) > 0 + ), + coverage_summary: ($review_coverage.summary // {}), + updated_at: $created_at + } + | if (.manager | type) == "object" then + .manager.last_reconciled_at = $created_at + | .manager.current_primary_task_id = ( + if $active_id == null then + (.manager.current_primary_task_id // null) + else + $active_id + end + ) + else + . + end + | .events += [{ + id: $event_id, + type: "implementation_review", + round: ($round | tonumber), + phase: "implementation", + verdict: $verdict, + dependency_hint: ($dependency_hint == "true"), + created_at: $created_at + }] + | .tasks |= ( + map( + if (.state == "done" or .state == "deferred") then + . + elif .id == $active_id then + if $verdict == "advanced" then + .state = "in_progress" + | .health.stuck_score = 0 + | .health.last_progress_round = ($round | tonumber) + | .strategy.repeated_failure_count = 0 + | .strategy.method_switch_required = false + elif $verdict == "stalled" then + .state = (if .state == "ready" then "in_progress" else .state end) + | .health.stuck_score = ((.health.stuck_score // 0) + 1) + | .strategy.repeated_failure_count = ((.strategy.repeated_failure_count // 0) + 1) + | .strategy.method_switch_required = (((.strategy.repeated_failure_count // 0) + 1) >= 2) + elif $verdict == "regressed" then + .state = "needs_replan" + | .health.stuck_score = ((.health.stuck_score // 0) + 1) + | .strategy.repeated_failure_count = ((.strategy.repeated_failure_count // 0) + 1) + | .strategy.method_switch_required = (((.strategy.repeated_failure_count // 0) + 1) >= 2) + else + . + end + elif $dependency_hint == "true" + and $active_id != null + and $verdict == "stalled" + and ((.depends_on // []) | index($active_id)) != null then + .state = "blocked" + elif $dependency_hint == "true" + and $active_id != null + and $verdict == "regressed" + and ((.depends_on // []) | index($active_id)) != null then + .state = "needs_replan" + else + . + end + ) as $updated_tasks + | $updated_tasks + | map( + if $verdict == "advanced" + and $active_id != null + and ((.depends_on // []) | index($active_id)) != null + and (.state == "blocked" or .state == "needs_replan") then + .state = reopened_dependency_state($updated_tasks; .) + else + . + end + ) + ) + | (first(.tasks[] | select(.id == $active_id)) // null) as $active_task + | ($active_task.strategy.repeated_failure_count // 0) as $failure_count + | ($active_task.health.stuck_score // 0) as $stuck_score + | .oversight = ( + if $verdict == "advanced" then + { + status: "idle", + last_action: "none", + updated_at: $created_at, + intervention: null, + history: (.oversight.history // []) + } + elif ($active_task | type) == "object" and ($failure_count >= 2 or $stuck_score >= 2) then + ( + if should_reclassify then + { + action: "reclassify", + reason: "Review feedback indicates that some findings should stay queued instead of replacing the current mainline objective.", + message: "Stay on the current mainline task and move non-blocking follow-up work back to queued status." + } + elif $dependency_hint == "true" then + { + action: "resequence", + reason: "An upstream dependency changed and invalidated downstream work.", + message: "Stay on the current mainline task, repair the upstream dependency first, and resequence dependent work afterward." + } + elif should_split then + { + action: "split", + reason: "Repeated failures suggest the current step is too broad.", + message: "Stay on the current mainline task, but split it into a smaller recovery step before changing more code." + } + elif $verdict == "regressed" then + { + action: "reframe", + reason: "Repeated regressions indicate the current method is not working.", + message: "Stay on the current mainline task and try a different method instead of repeating the same path." + } + else + { + action: "nudge", + reason: "Repeated stalled rounds indicate local thrashing without forward movement.", + message: "Stay on the current mainline task and try a narrower corrective step before expanding scope." + } + end + ) as $intervention + | { + status: "active", + last_action: $intervention.action, + updated_at: $created_at, + intervention: ( + $intervention + + { + target_task_id: ($active_task.id // null), + generated_for_round: ($round | tonumber), + failure_count: $failure_count, + stuck_score: $stuck_score + } + ), + history: ( + (.oversight.history // []) + + [ + ( + $intervention + + { + target_task_id: ($active_task.id // null), + generated_for_round: ($round | tonumber), + failure_count: $failure_count, + stuck_score: $stuck_score, + created_at: $created_at + } + ) + ] + ) + } + else + { + status: "idle", + last_action: (.oversight.last_action // "none"), + updated_at: $created_at, + intervention: null, + history: (.oversight.history // []) + } + end + ) + ' "$matrix_file" > "$temp_file" && mv "$temp_file" "$matrix_file" +} + +scenario_matrix_record_code_review_cycle() { + local matrix_file="$1" + local next_round="$2" + local event_type="${3:-code_review_issues}" + local implementation_review_content="${4:-}" + + if ! scenario_matrix_validate_file "$matrix_file"; then + return 1 + fi + + local created_at event_id temp_file implementation_review_coverage + created_at=$(date -u +%Y-%m-%dT%H:%M:%SZ) + event_id="evt-review-${next_round}-$(date +%s)" + temp_file="${matrix_file}.tmp.$$" + implementation_review_coverage='{}' + if [[ -n "$implementation_review_content" ]]; then + implementation_review_coverage=$(scenario_matrix_extract_review_coverage_json "$implementation_review_content") + fi + + jq \ + --arg created_at "$created_at" \ + --arg event_id "$event_id" \ + --arg event_type "$event_type" \ + --arg round "$next_round" \ + --argjson implementation_review_coverage "$implementation_review_coverage" ' + def coverage_has_entries($coverage): + ( + (($coverage.touched_failure_surfaces // []) | length) > 0 + or (($coverage.likely_sibling_risks // []) | length) > 0 + or (($coverage.coverage_ledger // []) | length) > 0 + ); + .runtime.current_round = ($round | tonumber) + | if coverage_has_entries($implementation_review_coverage) then + .runtime.review_coverage = ( + $implementation_review_coverage + + { + source_phase: "implementation", + source_round: ($round | tonumber), + updated_at: $created_at + } + ) + else + . + end + | .runtime.last_review = { + phase: "review", + verdict: $event_type, + dependency_hint: false, + coverage_available: ( + if (.runtime.review_coverage // null) | type == "object" then + ( + (((.runtime.review_coverage.touched_failure_surfaces // []) | length) > 0) + or (((.runtime.review_coverage.likely_sibling_risks // []) | length) > 0) + or (((.runtime.review_coverage.coverage_ledger // []) | length) > 0) + ) + else + false + end + ), + coverage_summary: ((.runtime.review_coverage.summary // {})), + updated_at: $created_at + } + | if (.manager | type) == "object" then + .manager.last_reconciled_at = $created_at + else + . + end + | .events += [{ + id: $event_id, + type: $event_type, + round: ($round | tonumber), + phase: "review", + verdict: $event_type, + dependency_hint: false, + created_at: $created_at + }] + ' "$matrix_file" > "$temp_file" && mv "$temp_file" "$matrix_file" +} diff --git a/hooks/loop-codex-stop-hook.sh b/hooks/loop-codex-stop-hook.sh index 0682ff6e..8eb22311 100755 --- a/hooks/loop-codex-stop-hook.sh +++ b/hooks/loop-codex-stop-hook.sh @@ -45,6 +45,7 @@ LOOP_BASE_DIR="$PROJECT_ROOT/.humanize/rlcr" # Source shared loop functions and template loader SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]:-$0}")" && pwd)" source "$SCRIPT_DIR/lib/loop-common.sh" +source "$SCRIPT_DIR/lib/scenario-matrix.sh" # Source portable timeout wrapper for git operations PLUGIN_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" @@ -155,6 +156,8 @@ fi MAINLINE_STALL_COUNT="${STATE_MAINLINE_STALL_COUNT:-0}" LAST_MAINLINE_VERDICT="${STATE_LAST_MAINLINE_VERDICT:-$MAINLINE_VERDICT_UNKNOWN}" DRIFT_STATUS="${STATE_DRIFT_STATUS:-$DRIFT_STATUS_NORMAL}" +SCENARIO_MATRIX_FILE_REL="${STATE_SCENARIO_MATRIX_FILE:-}" +SCENARIO_MATRIX_REQUIRED="${STATE_SCENARIO_MATRIX_REQUIRED:-false}" # Re-validate Codex Model and Effort for YAML safety (in case state.md was manually edited) # Use same validation patterns as setup-rlcr-loop.sh if [[ ! "$CODEX_EXEC_MODEL" =~ ^[a-zA-Z0-9._-]+$ ]]; then @@ -203,6 +206,15 @@ fi LAST_MAINLINE_VERDICT=$(normalize_mainline_progress_verdict "$LAST_MAINLINE_VERDICT") DRIFT_STATUS=$(normalize_drift_status "$DRIFT_STATUS") +if [[ -n "$SCENARIO_MATRIX_FILE_REL" ]]; then + SCENARIO_MATRIX_FILE="$PROJECT_ROOT/$SCENARIO_MATRIX_FILE_REL" +else + SCENARIO_MATRIX_FILE="$LOOP_DIR/scenario-matrix.json" +fi +if [[ "$SCENARIO_MATRIX_REQUIRED" != "true" && "$SCENARIO_MATRIX_REQUIRED" != "false" ]]; then + SCENARIO_MATRIX_REQUIRED="false" +fi + # ======================================== # Quick-check 0: Schema Validation (v1.1.2+ fields) # ======================================== @@ -886,6 +898,46 @@ Please fill in the Goal Tracker ({{GOAL_TRACKER_FILE}}): fi fi +# ======================================== +# Scenario Matrix Validation +# ======================================== + +if [[ "$IS_FINALIZE_PHASE" != "true" ]] && [[ "$SCENARIO_MATRIX_REQUIRED" == "true" ]]; then + if [[ ! -f "$SCENARIO_MATRIX_FILE" ]]; then + REASON="# Scenario Matrix Missing + +This loop requires a scenario matrix file, but it is missing: + +\`$SCENARIO_MATRIX_FILE\` + +Before continuing, restore or recreate the matrix so task dependencies and round re-anchoring remain consistent." + jq -n --arg reason "$REASON" --arg msg "Loop: Blocked - scenario matrix missing" \ + '{"decision": "block", "reason": $reason, "systemMessage": $msg}' + exit 0 + fi + + if ! scenario_matrix_validate_file "$SCENARIO_MATRIX_FILE"; then + REASON="# Scenario Matrix Invalid + +This loop requires a valid scenario matrix file, but the current file failed validation: + +\`$SCENARIO_MATRIX_FILE\` + +Do not continue until the matrix is repaired. The next-round contract and task graph would otherwise drift." + jq -n --arg reason "$REASON" --arg msg "Loop: Blocked - scenario matrix invalid" \ + '{"decision": "block", "reason": $reason, "systemMessage": $msg}' + exit 0 + fi +fi + +SCENARIO_MATRIX_ENABLED=false +SCENARIO_MATRIX_MAINLINE_SUMMARY="No valid scenario matrix is available." +if [[ -f "$SCENARIO_MATRIX_FILE" ]] && scenario_matrix_validate_file "$SCENARIO_MATRIX_FILE"; then + SCENARIO_MATRIX_ENABLED=true + scenario_matrix_ingest_summary_feedback "$SCENARIO_MATRIX_FILE" "$SUMMARY_FILE" || true + SCENARIO_MATRIX_MAINLINE_SUMMARY=$(scenario_matrix_current_mainline_summary "$SCENARIO_MATRIX_FILE" 2>/dev/null || echo "No active mainline task is recorded in the scenario matrix.") +fi + # ======================================== # Check Max Iterations (skip in Finalize Phase - already post-COMPLETE) # ======================================== @@ -1171,7 +1223,7 @@ Provider: codex # Note: detect_review_issues() is defined in loop-common.sh and sourced above # Run code review and handle the result -# Arguments: $1=round_number, $2=success_system_message +# Arguments: $1=round_number, $2=success_system_message, $3=latest_implementation_review_content (optional) # This function consolidates the common pattern of: # 1. Running codex review (no prompt - uses --base only) # 2. Checking results and handling outcomes @@ -1184,6 +1236,7 @@ Provider: codex run_and_handle_code_review() { local round="$1" local success_msg="$2" + local implementation_review_content="${3:-}" echo "Running codex review against base branch: $BASE_BRANCH..." >&2 @@ -1204,7 +1257,7 @@ run_and_handle_code_review() { block_review_failure "$round" "Codex review produced no stdout output" "N/A" elif [[ "$detect_exit" -eq 0 ]] && [[ -n "$merged_content" ]]; then # Issues found - continue review loop - continue_review_loop_with_issues "$round" "$merged_content" + continue_review_loop_with_issues "$round" "$merged_content" "$implementation_review_content" else # No issues found (exit code 1) - proceed to finalize echo "Code review passed with no issues. Proceeding to finalize phase." >&2 @@ -1314,6 +1367,122 @@ Follow the plan's per-task routing tags strictly: ROUTING_EOF } +# Append scenario matrix reminder to follow-up prompts. +# Arguments: $1=prompt_file_path $2=scenario_matrix_file $3=mainline_summary +append_scenario_matrix_note() { + local prompt_file="$1" + local scenario_matrix_file="$2" + local mainline_summary="$3" + local task_packet_markdown="" + local feedback_readback_markdown="" + local checkpoint_markdown="" + local review_coverage_markdown="" + + if [[ -f "$scenario_matrix_file" ]]; then + task_packet_markdown=$(scenario_matrix_current_task_packet_markdown "$scenario_matrix_file" 2>/dev/null || true) + feedback_readback_markdown=$(scenario_matrix_task_packet_feedback_instructions_markdown 2>/dev/null || true) + checkpoint_markdown=$(scenario_matrix_current_checkpoint_markdown "$scenario_matrix_file" 2>/dev/null || true) + review_coverage_markdown=$(scenario_matrix_current_review_coverage_markdown "$scenario_matrix_file" 2>/dev/null || true) + fi + + cat >> "$prompt_file" << EOF + +## Scenario Matrix Re-anchor + +Before rewriting the round contract or starting implementation work: +- Re-read @$scenario_matrix_file +- Refresh task states, dependency edges, and any invalidated assumptions in the matrix +- Keep the matrix aligned with the mutable goal tracker section and the current round contract +- Preserve exactly one current mainline objective for this round even if multiple supporting tasks are ready + +Current matrix mainline projection: +- $mainline_summary +EOF + + if [[ -n "$task_packet_markdown" ]]; then + printf '\n%s\n' "$task_packet_markdown" >> "$prompt_file" + fi + + if [[ -n "$checkpoint_markdown" ]]; then + printf '\n%s\n' "$checkpoint_markdown" >> "$prompt_file" + fi + + if [[ -n "$review_coverage_markdown" ]]; then + printf '\n%s\n' "$review_coverage_markdown" >> "$prompt_file" + fi + + if [[ -n "$feedback_readback_markdown" ]]; then + printf '\n%s\n' "$feedback_readback_markdown" >> "$prompt_file" + fi +} + +build_scenario_matrix_reanchor_steps() { + local scenario_matrix_enabled="$1" + local scenario_matrix_file="$2" + + if [[ "$scenario_matrix_enabled" != "true" ]]; then + return 0 + fi + + cat <> "$prompt_file" +} + # Stop the loop when mainline progress has stalled for too many consecutive rounds. # Arguments: $1=stall_count, $2=last_verdict stop_for_mainline_drift() { @@ -1388,10 +1557,11 @@ Files: } # Continue review loop when issues are found -# Arguments: $1=round_number, $2=review_content +# Arguments: $1=round_number, $2=review_content, $3=latest_implementation_review_content (optional) continue_review_loop_with_issues() { local round="$1" local review_content="$2" + local implementation_review_content="${3:-}" echo "Code review found issues. Continuing review loop..." >&2 @@ -1419,6 +1589,10 @@ continue_review_loop_with_issues() { ## Remaining Items - [List unresolved items, if any] +## Task Packet Feedback (Optional) +| Task ID | Source | Kind | Summary | +|---------|--------|------|---------| + ## BitLesson Delta - Action: none|add|update - Lesson ID(s): NONE @@ -1426,6 +1600,24 @@ continue_review_loop_with_issues() { EOF fi local next_contract_file="$LOOP_DIR/round-${round}-contract.md" + local scenario_matrix_mainline_summary="$SCENARIO_MATRIX_MAINLINE_SUMMARY" + local scenario_matrix_reanchor_steps="" + local scenario_matrix_mainline_block="" + local scenario_matrix_tracker_note="" + local scenario_matrix_oversight_markdown="" + + if [[ "$SCENARIO_MATRIX_ENABLED" == "true" ]]; then + scenario_matrix_record_code_review_cycle "$SCENARIO_MATRIX_FILE" "$round" "code_review_issues" "$implementation_review_content" || true + scenario_matrix_ingest_review_findings "$SCENARIO_MATRIX_FILE" "$round" "review" "$review_content" || true + scenario_matrix_reconcile_manager_state "$SCENARIO_MATRIX_FILE" "$round" "review_follow_up" || true + scenario_matrix_sync_goal_tracker "$SCENARIO_MATRIX_FILE" "$GOAL_TRACKER_FILE" || true + scenario_matrix_write_round_contract_scaffold "$SCENARIO_MATRIX_FILE" "$next_contract_file" "$round" "review" || true + scenario_matrix_mainline_summary=$(scenario_matrix_current_mainline_summary "$SCENARIO_MATRIX_FILE" 2>/dev/null || echo "$SCENARIO_MATRIX_MAINLINE_SUMMARY") + scenario_matrix_reanchor_steps=$(build_scenario_matrix_reanchor_steps "$SCENARIO_MATRIX_ENABLED" "$SCENARIO_MATRIX_FILE") + scenario_matrix_mainline_block=$(build_scenario_matrix_mainline_block "$SCENARIO_MATRIX_ENABLED" "$scenario_matrix_mainline_summary") + scenario_matrix_tracker_note=$(build_scenario_matrix_tracker_note "$SCENARIO_MATRIX_ENABLED" "$SCENARIO_MATRIX_FILE" "review") + scenario_matrix_oversight_markdown=$(scenario_matrix_current_oversight_markdown "$SCENARIO_MATRIX_FILE" 2>/dev/null || true) + fi local fallback="# Code Review Findings @@ -1451,6 +1643,9 @@ You are in the **Review Phase** of the RLCR loop. Codex has performed a code rev "PLAN_FILE=$PLAN_FILE" \ "GOAL_TRACKER_FILE=$GOAL_TRACKER_FILE" \ "ROUND_CONTRACT_FILE=$next_contract_file" \ + "SCENARIO_MATRIX_REANCHOR_STEPS=$scenario_matrix_reanchor_steps" \ + "SCENARIO_MATRIX_MAINLINE_BLOCK=$scenario_matrix_mainline_block" \ + "SCENARIO_MATRIX_TRACKER_NOTE=$scenario_matrix_tracker_note" \ "CURRENT_ROUND=$round" > "$next_prompt_file" if [[ "$BITLESSON_REQUIRED" == "true" ]] && ! grep -q 'bitlesson-selector' "$next_prompt_file"; then cat >> "$next_prompt_file" << EOF @@ -1467,6 +1662,10 @@ Reference: @$BITLESSON_FILE EOF fi append_task_tag_routing_note "$next_prompt_file" + if [[ "$SCENARIO_MATRIX_ENABLED" == "true" ]]; then + append_scenario_matrix_note "$next_prompt_file" "$SCENARIO_MATRIX_FILE" "$scenario_matrix_mainline_summary" + append_oversight_note "$next_prompt_file" "$scenario_matrix_oversight_markdown" + fi jq -n \ --arg reason "$(cat "$next_prompt_file")" \ @@ -1802,7 +2001,7 @@ if [[ "$LAST_LINE_TRIMMED" == "$MARKER_COMPLETE" ]]; then # Run code review and handle results (may exit on issues/failure/success) # Pass CURRENT_ROUND + 1 so all review phase files use the next round number echo "Implementation complete. Running initial code review..." >&2 - run_and_handle_code_review "$((CURRENT_ROUND + 1))" "Loop: Finalize Phase - Simplify and refactor code before completion" + run_and_handle_code_review "$((CURRENT_ROUND + 1))" "Loop: Finalize Phase - Simplify and refactor code before completion" "$REVIEW_CONTENT" fi fi fi @@ -1890,6 +2089,21 @@ upsert_state_fields "$STATE_FILE" \ "${FIELD_LAST_MAINLINE_VERDICT}=${NEXT_LAST_MAINLINE_VERDICT}" \ "${FIELD_DRIFT_STATUS}=${NEXT_DRIFT_STATUS}" +NEXT_SCENARIO_MATRIX_MAINLINE_SUMMARY="$SCENARIO_MATRIX_MAINLINE_SUMMARY" +NEXT_SCENARIO_MATRIX_REANCHOR_STEPS="" +NEXT_SCENARIO_MATRIX_MAINLINE_BLOCK="" +NEXT_SCENARIO_MATRIX_TRACKER_NOTE="" +NEXT_SCENARIO_MATRIX_OVERSIGHT_MARKDOWN="" +if [[ "$SCENARIO_MATRIX_ENABLED" == "true" ]]; then + scenario_matrix_apply_implementation_review "$SCENARIO_MATRIX_FILE" "$NEXT_ROUND" "$NEXT_LAST_MAINLINE_VERDICT" "$REVIEW_CONTENT" || true + scenario_matrix_ingest_review_findings "$SCENARIO_MATRIX_FILE" "$NEXT_ROUND" "implementation" "$REVIEW_CONTENT" || true + scenario_matrix_reconcile_manager_state "$SCENARIO_MATRIX_FILE" "$NEXT_ROUND" "implementation_follow_up" || true + NEXT_SCENARIO_MATRIX_MAINLINE_SUMMARY=$(scenario_matrix_current_mainline_summary "$SCENARIO_MATRIX_FILE" 2>/dev/null || echo "$SCENARIO_MATRIX_MAINLINE_SUMMARY") + NEXT_SCENARIO_MATRIX_REANCHOR_STEPS=$(build_scenario_matrix_reanchor_steps "$SCENARIO_MATRIX_ENABLED" "$SCENARIO_MATRIX_FILE") + NEXT_SCENARIO_MATRIX_MAINLINE_BLOCK=$(build_scenario_matrix_mainline_block "$SCENARIO_MATRIX_ENABLED" "$NEXT_SCENARIO_MATRIX_MAINLINE_SUMMARY") + NEXT_SCENARIO_MATRIX_OVERSIGHT_MARKDOWN=$(scenario_matrix_current_oversight_markdown "$SCENARIO_MATRIX_FILE" 2>/dev/null || true) +fi + # Create next round prompt NEXT_PROMPT_FILE="$LOOP_DIR/round-${NEXT_ROUND}-prompt.md" NEXT_SUMMARY_FILE="$LOOP_DIR/round-${NEXT_ROUND}-summary.md" @@ -1909,6 +2123,10 @@ if [[ ! -f "$NEXT_SUMMARY_FILE" ]]; then ## Remaining Items - [List unresolved items, if any] +## Task Packet Feedback (Optional) +| Task ID | Source | Kind | Summary | +|---------|--------|------|---------| + ## BitLesson Delta - Action: none|add|update - Lesson ID(s): NONE @@ -1916,6 +2134,14 @@ if [[ ! -f "$NEXT_SUMMARY_FILE" ]]; then EOF fi NEXT_CONTRACT_FILE="$LOOP_DIR/round-${NEXT_ROUND}-contract.md" +if [[ "$SCENARIO_MATRIX_ENABLED" == "true" ]]; then + scenario_matrix_sync_goal_tracker "$SCENARIO_MATRIX_FILE" "$GOAL_TRACKER_FILE" || true + if [[ "$DRIFT_REPLAN_REQUIRED" == "true" ]]; then + scenario_matrix_write_round_contract_scaffold "$SCENARIO_MATRIX_FILE" "$NEXT_CONTRACT_FILE" "$NEXT_ROUND" "recovery" || true + else + scenario_matrix_write_round_contract_scaffold "$SCENARIO_MATRIX_FILE" "$NEXT_CONTRACT_FILE" "$NEXT_ROUND" "implementation" || true + fi +fi # Build the next round prompt from templates NEXT_ROUND_FALLBACK="# Next Round Instructions @@ -1955,6 +2181,9 @@ if [[ "$DRIFT_REPLAN_REQUIRED" == "true" ]]; then "GOAL_TRACKER_FILE=$GOAL_TRACKER_FILE" \ "BITLESSON_FILE=$BITLESSON_FILE" \ "ROUND_CONTRACT_FILE=$NEXT_CONTRACT_FILE" \ + "SCENARIO_MATRIX_REANCHOR_STEPS=$NEXT_SCENARIO_MATRIX_REANCHOR_STEPS" \ + "SCENARIO_MATRIX_MAINLINE_BLOCK=$NEXT_SCENARIO_MATRIX_MAINLINE_BLOCK" \ + "SCENARIO_MATRIX_TRACKER_NOTE=$(build_scenario_matrix_tracker_note "$SCENARIO_MATRIX_ENABLED" "$SCENARIO_MATRIX_FILE" "recovery")" \ "CURRENT_ROUND=$NEXT_ROUND" \ "STALL_COUNT=$NEXT_MAINLINE_STALL_COUNT" \ "LAST_MAINLINE_VERDICT=$NEXT_LAST_MAINLINE_VERDICT" > "$NEXT_PROMPT_FILE" @@ -1965,6 +2194,9 @@ else "GOAL_TRACKER_FILE=$GOAL_TRACKER_FILE" \ "BITLESSON_FILE=$BITLESSON_FILE" \ "ROUND_CONTRACT_FILE=$NEXT_CONTRACT_FILE" \ + "SCENARIO_MATRIX_REANCHOR_STEPS=$NEXT_SCENARIO_MATRIX_REANCHOR_STEPS" \ + "SCENARIO_MATRIX_MAINLINE_BLOCK=$NEXT_SCENARIO_MATRIX_MAINLINE_BLOCK" \ + "SCENARIO_MATRIX_TRACKER_NOTE=$(build_scenario_matrix_tracker_note "$SCENARIO_MATRIX_ENABLED" "$SCENARIO_MATRIX_FILE" "implementation")" \ "CURRENT_ROUND=$NEXT_ROUND" \ "STALL_COUNT=$NEXT_MAINLINE_STALL_COUNT" \ "LAST_MAINLINE_VERDICT=$NEXT_LAST_MAINLINE_VERDICT" > "$NEXT_PROMPT_FILE" @@ -2059,6 +2291,10 @@ Commit your changes and write summary to {{NEXT_SUMMARY_FILE}}" load_and_render_safe "$TEMPLATE_DIR" "claude/next-round-footer.md" "$FOOTER_FALLBACK" \ "NEXT_SUMMARY_FILE=$NEXT_SUMMARY_FILE" >> "$NEXT_PROMPT_FILE" append_task_tag_routing_note "$NEXT_PROMPT_FILE" +if [[ "$SCENARIO_MATRIX_ENABLED" == "true" ]]; then + append_scenario_matrix_note "$NEXT_PROMPT_FILE" "$SCENARIO_MATRIX_FILE" "$NEXT_SCENARIO_MATRIX_MAINLINE_SUMMARY" + append_oversight_note "$NEXT_PROMPT_FILE" "$NEXT_SCENARIO_MATRIX_OVERSIGHT_MARKDOWN" +fi # Add push instruction only if push_every_round is true if [[ "$PUSH_EVERY_ROUND" == "true" ]]; then diff --git a/prompt-template/claude/drift-replan-prompt.md b/prompt-template/claude/drift-replan-prompt.md index a5970c59..24947c5c 100644 --- a/prompt-template/claude/drift-replan-prompt.md +++ b/prompt-template/claude/drift-replan-prompt.md @@ -19,6 +19,7 @@ This round is a **drift recovery round**. Do not continue with normal issue-clea Before changing code: - Re-read @{{PLAN_FILE}} - Re-read @{{GOAL_TRACKER_FILE}} +{{SCENARIO_MATRIX_REANCHOR_STEPS}} - Re-read the recent round summaries and review results that led here - Rewrite the round contract at @{{ROUND_CONTRACT_FILE}} @@ -32,6 +33,11 @@ Your recovery contract must contain: Do not start implementation until the recovery contract exists. +{{SCENARIO_MATRIX_MAINLINE_BLOCK}} + +Treat the Current Task Packet and Manager Checkpoint as manager-issued recovery scope. +Do not rewrite authoritative matrix state directly; report packet corrections or task-shape concerns through the summary feedback section. + ## Task Lane Rules Use the Task system (TaskCreate, TaskUpdate, TaskList) with one required tag per task: @@ -60,6 +66,8 @@ Before starting work, **read and update** @{{GOAL_TRACKER_FILE}} as needed: - Keep blocking vs queued issue classification accurate - Ensure the tracker and contract now describe the same recovered mainline objective +{{SCENARIO_MATRIX_TRACKER_NOTE}} + ## Recovery Guardrails - Do not spend this round mostly on queued cleanup diff --git a/prompt-template/claude/next-round-prompt.md b/prompt-template/claude/next-round-prompt.md index fd1b1cfe..607fea91 100644 --- a/prompt-template/claude/next-round-prompt.md +++ b/prompt-template/claude/next-round-prompt.md @@ -14,6 +14,7 @@ This plan contains the full scope of work and requirements. Ensure your work ali Before writing code: - Re-read @{{PLAN_FILE}} - Re-read @{{GOAL_TRACKER_FILE}} +{{SCENARIO_MATRIX_REANCHOR_STEPS}} - Re-read the most recent round summaries/reviews that led to this round - Write the current round contract to @{{ROUND_CONTRACT_FILE}} @@ -26,6 +27,11 @@ Your round contract must contain: Do not start implementation until the round contract exists. +{{SCENARIO_MATRIX_MAINLINE_BLOCK}} + +Treat the Current Task Packet and Manager Checkpoint as manager-issued scope. +Do not rewrite authoritative matrix state directly; report packet corrections or task-shape concerns through the summary feedback section. + ## Task Lane Rules Use the Task system (TaskCreate, TaskUpdate, TaskList) with one required tag per task: @@ -64,6 +70,8 @@ Before starting work, **read** @{{GOAL_TRACKER_FILE}} to understand: Do NOT change the immutable section after Round 0. If you cannot safely reconcile the tracker yourself, include an optional "Goal Tracker Update Request" section in your summary (see below). +{{SCENARIO_MATRIX_TRACKER_NOTE}} + ## Mainline Guardrails - Keep the mainline objective from @{{ROUND_CONTRACT_FILE}} stable for this round diff --git a/prompt-template/claude/review-phase-prompt.md b/prompt-template/claude/review-phase-prompt.md index e180e418..1b729741 100644 --- a/prompt-template/claude/review-phase-prompt.md +++ b/prompt-template/claude/review-phase-prompt.md @@ -7,10 +7,16 @@ You are in the **Review Phase**. Codex has performed a code review and found iss Before touching code: - Re-read the original plan at @{{PLAN_FILE}} - Re-read the goal tracker at @{{GOAL_TRACKER_FILE}} +{{SCENARIO_MATRIX_REANCHOR_STEPS}} - Refresh the current round contract at @{{ROUND_CONTRACT_FILE}} The round contract must preserve a single mainline objective. Code review findings do NOT automatically become the new round objective. +{{SCENARIO_MATRIX_MAINLINE_BLOCK}} + +Treat the Current Task Packet and Manager Checkpoint as manager-issued fix scope. +Do not rewrite authoritative matrix state directly; report packet corrections or task-shape concerns through the summary feedback section. + ## Review Results {{REVIEW_CONTENT}} @@ -56,3 +62,4 @@ Your summary should include: - You must address the code review findings to proceed - After you commit and write your summary, Codex will perform another code review - The loop continues until no `[P0-9]` issues are found +{{SCENARIO_MATRIX_TRACKER_NOTE}} diff --git a/prompt-template/codex/full-alignment-review.md b/prompt-template/codex/full-alignment-review.md index 02997dd8..760cacaa 100644 --- a/prompt-template/codex/full-alignment-review.md +++ b/prompt-template/codex/full-alignment-review.md @@ -70,9 +70,56 @@ The `Mainline Progress Verdict` line is mandatory. If you omit it, the Humanize - Identify any gaps, bugs, or incomplete work - Reference @{{DOCS_PATH}} for design documents -## Part 4: {{GOAL_TRACKER_UPDATE_SECTION}} - -## Part 5: Progress Stagnation Check (MANDATORY for Full Alignment Rounds) +## Part 4: Failure-Surface Coverage Pass (MANDATORY) + +Before you write the final lane findings, you MUST expand review coverage across the touched failure surfaces: + +0. **Historical Tail-Repair Scan** + - Inspect recent git history before you settle on a narrow diff-only review. + - Start with `git log --oneline --stat -n 20` to understand the recent repair pattern. + - If the current round or recent rounds keep touching the same hotspot file or module, also inspect file-scoped history such as `git log --stat -- `. + - Treat these as a long-tail repair-chain signal: + - repeated small `fix` commits + - repeated blocker cleanup in the same file or module + - several follow-up patches that keep revisiting one hotspot + - If that signal appears, widen the audit: + - inspect neighboring call sites in the hotspot + - inspect sibling lifecycle and rollback paths + - search for system-level consistency failures beyond the newest local patch + - Reflect that broader scan in `Touched Failure Surfaces`, `Likely Sibling Risks`, and `Coverage Ledger`. + +1. **Touched Failure Surfaces** + - Map the high-risk failure surfaces touched by the recent diff, summaries, and changed tests. + - Prefer system-oriented surfaces such as lifecycle symmetry, rollback correctness, resource cleanup, state consistency, snapshot/projection consistency, dependency propagation, and cross-subsystem synchronization. + - If the git history shows a long-tail repair chain, prefer the broader failure surface over a single-file symptom label. + - Use this exact bullet format so the runtime can retain the analysis: + - `- | why: | confidence: high|medium|low` + +2. **Likely Sibling Risks** + - For each confirmed issue, extend the search into sibling paths: + - symmetric paths + - parallel resources + - adjacent state transitions + - neighboring call sites in the same hotspot module + - If the git history shows repeated small fixes in one hotspot, increase skepticism and search wider before you stop. + - Report high-confidence sibling risks even if they are not yet elevated into blocking findings. + - Use this exact bullet format: + - `- | derived_from: | axis: | why: | check: | confidence: high|medium|low` + +3. **Coverage Ledger** + - End the review with a short ledger describing which touched surfaces are `covered`, `partial`, or `unclear`. + - **Do NOT render the Coverage Ledger as `-` / `*` bullet findings.** Use a markdown table or short plain-text paragraphs instead. + - This preserves compatibility with the current finding parser. + - Preferred format: + ``` + | Surface | Status | Notes | + |---------|--------|-------| + | rollback-symmetry | partial | cancel path checked; restore path still unclear | + ``` + +## Part 5: {{GOAL_TRACKER_UPDATE_SECTION}} + +## Part 6: Progress Stagnation Check (MANDATORY for Full Alignment Rounds) To implement the original plan at @{{PLAN_FILE}}, we have completed **{{COMPLETED_ITERATIONS}} iterations** (Round 0 to Round {{CURRENT_ROUND}}). @@ -99,14 +146,24 @@ The project's `.humanize/rlcr/{{LOOP_TIMESTAMP}}/` directory contains the histor **If development is stagnating**, write **STOP** (as a single word on its own line) as the last line of your review output @{{REVIEW_RESULT_FILE}} instead of COMPLETE. -## Part 6: Output Requirements +## Part 7: Output Requirements - If issues found OR any AC is NOT MET (including deferred ACs), write your findings to @{{REVIEW_RESULT_FILE}} +- Structure the review in this order: + 1. `Touched Failure Surfaces` + 2. `Likely Sibling Risks` + 3. `Mainline Gaps` + 4. `Blocking Side Issues` + 5. `Queued Side Issues` + 6. `Mainline Progress Verdict` + 7. `Coverage Ledger` - Include specific action items for Claude to address, classified into: - Mainline Gaps - Blocking Side Issues - Queued Side Issues -- **If development is stagnating** (see Part 4), write "STOP" as the last line +- Keep the lane headings exactly as written above so the runtime can continue to classify findings safely. +- Keep lane findings as explicit issue bullets, preferably with `[P0-9]` severity markers. +- **If development is stagnating** (see Part 6), write "STOP" as the last line - **CRITICAL**: Only write "COMPLETE" as the last line if ALL ACs from the original plan are FULLY MET with no deferrals - DEFERRED items are considered INCOMPLETE - do NOT output COMPLETE if any AC is deferred - The ONLY condition for COMPLETE is: all original plan tasks are done, all ACs are met, no deferrals allowed diff --git a/prompt-template/codex/regular-review.md b/prompt-template/codex/regular-review.md index 7db26ea2..4ea138f9 100644 --- a/prompt-template/codex/regular-review.md +++ b/prompt-template/codex/regular-review.md @@ -44,12 +44,53 @@ Include a brief Goal Alignment Summary in your review: ACs: X/Y addressed | Forgotten items: N | Unjustified deferrals: N ``` -## Part 3: Required Finding Classification - -You MUST classify your findings into these lanes: -- **Mainline Gaps**: plan-derived work or AC progress that is missing, incomplete, or regressing -- **Blocking Side Issues**: bugs or implementation issues that block the current mainline objective from succeeding safely -- **Queued Side Issues**: valid non-blocking follow-up issues that should be documented but must NOT take over the next round +## Part 3: Failure-Surface Coverage Pass (MANDATORY) + +Before you write those lane findings, you MUST first run a failure-surface coverage pass: + +0. **Historical Tail-Repair Scan** + - Before you settle on a narrow diff-only review, inspect recent git history. + - Start with `git log --oneline --stat -n 12` to understand the recent repair pattern. + - If the current round appears to touch a hotspot file or module, also inspect file-scoped history such as `git log --stat -- `. + - Treat these as a long-tail repair-chain signal: + - repeated small `fix` commits + - repeated review-blocker fixes in the same file or module + - several follow-up patches that only nibble at one hotspot + - If you detect that signal, widen your review strategy beyond the latest patch: + - inspect neighboring call sites in the hotspot + - inspect sibling state transitions and rollback paths + - look for system-level consistency failures, not just the newest local diff defect + - Reflect that broader scan in `Touched Failure Surfaces`, `Likely Sibling Risks`, and `Coverage Ledger`. + +1. **Touched Failure Surfaces** + - Map the high-risk failure surfaces touched by this round's diff, summary claims, and changed tests. + - Prefer system-oriented surfaces such as lifecycle symmetry, rollback correctness, resource cleanup, state consistency, snapshot/projection consistency, dependency propagation, and cross-subsystem synchronization. + - If the git history shows a long-tail repair chain around one hotspot, prefer naming the broader failure surface instead of reporting only a single-file symptom. + - Keep this section concise, but do not skip it just because you already found one bug. + - Use this exact bullet format so the runtime can retain the analysis: + - `- | why: | confidence: high|medium|low` + +2. **Likely Sibling Risks** + - For each confirmed issue, expand at least one round further across sibling paths: + - symmetric paths + - parallel resources + - adjacent state transitions + - neighboring call sites in the same hotspot module + - If the git history shows repeated small fixes in the same hotspot, increase skepticism and search wider before you stop. + - Report high-confidence sibling risks even if they are not yet as strongly confirmed as the main finding. + - Use this exact bullet format: + - `- | derived_from: | axis: | why: | check: | confidence: high|medium|low` + +3. **Coverage Ledger** + - Close the review with a short coverage ledger describing which touched failure surfaces are `covered`, `partial`, or `unclear`. + - **Do NOT render the Coverage Ledger as `-` / `*` bullet findings.** Use a markdown table or short plain-text paragraphs instead. + - This is required because Humanize currently parses lane bullets into machine-managed findings. + - Preferred format: + ``` + | Surface | Status | Notes | + |---------|--------|-------| + | rollback-symmetry | partial | cancel path checked; restore path still unclear | + ``` Also include a one-line verdict: ``` @@ -60,12 +101,28 @@ This verdict line is mandatory. If you omit it, the Humanize stop hook will bloc If Claude mostly worked on queued side issues and failed to advance the mainline, say so explicitly. -## Part 4: {{GOAL_TRACKER_UPDATE_SECTION}} +## Part 4: Required Finding Classification + +You MUST classify your findings into these lanes: +- **Mainline Gaps**: plan-derived work or AC progress that is missing, incomplete, or regressing +- **Blocking Side Issues**: bugs or implementation issues that block the current mainline objective from succeeding safely +- **Queued Side Issues**: valid non-blocking follow-up issues that should be documented but must NOT take over the next round + +## Part 5: {{GOAL_TRACKER_UPDATE_SECTION}} -## Part 5: Output Requirements +## Part 6: Output Requirements - In short, your review comments can include: problems/findings/blockers; claims that don't match reality; implementation plans for deferred work (to be implemented now); implementation plans for unfinished work; goal alignment issues. -- Your output should be structured so Claude can tell which items are mainline gaps, blocking side issues, and queued side issues. +- Your output should be structured in this order: + 1. `Touched Failure Surfaces` + 2. `Likely Sibling Risks` + 3. `Mainline Gaps` + 4. `Blocking Side Issues` + 5. `Queued Side Issues` + 6. `Mainline Progress Verdict` + 7. `Coverage Ledger` +- Keep the lane headings exactly as written above so the runtime can continue to classify findings safely. +- Keep lane findings as explicit issue bullets, preferably with `[P0-9]` severity markers. - If after your investigation the actual situation does not match what Claude claims to have completed, or there is pending work to be done, output your review comments to @{{REVIEW_RESULT_FILE}}. - **CRITICAL**: Only output "COMPLETE" as the last line if ALL tasks from the original plan are FULLY completed with no deferrals - DEFERRED items are considered INCOMPLETE - do NOT output COMPLETE if any task is deferred diff --git a/scripts/humanize.sh b/scripts/humanize.sh index 9804bde5..d45f26c6 100755 --- a/scripts/humanize.sh +++ b/scripts/humanize.sh @@ -14,6 +14,9 @@ HUMANIZE_HOOKS_LIB_DIR="$(cd "$HUMANIZE_SCRIPT_DIR/../hooks/lib" && pwd)" if [[ -f "$HUMANIZE_HOOKS_LIB_DIR/loop-common.sh" ]]; then source "$HUMANIZE_HOOKS_LIB_DIR/loop-common.sh" fi +if [[ -f "$HUMANIZE_HOOKS_LIB_DIR/scenario-matrix.sh" ]]; then + source "$HUMANIZE_HOOKS_LIB_DIR/scenario-matrix.sh" +fi # ======================================== # Public helper functions (can be called directly for testing) @@ -153,6 +156,55 @@ humanize_parse_goal_tracker() { echo "${total_acs}|${completed_acs}|${active_tasks}|${completed_tasks}|${deferred_tasks}|${open_issues}|${goal_summary}" } +humanize_parse_scenario_matrix() { + local session_dir="$1" + local state_file="${2:-}" + + if [[ -z "$session_dir" || ! -d "$session_dir" ]]; then + echo "missing|0|No scenario matrix session.|idle|none|n/a|unknown|n/a|none" + return + fi + + if [[ -z "$state_file" ]]; then + local state_info + state_info=$(monitor_find_state_file "$session_dir") + state_file="${state_info%%|*}" + fi + + local matrix_required="false" + local matrix_rel="" + if [[ -f "$state_file" ]]; then + matrix_required=$(grep -E "^scenario_matrix_required:" "$state_file" 2>/dev/null | sed 's/scenario_matrix_required: *//' | tr -d ' "') + matrix_rel=$(grep -E "^scenario_matrix_file:" "$state_file" 2>/dev/null | sed 's/scenario_matrix_file: *//' | sed 's/^"//; s/"$//') + fi + + if [[ "$matrix_required" != "true" && "$matrix_required" != "false" ]]; then + matrix_required="false" + fi + + local project_root + project_root=$(cd "$session_dir/../../.." 2>/dev/null && pwd) + + local matrix_file="" + if [[ -n "$matrix_rel" && -n "$project_root" ]]; then + matrix_file="$project_root/$matrix_rel" + else + matrix_file="$session_dir/scenario-matrix.json" + fi + + if declare -F scenario_matrix_monitor_snapshot >/dev/null 2>&1; then + scenario_matrix_monitor_snapshot "$matrix_file" "$matrix_required" + else + if [[ -f "$matrix_file" ]]; then + echo "ready|0|Scenario matrix library unavailable.|idle|none|n/a|unknown|n/a|none" + elif [[ "$matrix_required" == "true" ]]; then + echo "missing|0|Scenario matrix file is missing.|idle|none|n/a|unknown|n/a|none" + else + echo "legacy|0|Legacy loop without scenario matrix.|idle|none|n/a|legacy|n/a|none" + fi + fi +} + # Detect special git repository states # Returns: state_name (one of: normal, detached, rebase, merge, shallow, permission_error) humanize_detect_git_state() { @@ -267,7 +319,7 @@ _humanize_monitor_codex() { local current_file="" local current_session_dir="" local check_interval=2 # seconds between checking for new files - local status_bar_height=11 # number of lines for status bar (includes loop status line) + local status_bar_height=12 # number of lines for status bar (includes loop status line) # Check if .humanize/rlcr exists if [[ ! -d "$loop_dir" ]]; then @@ -461,6 +513,19 @@ _humanize_monitor_codex() { local blocking_issues="${issue_parts[0]}" local queued_issues="${issue_parts[1]}" + # Parse scenario-matrix runtime + local -a matrix_parts + _split_to_array matrix_parts "$(humanize_parse_scenario_matrix "$session_dir" "$state_file")" + local matrix_status="${matrix_parts[0]:-legacy}" + local matrix_task_count="${matrix_parts[1]:-0}" + local matrix_mainline_summary="${matrix_parts[2]:-Legacy loop without scenario matrix.}" + local matrix_oversight_status="${matrix_parts[3]:-idle}" + local matrix_oversight_action="${matrix_parts[4]:-none}" + local matrix_checkpoint_id="${matrix_parts[5]:-n/a}" + local matrix_convergence_status="${matrix_parts[6]:-unknown}" + local matrix_next_action="${matrix_parts[7]:-n/a}" + local matrix_wave_label="${matrix_parts[8]:-none}" + # Parse git status local -a git_parts _split_to_array git_parts "$(_parse_git_status)" @@ -494,6 +559,36 @@ _humanize_monitor_codex() { local prefix_len=$((max_display_len - 3)) goal_display="${goal_summary:0:$prefix_len}..." fi + local matrix_task_label="${matrix_task_count} tasks" + [[ "$matrix_task_count" == "1" ]] && matrix_task_label="1 task" + local matrix_oversight_label="$matrix_oversight_status" + if [[ -n "$matrix_oversight_action" ]] && [[ "$matrix_oversight_action" != "none" ]] && [[ "$matrix_oversight_action" != "$matrix_oversight_status" ]]; then + matrix_oversight_label="${matrix_oversight_status}/${matrix_oversight_action}" + fi + local matrix_convergence_label="$matrix_convergence_status" + if [[ -n "$matrix_next_action" ]] && [[ "$matrix_next_action" != "n/a" ]] && [[ "$matrix_next_action" != "$matrix_convergence_status" ]]; then + matrix_convergence_label="${matrix_convergence_status}/${matrix_next_action}" + fi + local matrix_prefix_plain="${matrix_status} (${matrix_task_label}) | Oversight: ${matrix_oversight_label} | Mainline: " + local matrix_mainline_max=$((term_width - 12 - ${#matrix_prefix_plain})) + local matrix_display="$matrix_mainline_summary" + if [[ "$matrix_mainline_max" -lt 24 ]]; then + matrix_mainline_max=24 + fi + if [[ ${#matrix_display} -gt $matrix_mainline_max ]]; then + local matrix_prefix_len=$((matrix_mainline_max - 3)) + matrix_display="${matrix_display:0:$matrix_prefix_len}..." + fi + local control_prefix_plain="Checkpoint: ${matrix_checkpoint_id} | Convergence: ${matrix_convergence_label} | Wave: " + local control_max=$((term_width - 12 - ${#control_prefix_plain})) + local control_display="$matrix_wave_label" + if [[ "$control_max" -lt 18 ]]; then + control_max=18 + fi + if [[ ${#control_display} -gt $control_max ]]; then + local control_prefix_len=$((control_max - 3)) + control_display="${control_display:0:$control_prefix_len}..." + fi # Save cursor position and move to top tput sc @@ -623,6 +718,45 @@ _humanize_monitor_codex() { fi printf "${clr_eol}\n" + local matrix_status_color="${green}" + case "$matrix_status" in + legacy) + matrix_status_color="${dim}" + ;; + not_applicable) + matrix_status_color="${cyan}" + ;; + missing|invalid) + matrix_status_color="${red}" + ;; + esac + local oversight_color="${dim}" + if [[ "$matrix_oversight_status" == "active" ]]; then + oversight_color="${yellow}" + fi + local convergence_color="${dim}" + case "$matrix_convergence_status" in + converged) + convergence_color="${green}" + ;; + stabilizing) + convergence_color="${cyan}" + ;; + continue) + convergence_color="${yellow}" + ;; + unknown) + convergence_color="${red}" + ;; + legacy) + convergence_color="${dim}" + ;; + esac + printf "${magenta}Matrix:${reset} ${matrix_status_color}%s${reset} (%s) | Oversight: ${oversight_color}%s${reset} | Mainline: %s${clr_eol}\n" \ + "$matrix_status" "$matrix_task_label" "$matrix_oversight_label" "$matrix_display" + printf "${magenta}Control:${reset} Checkpoint: %s | Convergence: ${convergence_color}%s${reset} | Wave: %s${clr_eol}\n" \ + "$matrix_checkpoint_id" "$matrix_convergence_label" "$control_display" + # Git status line (same color as Progress) local git_total=$((git_modified + git_added + git_deleted)) printf "${magenta}Git:${reset} " @@ -1193,6 +1327,14 @@ humanize() { shift case "$cmd" in + matrix) + local viewer_script="$HUMANIZE_SCRIPT_DIR/render-scenario-matrix.py" + if [[ ! -x "$viewer_script" ]]; then + echo "Error: scenario matrix viewer script not found: $viewer_script" >&2 + return 1 + fi + "$viewer_script" "$@" + ;; monitor) local target="$1" shift 2>/dev/null || true @@ -1231,6 +1373,7 @@ humanize() { echo "Usage: humanize [args]" echo "" echo "Commands:" + echo " matrix Render the latest scenario matrix as a local HTML dashboard" echo " monitor rlcr Monitor the latest RLCR loop log" echo " monitor skill Monitor all skill invocations (codex + gemini)" echo " monitor codex Monitor ask-codex skill invocations only" diff --git a/scripts/render-scenario-matrix.py b/scripts/render-scenario-matrix.py new file mode 100755 index 00000000..0a208d58 --- /dev/null +++ b/scripts/render-scenario-matrix.py @@ -0,0 +1,1925 @@ +#!/usr/bin/env python3 +""" +Generate a local HTML dashboard for a Humanize scenario matrix snapshot. +""" + +from __future__ import annotations + +import argparse +import json +import os +import subprocess +import sys +from collections import Counter, defaultdict +from datetime import datetime, timezone +from html import escape +from http import HTTPStatus +from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer +from pathlib import Path +from threading import Thread +from typing import Any +from urllib.parse import urlparse + + +def die(message: str) -> int: + print(f"[scenario-matrix-view] Error: {message}", file=sys.stderr) + return 1 + + +def read_state_value(state_file: Path, key: str) -> str: + if not state_file.is_file(): + return "" + + try: + for line in state_file.read_text(encoding="utf-8").splitlines(): + if line.startswith(f"{key}:"): + return line.split(":", 1)[1].strip().strip('"') + except OSError: + return "" + return "" + + +def git_project_root(start: Path) -> Path | None: + try: + result = subprocess.run( + ["git", "-C", str(start), "rev-parse", "--show-toplevel"], + check=True, + capture_output=True, + text=True, + ) + except (OSError, subprocess.CalledProcessError): + return None + root = result.stdout.strip() + return Path(root) if root else None + + +def project_root_from_session(session_dir: Path) -> Path: + try: + return session_dir.resolve().parents[2] + except IndexError: + return session_dir.resolve() + + +def find_latest_session(loop_dir: Path) -> Path | None: + if not loop_dir.is_dir(): + return None + + latest: Path | None = None + for child in loop_dir.iterdir(): + if child.is_dir() and len(child.name) == 19 and child.name[4] == "-" and child.name[7] == "-" and child.name[10] == "_": + if latest is None or child.name > latest.name: + latest = child + return latest + + +def resolve_matrix_from_session(session_dir: Path) -> tuple[Path, Path]: + state_candidates = [ + session_dir / "state.md", + session_dir / "methodology-analysis-state.md", + session_dir / "finalize-state.md", + ] + state_candidates.extend(sorted(session_dir.glob("*-state.md"))) + state_file = next((candidate for candidate in state_candidates if candidate.is_file()), session_dir / "state.md") + + matrix_rel = read_state_value(state_file, "scenario_matrix_file") + matrix_file: Path | None = None + if matrix_rel: + matrix_candidate = project_root_from_session(session_dir) / matrix_rel + if matrix_candidate.is_file(): + matrix_file = matrix_candidate + + if matrix_file is None: + fallback = session_dir / "scenario-matrix.json" + if fallback.is_file(): + matrix_file = fallback + + if matrix_file is None: + raise FileNotFoundError(f"no scenario matrix found for session: {session_dir}") + return matrix_file, session_dir + + +def resolve_input(input_arg: str | None) -> tuple[Path, Path | None]: + if input_arg: + candidate = Path(input_arg).expanduser() + else: + base = Path(os.environ.get("CLAUDE_PROJECT_DIR") or os.getcwd()) + project_root = git_project_root(base) or base.resolve() + latest = find_latest_session(project_root / ".humanize" / "rlcr") + if latest is None: + raise FileNotFoundError("no RLCR session found under .humanize/rlcr") + return resolve_matrix_from_session(latest) + + candidate = candidate.resolve() + + if candidate.is_file(): + if candidate.suffix.lower() == ".json": + return candidate, candidate.parent if candidate.name == "scenario-matrix.json" else None + if candidate.name.endswith("-state.md") or candidate.name == "state.md": + return resolve_matrix_from_session(candidate.parent) + raise FileNotFoundError(f"unsupported input file: {candidate}") + + if candidate.is_dir(): + if (candidate / ".humanize" / "rlcr").is_dir(): + latest = find_latest_session(candidate / ".humanize" / "rlcr") + if latest is None: + raise FileNotFoundError(f"no RLCR session found under {candidate / '.humanize' / 'rlcr'}") + return resolve_matrix_from_session(latest) + if (candidate / "scenario-matrix.json").is_file(): + return resolve_matrix_from_session(candidate) + if (candidate / "state.md").is_file() or list(candidate.glob("*-state.md")): + return resolve_matrix_from_session(candidate) + raise FileNotFoundError(f"directory is not a session dir, project dir, or matrix dir: {candidate}") + + raise FileNotFoundError(f"input path not found: {candidate}") + + +def choose_output_path(matrix_file: Path, session_dir: Path | None, explicit_output: str | None) -> Path: + if explicit_output: + return Path(explicit_output).expanduser().resolve() + + if matrix_file.name == "scenario-matrix.json": + filename = "scenario-matrix-view.html" + else: + filename = f"{matrix_file.stem}-view.html" + + base_dir = session_dir if session_dir is not None else matrix_file.parent + return (base_dir / filename).resolve() + + +def load_matrix(matrix_file: Path) -> dict[str, Any]: + try: + matrix = json.loads(matrix_file.read_text(encoding="utf-8")) + except FileNotFoundError as exc: + raise FileNotFoundError(f"matrix file does not exist: {matrix_file}") from exc + except json.JSONDecodeError as exc: + raise ValueError(f"invalid JSON in {matrix_file}: {exc}") from exc + + if not isinstance(matrix, dict): + raise ValueError("matrix root must be a JSON object") + if not isinstance(matrix.get("tasks"), list): + raise ValueError("matrix.tasks must be an array") + return matrix + + +def classify_bucket(task: dict[str, Any], primary_id: str | None, supporting_ids: set[str]) -> str: + task_id = str(task.get("id") or "") + state = str(task.get("state") or "pending") + if state == "done": + return "done" + if state == "deferred" or str(task.get("admission", {}).get("status") or "") == "deferred": + return "deferred" + if task_id and task_id == primary_id: + return "primary" + if task_id in supporting_ids: + return "supporting" + return "active" + + +def should_render_task(raw_task: Any) -> tuple[bool, str | None]: + if not isinstance(raw_task, dict): + return False, "invalid task entry" + + task_id = str(raw_task.get("id") or "").strip() + title = raw_task.get("title") + source = str(raw_task.get("source") or "").strip().lower() + normalized_title = " ".join(str(title or "").split()).upper() + + if not task_id: + return False, "missing task id" + if not isinstance(title, str) or not title.strip(): + return False, f"{task_id}: missing task title" + if normalized_title in {"FINDING", "STRUCTURED_FINDING", "WATCHLIST_FINDING"}: + return False, f"{task_id}: placeholder finding title" + if task_id.startswith("finding-r") and source == "review": + return False, f"{task_id}: transient review finding" + return True, None + + +def build_view_model(matrix: dict[str, Any], matrix_file: Path, session_dir: Path | None) -> dict[str, Any]: + raw_tasks = matrix.get("tasks", []) + runtime = matrix.get("runtime", {}) + manager = matrix.get("manager", {}) + checkpoint = runtime.get("checkpoint", {}) + convergence = runtime.get("convergence", {}) + oversight = matrix.get("oversight", {}) + feedback = matrix.get("feedback", {}) + events = matrix.get("events", []) + + primary_id = manager.get("current_primary_task_id") or checkpoint.get("primary_task_id") + supporting_ids = { + str(task_id) + for task_id in checkpoint.get("supporting_task_ids", []) + if isinstance(task_id, str) + } + + tasks: list[dict[str, Any]] = [] + hidden_tasks: list[str] = [] + for raw_task in raw_tasks: + should_render, reason = should_render_task(raw_task) + if should_render: + tasks.append(raw_task) + elif reason: + hidden_tasks.append(reason) + + dependents: dict[str, list[str]] = defaultdict(list) + for task in tasks: + task_id = str(task.get("id") or "") + for dep in task.get("depends_on", []): + if isinstance(dep, str) and task_id: + dependents[dep].append(task_id) + + feedback_entries = list(feedback.get("execution", [])) + list(feedback.get("review", [])) + feedback_by_task: dict[str, list[dict[str, Any]]] = defaultdict(list) + for entry in feedback_entries: + task_id = str(entry.get("task_id") or "") + if task_id: + feedback_by_task[task_id].append(entry) + + state_counts = Counter() + bucket_counts = Counter() + task_cards = [] + for raw_task in tasks: + task_id = str(raw_task.get("id") or "") + bucket = classify_bucket(raw_task, primary_id, supporting_ids) + state = str(raw_task.get("state") or "pending") + state_counts[state] += 1 + bucket_counts[bucket] += 1 + task_cards.append( + { + "id": task_id, + "title": str(raw_task.get("title") or "Untitled task"), + "lane": str(raw_task.get("lane") or "queued"), + "routing": str(raw_task.get("routing") or "coding"), + "state": state, + "kind": str(raw_task.get("kind") or "feature"), + "bucket": bucket, + "is_primary": task_id == primary_id, + "is_supporting": task_id in supporting_ids, + "risk_bucket": str(raw_task.get("risk_bucket") or "planned"), + "owner": raw_task.get("owner"), + "target_ac": raw_task.get("target_ac", []), + "depends_on": raw_task.get("depends_on", []), + "dependent_ids": sorted(dependents.get(task_id, [])), + "cluster_id": raw_task.get("cluster_id"), + "repair_wave": raw_task.get("repair_wave"), + "wave_label": raw_task.get("repair_wave") or raw_task.get("cluster_id"), + "scope": raw_task.get("scope", {}), + "assumptions": raw_task.get("assumptions", []), + "strategy": raw_task.get("strategy", {}), + "health": raw_task.get("health", {}), + "admission": raw_task.get("admission", {}), + "metadata": raw_task.get("metadata", {}), + "feedback": feedback_by_task.get(task_id, []), + } + ) + + task_cards.sort( + key=lambda task: ( + {"primary": 0, "supporting": 1, "active": 2, "done": 3, "deferred": 4}.get(task["bucket"], 9), + {"in_progress": 0, "ready": 1, "pending": 2, "blocked": 3, "needs_replan": 4, "done": 5, "deferred": 6}.get(task["state"], 9), + task["id"], + ) + ) + + primary_task = next((task for task in task_cards if task["id"] == primary_id), None) + event_cards = [] + for event in events: + if not isinstance(event, dict): + continue + event_cards.append( + { + "id": str(event.get("id") or ""), + "type": str(event.get("type") or "event"), + "round": event.get("round"), + "phase": event.get("phase"), + "task_id": event.get("task_id"), + "verdict": event.get("verdict"), + "severity": event.get("severity"), + "finding_key": event.get("finding_key"), + "created_at": event.get("created_at"), + "summary": event.get("summary") or event.get("reason") or event.get("message"), + } + ) + event_cards.sort(key=lambda event: (str(event.get("created_at") or ""), event.get("id", "")), reverse=True) + + return { + "meta": { + "title": "Scenario Matrix Dashboard", + "generated_at": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"), + "source_file": str(matrix_file), + "session_dir": str(session_dir) if session_dir else "", + "schema_version": matrix.get("schema_version"), + }, + "plan": matrix.get("plan", {}), + "metadata": matrix.get("metadata", {}), + "summary": { + "mode": runtime.get("mode"), + "projection_mode": runtime.get("projection_mode"), + "current_round": runtime.get("current_round"), + "primary_task_id": primary_id, + "primary_task_title": primary_task["title"] if primary_task else None, + "task_count": len(task_cards), + "hidden_task_count": len(hidden_tasks), + "state_counts": state_counts, + "bucket_counts": bucket_counts, + "event_count": len(event_cards), + "execution_feedback_count": len(feedback.get("execution", [])), + "review_feedback_count": len(feedback.get("review", [])), + }, + "checkpoint": checkpoint, + "convergence": convergence, + "last_review": runtime.get("last_review", {}), + "manager": manager, + "oversight": oversight, + "tasks": task_cards, + "hidden_tasks": hidden_tasks, + "events": event_cards, + "feedback": { + "execution": feedback.get("execution", []), + "review": feedback.get("review", []), + }, + "raw_matrix": matrix, + } + + +def load_view_model_from_input(input_arg: str | None) -> tuple[dict[str, Any], Path, Path | None]: + matrix_file, session_dir = resolve_input(input_arg) + matrix = load_matrix(matrix_file) + return build_view_model(matrix, matrix_file, session_dir), matrix_file, session_dir + + +def render_html(view_model: dict[str, Any], page_title: str) -> str: + payload = json.dumps(view_model, ensure_ascii=False).replace(" + + + + + __TITLE__ + + + +

+
+
+
+

__TITLE__

+

Manager-facing snapshot of the current scenario matrix. This view groups tasks into the active frontier, shows dependency edges, exposes checkpoint and convergence status, and keeps feedback/events visible without forcing you to read raw JSON.

+
+
+
Generated: __GENERATED_AT__
+
Source: __SOURCE_FILE__
+
Session: __SESSION_DIR__
+
+
+
+
+ +
+
+
+
+

Workflow Graph

+

Read the matrix as a left-to-right dependency chain, closer to a workflow graph than a stacked kanban board.

+
+
+
+
+
+

Tasks are staged from left to right by dependency depth. Drag cards to refine local branches, drag the background to pan, and use reset if you want the workflow chain rebuilt from the current matrix.

+
+
+ +
100%
+ + +
+
+ +
60s
+
+ + +
+
+
+
+ +
+
+
+
+ +
+
+
+

Recent Events

+
+
+
+

Feedback Queues

+ +
+
+
+ +
+

Warnings and Notes

+
+
+ +
+
+ Raw Matrix JSON +

+      
+
+
+ + + + + +""" + return ( + template.replace("__TITLE__", html_title) + .replace("__SOURCE_FILE__", source_file) + .replace("__GENERATED_AT__", generated_at) + .replace("__SESSION_DIR__", session_dir) + .replace("__PAYLOAD__", payload) + ) + + +def serve_dashboard(input_arg: str | None, page_title: str, bind: str, port: int, once: bool) -> int: + class ScenarioMatrixHandler(BaseHTTPRequestHandler): + server_version = "ScenarioMatrixViewer/1.0" + + def log_message(self, format: str, *args: Any) -> None: + return + + def _send_text(self, status: HTTPStatus, body: str, content_type: str) -> None: + encoded = body.encode("utf-8") + self.send_response(status) + self.send_header("Content-Type", f"{content_type}; charset=utf-8") + self.send_header("Content-Length", str(len(encoded))) + self.send_header("Cache-Control", "no-store") + self.end_headers() + self.wfile.write(encoded) + + def _maybe_shutdown(self) -> None: + if once: + Thread(target=self.server.shutdown, daemon=True).start() + + def do_GET(self) -> None: + parsed = urlparse(self.path) + if parsed.path == "/healthz": + self._send_text(HTTPStatus.OK, "ok\n", "text/plain") + return + + if parsed.path not in ("/", "/index.html"): + self._send_text(HTTPStatus.NOT_FOUND, "Not found\n", "text/plain") + return + + try: + view_model, _, _ = load_view_model_from_input(input_arg) + html = render_html(view_model, page_title) + self._send_text(HTTPStatus.OK, html, "text/html") + except (FileNotFoundError, ValueError) as exc: + message = escape(str(exc)) + body = ( + "" + f"{escape(page_title)} — Error" + f"

{escape(page_title)}

Unable to load the current scenario matrix snapshot.

" + f"
{message}
" + ) + self._send_text(HTTPStatus.INTERNAL_SERVER_ERROR, body, "text/html") + self._maybe_shutdown() + + class ScenarioMatrixServer(ThreadingHTTPServer): + allow_reuse_address = True + daemon_threads = True + + try: + server = ScenarioMatrixServer((bind, port), ScenarioMatrixHandler) + except OSError as exc: + raise OSError(f"unable to start scenario matrix client on {bind}:{port}: {exc}") from exc + + host, actual_port = server.server_address[:2] + display_host = "127.0.0.1" if host in ("0.0.0.0", "", "::") else str(host) + print(f"http://{display_host}:{actual_port}/", flush=True) + + try: + server.serve_forever() + except KeyboardInterrupt: + pass + finally: + server.server_close() + return 0 + + +def main(argv: list[str]) -> int: + parser = argparse.ArgumentParser( + description="Render a Humanize scenario matrix snapshot into a local HTML dashboard." + ) + parser.add_argument( + "--input", + help="Matrix JSON, RLCR session dir, state file, or project dir. Defaults to the latest local RLCR session.", + ) + parser.add_argument( + "--output", + help="HTML output path. Defaults next to the matrix/session as *-view.html.", + ) + parser.add_argument( + "--title", + default="Scenario Matrix Dashboard", + help="Page title for the generated HTML.", + ) + parser.add_argument( + "--serve", + action="store_true", + help="Run a local HTML client that re-renders the current matrix snapshot on each page refresh.", + ) + parser.add_argument( + "--bind", + default="127.0.0.1", + help="Bind address for --serve mode. Default: 127.0.0.1", + ) + parser.add_argument( + "--port", + type=int, + default=8765, + help="Port for --serve mode. Use 0 to auto-select an open port. Default: 8765", + ) + parser.add_argument( + "--once", + action="store_true", + help="Serve a single HTML request and then exit. Useful for tests.", + ) + args = parser.parse_args(argv) + + if args.serve: + try: + return serve_dashboard(args.input, args.title, args.bind, args.port, args.once) + except OSError as exc: + return die(str(exc)) + + try: + matrix_file, session_dir = resolve_input(args.input) + output_file = choose_output_path(matrix_file, session_dir, args.output) + matrix = load_matrix(matrix_file) + except (FileNotFoundError, ValueError) as exc: + return die(str(exc)) + + output_file.parent.mkdir(parents=True, exist_ok=True) + view_model = build_view_model(matrix, matrix_file, session_dir) + html = render_html(view_model, args.title) + output_file.write_text(html, encoding="utf-8") + print(output_file) + return 0 + + +if __name__ == "__main__": + raise SystemExit(main(sys.argv[1:])) diff --git a/scripts/setup-rlcr-loop.sh b/scripts/setup-rlcr-loop.sh index 9d45363c..03a3b1b3 100755 --- a/scripts/setup-rlcr-loop.sh +++ b/scripts/setup-rlcr-loop.sh @@ -31,6 +31,7 @@ source "$SCRIPT_DIR/portable-timeout.sh" # before invoking this script. HOOKS_LIB_DIR="$(cd "$SCRIPT_DIR/../hooks/lib" && pwd)" source "$HOOKS_LIB_DIR/loop-common.sh" +source "$HOOKS_LIB_DIR/scenario-matrix.sh" # ======================================== # Parse Arguments @@ -823,6 +824,7 @@ LOOP_DIR="$LOOP_BASE_DIR/$TIMESTAMP" mkdir -p "$LOOP_DIR" # Copy plan file to loop directory as backup (or create placeholder for skip-impl) +PLAN_BACKUP_REL=".humanize/rlcr/$TIMESTAMP/plan.md" if [[ "$SKIP_IMPL_NO_PLAN" == "true" ]]; then # Create placeholder plan file for skip-impl mode cat > "$LOOP_DIR/plan.md" << 'SKIP_IMPL_PLAN_EOF' @@ -841,7 +843,7 @@ The loop will: SKIP_IMPL_PLAN_EOF # Update PLAN_FILE to point to the actual placeholder location (repo-relative path) # Using relative path because git ls-files requires repo-relative paths - PLAN_FILE=".humanize/rlcr/$TIMESTAMP/plan.md" + PLAN_FILE="$PLAN_BACKUP_REL" else cp "$FULL_PLAN_PATH" "$LOOP_DIR/plan.md" fi @@ -849,6 +851,36 @@ fi # Docs path default DOCS_PATH="docs" +# ======================================== +# Initialize Scenario Matrix File +# ======================================== + +SCENARIO_MATRIX_FILE="$LOOP_DIR/scenario-matrix.json" +SCENARIO_MATRIX_FILE_REL=".humanize/rlcr/$TIMESTAMP/scenario-matrix.json" +SCENARIO_MATRIX_MODE="implementation" +SCENARIO_MATRIX_STATUS_OVERRIDE="" +if [[ "$SKIP_IMPL" == "true" ]]; then + SCENARIO_MATRIX_MODE="skip_impl" +fi +if [[ "$SKIP_IMPL_NO_PLAN" == "true" ]]; then + SCENARIO_MATRIX_STATUS_OVERRIDE="not_applicable" +fi + +scenario_matrix_initialize_file \ + "$SCENARIO_MATRIX_FILE" \ + "$PLAN_FILE" \ + "$PLAN_BACKUP_REL" \ + "$SCENARIO_MATRIX_MODE" \ + 0 \ + "$SCENARIO_MATRIX_STATUS_OVERRIDE" + +if ! scenario_matrix_validate_file "$SCENARIO_MATRIX_FILE"; then + echo "Error: Failed to initialize a valid scenario matrix file" >&2 + exit 1 +fi + +scenario_matrix_reconcile_manager_state "$SCENARIO_MATRIX_FILE" 0 "initial_setup" || true + # ======================================== # Initialize BitLesson File # ======================================== @@ -892,6 +924,8 @@ ask_codex_question: $ASK_CODEX_QUESTION session_id: agent_teams: $AGENT_TEAMS privacy_mode: $PRIVACY_MODE +scenario_matrix_file: $SCENARIO_MATRIX_FILE_REL +scenario_matrix_required: true bitlesson_required: $BITLESSON_STATE_VALUE bitlesson_file: $BITLESSON_FILE_REL bitlesson_allow_empty_none: $BITLESSON_ALLOW_EMPTY_NONE @@ -1176,6 +1210,11 @@ write_summary_template() { [List any deferred or pending items] +## Task Packet Feedback (Optional) + +| Task ID | Source | Kind | Summary | +|---------|--------|------|---------| + ## BitLesson Delta Action: none @@ -1195,6 +1234,11 @@ ROUND_CONTRACT_PATH="$LOOP_DIR/round-0-contract.md" # validation and BitLesson Delta checks have a valid target file. write_summary_template "$SUMMARY_PATH" +if [[ "$SCENARIO_MATRIX_MODE" == "implementation" ]]; then + scenario_matrix_sync_goal_tracker "$SCENARIO_MATRIX_FILE" "$GOAL_TRACKER_FILE" || true + scenario_matrix_write_round_contract_scaffold "$SCENARIO_MATRIX_FILE" "$ROUND_CONTRACT_PATH" 0 "implementation" "true" || true +fi + if [[ "$SKIP_IMPL" == "true" ]]; then if [[ "$SKIP_IMPL_PLAN_ANCHORED" == "true" ]]; then cat > "$ROUND_CONTRACT_PATH" << EOF @@ -1237,15 +1281,17 @@ Do not try to execute anything to trigger the review - just stop and it will run Before requesting review, read: - @$PLAN_FILE - @$GOAL_TRACKER_FILE +- @$SCENARIO_MATRIX_FILE - @$ROUND_CONTRACT_PATH ## Your Task 1. Review your current work -2. When ready, try to exit - Codex will review your code -3. Fix any issues Codex finds -4. Repeat until no issues remain -5. Enter finalize phase for code simplification +2. Refresh @$SCENARIO_MATRIX_FILE so review-only work stays aligned with the current task graph +3. When ready, try to exit - Codex will review your code +4. Fix any issues Codex finds +5. Repeat until no issues remain +6. Enter finalize phase for code simplification ## Review Objective @@ -1273,6 +1319,7 @@ EOF cat >> "$LOOP_DIR/round-0-prompt.md" << EOF Keep @$ROUND_CONTRACT_PATH updated if the blocking/queued split changes materially during review iterations. +Keep @$SCENARIO_MATRIX_FILE updated if review findings change task readiness, dependency assumptions, or the mainline/supporting split. When you're ready for review, write a brief summary of your changes and try to exit (do not try to execute anything, just stop). @@ -1311,6 +1358,17 @@ Use this contract to keep the round focused. Do NOT let non-blocking bugs or cle **IMPORTANT**: The IMMUTABLE SECTION can only be modified in Round 0. After this round, it becomes read-only. +## Scenario Matrix Setup (REQUIRED BEFORE CODING) + +Before starting implementation, you MUST read and refresh @$SCENARIO_MATRIX_FILE: + +1. Verify the seeded task, routing, and dependency data copied from the plan +2. If the task table was missing or malformed, repair the matrix with the smallest safe defaults +3. Keep the matrix aligned with the round contract and mutable goal tracker state +4. Track dependency-sensitive task readiness there before touching code + +The matrix may describe several related tasks, but this round still has exactly one current mainline objective. + --- ## Implementation Plan @@ -1414,6 +1472,18 @@ Throughout your work, you MUST maintain the Goal Tracker: - Add to "Blocking Side Issues" only if mainline progress is blocked - Otherwise add to "Queued Side Issues" or keep them as \`[queued]\` tasks/backlog +## Scenario Matrix Rules + +Throughout your work, you MUST maintain @$SCENARIO_MATRIX_FILE: + +1. Refresh task states and dependency edges before rewriting the round contract +2. If upstream work changes a downstream assumption, update the affected task state instead of blindly following the old plan +3. Keep exactly one current mainline objective for the round even if several supporting tasks are ready +4. If matrix state and tracker state disagree, reconcile the matrix first, then update the mutable tracker section + +Treat the Current Task Packet and Manager Checkpoint as manager-issued scope. +Do not rewrite authoritative matrix state directly; report packet corrections or task-shape concerns through the summary feedback section. + --- Note: You MUST NOT try to exit \`start-rlcr-loop\` loop by lying or edit loop state file or try to execute \`cancel-rlcr-loop\` @@ -1436,6 +1506,21 @@ fi fi # End of skip-impl prompt handling +if scenario_matrix_has_projectable_tasks "$SCENARIO_MATRIX_FILE"; then + ROUND0_TASK_PACKET=$(scenario_matrix_current_task_packet_markdown "$SCENARIO_MATRIX_FILE" 2>/dev/null || true) + ROUND0_CHECKPOINT=$(scenario_matrix_current_checkpoint_markdown "$SCENARIO_MATRIX_FILE" 2>/dev/null || true) + ROUND0_TASK_PACKET_FEEDBACK=$(scenario_matrix_task_packet_feedback_instructions_markdown 2>/dev/null || true) + if [[ -n "$ROUND0_TASK_PACKET" ]]; then + printf '\n%s\n' "$ROUND0_TASK_PACKET" >> "$LOOP_DIR/round-0-prompt.md" + fi + if [[ -n "$ROUND0_CHECKPOINT" ]]; then + printf '\n%s\n' "$ROUND0_CHECKPOINT" >> "$LOOP_DIR/round-0-prompt.md" + fi + if [[ -n "$ROUND0_TASK_PACKET_FEEDBACK" ]]; then + printf '\n%s\n' "$ROUND0_TASK_PACKET_FEEDBACK" >> "$LOOP_DIR/round-0-prompt.md" + fi +fi + # ======================================== # Output Setup Message # ======================================== diff --git a/tests/run-all-tests.sh b/tests/run-all-tests.sh index b6ba6b24..02c76835 100755 --- a/tests/run-all-tests.sh +++ b/tests/run-all-tests.sh @@ -82,6 +82,8 @@ TEST_SUITES=( "test-gen-plan.sh" "test-refine-plan.sh" "test-task-tag-routing.sh" + "test-scenario-matrix.sh" + "test-review-coverage-prompts.sh" "test-config-merge.sh" "test-config-error-handling.sh" "test-codex-hook-install.sh" diff --git a/tests/test-codex-hook-install.sh b/tests/test-codex-hook-install.sh index 2d70bb2d..a85308d4 100755 --- a/tests/test-codex-hook-install.sh +++ b/tests/test-codex-hook-install.sh @@ -253,6 +253,7 @@ PATH="$FAKE_BIN:$PATH" TEST_CODEX_FEATURE_LOG="$FEATURE_LOG" XDG_CONFIG_HOME="$X --target codex \ --codex-config-dir "$CODEX_HOME_DIR" \ --codex-skills-dir "$CODEX_HOME_DIR/skills" \ + --command-bin-dir "$COMMAND_BIN_DIR" \ > "$TEST_DIR/install-2.log" 2>&1 PY_OUTPUT_2="$( @@ -289,7 +290,9 @@ fi UNSUPPORTED_BIN="$TEST_DIR/bin-unsupported" UNSUPPORTED_HOME="$TEST_DIR/codex-home-unsupported" -mkdir -p "$UNSUPPORTED_BIN" "$UNSUPPORTED_HOME" +UNSUPPORTED_XDG_CONFIG_HOME="$TEST_DIR/xdg-config-unsupported" +UNSUPPORTED_COMMAND_BIN_DIR="$TEST_DIR/command-bin-unsupported" +mkdir -p "$UNSUPPORTED_BIN" "$UNSUPPORTED_HOME" "$UNSUPPORTED_XDG_CONFIG_HOME" "$UNSUPPORTED_COMMAND_BIN_DIR" cat > "$UNSUPPORTED_BIN/codex" <<'EOF' #!/usr/bin/env bash @@ -309,10 +312,12 @@ chmod +x "$UNSUPPORTED_BIN/codex" set +e PATH="$UNSUPPORTED_BIN:$PATH" \ + XDG_CONFIG_HOME="$UNSUPPORTED_XDG_CONFIG_HOME" \ "$INSTALL_SCRIPT" \ --target codex \ --codex-config-dir "$UNSUPPORTED_HOME" \ --codex-skills-dir "$UNSUPPORTED_HOME/skills" \ + --command-bin-dir "$UNSUPPORTED_COMMAND_BIN_DIR" \ > "$TEST_DIR/install-unsupported.log" 2>&1 UNSUPPORTED_EXIT=$? set -e diff --git a/tests/test-review-coverage-prompts.sh b/tests/test-review-coverage-prompts.sh new file mode 100755 index 00000000..17603ac6 --- /dev/null +++ b/tests/test-review-coverage-prompts.sh @@ -0,0 +1,85 @@ +#!/usr/bin/env bash + +set -uo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" + +RED='\033[0;31m' +GREEN='\033[0;32m' +NC='\033[0m' + +TESTS_PASSED=0 +TESTS_FAILED=0 + +pass() { + echo -e "${GREEN}PASS${NC}: $1" + TESTS_PASSED=$((TESTS_PASSED + 1)) +} + +fail() { + echo -e "${RED}FAIL${NC}: $1" + if [[ -n "${2:-}" ]]; then + echo " Expected: $2" + fi + if [[ -n "${3:-}" ]]; then + echo " Got: $3" + fi + TESTS_FAILED=$((TESTS_FAILED + 1)) +} + +regular_template="$PROJECT_ROOT/prompt-template/codex/regular-review.md" +full_template="$PROJECT_ROOT/prompt-template/codex/full-alignment-review.md" + +require_string() { + local file="$1" + local pattern="$2" + local description="$3" + + if grep -Fq -- "$pattern" "$file"; then + pass "$description" + else + fail "$description" "$pattern in $file" "$(sed -n '1,220p' "$file")" + fi +} + +require_string "$regular_template" "Touched Failure Surfaces" "regular review prompt requires touched failure surfaces" +require_string "$regular_template" "Likely Sibling Risks" "regular review prompt requires likely sibling risks" +require_string "$regular_template" "Coverage Ledger" "regular review prompt requires coverage ledger" +require_string "$regular_template" "Historical Tail-Repair Scan" "regular review prompt requires historical tail-repair scan" +require_string "$regular_template" "git log --oneline --stat -n 12" "regular review prompt inspects recent git history" +require_string "$regular_template" "long-tail repair-chain signal" "regular review prompt defines long-tail repair-chain signal" +require_string "$regular_template" "Do NOT render the Coverage Ledger as \`-\` / \`*\` bullet findings." "regular review prompt preserves parser-safe coverage ledger formatting" +require_string "$regular_template" "Keep the lane headings exactly as written above" "regular review prompt preserves machine-readable lane headings" +require_string "$regular_template" "1. \`Touched Failure Surfaces\`" "regular review prompt documents output order starting with touched failure surfaces" +require_string "$regular_template" "3. \`Mainline Gaps\`" "regular review prompt keeps lane output after analysis" +require_string "$regular_template" "7. \`Coverage Ledger\`" "regular review prompt ends output order with coverage ledger" +require_string "$regular_template" "- \`- | why: | confidence: high|medium|low\`" "regular review prompt specifies parser-friendly touched surface format" +require_string "$regular_template" "- \`- | derived_from: | axis: | why: | check: | confidence: high|medium|low\`" "regular review prompt specifies parser-friendly sibling risk format" +require_string "$regular_template" "| Surface | Status | Notes |" "regular review prompt specifies parser-friendly coverage ledger table" + +require_string "$full_template" "Touched Failure Surfaces" "full alignment prompt requires touched failure surfaces" +require_string "$full_template" "Likely Sibling Risks" "full alignment prompt requires likely sibling risks" +require_string "$full_template" "Coverage Ledger" "full alignment prompt requires coverage ledger" +require_string "$full_template" "Historical Tail-Repair Scan" "full alignment prompt requires historical tail-repair scan" +require_string "$full_template" "git log --oneline --stat -n 20" "full alignment prompt inspects recent git history" +require_string "$full_template" "long-tail repair-chain signal" "full alignment prompt defines long-tail repair-chain signal" +require_string "$full_template" "Do NOT render the Coverage Ledger as \`-\` / \`*\` bullet findings." "full alignment prompt preserves parser-safe coverage ledger formatting" +require_string "$full_template" "Keep the lane headings exactly as written above" "full alignment prompt preserves machine-readable lane headings" +require_string "$full_template" "1. \`Touched Failure Surfaces\`" "full alignment prompt documents output order starting with touched failure surfaces" +require_string "$full_template" "3. \`Mainline Gaps\`" "full alignment prompt keeps lane output after analysis" +require_string "$full_template" "7. \`Coverage Ledger\`" "full alignment prompt ends output order with coverage ledger" +require_string "$full_template" "- \`- | why: | confidence: high|medium|low\`" "full alignment prompt specifies parser-friendly touched surface format" +require_string "$full_template" "- \`- | derived_from: | axis: | why: | check: | confidence: high|medium|low\`" "full alignment prompt specifies parser-friendly sibling risk format" +require_string "$full_template" "| Surface | Status | Notes |" "full alignment prompt specifies parser-friendly coverage ledger table" + +echo "" +echo "========================================" +echo "Review Coverage Prompt Tests" +echo "========================================" +echo "Passed: ${TESTS_PASSED}" +echo "Failed: ${TESTS_FAILED}" + +if [[ "$TESTS_FAILED" -ne 0 ]]; then + exit 1 +fi diff --git a/tests/test-scenario-matrix.sh b/tests/test-scenario-matrix.sh new file mode 100755 index 00000000..2987d373 --- /dev/null +++ b/tests/test-scenario-matrix.sh @@ -0,0 +1,1572 @@ +#!/usr/bin/env bash +# +# Tests for scenario matrix foundation in setup-rlcr-loop.sh +# + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]:-$0}")" && pwd)" +PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +source "$SCRIPT_DIR/test-helpers.sh" + +SETUP_SCRIPT="$PROJECT_ROOT/scripts/setup-rlcr-loop.sh" +STOP_HOOK="$PROJECT_ROOT/hooks/loop-codex-stop-hook.sh" +HUMANIZE_SCRIPT="$PROJECT_ROOT/scripts/humanize.sh" +SCENARIO_MATRIX_LIB="$PROJECT_ROOT/hooks/lib/scenario-matrix.sh" + +source "$SCENARIO_MATRIX_LIB" + +echo "========================================" +echo "Scenario Matrix Foundation Tests" +echo "========================================" +echo "" + +create_mock_codex() { + local bin_dir="$1" + local exec_output="${2:-Need follow-up work}" + local review_output="${3:-No issues found.}" + + mkdir -p "$bin_dir" + printf '%s\n' "$exec_output" > "$bin_dir/exec-output.txt" + printf '%s\n' "$review_output" > "$bin_dir/review-output.txt" + cat > "$bin_dir/codex" << 'EOF' +#!/usr/bin/env bash +script_dir="$(cd "$(dirname "${BASH_SOURCE[0]:-$0}")" && pwd)" +subcommand="" +for arg in "$@"; do + if [[ "$arg" == "exec" || "$arg" == "review" ]]; then + subcommand="$arg" + break + fi +done +if [[ "$subcommand" == "exec" ]]; then + cat "$script_dir/exec-output.txt" +elif [[ "$subcommand" == "review" ]]; then + cat "$script_dir/review-output.txt" +else + exit 0 +fi +EOF + chmod +x "$bin_dir/codex" +} + +create_repo_with_plan() { + local repo_dir="$1" + local plan_body="$2" + + init_test_git_repo "$repo_dir" + mkdir -p "$repo_dir/plans" + cat > "$repo_dir/plans/plan.md" << EOF +$plan_body +EOF + cat > "$repo_dir/.gitignore" << 'EOF' +plans/ +.humanize/ +bin/ +.cache/ +EOF + git -C "$repo_dir" add .gitignore + git -C "$repo_dir" commit -q -m "Add gitignore for scenario matrix tests" +} + +run_setup() { + local repo_dir="$1" + shift + + ( + cd "$repo_dir" + PATH="$repo_dir/bin:$PATH" CLAUDE_PROJECT_DIR="$repo_dir" bash "$SETUP_SCRIPT" "$@" + ) +} + +run_setup_from_subdir() { + local repo_dir="$1" + local subdir="$2" + shift 2 + + ( + cd "$repo_dir/$subdir" + PATH="$repo_dir/bin:$PATH" CLAUDE_PROJECT_DIR="$repo_dir" bash "$SETUP_SCRIPT" "$@" + ) +} + +setup_matrix_test_dir() { + setup_test_dir + export XDG_CACHE_HOME="$TEST_DIR/.cache" + mkdir -p "$XDG_CACHE_HOME" +} + +find_matrix_file() { + local repo_dir="$1" + find "$repo_dir/.humanize/rlcr" -name "scenario-matrix.json" -type f | head -1 +} + +setup_manual_loop_repo() { + local repo_dir="$1" + local round="$2" + local review_started="$3" + local scenario_matrix_required="${4:-true}" + + init_test_git_repo "$repo_dir" + mkdir -p "$repo_dir/plans" + cat > "$repo_dir/plans/plan.md" << 'EOF' +# Hook Plan + +## Goal +Keep matrix state aligned with runtime prompts. + +## Acceptance Criteria +- AC-1: Scenario matrix is refreshed +EOF + cat > "$repo_dir/.gitignore" << 'EOF' +plans/ +.humanize/ +bin/ +.cache/ +EOF + git -C "$repo_dir" add .gitignore + git -C "$repo_dir" commit -q -m "Add gitignore for hook matrix tests" + + local current_branch + current_branch=$(git -C "$repo_dir" rev-parse --abbrev-ref HEAD) + + local loop_dir="$repo_dir/.humanize/rlcr/2024-03-01_12-00-00" + mkdir -p "$loop_dir" + cat > "$loop_dir/state.md" << EOF +--- +current_round: $round +max_iterations: 10 +codex_model: gpt-5.4 +codex_effort: high +codex_timeout: 5400 +push_every_round: false +plan_file: plans/plan.md +plan_tracked: false +start_branch: $current_branch +base_branch: main +base_commit: abc123 +review_started: $review_started +ask_codex_question: false +full_review_round: 5 +session_id: +scenario_matrix_file: .humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json +scenario_matrix_required: $scenario_matrix_required +drift_status: normal +mainline_stall_count: 0 +last_mainline_verdict: unknown +--- +EOF + cp "$repo_dir/plans/plan.md" "$loop_dir/plan.md" + cat > "$loop_dir/goal-tracker.md" << 'EOF' +# Goal Tracker +## IMMUTABLE SECTION +### Ultimate Goal +Keep matrix state aligned with runtime prompts. +### Acceptance Criteria +| ID | Criterion | +|----|-----------| +| AC-1 | Matrix instructions remain present | +--- +## MUTABLE SECTION +#### Active Tasks +| Task | Target AC | Status | Tag | Owner | Notes | +|------|-----------|--------|-----|-------|-------| +| Keep matrix aligned | AC-1 | in_progress | coding | claude | matrix-aware round | + +### Blocking Side Issues +| Issue | Discovered Round | Blocking AC | Resolution Path | +|-------|-----------------|-------------|-----------------| + +### Queued Side Issues +| Issue | Discovered Round | Why Not Blocking | Revisit Trigger | +|-------|-----------------|------------------|-----------------| +EOF + cat > "$loop_dir/round-${round}-contract.md" << EOF +# Round $round Contract + +- Mainline Objective: Keep the scenario matrix aligned with the current round. +- Target ACs: AC-1 +- Blocking Side Issues In Scope: none +- Queued Side Issues Out of Scope: none +- Success Criteria: The next prompt still re-anchors on the scenario matrix. +EOF + cat > "$loop_dir/round-${round}-prompt.md" << EOF +# Round $round Prompt + +Continue the work. +EOF + cat > "$loop_dir/round-${round}-summary.md" << 'EOF' +# Round Summary + +Progress made, but more work remains. + +## BitLesson Delta +- Action: none +- Lesson ID(s): NONE +- Notes: No new lessons in this fixture. +EOF + + if [[ "$review_started" == "true" ]]; then + echo "build_finish_round=$round" > "$loop_dir/.review-phase-started" + fi + + cat > "$loop_dir/scenario-matrix.json" << 'EOF' +{ + "schema_version": 2, + "created_at": "2026-04-01T00:00:00Z", + "plan": { + "file": "plans/plan.md", + "backup_file": ".humanize/rlcr/2024-03-01_12-00-00/plan.md", + "task_breakdown_status": "parsed", + "warnings": [] + }, + "runtime": { + "mode": "implementation", + "current_round": 2, + "projection_mode": "compatibility" + }, + "metadata": { + "seed_task_count": 2, + "seed_source": "task_breakdown" + }, + "manager": { + "role": "top_level_session", + "authority_mode": "manager_reconcile", + "authoritative_writer": "manager", + "current_primary_task_id": "task1", + "last_reconciled_at": "2026-04-01T00:00:00Z" + }, + "feedback": { + "execution": [], + "review": [] + }, + "tasks": [ + { + "id": "task1", + "title": "Repair parser contract", + "lane": "mainline", + "routing": "coding", + "owner": null, + "scope": { + "summary": "", + "paths": [], + "constraints": [] + }, + "cluster_id": null, + "repair_wave": null, + "risk_bucket": "planned", + "admission": { + "status": "active", + "reason": "fixture" + }, + "authority": { + "write_mode": "manager_only", + "authoritative_source": "manager" + }, + "target_ac": ["AC-1"], + "depends_on": [], + "state": "ready", + "assumptions": [], + "strategy": { + "current": "repair-parser", + "attempt_count": 0, + "repeated_failure_count": 0, + "method_switch_required": false + }, + "health": { + "stuck_score": 0, + "last_progress_round": 0 + }, + "metadata": { + "seed_source": "fixture" + } + }, + { + "id": "task2", + "title": "Update downstream validator", + "lane": "supporting", + "routing": "coding", + "owner": null, + "scope": { + "summary": "", + "paths": [], + "constraints": [] + }, + "cluster_id": null, + "repair_wave": null, + "risk_bucket": "planned", + "admission": { + "status": "active", + "reason": "fixture" + }, + "authority": { + "write_mode": "manager_only", + "authoritative_source": "manager" + }, + "target_ac": ["AC-1"], + "depends_on": ["task1"], + "state": "pending", + "assumptions": [], + "strategy": { + "current": "update-validator", + "attempt_count": 0, + "repeated_failure_count": 0, + "method_switch_required": false + }, + "health": { + "stuck_score": 0, + "last_progress_round": 0 + }, + "metadata": { + "seed_source": "fixture" + } + } + ], + "events": [], + "oversight": { + "status": "idle", + "last_action": "none", + "updated_at": null + } +} +EOF +} + +# ======================================== +# Test 1: Valid task breakdown seeds tasks and dependencies +# ======================================== + +setup_matrix_test_dir +REPO_VALID_DIR="$TEST_DIR/repo-valid" +create_repo_with_plan "$REPO_VALID_DIR" '# Scenario Matrix Plan + +## Goal +Seed the scenario matrix from a valid task table. + +## Acceptance Criteria +- AC-1: Parsed successfully +- AC-2: Dependencies preserved + +## Task Breakdown +| Task ID | Description | Target AC | Tag (`coding`/`analyze`) | Depends On | +|---------|-------------|-----------|----------------------------|------------| +| task1 | Implement parser | AC-1 | coding | - | +| task2 | Analyze rollout risk | AC-2 | analyze | task1 +' +create_mock_codex "$REPO_VALID_DIR/bin" + +run_setup "$REPO_VALID_DIR" plans/plan.md > /dev/null 2>&1 +MATRIX_FILE=$(find_matrix_file "$REPO_VALID_DIR") + +if [[ -n "$MATRIX_FILE" && -f "$MATRIX_FILE" ]]; then + pass "setup creates scenario-matrix.json for valid task table" +else + fail "setup creates scenario-matrix.json for valid task table" "matrix file exists" "not found" +fi + +if [[ -n "$MATRIX_FILE" ]] && jq -e '.schema_version == 2 and .plan.task_breakdown_status == "parsed" and .metadata.seed_task_count == 2' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "matrix metadata records parsed task table under manager-owned schema" +else + fail "matrix metadata records parsed task table under manager-owned schema" "schema_version 2 with parsed status and 2 tasks" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing file')" +fi + +if [[ -n "$MATRIX_FILE" ]] && jq -e ' + .tasks[0].id == "task1" + and .tasks[0].lane == "mainline" + and .tasks[0].routing == "coding" + and .tasks[0].state == "ready" + and .tasks[1].id == "task2" + and .tasks[1].lane == "queued" + and .tasks[1].routing == "analyze" + and .tasks[1].depends_on == ["task1"] + and .tasks[1].state == "pending" +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "matrix seeds routing, queue placement, and dependency state from task table" +else + fail "matrix seeds routing, queue placement, and dependency state from task table" "task1 ready, task2 queued pending with dependency" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing file')" +fi + +if [[ -n "$MATRIX_FILE" ]] && jq -e ' + .manager.authority_mode == "manager_reconcile" + and .manager.authoritative_writer == "manager" + and .manager.current_primary_task_id == "task1" + and .runtime.checkpoint.sequence == 1 + and .runtime.checkpoint.current_id == "checkpoint-1" + and .runtime.checkpoint.primary_task_id == "task1" + and .runtime.checkpoint.supporting_task_ids == [] + and .runtime.convergence.status == "continue" + and .runtime.convergence.active_task_count == 2 + and (.feedback.execution | length) == 0 + and (.feedback.review | length) == 0 + and .tasks[0].authority.write_mode == "manager_only" + and .tasks[0].authority.authoritative_source == "manager" + and .tasks[0].scope.paths == [] + and .tasks[0].owner == null + and .tasks[0].risk_bucket == "planned" + and .tasks[0].admission.status == "active" +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "manager authority and task ownership metadata seed into new matrix" +else + fail "manager authority and task ownership metadata seed into new matrix" "manager block plus task authority/scope/admission defaults" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing file')" +fi + +PACKET_MARKDOWN=$(scenario_matrix_current_task_packet_markdown "$MATRIX_FILE" 2>/dev/null || true) +if echo "$PACKET_MARKDOWN" | grep -q '^## Current Task Packet$' && \ + echo "$PACKET_MARKDOWN" | grep -q 'Primary Objective: `task1`' && \ + echo "$PACKET_MARKDOWN" | grep -q 'Assigned Task: `task1` - Implement parser' && \ + echo "$PACKET_MARKDOWN" | grep -q 'Direct Downstream Impact: task2' && \ + echo "$PACKET_MARKDOWN" | grep -q 'Target ACs: AC-1'; then + pass "scenario matrix renders primary task packet with dependency and AC context" +else + fail "scenario matrix renders primary task packet with dependency and AC context" "task packet markdown with primary objective, downstream impact, and ACs" "$PACKET_MARKDOWN" +fi + +CHECKPOINT_MARKDOWN=$(scenario_matrix_current_checkpoint_markdown "$MATRIX_FILE" 2>/dev/null || true) +if echo "$CHECKPOINT_MARKDOWN" | grep -q '^## Manager Checkpoint$' && \ + echo "$CHECKPOINT_MARKDOWN" | grep -q 'Checkpoint: `checkpoint-1`' && \ + echo "$CHECKPOINT_MARKDOWN" | grep -q 'Supporting Window: none' && \ + echo "$CHECKPOINT_MARKDOWN" | grep -q 'Convergence Status: `continue`'; then + pass "scenario matrix renders checkpoint and convergence guidance" +else + fail "scenario matrix renders checkpoint and convergence guidance" "checkpoint markdown with supporting window and convergence state" "$CHECKPOINT_MARKDOWN" +fi + +GOAL_TRACKER_FILE=$(find "$REPO_VALID_DIR/.humanize/rlcr" -name "goal-tracker.md" -type f | head -1) +if [[ -n "$GOAL_TRACKER_FILE" ]] && \ + grep -q '\[mainline\] Implement parser' "$GOAL_TRACKER_FILE" && \ + grep -q 'Analyze rollout risk \[task2\]' "$GOAL_TRACKER_FILE" && \ + ! grep -q '\[To be populated by Claude based on plan\]' "$GOAL_TRACKER_FILE"; then + pass "setup projects scenario matrix into goal tracker mutable sections" +else + fail "setup projects scenario matrix into goal tracker mutable sections" "mainline and queued task rows without placeholder" "$(cat "$GOAL_TRACKER_FILE" 2>/dev/null || echo 'missing tracker')" +fi + +ROUND0_CONTRACT=$(find "$REPO_VALID_DIR/.humanize/rlcr" -name "round-0-contract.md" -type f | head -1) +if [[ -n "$ROUND0_CONTRACT" ]] && \ + grep -q 'Mainline Objective: task1: Implement parser' "$ROUND0_CONTRACT" && \ + grep -q 'Checkpoint: checkpoint-1' "$ROUND0_CONTRACT" && \ + grep -q 'Supporting Window In Scope: none' "$ROUND0_CONTRACT" && \ + grep -q 'Queued Side Issues Out of Scope: task2: Analyze rollout risk' "$ROUND0_CONTRACT" && \ + grep -q 'Convergence Status: continue' "$ROUND0_CONTRACT"; then + pass "setup seeds round-0 contract scaffold from scenario matrix" +else + fail "setup seeds round-0 contract scaffold from scenario matrix" "matrix-derived mainline and queued task in contract" "$(cat "$ROUND0_CONTRACT" 2>/dev/null || echo 'missing contract')" +fi + +if ! grep -q "mapfile" "$PROJECT_ROOT/hooks/lib/scenario-matrix.sh"; then + pass "scenario matrix parser avoids bash-4-only mapfile" +else + fail "scenario matrix parser avoids bash-4-only mapfile" "no mapfile usage" "$(grep -n "mapfile" "$PROJECT_ROOT/hooks/lib/scenario-matrix.sh")" +fi + +# ======================================== +# Test 1b: Setup from subdirectory still parses copied plan backup +# ======================================== + +setup_matrix_test_dir +REPO_SUBDIR_DIR="$TEST_DIR/repo-subdir" +create_repo_with_plan "$REPO_SUBDIR_DIR" '# Scenario Matrix Plan + +## Goal +Seed the scenario matrix even when launched below repo root. + +## Acceptance Criteria +- AC-1: Parsed successfully +- AC-2: Dependencies preserved + +## Task Breakdown +| Task ID | Description | Target AC | Tag (`coding`/`analyze`) | Depends On | +|---------|-------------|-----------|----------------------------|------------| +| task1 | Implement parser | AC-1 | coding | - | +| task2 | Analyze rollout risk | AC-2 | analyze | task1 +' +mkdir -p "$REPO_SUBDIR_DIR/work/nested" +create_mock_codex "$REPO_SUBDIR_DIR/bin" + +run_setup_from_subdir "$REPO_SUBDIR_DIR" "work/nested" "plans/plan.md" > /dev/null 2>&1 +MATRIX_FILE=$(find_matrix_file "$REPO_SUBDIR_DIR") + +if [[ -n "$MATRIX_FILE" ]] && jq -e '.plan.task_breakdown_status == "parsed" and .metadata.seed_task_count == 2 and (.tasks | length) == 2' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "setup launched from subdirectory still resolves copied plan backup for matrix seed" +else + fail "setup launched from subdirectory still resolves copied plan backup for matrix seed" "parsed task breakdown from copied backup plan" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing file')" +fi + +# ======================================== +# Test 2: Missing task breakdown still creates valid empty matrix +# ======================================== + +setup_matrix_test_dir +create_repo_with_plan "$TEST_DIR/repo-missing" '# No Task Table Plan + +## Goal +Still create a matrix. + +## Acceptance Criteria +- AC-1: Setup succeeds +- AC-2: Matrix is valid +' +create_mock_codex "$TEST_DIR/repo-missing/bin" + +run_setup "$TEST_DIR/repo-missing" plans/plan.md > /dev/null 2>&1 +MATRIX_FILE=$(find_matrix_file "$TEST_DIR/repo-missing") + +if [[ -n "$MATRIX_FILE" ]] && jq -e '.plan.task_breakdown_status == "missing" and .metadata.seed_task_count == 0 and (.tasks | length) == 0' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "missing task table produces empty valid matrix" +else + fail "missing task table produces empty valid matrix" "missing status with zero tasks" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing file')" +fi + +# ======================================== +# Test 3: Malformed task breakdown degrades safely +# ======================================== + +setup_matrix_test_dir +create_repo_with_plan "$TEST_DIR/repo-malformed" '# Broken Task Table Plan + +## Goal +Do not write invalid matrix JSON. + +## Acceptance Criteria +- AC-1: Setup succeeds safely +- AC-2: Matrix reports malformed task table + +## Task Breakdown +| Task ID | Description | Target AC | Tag (`coding`/`analyze`) | Depends On | +|---------|-------------|-----------|----------------------------|------------| +| task1 | Broken tag row | AC-1 | codng | - +' +create_mock_codex "$TEST_DIR/repo-malformed/bin" + +run_setup "$TEST_DIR/repo-malformed" plans/plan.md > /dev/null 2>&1 +MATRIX_FILE=$(find_matrix_file "$TEST_DIR/repo-malformed") + +if [[ -n "$MATRIX_FILE" ]] && jq -e '.plan.task_breakdown_status == "malformed" and .metadata.seed_task_count == 0 and (.tasks | length) == 0 and (.plan.warnings | length) >= 1' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "malformed task table produces warning-backed empty matrix" +else + fail "malformed task table produces warning-backed empty matrix" "malformed status with warnings and zero tasks" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing file')" +fi + +INVALID_MATRIX="$TEST_DIR/invalid-structure.json" +cat > "$INVALID_MATRIX" << 'EOF' +{ + "schema_version": 2, + "plan": { + "task_breakdown_status": "parsed", + "warnings": [] + }, + "runtime": { + "mode": "implementation", + "current_round": 0 + }, + "manager": { + "role": "top_level_session", + "authority_mode": "manager_reconcile", + "authoritative_writer": "manager", + "current_primary_task_id": "task1", + "last_reconciled_at": null + }, + "feedback": { + "execution": [], + "review": [] + }, + "tasks": [1], + "events": [], + "oversight": { + "status": "idle", + "last_action": "none" + } +} +EOF + +if ! scenario_matrix_validate_file "$INVALID_MATRIX"; then + pass "scenario matrix validation rejects structurally invalid task entries" +else + fail "scenario matrix validation rejects structurally invalid task entries" "invalid tasks array should fail validation" "$(cat "$INVALID_MATRIX")" +fi + +INVALID_AUTHORITY_MATRIX="$TEST_DIR/invalid-authority.json" +VALID_SEEDED_MATRIX=$(find_matrix_file "$REPO_VALID_DIR") +jq ' + .tasks = [ + .tasks[0], + ( + .tasks[0] + | .id = "task-conflict" + | .title = "Conflicting mainline" + | .owner = "manager" + ) + ] +' "$VALID_SEEDED_MATRIX" > "$INVALID_AUTHORITY_MATRIX" + +if ! scenario_matrix_validate_file "$INVALID_AUTHORITY_MATRIX"; then + pass "scenario matrix validation rejects contradictory manager ownership and multiple active mainlines" +else + fail "scenario matrix validation rejects contradictory manager ownership and multiple active mainlines" "invalid owner and duplicate active mainline should fail validation" "$(cat "$INVALID_AUTHORITY_MATRIX")" +fi + +FEEDBACK_MATRIX="$TEST_DIR/feedback-matrix.json" +cp "$VALID_SEEDED_MATRIX" "$FEEDBACK_MATRIX" +scenario_matrix_record_execution_feedback "$FEEDBACK_MATRIX" "task2" "subagent-1" "state_suggestion" "Suggest keeping validator work queued behind parser repair." +scenario_matrix_record_review_feedback "$FEEDBACK_MATRIX" "task1" "review-agent" "cluster_hint" "This issue likely belongs to the parser-contract repair wave." + +if jq -e ' + .manager.current_primary_task_id == "task1" + and .tasks[0].state == "ready" + and .tasks[1].state == "pending" + and (.feedback.execution | length) == 1 + and .feedback.execution[0].authoritative == false + and .feedback.execution[0].task_id == "task2" + and .feedback.execution[0].suggested_by == "subagent-1" + and (.feedback.review | length) == 1 + and .feedback.review[0].authoritative == false + and .feedback.review[0].task_id == "task1" + and .feedback.review[0].kind == "cluster_hint" +' "$FEEDBACK_MATRIX" >/dev/null 2>&1; then + pass "non-authoritative subagent and review feedback stay in feedback queues without mutating task state" +else + fail "non-authoritative subagent and review feedback stay in feedback queues without mutating task state" "feedback queues updated while authoritative task state stays unchanged" "$(cat "$FEEDBACK_MATRIX")" +fi + +# ======================================== +# Test 4: Skip-impl without a plan uses not_applicable seed mode +# ======================================== + +setup_matrix_test_dir +init_test_git_repo "$TEST_DIR/repo-skip" +cat > "$TEST_DIR/repo-skip/.gitignore" << 'EOF' +.humanize/ +bin/ +.cache/ +EOF +git -C "$TEST_DIR/repo-skip" add .gitignore +git -C "$TEST_DIR/repo-skip" commit -q -m "Add gitignore for skip-impl matrix test" +create_mock_codex "$TEST_DIR/repo-skip/bin" + +run_setup "$TEST_DIR/repo-skip" --skip-impl > /dev/null 2>&1 +MATRIX_FILE=$(find_matrix_file "$TEST_DIR/repo-skip") + +if [[ -n "$MATRIX_FILE" ]] && jq -e '.runtime.mode == "skip_impl" and .plan.task_breakdown_status == "not_applicable" and .metadata.seed_source == "not_applicable" and (.tasks | length) == 0' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "skip-impl without plan creates not_applicable matrix scaffold" +else + fail "skip-impl without plan creates not_applicable matrix scaffold" "skip_impl mode with not_applicable seed" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing file')" +fi + +PROMPT_FILE=$(find "$REPO_VALID_DIR/.humanize/rlcr" -name "round-0-prompt.md" -type f | head -1) +if [[ -n "$PROMPT_FILE" ]] && grep -q "Scenario Matrix Setup" "$PROMPT_FILE" && grep -q "scenario-matrix.json" "$PROMPT_FILE"; then + pass "round-0 prompt includes scenario matrix setup guidance" +else + fail "round-0 prompt includes scenario matrix setup guidance" "scenario matrix setup section in prompt" "$(cat "$PROMPT_FILE" 2>/dev/null || echo 'missing prompt')" +fi + +if [[ -n "$PROMPT_FILE" ]] && \ + grep -q '^## Current Task Packet$' "$PROMPT_FILE" && \ + grep -q 'Primary Objective: `task1`' "$PROMPT_FILE" && \ + grep -q '^## Manager Checkpoint$' "$PROMPT_FILE" && \ + grep -q '^## Task Packet Feedback Readback$' "$PROMPT_FILE" && \ + grep -q 'manager-issued scope' "$PROMPT_FILE"; then + pass "round-0 prompt includes current task packet, manager checkpoint, and scope authority note" +else + fail "round-0 prompt includes current task packet, manager checkpoint, and scope authority note" "round-0 prompt with task packet, checkpoint, feedback readback, and manager-issued scope guidance" "$(cat "$PROMPT_FILE" 2>/dev/null || echo 'missing prompt')" +fi + +SUMMARY_FILE=$(find "$REPO_VALID_DIR/.humanize/rlcr" -name "round-0-summary.md" -type f | head -1) +if [[ -n "$SUMMARY_FILE" ]] && \ + grep -q '^## Task Packet Feedback (Optional)$' "$SUMMARY_FILE" && \ + grep -q '^| Task ID | Source | Kind | Summary |$' "$SUMMARY_FILE"; then + pass "summary scaffold includes task packet feedback table" +else + fail "summary scaffold includes task packet feedback table" "summary scaffold with optional task packet feedback section" "$(cat "$SUMMARY_FILE" 2>/dev/null || echo 'missing summary')" +fi + +# ======================================== +# Test 5: Stop hook refreshes matrix and next-round prompt after implementation review +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-hook-impl" 2 false true +create_mock_codex "$TEST_DIR/repo-hook-impl/bin" "## Review Feedback + +## Touched Failure Surfaces +- dependency-contract | why: parser and downstream validator drifted together | confidence: high +- rollback-symmetry | why: replanning touches recovery and invalidation paths | confidence: medium + +## Likely Sibling Risks +- Validator follow-up may still assume the old parser shape | derived_from: dependency-contract | axis: adjacent state transitions | why: downstream synchronization already regressed once | check: audit the validator update path and stale assumptions | confidence: high + +## Mainline Gaps +- [P1] Dependency mismatch still breaks review. + +Mainline Progress Verdict: REGRESSED + +An upstream dependency changed and downstream work must be replanned. + +## Coverage Ledger +| Surface | Status | Notes | +|---------|--------|-------| +| dependency-contract | covered | checked the parser and downstream validator contract edges | +| rollback-symmetry | partial | replanning path inspected, but downstream restore path still needs follow-up | + +CONTINUE" + +HOOK_INPUT='{"stop_hook_active": false, "transcript": [], "session_id": ""}' +echo "$HOOK_INPUT" | PATH="$TEST_DIR/repo-hook-impl/bin:$PATH" CLAUDE_PROJECT_DIR="$TEST_DIR/repo-hook-impl" bash "$STOP_HOOK" > /dev/null 2>&1 || true + +NEXT_PROMPT="$TEST_DIR/repo-hook-impl/.humanize/rlcr/2024-03-01_12-00-00/round-3-prompt.md" +MATRIX_FILE="$TEST_DIR/repo-hook-impl/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" +GOAL_TRACKER_FILE="$TEST_DIR/repo-hook-impl/.humanize/rlcr/2024-03-01_12-00-00/goal-tracker.md" +NEXT_CONTRACT="$TEST_DIR/repo-hook-impl/.humanize/rlcr/2024-03-01_12-00-00/round-3-contract.md" + +if [[ -f "$NEXT_PROMPT" ]] && grep -q "Scenario Matrix Re-anchor" "$NEXT_PROMPT" && grep -q "Current matrix mainline projection" "$NEXT_PROMPT"; then + pass "implementation follow-up prompt includes scenario matrix re-anchor" +else + fail "implementation follow-up prompt includes scenario matrix re-anchor" "scenario matrix section in round-3 prompt" "$(cat "$NEXT_PROMPT" 2>/dev/null || echo 'missing prompt')" +fi + +if [[ -f "$NEXT_PROMPT" ]] && \ + grep -q '^## Current Task Packet$' "$NEXT_PROMPT" && \ + grep -q '^## Manager Checkpoint$' "$NEXT_PROMPT" && \ + grep -q '^## Recent Review Coverage$' "$NEXT_PROMPT" && \ + grep -q '^## Task Packet Feedback Readback$' "$NEXT_PROMPT"; then + pass "implementation follow-up prompt includes task packet projection, recent review coverage, and readback instructions" +else + fail "implementation follow-up prompt includes task packet projection, recent review coverage, and readback instructions" "task packet, checkpoint, recent review coverage, and feedback readback sections in round-3 prompt" "$(cat "$NEXT_PROMPT" 2>/dev/null || echo 'missing prompt')" +fi + +if [[ -f "$GOAL_TRACKER_FILE" ]] && \ + grep -q 'Repair parser contract' "$GOAL_TRACKER_FILE" && \ + grep -q 'Update downstream validator \[task2\]' "$GOAL_TRACKER_FILE"; then + pass "implementation review syncs matrix projection back into goal tracker" +else + fail "implementation review syncs matrix projection back into goal tracker" "tracker shows mainline and blocking projection" "$(cat "$GOAL_TRACKER_FILE" 2>/dev/null || echo 'missing tracker')" +fi + +if [[ -f "$NEXT_CONTRACT" ]] && \ + grep -q 'Mainline Objective: task1: Repair parser contract' "$NEXT_CONTRACT" && \ + grep -q 'Checkpoint: checkpoint-' "$NEXT_CONTRACT" && \ + grep -q 'Residual Risk: score=' "$NEXT_CONTRACT" && \ + grep -q 'Blocking Side Issues In Scope: task2: Update downstream validator' "$NEXT_CONTRACT"; then + pass "implementation review writes next-round contract scaffold from scenario matrix" +else + fail "implementation review writes next-round contract scaffold from scenario matrix" "matrix-derived next-round contract" "$(cat "$NEXT_CONTRACT" 2>/dev/null || echo 'missing contract')" +fi + +if jq -e ' + .runtime.current_round == 3 + and .runtime.last_review.phase == "implementation" + and .runtime.last_review.verdict == "regressed" + and .runtime.last_review.coverage_available == true + and .runtime.last_review.coverage_summary.surface_count == 2 + and .runtime.last_review.coverage_summary.sibling_risk_count == 1 + and .runtime.last_review.coverage_summary.partial_or_unclear_count == 1 + and .runtime.review_coverage.source_phase == "implementation" + and .runtime.review_coverage.source_round == 3 + and ([.runtime.review_coverage.touched_failure_surfaces[] | select(.surface == "dependency-contract" and .confidence == "high")] | length) == 1 + and ([.runtime.review_coverage.likely_sibling_risks[] | select(.derived_from == "dependency-contract" and .confidence == "high")] | length) == 1 + and ([.runtime.review_coverage.coverage_ledger[] | select(.surface == "rollback-symmetry" and .status == "partial")] | length) == 1 + and (.events | length) >= 1 + and .tasks[0].state == "needs_replan" + and .tasks[1].state == "blocked" + and .oversight.status == "idle" +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "implementation review stores review coverage, event log, and dependent task states" +else + fail "implementation review stores review coverage, event log, and dependent task states" "round advanced with review coverage snapshot and replanning states" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 5a: Paragraph-format coverage ledgers are parsed into retained review coverage +# ======================================== + +PARAGRAPH_COVERAGE_JSON=$(scenario_matrix_extract_review_coverage_json $'## Touched Failure Surfaces\n- rollback-symmetry | why: cancellation and restore touch the same state machine | confidence: high\n\n## Coverage Ledger\nrollback-symmetry: partial. cancel path checked; restore path still unclear.\n\nresource-cleanup is covered. release path and rollback path were both inspected.\n\nCONTINUE') + +if jq -e ' + ([.touched_failure_surfaces[] | select(.surface == "rollback-symmetry" and .confidence == "high")] | length) == 1 + and ([.coverage_ledger[] | select(.surface == "rollback-symmetry" and .status == "partial")] | length) == 1 + and ([.coverage_ledger[] | select(.surface == "resource-cleanup" and .status == "covered")] | length) == 1 + and .summary.covered_count == 1 + and .summary.partial_or_unclear_count == 1 +' <<< "$PARAGRAPH_COVERAGE_JSON" >/dev/null 2>&1; then + pass "paragraph-format coverage ledgers are parsed into structured review coverage" +else + fail "paragraph-format coverage ledgers are parsed into structured review coverage" "paragraph coverage entries retained with correct counts" "$PARAGRAPH_COVERAGE_JSON" +fi + +# ======================================== +# Test 5aa: Code review cycle preserves the most recent implementation review coverage snapshot +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-review-coverage-preserve" 2 false true +MATRIX_FILE="$TEST_DIR/repo-review-coverage-preserve/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +scenario_matrix_apply_implementation_review "$MATRIX_FILE" 3 "stalled" $'## Touched Failure Surfaces\n- dependency-contract | why: parser and validator drifted together | confidence: high\n\n## Likely Sibling Risks\n- Validator update path may still rely on the old parser contract | derived_from: dependency-contract | axis: adjacent state transitions | why: downstream sync is brittle | check: inspect stale field assumptions | confidence: medium\n\nMainline Progress Verdict: STALLED\n\n## Coverage Ledger\n| Surface | Status | Notes |\n|---------|--------|-------|\n| dependency-contract | partial | parser repair checked; downstream validation path still open |\n\nCONTINUE' +scenario_matrix_record_code_review_cycle "$MATRIX_FILE" 4 "code_review_issues" + +if jq -e ' + .runtime.current_round == 4 + and .runtime.last_review.phase == "review" + and .runtime.last_review.verdict == "code_review_issues" + and .runtime.last_review.coverage_available == true + and .runtime.last_review.coverage_summary.surface_count == 1 + and .runtime.review_coverage.source_phase == "implementation" + and .runtime.review_coverage.source_round == 3 + and ([.runtime.review_coverage.coverage_ledger[] | select(.surface == "dependency-contract" and .status == "partial")] | length) == 1 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "code review cycle preserves the latest implementation review coverage snapshot" +else + fail "code review cycle preserves the latest implementation review coverage snapshot" "review phase keeps the last implementation coverage analysis intact" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 5ab: COMPLETE -> code review handoff preserves the newest implementation review coverage snapshot +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-complete-handoff-coverage" 2 false true +create_mock_codex "$TEST_DIR/repo-complete-handoff-coverage/bin" $'## Review Feedback\n\n## Touched Failure Surfaces\n- handoff-hotspot | why: the final implementation review still touches the same rollback hotspot | confidence: high\n\n## Likely Sibling Risks\n- Restore path may still drift from cancel semantics | derived_from: handoff-hotspot | axis: symmetric paths | why: the hotspot already needed multiple follow-up adjustments | check: audit the rollback restore branch and neighboring call sites | confidence: high\n\nMainline Progress Verdict: ADVANCED\n\n## Coverage Ledger\nrollback-symmetry: partial. cancel path checked; restore path still unclear.\n\nCOMPLETE' $'[P1] rollback restore path still breaks review.\n' + +echo "$HOOK_INPUT" | PATH="$TEST_DIR/repo-complete-handoff-coverage/bin:$PATH" CLAUDE_PROJECT_DIR="$TEST_DIR/repo-complete-handoff-coverage" bash "$STOP_HOOK" > /dev/null 2>&1 || true + +NEXT_PROMPT="$TEST_DIR/repo-complete-handoff-coverage/.humanize/rlcr/2024-03-01_12-00-00/round-3-prompt.md" +MATRIX_FILE="$TEST_DIR/repo-complete-handoff-coverage/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +if [[ -f "$NEXT_PROMPT" ]] && \ + grep -q '^## Recent Review Coverage$' "$NEXT_PROMPT" && \ + grep -q 'handoff-hotspot' "$NEXT_PROMPT" && \ + grep -q 'rollback-symmetry' "$NEXT_PROMPT"; then + pass "review follow-up prompt keeps the newest implementation review coverage after COMPLETE handoff" +else + fail "review follow-up prompt keeps the newest implementation review coverage after COMPLETE handoff" "recent review coverage from the final implementation review in round-3 prompt" "$(cat "$NEXT_PROMPT" 2>/dev/null || echo 'missing prompt')" +fi + +if jq -e ' + .runtime.current_round == 3 + and .runtime.last_review.phase == "review" + and .runtime.last_review.verdict == "code_review_issues" + and .runtime.last_review.coverage_available == true + and .runtime.last_review.coverage_summary.surface_count == 1 + and .runtime.last_review.coverage_summary.partial_or_unclear_count == 1 + and .runtime.review_coverage.source_phase == "implementation" + and .runtime.review_coverage.source_round == 3 + and ([.runtime.review_coverage.touched_failure_surfaces[] | select(.surface == "handoff-hotspot" and .confidence == "high")] | length) == 1 + and ([.runtime.review_coverage.coverage_ledger[] | select(.surface == "rollback-symmetry" and .status == "partial")] | length) == 1 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "code-review follow-up retains the latest implementation review coverage after COMPLETE handoff" +else + fail "code-review follow-up retains the latest implementation review coverage after COMPLETE handoff" "review phase matrix keeps the final implementation review coverage snapshot" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 5d: Summary task-packet feedback is ingested as non-authoritative execution feedback +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-hook-feedback" 2 false true +cat >> "$TEST_DIR/repo-hook-feedback/.humanize/rlcr/2024-03-01_12-00-00/round-2-summary.md" << 'EOF' + +## Task Packet Feedback +| Task ID | Source | Kind | Summary | +|---------|--------|------|---------| +| task2 | subagent-validator | dependency_note | Validator work should stay queued until parser repair stabilizes. | +EOF +create_mock_codex "$TEST_DIR/repo-hook-feedback/bin" "## Review Feedback + +Mainline Progress Verdict: ADVANCED + +Continue with the current mainline. + +CONTINUE" + +echo "$HOOK_INPUT" | PATH="$TEST_DIR/repo-hook-feedback/bin:$PATH" CLAUDE_PROJECT_DIR="$TEST_DIR/repo-hook-feedback" bash "$STOP_HOOK" > /dev/null 2>&1 || true +MATRIX_FILE="$TEST_DIR/repo-hook-feedback/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +if jq -e ' + (.feedback.execution | length) == 1 + and .feedback.execution[0].task_id == "task2" + and .feedback.execution[0].suggested_by == "subagent-validator" + and .feedback.execution[0].kind == "dependency_note" + and .feedback.execution[0].source_file == "round-2-summary.md" + and .feedback.execution[0].authoritative == false + and (.feedback.execution[0].summary | contains("queued until parser repair stabilizes")) + and .tasks[0].state == "in_progress" + and .tasks[1].state == "pending" +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "stop hook ingests task packet feedback into non-authoritative execution queue" +else + fail "stop hook ingests task packet feedback into non-authoritative execution queue" "execution feedback entry with preserved authoritative task state" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 5b: Recovered mainline clears stale replan state and reopens dependents +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-recovery" 2 false true +MATRIX_FILE="$TEST_DIR/repo-recovery/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +scenario_matrix_apply_implementation_review "$MATRIX_FILE" 3 "regressed" "An upstream dependency changed and downstream work must be replanned." +scenario_matrix_apply_implementation_review "$MATRIX_FILE" 4 "advanced" "Recovered the parser contract and resumed steady mainline progress." + +if jq -e ' + .runtime.current_round == 4 + and .runtime.last_review.phase == "implementation" + and .runtime.last_review.verdict == "advanced" + and .tasks[0].state == "in_progress" + and .tasks[0].health.stuck_score == 0 + and .tasks[0].strategy.repeated_failure_count == 0 + and .tasks[1].state == "pending" + and .oversight.status == "idle" +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "advanced review clears stale needs_replan state and reopens dependent tasks" +else + fail "advanced review clears stale needs_replan state and reopens dependent tasks" "mainline in_progress with dependent task pending" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 5e: Frontier reconcile promotes the next active task when the current primary is complete +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-frontier-reconcile" 2 false true +MATRIX_FILE="$TEST_DIR/repo-frontier-reconcile/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +jq ' + .tasks[0].state = "done" + | .tasks[0].health.last_progress_round = 2 + | .manager.current_primary_task_id = "task1" +' "$MATRIX_FILE" > "$MATRIX_FILE.tmp" && mv "$MATRIX_FILE.tmp" "$MATRIX_FILE" + +scenario_matrix_reconcile_manager_state "$MATRIX_FILE" 3 "frontier_shift" + +if jq -e ' + .manager.current_primary_task_id == "task2" + and .runtime.checkpoint.primary_task_id == "task2" + and .runtime.checkpoint.frontier_changed == true + and .runtime.checkpoint.sequence == 1 + and .tasks[1].lane == "mainline" + and .runtime.convergence.status == "stabilizing" +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "frontier reconcile promotes the next active task into the single primary objective" +else + fail "frontier reconcile promotes the next active task into the single primary objective" "task2 promoted to checkpoint primary" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 5g: Frontier reconcile prefers runnable work over blocked follow-up +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-frontier-runnable" 2 false true +MATRIX_FILE="$TEST_DIR/repo-frontier-runnable/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +jq ' + .metadata.seed_task_count = 3 + | .tasks[0].state = "done" + | .tasks[0].health.last_progress_round = 2 + | .tasks[1].state = "blocked" + | .tasks += [ + ( + .tasks[1] + | .id = "task3" + | .title = "Finalize executor cleanup" + | .lane = "queued" + | .state = "ready" + | .depends_on = [] + | .metadata.seed_source = "fixture" + ) + ] + | .manager.current_primary_task_id = "task1" +' "$MATRIX_FILE" > "$MATRIX_FILE.tmp" && mv "$MATRIX_FILE.tmp" "$MATRIX_FILE" + +scenario_matrix_reconcile_manager_state "$MATRIX_FILE" 3 "frontier_shift" + +if jq -e ' + .manager.current_primary_task_id == "task3" + and .runtime.checkpoint.primary_task_id == "task3" + and (.tasks[] | select(.id == "task3") | .lane == "mainline" and .state == "ready") + and (.tasks[] | select(.id == "task2") | .lane == "supporting" and .state == "blocked") +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "frontier reconcile prefers runnable work over blocked follow-up when selecting a new primary objective" +else + fail "frontier reconcile prefers runnable work over blocked follow-up when selecting a new primary objective" "task3 promoted while blocked task2 stays supporting" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 5f: Convergence stabilizes when only deferred watchlist work remains +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-convergence" 2 false true +MATRIX_FILE="$TEST_DIR/repo-convergence/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +jq ' + .tasks[0].state = "done" + | .tasks[1].state = "deferred" + | .tasks[1].lane = "queued" + | .tasks[1].admission.status = "watchlist" + | .tasks[1].admission.reason = "deferred_by_manager" + | .tasks[1].metadata.deferred_since_round = 2 +' "$MATRIX_FILE" > "$MATRIX_FILE.tmp" && mv "$MATRIX_FILE.tmp" "$MATRIX_FILE" + +scenario_matrix_reconcile_manager_state "$MATRIX_FILE" 3 "convergence_check" + +if jq -e ' + .manager.current_primary_task_id == null + and .runtime.convergence.status == "converged" + and .runtime.convergence.next_action == "prepare_closure" + and .runtime.convergence.must_fix_open_count == 0 + and .runtime.convergence.high_risk_open_count == 0 + and .runtime.convergence.active_task_count == 0 + and .runtime.convergence.watchlist_count == 1 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "convergence reconcile recognizes when only deferred watchlist work remains" +else + fail "convergence reconcile recognizes when only deferred watchlist work remains" "converged runtime with only watchlist residue" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 5g: Blocked finding backlog prevents premature convergence +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-convergence-finding-groups" 2 false true +MATRIX_FILE="$TEST_DIR/repo-convergence-finding-groups/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +jq ' + .tasks[0].state = "done" + | .tasks[1].state = "done" + | .tasks[1].lane = "queued" + | .manager.current_primary_task_id = null + | .runtime.checkpoint.primary_task_id = null +' "$MATRIX_FILE" > "$MATRIX_FILE.tmp" && mv "$MATRIX_FILE.tmp" "$MATRIX_FILE" + +scenario_matrix_ingest_review_findings "$MATRIX_FILE" 4 "review" "[P1] Downstream dependency mismatch still breaks review." +scenario_matrix_reconcile_manager_state "$MATRIX_FILE" 4 "finding_group_frontier" + +if jq -e ' + .manager.current_primary_task_id == null + and .runtime.convergence.status == "continue" + and .runtime.convergence.next_action == "advance_checkpoint" + and .runtime.convergence.must_fix_open_count == 1 + and .runtime.convergence.high_risk_open_count == 1 + and .runtime.convergence.active_task_count == 1 + and .runtime.convergence.watchlist_count == 0 + and ((.runtime.checkpoint.frontier_signature | fromjson | .blocked_finding_group_ids | length) == 1) +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "blocked grouped review backlog keeps the manager frontier open until it is resolved" +else + fail "blocked grouped review backlog keeps the manager frontier open until it is resolved" "continue convergence state with one blocked finding group in the checkpoint frontier" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi +# ======================================== +# Test 5c: Blocked dependents reopen once upstream starts advancing again +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-recovery-blocked" 2 false true +MATRIX_FILE="$TEST_DIR/repo-recovery-blocked/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +scenario_matrix_apply_implementation_review "$MATRIX_FILE" 3 "stalled" "The upstream dependency changed and downstream work is blocked." +scenario_matrix_apply_implementation_review "$MATRIX_FILE" 4 "advanced" "Recovered the parser contract and resumed steady mainline progress." + +if jq -e ' + .tasks[0].state == "in_progress" + and .tasks[1].state == "pending" +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "advanced review reopens previously blocked dependent tasks" +else + fail "advanced review reopens previously blocked dependent tasks" "dependent task returns to pending after upstream recovery" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 6: Repeated failures trigger bounded oversight intervention +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-hook-oversight" 2 false true +MATRIX_FILE="$TEST_DIR/repo-hook-oversight/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" +jq ' + .tasks[0].strategy.repeated_failure_count = 1 + | .tasks[0].health.stuck_score = 1 +' "$MATRIX_FILE" > "$MATRIX_FILE.tmp" && mv "$MATRIX_FILE.tmp" "$MATRIX_FILE" +create_mock_codex "$TEST_DIR/repo-hook-oversight/bin" "## Review Feedback + +Mainline Progress Verdict: REGRESSED + +The current approach is too broad. Split the recovery into smaller steps before editing more files. + +CONTINUE" + +echo "$HOOK_INPUT" | PATH="$TEST_DIR/repo-hook-oversight/bin:$PATH" CLAUDE_PROJECT_DIR="$TEST_DIR/repo-hook-oversight" bash "$STOP_HOOK" > /dev/null 2>&1 || true + +NEXT_PROMPT="$TEST_DIR/repo-hook-oversight/.humanize/rlcr/2024-03-01_12-00-00/round-3-prompt.md" +MATRIX_FILE="$TEST_DIR/repo-hook-oversight/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +if [[ -f "$NEXT_PROMPT" ]] && grep -q "Oversight Intervention" "$NEXT_PROMPT" && grep -q 'Action: `split`' "$NEXT_PROMPT"; then + pass "repeated failures inject oversight intervention into next-round prompt" +else + fail "repeated failures inject oversight intervention into next-round prompt" "oversight section with split action" "$(cat "$NEXT_PROMPT" 2>/dev/null || echo 'missing prompt')" +fi + +if jq -e ' + .oversight.status == "active" + and .oversight.last_action == "split" + and .oversight.intervention.action == "split" + and .oversight.intervention.target_task_id == "task1" + and .tasks[0].strategy.method_switch_required == true + and .tasks[0].strategy.repeated_failure_count == 2 + and .tasks[0].health.stuck_score == 2 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "repeated failures persist oversight action and task health in matrix" +else + fail "repeated failures persist oversight action and task health in matrix" "active split intervention with incremented health counters" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 7: Missing required matrix blocks stop hook +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-hook-missing" 1 false true +rm -f "$TEST_DIR/repo-hook-missing/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" +create_mock_codex "$TEST_DIR/repo-hook-missing/bin" + +OUTPUT=$(echo "$HOOK_INPUT" | PATH="$TEST_DIR/repo-hook-missing/bin:$PATH" CLAUDE_PROJECT_DIR="$TEST_DIR/repo-hook-missing" bash "$STOP_HOOK" 2>&1 || true) +if echo "$OUTPUT" | grep -q "Scenario Matrix Missing"; then + pass "missing required matrix blocks stop hook" +else + fail "missing required matrix blocks stop hook" "Scenario Matrix Missing block" "$OUTPUT" +fi + +# ======================================== +# Test 8: Review-phase follow-up prompt includes scenario matrix re-anchor +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-hook-review" 4 true true +create_mock_codex "$TEST_DIR/repo-hook-review/bin" "unused" "[P1] Dependency mismatch still breaks review." + +echo "$HOOK_INPUT" | PATH="$TEST_DIR/repo-hook-review/bin:$PATH" CLAUDE_PROJECT_DIR="$TEST_DIR/repo-hook-review" bash "$STOP_HOOK" > /dev/null 2>&1 || true + +NEXT_PROMPT="$TEST_DIR/repo-hook-review/.humanize/rlcr/2024-03-01_12-00-00/round-5-prompt.md" +MATRIX_FILE="$TEST_DIR/repo-hook-review/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +if [[ -f "$NEXT_PROMPT" ]] && grep -q "scenario matrix" "$NEXT_PROMPT" && grep -q "Scenario Matrix Re-anchor" "$NEXT_PROMPT" && grep -q 'manager-issued fix scope' "$NEXT_PROMPT"; then + pass "review-phase follow-up prompt includes scenario matrix guidance" +else + fail "review-phase follow-up prompt includes scenario matrix guidance" "scenario matrix text in round-5 prompt" "$(cat "$NEXT_PROMPT" 2>/dev/null || echo 'missing prompt')" +fi + +if [[ -f "$NEXT_PROMPT" ]] && ! grep -q 'Out Of Scope: task2: Update downstream validator' "$NEXT_PROMPT"; then + pass "review-phase task packet does not mark blocking in-scope work as out of scope" +else + fail "review-phase task packet does not mark blocking in-scope work as out of scope" "task packet without blocking task2 in Out Of Scope" "$(cat "$NEXT_PROMPT" 2>/dev/null || echo 'missing prompt')" +fi + +if jq -e '.runtime.current_round == 5 and .runtime.last_review.phase == "review" and .runtime.last_review.verdict == "code_review_issues"' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "review-phase follow-up records review cycle in matrix runtime state" +else + fail "review-phase follow-up records review cycle in matrix runtime state" "round 5 review-phase runtime state" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +if jq -e ' + (.tasks | length) == 2 + and ( + .tasks[] + | select(.id == "task2") + | .state == "blocked" + and .risk_bucket == "high" + and .metadata.last_review_finding_key == "dependency-mismatch-still-breaks-review" + and .metadata.review_finding_keys == ["dependency-mismatch-still-breaks-review"] + ) + and (.raw_findings | length) == 1 + and ( + .raw_findings[0] + | .finding_key == "dependency-mismatch-still-breaks-review" + and .link_task_id == "task2" + and .cluster_id == "cluster-dependency-contract" + and .repair_wave_hint == "wave-r5-dependency-contract" + ) + and (.finding_groups | length) == 0 + and (.feedback.review | length) == 1 + and ([.events[] | select(.type == "review_finding" and .task_id == "task2")] | length) == 1 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "review-phase findings annotate linked tasks while keeping review findings out of the task graph" +else + fail "review-phase findings annotate linked tasks while keeping review findings out of the task graph" "task2 annotated plus one linked raw finding" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 9: Finalize phase does not require matrix artifact to complete +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-hook-finalize" 2 false true +LOOP_DIR="$TEST_DIR/repo-hook-finalize/.humanize/rlcr/2024-03-01_12-00-00" +mv "$LOOP_DIR/state.md" "$LOOP_DIR/finalize-state.md" +cat > "$LOOP_DIR/finalize-summary.md" << 'EOF' +# Finalize Summary + +Ready to exit. +EOF +rm -f "$LOOP_DIR/scenario-matrix.json" + +OUTPUT=$(echo "$HOOK_INPUT" | PATH="$TEST_DIR/repo-hook-finalize/bin:$PATH" CLAUDE_PROJECT_DIR="$TEST_DIR/repo-hook-finalize" bash "$STOP_HOOK" 2>&1 || true) +if [[ -f "$LOOP_DIR/complete-state.md" ]] && [[ ! -f "$LOOP_DIR/finalize-state.md" ]] && ! echo "$OUTPUT" | grep -q "Scenario Matrix Missing"; then + pass "finalize phase ignores missing matrix artifact" +else + fail "finalize phase ignores missing matrix artifact" "complete-state.md without matrix enforcement block" "$OUTPUT" +fi + +# ======================================== +# Test 10: Legacy loops do not receive matrix re-anchor instructions +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-hook-legacy" 2 false false +STATE_FILE="$TEST_DIR/repo-hook-legacy/.humanize/rlcr/2024-03-01_12-00-00/state.md" +grep -v '^scenario_matrix_' "$STATE_FILE" > "$STATE_FILE.tmp" && mv "$STATE_FILE.tmp" "$STATE_FILE" +rm -f "$TEST_DIR/repo-hook-legacy/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" +create_mock_codex "$TEST_DIR/repo-hook-legacy/bin" "## Review Feedback + +Mainline Progress Verdict: STALLED + +Please tighten the current implementation path. + +CONTINUE" + +echo "$HOOK_INPUT" | PATH="$TEST_DIR/repo-hook-legacy/bin:$PATH" CLAUDE_PROJECT_DIR="$TEST_DIR/repo-hook-legacy" bash "$STOP_HOOK" > /dev/null 2>&1 || true + +NEXT_PROMPT="$TEST_DIR/repo-hook-legacy/.humanize/rlcr/2024-03-01_12-00-00/round-3-prompt.md" +if [[ -f "$NEXT_PROMPT" ]] && ! grep -qi "scenario matrix" "$NEXT_PROMPT"; then + pass "legacy loops omit scenario matrix prompt guidance" +else + fail "legacy loops omit scenario matrix prompt guidance" "next-round prompt without scenario matrix references" "$(cat "$NEXT_PROMPT" 2>/dev/null || echo 'missing prompt')" +fi + +# ======================================== +# Test 11: Implementation review findings create structured tasks and watchlist work +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-hook-impl-findings" 2 false true +create_mock_codex "$TEST_DIR/repo-hook-impl-findings/bin" $'## Review Feedback\n\nMainline Progress Verdict: STALLED\n\n- [P1] Dependency mismatch still breaks review.\n- [P3] Monitor wording nit should stay deferred.\n\nCONTINUE' + +echo "$HOOK_INPUT" | PATH="$TEST_DIR/repo-hook-impl-findings/bin:$PATH" CLAUDE_PROJECT_DIR="$TEST_DIR/repo-hook-impl-findings" bash "$STOP_HOOK" > /dev/null 2>&1 || true + +MATRIX_FILE="$TEST_DIR/repo-hook-impl-findings/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" +if jq -e ' + (.runtime.last_review.phase == "implementation") + and ( + .tasks[] + | select(.id == "task2") + | .state == "blocked" + and .risk_bucket == "high" + and .metadata.last_review_finding_key == "dependency-mismatch-still-breaks-review" + ) + and (.raw_findings | length) == 2 + and ([.raw_findings[] | select(.link_task_id == "task2" and .finding_key == "dependency-mismatch-still-breaks-review")] | length) == 1 + and ([.raw_findings[] | select(.admission_status == "watchlist" and .state == "deferred" and .finding_key == "monitor-wording-nit-should-stay-deferred")] | length) == 1 + and ([.finding_groups[] | select(.state == "deferred" and .surface_key == "docs-cleanup")] | length) == 1 + and (.feedback.review | length) == 2 + and ([.events[] | select(.type == "review_finding")] | length) == 2 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "implementation review findings annotate linked tasks and defer grouped backlog entries" +else + fail "implementation review findings annotate linked tasks and defer grouped backlog entries" "linked blocking finding plus deferred cleanup backlog" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 12: Repeated findings dedupe and deferred projection stay stable +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-finding-dedupe" 2 false true +LOOP_DIR="$TEST_DIR/repo-finding-dedupe/.humanize/rlcr/2024-03-01_12-00-00" +GOAL_TRACKER_FILE="$LOOP_DIR/goal-tracker.md" +cat >> "$GOAL_TRACKER_FILE" << 'EOF' + +### Completed and Verified +| AC | Task | Completed Round | Verified Round | Evidence | +|----|------|-----------------|----------------|----------| + +### Explicitly Deferred +| Task | Original AC | Deferred Since | Justification | When to Reconsider | +|------|-------------|----------------|---------------|-------------------| +EOF + +MATRIX_FILE="$LOOP_DIR/scenario-matrix.json" +REVIEW_FINDINGS=$'[P1] Dependency mismatch still breaks review.\n[P3] Monitor wording nit should stay deferred.' +scenario_matrix_ingest_review_findings "$MATRIX_FILE" 4 "review" "$REVIEW_FINDINGS" +scenario_matrix_ingest_review_findings "$MATRIX_FILE" 5 "review" "$REVIEW_FINDINGS" +scenario_matrix_sync_goal_tracker "$MATRIX_FILE" "$GOAL_TRACKER_FILE" + +if jq -e ' + (.tasks | length) == 2 + and (.raw_findings | length) == 2 + and ([.raw_findings[] | select(.finding_key == "dependency-mismatch-still-breaks-review" and .occurrence_count == 2 and .link_task_id == "task2")] | length) == 1 + and ([.raw_findings[] | select(.finding_key == "monitor-wording-nit-should-stay-deferred" and .occurrence_count == 2 and .admission_status == "watchlist")] | length) == 1 + and ([.finding_groups[] | select(.state == "deferred" and .finding_count == 1)] | length) == 1 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "repeated findings dedupe into raw findings without duplicating executable tasks" +else + fail "repeated findings dedupe into raw findings without duplicating executable tasks" "two raw findings with occurrence_count=2 and one deferred backlog group" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +if grep -q '^### Explicitly Deferred$' "$GOAL_TRACKER_FILE" && grep -q 'Monitor wording nit should stay deferred' "$GOAL_TRACKER_FILE"; then + pass "deferred watchlist tasks project into the goal tracker deferred section" +else + fail "deferred watchlist tasks project into the goal tracker deferred section" "deferred section containing the watchlist finding" "$(cat "$GOAL_TRACKER_FILE" 2>/dev/null || echo 'missing tracker')" +fi + +if grep -q 'Docs cleanup backlog for Repair parser contract | AC-1 | 4 |' "$GOAL_TRACKER_FILE"; then + pass "deferred tracker projection preserves the original defer round" +else + fail "deferred tracker projection preserves the original defer round" "deferred row with Deferred Since = 4" "$(cat "$GOAL_TRACKER_FILE" 2>/dev/null || echo 'missing tracker')" +fi + +# ======================================== +# Test 12b: Linked watchlist findings still project into grouped deferred backlog +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-linked-watchlist-finding" 2 false true +LOOP_DIR="$TEST_DIR/repo-linked-watchlist-finding/.humanize/rlcr/2024-03-01_12-00-00" +MATRIX_FILE="$LOOP_DIR/scenario-matrix.json" +GOAL_TRACKER_FILE="$LOOP_DIR/goal-tracker.md" + +cat >> "$GOAL_TRACKER_FILE" << 'EOF' + +### Completed and Verified +| AC | Task | Completed Round | Verified Round | Evidence | +|----|------|-----------------|----------------|----------| + +### Explicitly Deferred +| Task | Original AC | Deferred Since | Justification | When to Reconsider | +|------|-------------|----------------|---------------|-------------------| +EOF + +scenario_matrix_ingest_review_findings "$MATRIX_FILE" 4 "review" "[P3] task2 wording nit should stay deferred." +scenario_matrix_sync_goal_tracker "$MATRIX_FILE" "$GOAL_TRACKER_FILE" + +if jq -e ' + (.tasks | length) == 2 + and ([.raw_findings[] | select(.link_task_id == "task2" and .admission_status == "watchlist" and .state == "deferred")] | length) == 1 + and ([.finding_groups[] | select(.state == "deferred" and .related_task_ids == ["task2"] and .surface_key == "docs-cleanup")] | length) == 1 + and ([.feedback.review[] | select(.task_id == "task2" and .kind == "watchlist_finding")] | length) == 1 + and ( + .tasks[] + | select(.id == "task2") + | .state == "pending" + and .risk_bucket == "planned" + and .metadata.last_review_finding_key == "task2-wording-nit-should-stay-deferred" + ) +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "linked watchlist findings stay non-authoritative while still projecting into grouped deferred backlog" +else + fail "linked watchlist findings stay non-authoritative while still projecting into grouped deferred backlog" "one linked watchlist raw finding plus one deferred finding group for task2" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +if grep -q 'Docs cleanup backlog for Update downstream validator' "$GOAL_TRACKER_FILE"; then + pass "linked watchlist grouped backlog is visible in the goal tracker deferred section" +else + fail "linked watchlist grouped backlog is visible in the goal tracker deferred section" "goal tracker deferred row for linked watchlist backlog" "$(cat "$GOAL_TRACKER_FILE" 2>/dev/null || echo 'missing tracker')" +fi + +# ======================================== +# Test 13: Ambiguous dependency findings stay as standalone bounded work +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-finding-ambiguous" 2 false true +MATRIX_FILE="$TEST_DIR/repo-finding-ambiguous/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +jq ' + .metadata.seed_task_count = 3 + | .tasks += [{ + id: "task3", + title: "Refresh downstream serializer", + lane: "supporting", + routing: "coding", + owner: null, + scope: { + summary: "", + paths: [], + constraints: [] + }, + cluster_id: null, + repair_wave: null, + risk_bucket: "planned", + admission: { + status: "active", + reason: "fixture" + }, + authority: { + write_mode: "manager_only", + authoritative_source: "manager" + }, + target_ac: ["AC-1"], + depends_on: ["task1"], + state: "pending", + assumptions: [], + strategy: { + current: "refresh-serializer", + attempt_count: 0, + repeated_failure_count: 0, + method_switch_required: false + }, + health: { + stuck_score: 0, + last_progress_round: 0 + }, + metadata: { + seed_source: "fixture" + } + }] +' "$MATRIX_FILE" > "$MATRIX_FILE.tmp" && mv "$MATRIX_FILE.tmp" "$MATRIX_FILE" + +scenario_matrix_ingest_review_findings "$MATRIX_FILE" 4 "review" "[P1] Downstream dependency mismatch still breaks review." + +if jq -e ' + (.tasks | length) == 3 + and ([.tasks[] | select(.id == "task2" or .id == "task3") | select((.metadata.last_review_finding_key // null) == null)] | length) == 2 + and (.raw_findings | length) == 1 + and ( + .raw_findings[0] + | .finding_key == "downstream-dependency-mismatch-still-breaks-review" + and .related_task_id == null + and .link_task_id == null + and .state == "blocked" + and .depends_on == ["task1"] + ) + and ([.finding_groups[] | select(.state == "blocked" and .related_task_ids == [])] | length) == 1 + and ([.feedback.review[] | select(.task_id == null and .kind == "structured_finding")] | length) == 1 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "ambiguous dependency findings stay in grouped backlog instead of mutating an arbitrary dependent" +else + fail "ambiguous dependency findings stay in grouped backlog instead of mutating an arbitrary dependent" "unchanged task2/task3 plus one blocked raw finding group" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 13b: Explicit descriptive task ids link findings back to existing tasks +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-finding-explicit-id" 2 false true +MATRIX_FILE="$TEST_DIR/repo-finding-explicit-id/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +jq ' + .manager.current_primary_task_id = "parser-fix" + | .runtime.checkpoint.primary_task_id = "parser-fix" + | .tasks[0].id = "parser-fix" + | .tasks[1].id = "validator-sync" + | .tasks[1].depends_on = ["parser-fix"] +' "$MATRIX_FILE" > "$MATRIX_FILE.tmp" && mv "$MATRIX_FILE.tmp" "$MATRIX_FILE" + +scenario_matrix_ingest_review_findings "$MATRIX_FILE" 4 "review" "[P1] parser-fix still breaks review." + +if jq -e ' + (.tasks | length) == 2 + and ( + .tasks[] + | select(.id == "parser-fix") + | .lane == "mainline" + and .state == "blocked" + and .risk_bucket == "high" + and .metadata.last_review_finding_key == "parser-fix-still-breaks-review" + ) + and ([.events[] | select(.task_id == "parser-fix" and .type == "review_finding")] | length) == 1 + and ([.raw_findings[] | select(.link_task_id == "parser-fix" and .finding_key == "parser-fix-still-breaks-review")] | length) == 1 + and (.finding_groups | length) == 0 + and ([.feedback.review[] | select(.task_id == "parser-fix" and .kind == "structured_finding")] | length) == 1 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "explicit descriptive task ids annotate the referenced existing task without creating grouped backlog work" +else + fail "explicit descriptive task ids annotate the referenced existing task without creating grouped backlog work" "parser-fix updated in place with one linked raw finding and no finding group" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 13c: Legitimate task ids that start with finding-r are not migrated away +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-finding-prefix-plan-task" 2 false true +MATRIX_FILE="$TEST_DIR/repo-finding-prefix-plan-task/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +jq ' + .manager.current_primary_task_id = "finding-rules" + | .runtime.checkpoint.primary_task_id = "finding-rules" + | .tasks[0].id = "finding-rules" + | .tasks[0].title = "Define finding rules" + | .tasks[0].metadata.seed_source = "plan_task" + | .tasks[0].source = "plan" +' "$MATRIX_FILE" > "$MATRIX_FILE.tmp" && mv "$MATRIX_FILE.tmp" "$MATRIX_FILE" + +scenario_matrix_ingest_review_findings "$MATRIX_FILE" 4 "review" "[P1] task2 dependency mismatch still breaks review." + +if jq -e ' + (.tasks | length) == 2 + and ([.tasks[] | select(.id == "finding-rules" and .title == "Define finding rules")] | length) == 1 + and ([.tasks[] | select(.id == "task2" and .state == "blocked")] | length) == 1 + and ([.raw_findings[] | select(.link_task_id == "task2" and .finding_key == "task2-dependency-mismatch-still-breaks-review")] | length) == 1 +' "$MATRIX_FILE" >/dev/null 2>&1; then + pass "plan tasks with finding-r prefixes stay in the executable task graph during review ingestion" +else + fail "plan tasks with finding-r prefixes stay in the executable task graph during review ingestion" "finding-rules still present as a plan task plus one linked raw finding for task2" "$(cat "$MATRIX_FILE" 2>/dev/null || echo 'missing matrix')" +fi + +# ======================================== +# Test 13d: Supporting-window tasks do not also appear as queued out-of-scope work +# ======================================== + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-contract-supporting-window" 2 false true +MATRIX_FILE="$TEST_DIR/repo-contract-supporting-window/.humanize/rlcr/2024-03-01_12-00-00/scenario-matrix.json" + +jq ' + .tasks[0].repair_wave = "wave-r2-parser-contract" + | .tasks[1].repair_wave = "wave-r2-parser-contract" + | .tasks[1].state = "ready" + | .tasks[1].lane = "queued" +' "$MATRIX_FILE" > "$MATRIX_FILE.tmp" && mv "$MATRIX_FILE.tmp" "$MATRIX_FILE" + +scenario_matrix_reconcile_manager_state "$MATRIX_FILE" 2 "supporting_window_projection" +CONTRACT_OUTPUT=$(scenario_matrix_render_round_contract "$MATRIX_FILE" 2 "implementation") + +if echo "$CONTRACT_OUTPUT" | grep -q 'Supporting Window In Scope: task2: Update downstream validator' && \ + ! echo "$CONTRACT_OUTPUT" | grep -q 'Queued Side Issues Out of Scope: .*task2: Update downstream validator'; then + pass "round contract excludes supporting-window tasks from queued out-of-scope projection" +else + fail "round contract excludes supporting-window tasks from queued out-of-scope projection" "task2 only listed in Supporting Window In Scope" "$CONTRACT_OUTPUT" +fi + +# ======================================== +# Test 14: Monitor helper reports matrix and legacy status safely +# ======================================== + +VALID_SESSION_DIR=$(find "$REPO_VALID_DIR/.humanize/rlcr" -mindepth 1 -maxdepth 1 -type d | head -1) +VALID_STATE_FILE="$VALID_SESSION_DIR/state.md" +VALID_MONITOR_OUTPUT=$(SESSION_DIR="$VALID_SESSION_DIR" STATE_FILE="$VALID_STATE_FILE" HUMANIZE_SCRIPT="$HUMANIZE_SCRIPT" bash -lc 'source "$HUMANIZE_SCRIPT"; humanize_parse_scenario_matrix "$SESSION_DIR" "$STATE_FILE"') + +if echo "$VALID_MONITOR_OUTPUT" | grep -q '^ready|2|task1 - Implement parser \[state=ready, routing=coding\]|idle|none|checkpoint-1|continue|advance_checkpoint|none$'; then + pass "monitor helper reports ready matrix state, checkpoint, and convergence for new loops" +else + fail "monitor helper reports ready matrix state, checkpoint, and convergence for new loops" "ready matrix snapshot with checkpoint and convergence fields" "$VALID_MONITOR_OUTPUT" +fi + +setup_matrix_test_dir +setup_manual_loop_repo "$TEST_DIR/repo-monitor-legacy" 2 false false +LEGACY_SESSION_DIR="$TEST_DIR/repo-monitor-legacy/.humanize/rlcr/2024-03-01_12-00-00" +LEGACY_STATE_FILE="$LEGACY_SESSION_DIR/state.md" +grep -v '^scenario_matrix_' "$LEGACY_STATE_FILE" > "$LEGACY_STATE_FILE.tmp" && mv "$LEGACY_STATE_FILE.tmp" "$LEGACY_STATE_FILE" +rm -f "$LEGACY_SESSION_DIR/scenario-matrix.json" +LEGACY_MONITOR_OUTPUT=$(SESSION_DIR="$LEGACY_SESSION_DIR" STATE_FILE="$LEGACY_STATE_FILE" HUMANIZE_SCRIPT="$HUMANIZE_SCRIPT" bash -lc 'source "$HUMANIZE_SCRIPT"; humanize_parse_scenario_matrix "$SESSION_DIR" "$STATE_FILE"') + +if echo "$LEGACY_MONITOR_OUTPUT" | grep -q '^legacy|0|Legacy loop without scenario matrix\.|idle|none|n/a|legacy|n/a|none$'; then + pass "monitor helper treats pre-matrix loops as legacy instead of missing" +else + fail "monitor helper treats pre-matrix loops as legacy instead of missing" "legacy matrix snapshot" "$LEGACY_MONITOR_OUTPUT" +fi + + +print_test_summary "Scenario Matrix Foundation Tests"