Context
Nine-round RLCR session that converged successfully (all acceptance criteria met, code review ended with no [P0-9] issues at round 8 of 42). Methodology analysis surfaced eight project-agnostic improvements derived purely from per-round summaries and review results — no project-specific information.
Reviews were high-signal throughout. Every flagged gap was real, evidence-backed, and tied to a contract. The implementation side responded with focused fixes plus regression tests in the very next round; no rework cycles were needed for any reviewer-named gap.
Two structural patterns are worth improving for future RLCR runs on other projects:
- Early "scope underclaim → scope re-anchor" — round 0 narrowed scope significantly more than the plan authorized, the reviewer rejected the narrowing, and the loop spent re-anchoring effort.
- Late "drip-fed review findings" — rounds 5–8 each surfaced two-to-three correctness gaps in the same module, each round closing one batch. In aggregate this looks like one pass of medium-depth review spread across four serial rounds. The implementation side never proactively self-audited the module to surface remaining defects in one pass.
Methodology improvement suggestions
-
Mandate a scope-claim audit subsection in every round contract before execution begins. The contract template should enumerate every plan task and tag each as in-scope-this-round, in-scope-future-round, or explicitly-deferred-with-justification. The reviewer's first job each round becomes ratifying the deferral list against the plan, before any code is read. Catches unjustified narrowing at contract-time rather than at review-time.
-
Add a contract field for "deterministic gate metric" on every acceptance criterion claim. For every AC the round claims to satisfy, name the exact deterministic gate (a number, a boolean, a count, an inequality) and where in the artifact it lives. Forces the implementation side to surface gating evidence at contract-time; the reviewer can reject the contract before any implementation work happens if the gate is non-deterministic (e.g., "screenshots and counts" instead of "a numeric inequality").
-
Require a pre-review self-audit pass on any module the round modifies, before submitting for external review. A structured self-audit checklist: for every public surface touched, list the failure modes I have not tested, and either test them now or queue them as known-gap with rationale. Machine-checkable: every newly added public surface must have at least one negative-path test. Consolidates findings and reduces review rounds required to reach steady state.
-
Lock summary files at round close and provide a separate corrigenda file for downstream corrections. When a later round identifies a factual error in an earlier summary, the correction is appended to a per-loop corrigenda artifact with a back-reference to the original line. Tooling treats corrigenda as authoritative for any downstream consumer. Avoids the "amend or restate" dilemma when an edit hook prevents amendment of prior-round summaries.
-
Forbid the implementation side from declaring the plan "closed" in the shared tracker before the reviewer has ratified all gates. Only the reviewer can move a task from "active" to "closed-and-verified". Implementation can mark tasks "implemented, awaiting verification", which is informationally distinct. Avoids the "premature victory" pattern where the tracker drifts to "all closed / no active tasks" before the reviewer has verified the closure.
-
Add a "review depth budget" parameter to the loop to surface drip-fed findings earlier. Let the loop request "deep audit of module X" rather than the default "review what was changed this round". Auto-trigger this mode when three consecutive review rounds find new issues in the same module. The implementation side can then close the surfaced backlog in one or two consolidated rounds rather than four serial single-batch rounds.
-
Standardize a regression-coverage delta report in every summary. Per-round structured field: tests added (count), surfaces covered (list), negative-path tests added (count), contract-locking tests added (count). Lets the reviewer quickly verify that a round's fixes are durably locked rather than implicitly hoping the next review pass would re-catch a regression. Especially valuable in long tails of review-only rounds.
-
Require the contract to explicitly classify each round as "feature work" versus "review-fix only" with different acceptance gates. Tag the round type in the contract; review-fix rounds focus primarily on "did the named findings get closed and locked with regression tests" rather than re-auditing scope. Lets the loop apply different stagnation heuristics to each round type — a long tail of review-fix rounds is a different signal than a long tail of feature rounds.
How these suggestions were derived
Eight rounds of round-summary + round-review records analyzed by an opus agent under strict sanitization rules (no file paths, no symbol names, no commit hashes, no business-domain terms, no code fragments). The full analysis report is preserved in the originating loop's artifact directory; this issue contains only the project-agnostic methodology recommendations.
Context
Nine-round RLCR session that converged successfully (all acceptance criteria met, code review ended with no
[P0-9]issues at round 8 of 42). Methodology analysis surfaced eight project-agnostic improvements derived purely from per-round summaries and review results — no project-specific information.Reviews were high-signal throughout. Every flagged gap was real, evidence-backed, and tied to a contract. The implementation side responded with focused fixes plus regression tests in the very next round; no rework cycles were needed for any reviewer-named gap.
Two structural patterns are worth improving for future RLCR runs on other projects:
Methodology improvement suggestions
Mandate a scope-claim audit subsection in every round contract before execution begins. The contract template should enumerate every plan task and tag each as
in-scope-this-round,in-scope-future-round, orexplicitly-deferred-with-justification. The reviewer's first job each round becomes ratifying the deferral list against the plan, before any code is read. Catches unjustified narrowing at contract-time rather than at review-time.Add a contract field for "deterministic gate metric" on every acceptance criterion claim. For every AC the round claims to satisfy, name the exact deterministic gate (a number, a boolean, a count, an inequality) and where in the artifact it lives. Forces the implementation side to surface gating evidence at contract-time; the reviewer can reject the contract before any implementation work happens if the gate is non-deterministic (e.g., "screenshots and counts" instead of "a numeric inequality").
Require a pre-review self-audit pass on any module the round modifies, before submitting for external review. A structured self-audit checklist: for every public surface touched, list the failure modes I have not tested, and either test them now or queue them as known-gap with rationale. Machine-checkable: every newly added public surface must have at least one negative-path test. Consolidates findings and reduces review rounds required to reach steady state.
Lock summary files at round close and provide a separate corrigenda file for downstream corrections. When a later round identifies a factual error in an earlier summary, the correction is appended to a per-loop corrigenda artifact with a back-reference to the original line. Tooling treats corrigenda as authoritative for any downstream consumer. Avoids the "amend or restate" dilemma when an edit hook prevents amendment of prior-round summaries.
Forbid the implementation side from declaring the plan "closed" in the shared tracker before the reviewer has ratified all gates. Only the reviewer can move a task from "active" to "closed-and-verified". Implementation can mark tasks "implemented, awaiting verification", which is informationally distinct. Avoids the "premature victory" pattern where the tracker drifts to "all closed / no active tasks" before the reviewer has verified the closure.
Add a "review depth budget" parameter to the loop to surface drip-fed findings earlier. Let the loop request "deep audit of module X" rather than the default "review what was changed this round". Auto-trigger this mode when three consecutive review rounds find new issues in the same module. The implementation side can then close the surfaced backlog in one or two consolidated rounds rather than four serial single-batch rounds.
Standardize a regression-coverage delta report in every summary. Per-round structured field: tests added (count), surfaces covered (list), negative-path tests added (count), contract-locking tests added (count). Lets the reviewer quickly verify that a round's fixes are durably locked rather than implicitly hoping the next review pass would re-catch a regression. Especially valuable in long tails of review-only rounds.
Require the contract to explicitly classify each round as "feature work" versus "review-fix only" with different acceptance gates. Tag the round type in the contract; review-fix rounds focus primarily on "did the named findings get closed and locked with regression tests" rather than re-auditing scope. Lets the loop apply different stagnation heuristics to each round type — a long tail of review-fix rounds is a different signal than a long tail of feature rounds.
How these suggestions were derived
Eight rounds of round-summary + round-review records analyzed by an opus agent under strict sanitization rules (no file paths, no symbol names, no commit hashes, no business-domain terms, no code fragments). The full analysis report is preserved in the originating loop's artifact directory; this issue contains only the project-agnostic methodology recommendations.