feat: add NAESatisfiability → SetSplitting reduction (#841) [full pipeline v2]#985
Closed
feat: add NAESatisfiability → SetSplitting reduction (#841) [full pipeline v2]#985
Conversation
Typst: Construction (3 steps) + Correctness (⟹ + ⟸) + Extraction + Overhead + YES/NO examples Constructor: 32,513 checks, 0 failures (exhaustive n ≤ 5, 7 sections) Adversary: 46,774 checks, 0 failures (independent impl + hypothesis) Cross-comparison: 615 instances, all agree Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implemented via /verify-reduction -> /add-reduction pipeline. Includes: reduction rule, 7 tests, canonical example in example_db. Verification artifacts cleaned up (79K+ Python checks preceded this Rust impl). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #985 +/- ##
========================================
Coverage 98.03% 98.04%
========================================
Files 784 786 +2
Lines 82310 82461 +151
========================================
+ Hits 80695 80846 +151
Misses 1615 1615 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
zazabap
added a commit
that referenced
this pull request
Apr 1, 2026
Root cause: PRs #985 and #991 failed CI because: 1. Local clippy doesn't use -D warnings but CI does (caught needless_range_loop) 2. New reductions can create paths that dominate existing direct reductions (test_find_dominated_rules_returns_known_set has hardcoded known set) Added to Step 6: - Mandatory `cargo clippy -- -D warnings` (not just `cargo clippy`) - Mandatory `cargo test` (full suite, not filtered) - Explicit dominated-rules gate with fix instructions Added to Common Mistakes: - clippy without -D warnings - dominated rules test - skipping full cargo test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves clippy -D warnings failure on Rust 1.94 CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GiggleLiu
added a commit
that referenced
this pull request
Apr 6, 2026
* docs: add design spec for proposed reductions Typst note Covers 9 reductions: 2 NP-hardness chain extensions (#973, #198), 4 Tier 1a blocked issues (#379, #380, #888, #822), and 3 Tier 1b blocked issues (#892, #894, #890). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add verify-reduction skill for mathematical verification of reductions New skill: /verify-reduction <issue-number> End-to-end pipeline that takes a reduction rule issue and produces: 1. Typst proof (Construction/Correctness/Extraction/Overhead + YES/NO examples) 2. Python verification script (7 mandatory sections, ≥5000 checks, exhaustive n≤5) 3. Lean 4 lemmas (non-trivial structural proofs required) Follows issue-to-pr conventions: creates worktree, works in isolation, submits PR. Strict quality gates (zero tolerance): - No "trivial" category — every reduction ≥5000 checks - 7 mandatory Python sections including NO (infeasible) example - Non-trivial Lean required (rfl/omega tautologies rejected) - Zero hand-waving in Typst ("clearly", "obviously" → rejected) - Mandatory gap analysis: every proof claim must have a test - Self-review checklist with 20+ items across 4 categories Developed and validated through PR #975 (800K+ checks, 3 bugs caught) and tested on issues #868 (caught wrong example) and #841 (35K checks). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: add YAML frontmatter + professional tone to verify-reduction skill - Added frontmatter (name, description) matching other skills' convention - Toned down aggressive language ("ZERO TOLERANCE", "THE HARSHEST STEP", "NON-NEGOTIABLE") to professional but firm language - All quality gates unchanged — same strictness, better presentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: adversarial multi-agent verification in verify-reduction skill Replaces Lean-required gates with adversarial second agent: - Step 5: Adversary agent independently implements reduce() and extract_solution() from theorem statement only (not constructor's script) - Step 5c: Cross-comparison of both implementations on 1000 instances - Lean downgraded from required to optional - hypothesis property-based testing for n up to 50 - Quality gates: 2 independent scripts ≥5000 checks each + cross-comparison Design rationale (docs/superpowers/specs/2026-04-01-adversarial-verification-design.md): - Same agent writing proof + test is the #1 risk for AI verification - Two independent implementations agreeing > one + trivial Lean - Lean caught 0 bugs in PR #975; Python caught all 4 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: design spec for verify-reduction skill enhancements Typst↔Python auto-matching, test vectors JSON for downstream consumption by add-rule and review-pipeline, adversary tailoring by reduction type, compositional verification via pred CLI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: implementation plan for verify-reduction enhancements 5 tasks: update verify-reduction (Step 4.5 auto-matching, Step 5 typed adversary, Step 8 downstream artifacts), create add-reduction skill, register in CLAUDE.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: enhance verify-reduction with test vectors export, typed adversary, pipeline integration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: create add-reduction skill — consumes verified artifacts from verify-reduction * feat: register add-reduction skill in CLAUDE.md, update verify-reduction description Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: three improvements to verify-reduction and add-reduction skills 1. verify-reduction Step 1: type compatibility gate — checks source/target Value types before proceeding. Stops and comments on issue if types are incompatible (e.g., optimization → decision needs K parameter). 2. add-reduction Step 7: mandatory cleanup of verification artifacts from docs/paper/verify-reductions/ — Python scripts, JSON, Typst, PDF must not get into the library. 3. add-reduction Steps 4/4b/5: mandatory requirements from #974 — canonical example in rule_builders.rs (Check 9), example-db lookup test (Check 10), paper reduction-rule entry (Check 11). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: concise verify-reduction (761→124 lines) + self-contained add-reduction verify-reduction: removed verbose templates, condensed checklists into prose, kept all requirements but removed boilerplate code blocks that the agent can derive from context. add-reduction: integrated add-rule Steps 1-6, write-rule-in-paper Steps 1-6, and #974 requirements (Checks 9/10/11) into a single self-contained skill. No need to read 3 other skills. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: restore structural requirements in verify-reduction (124→274 lines) The previous rewrite over-condensed the skill, removing gates that agents need to follow: 7-section descriptions with table, minimum check count table, check count audit template, gap analysis format, common mistakes table, and self-review checklist with checkboxes. Restored: all structural gates and requirements. Kept concise: no verbose Python/Typst code templates (agent derives these). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: harden add-reduction with file-level verification gates Steps 4, 4b, 5 now have HARD GATE labels with verification commands that check the SPECIFIC required files appear in `git diff --name-only`. Step 8 has a pre-commit gate that lists all 6 required files and blocks commit if any is missing. Root cause: subagents skipped Steps 4 (put example in rule file instead of rule_builders.rs) and 5 (skipped paper entry entirely) because the skill said "MANDATORY" but had no mechanical enforcement. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add CI-equivalent checks to add-reduction pre-commit gate Root cause: PRs #985 and #991 failed CI because: 1. Local clippy doesn't use -D warnings but CI does (caught needless_range_loop) 2. New reductions can create paths that dominate existing direct reductions (test_find_dominated_rules_returns_known_set has hardcoded known set) Added to Step 6: - Mandatory `cargo clippy -- -D warnings` (not just `cargo clippy`) - Mandatory `cargo test` (full suite, not filtered) - Explicit dominated-rules gate with fix instructions Added to Common Mistakes: - clippy without -D warnings - dominated rules test - skipping full cargo test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: correct add-reduction HARD GATE for canonical examples rule_builders.rs is a 4-line pass-through — canonical examples are registered via canonical_rule_example_specs() in each rule file, wired through mod.rs. Updated Step 4 to match actual architecture. Also added analysis.rs to git add list (for dominated-rules updates). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: parent-side verification + pre-commit hook for add-reduction Two enforcement mechanisms that don't rely on subagent compliance: 1. Parent-side verification (Step 8a): After subagent reports DONE, the parent runs file gate checks independently. If any required file is missing, sends the subagent back — doesn't trust self-report. 2. Pre-commit hook (.claude/hooks/add-reduction-precommit.sh): Mechanically blocks commits of new rule files unless example_db.rs, reductions.typ, and mod.rs are also staged. Subagents cannot bypass. Root cause: subagents skip HARD GATE steps despite skill text saying "MANDATORY". Text-based enforcement doesn't work — need mechanical checks that run after the subagent, not instructions the subagent reads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: strengthen type compatibility gate in verify-reduction skill Expanded the type compatibility table to explicitly list all incompatible pairs (Min→Or, Max→Or, Max→Min, Min→Max, Or→Sum, etc.) with concrete examples from batch verification (#198 MVC→HamCircuit, #890 MaxCut→OLA). Added common mistake entry for proceeding past the type gate. Learned from batch run: 5 out of 50 reductions were mathematically verified but turned out to be unimplementable as ReduceTo due to type mismatches that the original gate didn't catch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Delete docs/superpowers/plans/2026-04-01-verify-reduction-enhancements.md * Delete docs/superpowers/specs/2026-03-31-proposed-reductions-note-design.md * Delete .claude/CLAUDE.md * Revert "Delete .claude/CLAUDE.md" This reverts commit 71c1444. * chore: remove docs/superpowers/specs/ directory Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: integrate verify-reduction into add-rule pipeline - Delete /add-reduction skill and pre-commit hook (absorbed into /add-rule) - /add-rule now runs /verify-reduction by default; --no-verify to skip - /verify-reduction simplified: no worktree, no PR, no saved artifacts - /issue-to-pr passes --no-verify flag through to /add-rule - Update CLAUDE.md skill descriptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Jinguo Liu <cacate0129@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements NAESatisfiability → SetSplitting using the updated
/verify-reduction→/add-reductionpipeline. Compare with PR #984 (old pipeline) to see the differences.What's new vs PR #984
Or→Orverified in Step 1rule_builders.rsexample_db.rsCommit structure
168f6be— verification artifacts (Typst proof + Python scripts from PR docs: verify reduction #841 — NAESatisfiability → SetSplitting #980)17e9bca— Rust implementation + example_db + tests + artifact cleanupThe second commit shows the clean pattern: -6942 lines (artifacts removed) + 324 lines (Rust added).
Pipeline flow demonstrated
Test plan
cargo test naesatisfiability_to_setsplitting— 7 tests passcargo clippy— cleanCloses #841
🤖 Generated with Claude Code