Skip to content

feat(sumcheck): complete y-first cutover and recover perf#46

Open
quangvdao wants to merge 9 commits intomainfrom
quang/shifted-eq-dp
Open

feat(sumcheck): complete y-first cutover and recover perf#46
quangvdao wants to merge 9 commits intomainfrom
quang/shifted-eq-dp

Conversation

@quangvdao
Copy link
Copy Markdown

@quangvdao quangvdao commented Apr 7, 2026

Motivation

Technique 2's y-first cutover is now complete across both Stage 1 and Stage 2. We need to preserve the semantic switch to y-first ordering, keep the compact prefix fast paths, and recover the profiling regressions that showed up during the migration.

Summary

  • complete the Stage 1 plus Stage 2 y-first cutover, with no return to x-first compatibility shims
  • keep the compact witness and recursive opening-point flows aligned with y-first challenge ordering
  • recover the observed Stage 1 onehot regression by fusing the sparse y-stage full-table fold with next-round polynomial generation

Key changes

  • reorder Stage 1 inputs to y-first at the call sites and make the recursive opening point carry y-first challenges directly
  • store compact witness evaluations as x-outer, y-inner slices and thread that layout through Stage 1, Stage 2, the tree backend, and two-round prefix reconstruction
  • keep verifier-side Stage 2 m(x) sourcing explicit via Stage2MEvalSource and y-first ring-switch outputs
  • add a fused Stage 1 sparse-x/y handoff so the y-first cutover keeps onehot performance close to or better than main without reverting semantics

Performance

Benchmarks on the local profiling machine, using warmed reruns against a clean main worktree:

  • onehot nv32: commit 0.290s -> 0.190s, prove 0.833s -> 0.748s, verify 0.105s -> 0.095s
  • dense nv26: commit 8.640s -> 5.672s, prove 2.258s -> 1.830s, verify 0.0420s -> 0.0341s

Test coverage

  • cargo fmt -q
  • cargo clippy --all --message-format=short -q -- -D warnings
  • cargo test

Files/components impacted

  • src/algebra/shifted_eq.rs
  • src/protocol/commitment_scheme.rs
  • src/protocol/hachi_poly_ops/onehot.rs
  • src/protocol/ring_switch.rs
  • src/protocol/sumcheck/hachi_stage1.rs
  • src/protocol/sumcheck/hachi_stage1_tree.rs
  • src/protocol/sumcheck/hachi_stage2.rs
  • src/protocol/sumcheck/two_round_prefix.rs

Posted by Cursor assistant (model: GPT-5.4) on behalf of the user (Quang Dao) with approval.

Made with Cursor

Switch Stage 2 to the y-first witness layout and compute ring-switch
m(x) on demand so the verifier no longer materializes m_evals_x.

Keep Stage 1 x-first behind compatibility shims, including compact
witness transposes, two-round-prefix updates, and wiring-layer
challenge reordering, so recursive proofs continue to chain
correctly during the Phase 1 transition.

Made-with: Cursor
@cursor
Copy link
Copy Markdown

cursor bot commented Apr 7, 2026

PR Summary

High Risk
High risk because it changes core cryptographic proving/verifying logic (sumcheck challenge ordering, witness table layout, and recursive opening-point derivation), where subtle ordering bugs could invalidate proofs or weaken soundness.

Overview
Completes the y-first challenge-order cutover across ring switch, Stage 1, Stage 2, and recursive level chaining, removing the prior x/y reordering shim and instead threading y-first challenges directly as the next-level opening point.

The compact witness layout is switched to x-outer/y-inner throughout, updating Stage 1/2 folding, prefix fast-paths, and the Stage1 tree backend accordingly; a new Stage 1 sparse x/y fusion path is added to regain performance after the ordering change. Verifier-side Stage 2 now takes an explicit Stage2MEvalSource for m(x) evals, ring-switch verifier input validation is tightened, and multiple tests are updated/added to lock in the new opening geometry and fused transitions.

Reviewed by Cursor Bugbot for commit 7350955. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

Benchmark Report

  • Latest run: 91fb27c
  • Message: Merge 7350955 into 9c8bbaf
  • Ref: quang/shifted-eq-dp
  • Main baseline: 9c8bbaf from merge-base on main.
  • Previous run: 02c8b9d from the previous successful PR update.
  • Binary: target/release/examples/profile.
  • Memory: maximum resident set size from /usr/bin/time on the benchmark process.

Single Polynomial x 32 Variables

  • Benchmark: 1-of-256 one-hot with 32 variables
  • Sparsity: each polynomial is 1-of-256 one-hot (equivalently, 1-sparse over 256 slots, density 0.39%).
  • Command: target/release/examples/profile with HACHI_MODE=onehot HACHI_NUM_VARS=32 HACHI_NUM_POLYS=1 HACHI_PROFILE_TRACE=0 HACHI_PROFILE_SPAN_CLOSES=0 HACHI_PROFILE_LOG=info HACHI_PROFILE_ANSI=0.
Metric Main baseline Previous run Latest run Unit
Setup 3.716 4.016 3.774 s
Commit 1.759 1.826 1.765 s
Prove (Hachi) 4.077 4.624 4.529 s
Prove (Total) 4.077 4.624 4.529 s
Verify (Hachi) 0.177 0.198 0.181 s
Verify (Total) 0.177 0.198 0.181 s
Max RSS 2834.2 2834.5 2834.9 MiB
  • Proof size: 73,712 B
  • Hachi fold bytes: 34,288 B
  • Tail bytes: 39,424 B
  • Proof framing bytes: 0 B
  • Hachi levels: 7
  • Tail shape: 78,848 elems at 4 bits/elem
  • Observed terminal state: w_len=78,848 with log_basis=4
Per-level parameters
L Config D nA nB nD lb l1 m r δcommit δopen δfold next w (ring) next w (field) planned bytes
L0 D32-na3 32 3 2 2 2 256 16 11 1 64 11 1,245,760 39,864,320 4,672 B
L1 D32-na2 32 2 2 2 2 256 13 8 1 64 10 98,334 3,146,688 4,352 B
L2 D32-na2 32 2 2 2 3 256 11 6 1 43 6 17,822 570,304 4,832 B
L3 D32-na2 32 2 2 2 4 256 10 5 1 32 5 6,113 195,616 5,216 B
L4 D32-na2 32 2 2 2 4 256 9 4 1 32 5 3,707 118,624 5,072 B
L5 D32-na2 32 2 2 2 4 256 9 3 1 32 4 2,880 92,160 5,072 B
L6 D32-na2 32 2 2 2 4 256 9 3 1 32 4 2,464 78,848 5,072 B
Per-level proof-size breakdown
L total y_ring v stage1 sc interstage s_claim stage2 sc next_w_commit next_w_eval
L0 4,672 B 512 1,024 832 0 16 1,248 1,024 16
L1 4,352 B 512 1,024 704 0 16 1,056 1,024 16
L2 4,832 B 512 1,024 1,280 0 16 960 1,024 16
L3 5,216 B 512 1,024 1,728 32 16 864 1,024 16
L4 5,072 B 512 1,024 1,632 32 16 816 1,024 16
L5 5,072 B 512 1,024 1,632 32 16 816 1,024 16
L6 5,072 B 512 1,024 1,632 32 16 816 1,024 16

4 Polynomials x 30 Variables

  • Benchmark: 4 same-point 1-of-256 one-hot polynomials with 30 variables each
  • Batch: same-point opening of 4 polynomials, each with 30 variables.
  • Sparsity: each polynomial is 1-of-256 one-hot (equivalently, 1-sparse over 256 slots, density 0.39%).
  • Command: target/release/examples/profile with HACHI_MODE=onehot HACHI_NUM_VARS=30 HACHI_NUM_POLYS=4 HACHI_PROFILE_TRACE=0 HACHI_PROFILE_SPAN_CLOSES=0 HACHI_PROFILE_LOG=info HACHI_PROFILE_ANSI=0.
Metric Main baseline Previous run Latest run Unit
Setup 5.141 5.376 5.168 s
Commit 2.247 2.158 2.227 s
Prove (Hachi) 3.913 4.682 4.375 s
Prove (Total) 3.914 4.682 4.375 s
Verify (Hachi) 0.178 0.199 0.179 s
Verify (Total) 0.178 0.199 0.179 s
Max RSS 2892.0 2892.0 2892.5 MiB
  • Proof size: 75,248 B
  • Hachi fold bytes: 35,824 B
  • Tail bytes: 39,424 B
  • Proof framing bytes: 0 B
  • Hachi levels: 7
  • Tail shape: 78,848 elems at 4 bits/elem
  • Observed terminal state: w_len=78,848 with log_basis=4
Per-level proof-size breakdown
L total y_ring v stage1 sc interstage s_claim stage2 sc next_w_commit next_w_eval
L1 4,352 B 512 1,024 704 0 16 1,056 1,024 16
L2 4,832 B 512 1,024 1,280 0 16 960 1,024 16
L3 5,216 B 512 1,024 1,728 32 16 864 1,024 16
L4 5,072 B 512 1,024 1,632 32 16 816 1,024 16
L5 5,072 B 512 1,024 1,632 32 16 816 1,024 16
L6 5,072 B 512 1,024 1,632 32 16 816 1,024 16

Posted by Cursor assistant (model: GPT-5.4) on behalf of the user (Quang Dao) with approval.

Restore the fused stage-2 round-2 handoff so y-first prefix proofs stop
rescanning the compact witness after the two-round-prefix transition, and
clear the prefix state once that handoff completes to avoid stale-path
reentry.

Also narrow the temporary dead-code allowances introduced during the phase-1
y-first cutover by routing the verifier through the shared shifted-eq
dispatcher and dropping now-unused test helpers.

Made-with: Cursor
Make stage 1 bind y-first and move the only coordinate reorder to
stage-1 input so stage 2 consumes r_stage1 directly.

Preserve the compact two-round prefix path and sparse-x handling
while removing the old stage1-to-stage2 compatibility bridge.

Made-with: Cursor
Remove the dead planner option builder so PlannerOptions only exposes
configuration toggles that are still wired into the search flow.

Made-with: Cursor
Remove the unused shifted-eq evaluation helpers and the stale test that only exercised them so CI can keep treating dead code as an error.

Made-with: Cursor
After the y-first cutover, recursive stage transitions can carry the
sumcheck challenges directly as the next opening point. Remove the
identity helper and the dead width bookkeeping it forced the prover and
verifier to carry.

Made-with: Cursor
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 54be014. Configure here.

Fuse the sparse y-stage full-table fold with next-round polynomial generation so the y-first Stage 1 cutover recovers the onehot regression without reverting semantics.

Made-with: Cursor
@quangvdao quangvdao changed the title feat(sumcheck): implement phase-1 y-first cutover feat(sumcheck): complete y-first cutover and recover perf Apr 10, 2026
Now that y-first is the only ordering, remove the _y_first suffix
from reorder_stage1_coords, build_compact_s_table, and all related
variable names. Deduplicate pad_compact_witness, advance_stage1_claim,
and reorder helpers that were copy-pasted across three test modules.
Delete the unused shifted_eq module.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant