Skip to content

metal with deterministic proof gen#15

Open
lighter-zz wants to merge 10 commits intodevfrom
zz/metal
Open

metal with deterministic proof gen#15
lighter-zz wants to merge 10 commits intodevfrom
zz/metal

Conversation

@lighter-zz
Copy link
Contributor

No description provided.

timemeansalot and others added 10 commits March 9, 2026 15:27
GPU-accelerate two proving bottlenecks via Apple Metal:

1. Merkle tree construction (Poseidon2) — GPU hashing for trees
   with 2^13 to 2^20 leaves
2. Quotient polynomial evaluation — fused gate eval + alpha reduction
   with zero device memory allocation

Enable with `features = ["metal"]`. Automatic runtime dispatch via
TypeId checks; falls back to CPU when conditions are not met.

Benchmark (lighter-prover, 500 txs, 125 chunks):
- CPU-only: 653.5s
- GPU Metal: 471.6s (1.39x, 27.8% faster)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verified with back-to-back runs: speedup ranges 21-28% across runs
due to thermal/load variance. Report the conservative 1.27x (21.3%).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Shows Apple M4 GPU History: active with Metal on, idle with Metal off.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Shows Merkle vs Quotient contribution for each circuit type.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove per-gate Instant::now() + AtomicU64 profiling that broke compiler
auto-vectorization and added ~40% overhead to CPU gate evaluation.
GPU quotient kernel and caching remain enabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- GPU_BENCHMARK_COMBINED.md: Metal vs CUDA side-by-side comparison
- GPU_COST_ANALYSIS.md: AWS cost per proof analysis (mac-m4 vs g6e L40S)
- METAL_BENCH_GUIDE.md: Step-by-step guide for running Metal benchmark
- cost_per_proof.png + plot_cost.py: Cost comparison bar chart
- GPU_BENCHMARK_RESULTS.md: Updated with validated 1.56-1.61x speedup numbers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants