Formal Verification of Sharpness-Aware Minimization with Z-Score Gradient Filtering in Lean 4.
LeanSharp is the formal, mathematical sister-project to ZSharp. While ZSharp provides an empirical PyTorch implementation of Z-Score filtered SAM (achieving +5.26% accuracy over SGD), this repository constructs a completely rigorous foundation for the algorithm using the Lean 4 interactive theorem prover.
Machine Learning optimization algorithms are notoriously difficult to analyze theoretically. Proofs of convergence for Deep Learning optimizers often rely on informal heuristics or hidden assumptions about the loss landscape.
By formally verifying Z-Score SAM in Lean 4, every mathematical step—from the Fréchet derivative of the loss function to the final contraction properties of the gradient filter—is rigorously checked by a verified kernel.
For a detailed overview of the project's design patterns, including the W parameter space abstraction and the recursive Chain/ChainData architecture, see ARCHITECTURE.md.
-
Robust Convergence Theory: Proved
$O(1/T)$ stochastic convergence under$\alpha$ -stable noise and established geometric convergence ($O(c^T)$) for strongly convex objectives. Established$50%$ outlier stability through formalized breakdown-point analysis. -
Unified Alignment Framework: Established the definitive
AlignmentConditionbridge, mathematically linking deterministic gradient geometry to stochastic Z-score filtering. -
Formal Stability & Regularity: Completed
StabilityCertificate$C^2$ smoothness and Lipschitz regularity proofs for the entire core stack, includingLinear,Softmax,Attention,LayerNorm, andBatchNorm. - Generalization Theory: Fully formalized the PAC-Bayesian population risk bounds by proving the Donsker-Varadhan Variational Inequality using Mathlib's information-theoretic machinery.
| Task | Priority | Justification |
|---|---|---|
| Z-Score CLT | High | Formally characterize the statistical limit of the filtered gradient as $ |
| Non-Convex Population Risk | Medium | Extend PAC-Bayes analysis to non-convex landscapes using local stability certificates. |
| Transformer Invariance | Medium | Prove permutation and scaling invariance properties for the formalized MHA architecture. |
| Task | Priority | Justification |
|---|---|---|
| Stochastic Generalization | High | Extend alignment to heavy-tailed noise distributions. |
| NTK Dynamics | Low | Prove network initialization and NTK-regime bounds. |
| Optimality Bound | Low | Prove statistical lower bounds via information theory. |
| Diffusion Stability | Low | Formalize SDE objectives and stability for DDPMs. |
Make sure you have elan installed for Lean 4 version management.
git clone https://github.com/bangyen/leansharp.git
cd leansharp
lake exe cache get # Downloads the pre-compiled Mathlib libraries
lake buildThis repo uses standard Mathlib naming conventions. If you're a Lean 4 wizard interested in ML optimization theory, feel free to submit PRs targeting the roadmap!
If you use this work in your research, please cite:
@misc{pham_leansharp_2026,
author = {Pham, Bangyen},
title = {LeanSharp: Formal Verification of Sharpness-Aware Minimization with Z-Score Gradient Filtering in Lean 4},
year = {2026},
url = {https://github.com/bangyen/leansharp}
}