Promote GPU Wave 9 and Hypercomplex Wave 9#41
Open
agourakis82 wants to merge 105 commits intomainfrom
Open
Conversation
…shability + associator field
5 new experiments for Paper A (Non-Associative SSMs):
1. octonion_168_associators: Verified 168 = |PSL(2,7)| nonzero basis
associators. Fano plane, anticommutativity, Moufang, alternativity,
binary {0,2} norm, antisymmetry — all PASS.
2. octonion_path_products: Path products on graphs. Parenthesization
dependence proven: (e1·e3)·e5 = -e4 vs e1·(e3·e5) = +e4.
42 associative + 168 non-associative = 210 distinct triples.
Fano cycle → scalar. Conjugation reversal verified.
3. ossm_order_sensitivity: Cross-dim 1.34x more order-sensitive than
diagonal (state distance [0,1] vs [1,0]). Full backprop training.
4. ossm_distinguishability: Cross-dim 4.14x wider state spread than
diagonal across all 32 binary sequences of length 5. Both produce
32 distinct states, but cross-dim separates them 4x further.
1.29x more permutation-sensitive.
5. octonion_associator_field: Associator field on 8-node connectome-like
graph. 6 triangles, 4 non-associative (order-dependent), 2 Fano-aligned.
Path dependence: 0→1→4→7 gives -e2, 0→2→5→7 gives -e3.
…ison Three deep Paper A experiments: 1. ossm_fano_selective: α-sweep over Fano/cross-Fano coupling. α=0 (pure associative) wins on loss (0.713), but α=1 has 6.66x higher order sensitivity. Core tradeoff: optimization ease vs representational richness. 2. ossm_moufang_dynamics: Complete algebraic hierarchy verification. 7/7 PASS — Right Moufang (343), Left Moufang (343), Right Bol (343), Flexibility (49), Diassociativity (49), 7 quaternion subalgebras enumerated (= Fano lines), power-associativity (x⁴=1). 3. ossm_scaled_comparison: 7-dim O-SSM (84 params) vs diagonal (42). Cross-dim has higher training loss (1.07 vs 0.69) but WINS on test accuracy (55% vs 45%). Wider state spread enables better generalization despite harder optimization.
Holonomy computation on complete graph K7 with generic octonion labels. Key results: - 35 triangles: 7 flat (±e0) + 28 curved (±ei) - The 7 flat triangles correspond exactly to 7 Fano triples - Holonomy spectrum perfectly uniform: exactly 4 per curved basis - 80% associator curvature density - Same 7/28 split for quadrilateral holonomies - Parenthesization-dependent triangles = 28/35 The 7/28 = Fano/non-Fano split is a manifestation of the PSL(2,7) symmetry in the holonomy of octonion-labeled graphs.
Composition algebra property |xy|=|x||y| verified computationally: - 50 random octonion products: ZERO relative error - Unit octonion transition preserves norm: 1.000→0.999999 (10 steps) - Diagonal SSM DOES NOT preserve norm: 1.000→0.329 (drift 67%) Theoretical foundation: - Hurwitz (1898): normed division algebras in dim 1,2,4,8 ONLY - O-SSM uses dim 8: the MAXIMAL composition algebra - No zero divisors confirmed → normed division algebra - Cayley-Dickson tower: O is last step with norm preservation - Sedenions (dim 16) have zero divisors → cannot build S-SSM This proves O-SSM is architecturally unique: the deepest SSM with guaranteed state norm preservation, by Hurwitz's theorem.
…ranching
Three frontier experiments:
1. octonion_g2_automorphisms: Exhaustive enumeration of ALL 5040
permutations of {1..7}. Exactly 168 preserve Fano plane structure
= |PSL(2,7)| = |GL(3,F₂)|. Second independent proof of 168.
Cyclic shift and doubling map verified as automorphisms.
σ(ei·ej) = σ(ei)·σ(ej) for all 49 basis pairs.
2. octonion_cross_product_7d: The 7D cross product exists ONLY in
dims 1, 3, 7 (Brown-Gray 1967). Verified: antisymmetry, orthogonality
(zero error), Lagrange identity |a×b|² = |a|²|b|² - (a·b)² (zero error),
Jacobi identity FAILS with |J|²=9 (NOT a Lie algebra — hallmark of
non-associativity). This IS the geometric operation in O-SSM mixing.
3. octonion_catalan_branching: For n factors, C(n-1) parenthesizations.
n=3: 168/210 = 80% non-associative (168 again!).
n=4: ALL 840 quadruples give exactly 2 distinct results out of C(3)=5.
Branching saturates at 2 due to Moufang constraints.
n=10: 4862 phantom trajectories, n=20: 1.77 billion.
…l (honest negative)
…A fixed, accuracy plateaus at 24%)
…34.5%) Fixed BPTT with continuous value encoding (not token discretization). Two-phase training: output-only warmup (30 ep) then full BPTT (50 ep). O-SSM (full BPTT): 45.0% (1.35x random, +8.5pp over output-only) Diagonal (full BPTT): 34.5% (random, BPTT doesn't help diagonal) O-SSM (output-only): 36.5% (reference) BPTT through A matrix improves O-SSM by 8.5 percentage points. Diagonal stays at random even WITH full backprop — proving cross-dim coupling is the source of advantage, not training method.
… with full backprop
…0x, Ensemble 15%/20x Four-way UQ comparison on sin(2πx) with σ=0.05 noise: E-KAN GUM: 100% coverage, 1x cost (analytical, single pass) Laplace Approx: 100% coverage, 3x cost (Hessian computation) MC Dropout: 43% coverage, 50x cost (N=50 forward passes) Deep Ensemble: 15% coverage, 20x cost (N=20 independent MLPs) E-KAN GUM matches Laplace at 1/3 cost, crushes MC Dropout and Deep Ensemble on coverage. The hat-basis piecewise-linear representation enables both accurate uncertainty AND good function approximation. On x² (quadratic): E-KAN GUM 100%, Deep Ensemble 57%. Addresses reviewer criticism "compare to MC Dropout, Laplace, SNGP, not just small ensembles." E-KAN GUM is competitive with Laplace and strictly better than MC Dropout/Ensemble on coverage AND cost.
…50x, Ensemble 15%/20x
…tive overcoverage
…r (Sounio) - Paper A: added as co-author via \And in NeurIPS format - Paper B: added as co-author via \And in NeurIPS format - Website about page: added Contributors section with Dionisio
Author formatting: - Both papers: superscript numbering with ORCID on separate line Hallucinated references FIXED (Paper B): - hassan2024bayeskan: wrong arXiv ID (2408.02243→2408.02706), wrong authors (T. Hassan, A. Devkota → M. M. Hassan) - mollaali2025conformal: wrong authors (added H. Gupta → real 6 authors), missing arXiv ID (added 2504.15240) - ju2025svgpkan: wrong authors (T. Ju, Y. Li, Z. Zhang → Y. S. Ju), wrong venue (IEEE TNNLS under review → arXiv:2512.05306) - saunders2019rapamycin: wrong year/venue (2019 KIR → 2001 KI), incomplete authors (et al → 3 named) Reference FIXED (Paper A): - brandstetter2023clifford: typo J.~"; Gupta → J.~K.~Gupta AI prose removed: - "Crucially" → removed - "rich mathematical landscape" → "mathematical framework" - "comprehensive algebraic treatments" → "cover the algebra in detail" - "leveraging" → "exploiting" (2 occurrences)
Paper A: - Intro rewritten: "Every modern SSM assumes associativity... We ask: what if non-associativity is a feature?" - "path dependence is not a defect" framing - Discussion: "choose algebra the way a physicist chooses coordinates" - Parallel scan: "we do not pretend otherwise" - Limitations: "What we have not shown" (honest framing) - Conclusion: "SSMs have treated associativity as a requirement. We showed it is a choice." Paper B: - Intro: "A neural network that cannot say 'I don't know' is dangerous in a hospital." - "None of them give you what a metrologist actually needs" - Conclusion: "The question was simple: can a neural network produce uncertainty estimates that a metrologist would sign off on?" - "We found three walls, all expected"
# Conflicts: # docs/governance/DOCS_ACCEPTANCE_REPORT.md
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
gpu-compiler-capability-wave9-codexhypercomplex-algebra-wave9-codexNotes
compiler/main --self-testremains red in the knownimport-budget/SRC overflowzone and is not treated here as a blocking signal for the Hypercomplex track.Operational guidance
Note
Medium Risk
Medium risk because CI is restructured to introduce new self-host authority/provenance/dual-trust and ABI/parity gates with longer timeouts and artifact uploads, which may impact required-check behavior and mergeability. Most other changes are docs/governance and generated artifact updates with low runtime risk.
Overview
CI is reworked to formalize self-host baseline verification. The prior
native-selfhostjob is replaced byselfhost-authorityandselfhost-abi-parity, runningselfhost_authority_gate.sh, provenance verification, andselfhost_dual_trust_gate.sh, publishing markdown summaries to the GitHub step summary, increasing timeouts, and uploading detailed gate artifacts.GPU public-contract evidence is refreshed and made more granular. The committed
gpu_public_contract.v1.jsonis updated to include explicit multidimensionalGPU.launchsurface coverage and dedicated negative fixtures per unsupportedgpu.*builtin, and the GPU docs are updated to reference a new repo-local capability taxonomy and gates.Docs/governance expands with new architecture notes, ADRs, and operator runbooks. Adds a new
docs/architecture/*set, an ADR series underdocs/decisions/, maintainer-facing selfhost authority/release-train/debt-register and baseline stewardship docs, plus new paper submission helper files; the docs governance registry/matrix/report is updated to include these new topics and updated counts.Written by Cursor Bugbot for commit d064f25. This will update automatically on new commits. Configure here.