Left matmul 100x performance improvements by dance858 · Pull Request #44 · SparseDifferentiation/SparseDiffEngine

dance858 · 2026-02-12T23:17:58Z

This PR accelerates left matrix multiplication by 100x (!) by avoiding explicit Kronecker product construction. Instead of treating the operation as a generic sparse matrix–matrix multiply, we use specialized logic that exploits the block/Kronecker structure. The initialization that took 15 seconds on one of Max's problems now takes 0.17 seconds.

This will make the parameter code for left matmul much simpler, since you only need to update A when refreshing a parameter @Transurgeon.

I have not tested that our Python tests in DNLP pass with this code, so let's wait with merging it until I've done so.

We should also do this refactor for right matmul, but that's for another day. I wonder if claude can code it up by mimicking my implementation of left_matmul? @Transurgeon

include/subexpr.h

include/utils/linalg.h

src/bivariate/left_matmul.c

include/subexpr.h

dance858 · 2026-02-13T17:56:23Z

I tested that it works on the python side and added a few extra tests on the python side. I also addressed your comments @Transurgeon, so I'm merging this.

Sync parameter-support with main's left matmul 100x perf improvements (PR #44) and right matmul refactor (PR #46). Simplify param matmul to store only the small A matrix instead of block-diagonal — block_left_multiply_* functions handle the rest. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

dance858 added 10 commits February 12, 2026 07:30

test for profiling

de23e18

90 times faster sparsity pattern

486c03c

fill values without forming A kron

f54382c

update forward pass

37fc13a

fix test

3bd25ce

ran formatter

40f85e0

profile forward pass

7a1bd4e

ran formatter

f3bd95f

removed kroenecker product from hessian

d668708

minor changes

8259371

Transurgeon reviewed Feb 13, 2026

View reviewed changes

include/subexpr.h Outdated Show resolved Hide resolved

include/utils/linalg.h Outdated Show resolved Hide resolved

src/bivariate/left_matmul.c Show resolved Hide resolved

include/subexpr.h Outdated Show resolved Hide resolved

dance858 added 3 commits February 13, 2026 09:09

broadcast fix

6b2ae18

ran formatter

fa947bc

improved documentation of block_left_mult

0542a4d

dance858 merged commit 0e804a0 into main Feb 13, 2026
11 checks passed

dance858 deleted the profile-left-matmul branch February 13, 2026 20:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Left matmul 100x performance improvements#44

Left matmul 100x performance improvements#44
dance858 merged 13 commits intomainfrom
profile-left-matmul

dance858 commented Feb 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dance858 commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dance858 commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dance858 commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dance858 commented Feb 12, 2026 •

edited

Loading