feat: add chunksize parameter to `AutoEnzyme` #124

gdalle · 2025-08-15T22:14:38Z

Checklist

Appropriate tests were added
Any code changes were done in a way that does not break public API
All documentation related to code changes were updated
The new code follows the
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Any new documentation only uses public API

Additional context

Fixes #85, fixes #114

Note that these changes rely on downstream packages (mostly DI) to interpret them correctly when reconstructing an Enzyme mode object before differentiation. Between the release of ADTypes v1.18 and the release of the corresponding DI patch, the new backend parameters introduced here will have zero effect. That's not a great situation but ADTypes has no way to require a future DI version before introducing these changes. The real solution to prevent such gaps would probably be to merge ADTypes into DI, so that the semantics are defined at the same time as the backends.

wsmoses

What if specifying the runtime activity flag with the auto forward mode created an EnzymeCore.set_runtime_activity(enzymecore.forward), that would mean that downstream users don't change

gdalle · 2025-08-15T22:27:18Z

Downstream code needs to change anyway, if only because of the chunk size.
Indeed I thought of adding a converter inside the EnzymeCore extension, but that would make it necessary for EnzymeCore to be loaded for the AutoEnzyme(; mode=ADTypes.ForwardMode()) constructor to run. I guess it depends how extreme you want to be about #113 / #123: does your ADTypes constructor need to work before Enzyme is loaded or not?

wsmoses · 2025-08-16T06:08:38Z

Yeah but making sure that the downstream users don't need to reimplement the "to mode" logic is nice -- and especially making sure that it's implemented correctly here/consistently.

Is there a reason for not doing that offhand?

gdalle · 2025-08-16T06:27:41Z

The main reason is people wanting to be able to specify the mode without having Enzyme loaded (see #113 or #123). If you really want to do that, then you need to defer the to_mode call to the differentiation step.
But in all honesty, that sounds like a bad idea to me anyway. If you're going to use Enzyme, you might as well load it, or rather load EnzymeCore which is very lightweight. Furthermore, having more than one syntax to specify the mode is unneeded complexity. I'd much prefer removing everything from this PR except the chunk size addition, which is not present in the Enzyme mode object and which is consistent with other ADTypes backends.

gdalle · 2025-08-16T06:30:23Z

The middle ground you're suggesting is for people who are ok with loading EnzymeCore before constructing the backend object, but are not okay using the EnzymeCore mode objects, so they want something like this

using EnzymeCore, ADTypes
backend = AutoEnzyme(mode=ADTypes.ForwardMode())

rather than something like that

using EnzymeCore, ADTypes
backend = AutoEnzyme(mode=EnzymeCore.Forward)

which sounds very strange to me

gdalle · 2025-08-16T06:35:15Z

To clarify, I completely agree with you that getting the EnzymeCore mode object as soon as possible and as uniquely as possible is a good thing for correctness. I'm just trying to be the devil's advocate and argue in favor of a request like #113, which I tried to entertain in this PR. But my opinion is that the added complexity is not worth it, and I don't really see a scenario where loading EnzymeCore at backend definition time is a bad thing. With that in mind, I'd rather keep one single mode specification, the existing one.

wsmoses · 2025-08-16T06:40:46Z

I think it's mostly making as easy as possible for the user (just seeing the ADTypes package from their end, and not knowing they want to import EnzymeCore in place of enzyme, which seems to be the case in the linked issue). Similarly for consistency, if other backends also use the mode specifier from here.

Also one we can potentially avoid the import two packages problem to register to mode is just depending on EnzymeCore (it is dependency free and rarely changed so shouldn't cause any problems)

gdalle · 2025-08-16T06:47:20Z

To sum up, we have four options:

Specify mode with EnzymeCore only (current behavior)
Specify mode with EnzymeCore or ADTypes, conversion to EnzymeCore downstream (current state of this PR)
Specify mode with EnzymeCore or ADTypes, conversion to EnzymeCore in ADTypes extension
Specify mode with EnzymeCore or ADTypes, conversion to EnzymeCore in ADTypes itself with new dependency on EnzymeCore

I agree that 2 is not great for correctness. I'm putting a veto on 4, cause I don't want ADTypes or DI getting a hard dependency on any AD package, no matter how lightweight. This leaves 1 or 3. My preference would be to keep 1, precisely because I deem it more simple to have just one syntax, but if you think your users would prefer 3 with the additional ADTypes mode and the conversion in EnzymeCoreExt, I'm happy to adapt the PR. Your call.

gdalle · 2025-08-27T05:53:07Z

@wsmoses I just remembered a reason why options 3 and 4 are not possible anyway: when the user selects AutoEnzyme(; mode = nothing), this means that we need to pick the best mode depending on the differentiation operator that gets used. For DI.pushforward, it will be a forward mode, but for DI.gradient, it will be a reverse mode (and I assume other packages like Optimization.jl do something similar). This kind of choice needs to happen downstream, so the conversion to an EnzymeCore object needs to happen downstream as well (because we cannot convert nothing to an EnzymeCore.mode without knowing the operator to apply).

gdalle · 2025-08-27T05:53:27Z

In other words, I think this PR is the right version, and I'd appreciate other reviews

src/dense.jl

wsmoses · 2025-08-27T11:28:55Z

I actually dislike that convention, because it means that people don't know what to expect from downstream packages (e.g. maybe someone chooses forward vs reverse, and something fails unexpectedly). Similarly here you end up in the situation where you have multiple conflicting ways of specifying things -- leading to further confusion/ambiguity.

Since I think I would push for 4 or 3, and veto 2; and you would push for 2 and veto 4, I think that means we stick with 1.

gdalle · 2025-08-27T13:20:23Z

I actually dislike that convention, because it means that people don't know what to expect from downstream packages (e.g. maybe someone chooses forward vs reverse, and something fails unexpectedly). Similarly here you end up in the situation where you have multiple conflicting ways of specifying things -- leading to further confusion/ambiguity.

Yeah, the default AutoEnzyme() is a compromise. It's open to downstream interpretation, but at least it doesn't expect beginners to know what forward and reverse modes are, or which one is (usually) best for their given application. In any case, it's probably too breaking to remove.

Since I think I would push for 4 or 3, and veto 2; and you would push for 2 and veto 4, I think that means we stick with 1.

Alright then, I removed the ADTypes mode specification, but I left runtime activity and chunksize choices. In particular, this will allow people to use AutoEnzyme(; runtime_activity=true) in e.g. Turing and have DI automatically pick set_runtime_activity(Reverse) as the mode when computing gradients, which is a nice QoL improvement.

If you think this is good to go, we can merge and I can do the DI follow up.

wsmoses · 2025-08-27T13:22:22Z

I think without the conversion to mode in ADTypes, we shouldn't include runtime activity -- as thats also in the mode and would lead to ambiguity.

gdalle · 2025-08-27T13:35:33Z

Fair enough, I removed it in the last commit. Now this only adds a chunk size and changes nothing else

gdalle · 2025-09-08T08:07:38Z

Gentle bump @wsmoses

wsmoses · 2025-09-08T13:39:11Z

src/dense.jl

+        if C isa Int
+            @assert C > 0
+        elseif C isa Float64
+            @assert C == Inf


we should give a better error message here

wsmoses · 2025-09-08T13:40:14Z

src/dense.jl


-      + an object subtyping `EnzymeCore.Mode` (like `EnzymeCore.Forward` or `EnzymeCore.Reverse`) if a specific mode is required
-      + `nothing` to choose the best mode automatically
+      + a positive `Int` to fix a constant chunk size


I kind of wonder if a chunk size of 0 here would be a good way to represent maximum chunk size

I get where you're coming from but I have two objections:

semantically, chunksize=Inf makes much more sense even to the uninformed reader

practically, a zero chunksize is also a thing, and it is very very tricky to handle correctly across backends (see this recent discussion in DI Inconsistency in handling empty arguments JuliaDiff/DifferentiationInterface.jl#802) so I'd rather not confuse the two

reading that, I still don't understand why a zero chunksize is semantically meaningful?

If you denote by $N$ the dimension, $C$ the chunk size, $N_C$ the number of chunks, you have $N = N_C \cdot C$ (plus a remainder possibly). For $N = 0$, you can either pick $C = 0$ or $N_C = 0$. None of those means a lot to be honest, but different backends have different conventions, and ForwardDiff picks a zero chunk size in the zero-length case by default (JuliaDiff/DifferentiationInterface.jl#835 (comment)) while Enzyme doesn't. That's why I'd rather steer clear of this whole mess.

I see where you're coming from vis-a-vis forward diff making a different design choice, but I'm not sure that is most critical here.

Alternatively, I kind of wonder, if it would be best to make an EnzymeCore.MaxChunk (which equally can be used by the Enzyme.gradient/jacobian wrappers), which would be the alternate here like there is for EnzymeCore.Mode

I'm gonna veto the zero but if you want to add the max chunk setting to EnzymeCore that's fine by me too, your call

Hi Billy, just following up on this, do you want to add that setting to EnzymeCore?

yeah I think thats the right move. if you have cycles before I feel free to open a PR on

feat: add runtime activity and chunksize parameters to AutoEnzyme

a95bf02

gdalle marked this pull request as draft August 15, 2025 22:15

gdalle requested review from wsmoses and ChrisRackauckas August 15, 2025 22:15

wsmoses reviewed Aug 15, 2025

View reviewed changes

gdalle marked this pull request as ready for review August 27, 2025 05:53

gdalle commented Aug 27, 2025

View reviewed changes

src/dense.jl Outdated Show resolved Hide resolved

Update src/dense.jl

428bae4

Remove ADTypes mode

8e4cd22

Remove runtime activity

2790b3c

gdalle changed the title ~~feat: add runtime activity and chunksize parameters to AutoEnzyme~~ feat: add chunksize parameter to AutoEnzyme Aug 27, 2025

gdalle requested a review from wsmoses August 27, 2025 13:40

wsmoses reviewed Sep 8, 2025

View reviewed changes

src/dense.jl

if C isa Int

@assert C > 0

elseif C isa Float64

@assert C == Inf

Copy link

Collaborator

wsmoses Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should give a better error message here

wsmoses reviewed Sep 8, 2025

View reviewed changes

Uh oh!

feat: add chunksize parameter to AutoEnzyme #124

Are you sure you want to change the base?

feat: add chunksize parameter to AutoEnzyme #124

Uh oh!

Conversation

gdalle commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Additional context

Uh oh!

wsmoses left a comment

Choose a reason for hiding this comment

Uh oh!

gdalle commented Aug 15, 2025

Uh oh!

wsmoses commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gdalle commented Aug 16, 2025

Uh oh!

gdalle commented Aug 16, 2025

Uh oh!

gdalle commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wsmoses commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gdalle commented Aug 16, 2025

Uh oh!

gdalle commented Aug 27, 2025

Uh oh!

gdalle commented Aug 27, 2025

Uh oh!

Uh oh!

wsmoses commented Aug 27, 2025

Uh oh!

gdalle commented Aug 27, 2025

Uh oh!

wsmoses commented Aug 27, 2025

Uh oh!

gdalle commented Aug 27, 2025

Uh oh!

gdalle commented Sep 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

feat: add chunksize parameter to `AutoEnzyme` #124

feat: add chunksize parameter to `AutoEnzyme` #124

gdalle commented Aug 15, 2025 •

edited

Loading

wsmoses commented Aug 16, 2025 •

edited

Loading

gdalle commented Aug 16, 2025 •

edited

Loading

wsmoses commented Aug 16, 2025 •

edited

Loading