Make performance more consistent by sorting dims before passing to mean
, min
...
#10611
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It looks like having the order of the dims in reduce operations (
mean
,min
, ...) match the order of the dims in the array makes the performance of these operations more consistent.This has to do with how the data is stored. So ascending ordering is best for c-ordered arrays and descending is better for fortran-ordered arrays. So I'm not 100% sure that this is a good idea. Might be better to just add a note to the docs or something (I did not check if such a note is already in there).
.mean
calls according to dimensions' sizes? #10606whats-new.rst
Demo
For the following c-ordered data array:
On main:
On this PR:
If we use a fortran-ordered array (wrap
np.asfortranarray
arounddata
) we get consistent performance on this PR, but it is consistent with the worse performance.On main:
On this PR:
Honestly this might something that makes more sense doing on the numpy side.