Make performance more consistent by sorting dims before passing to `mean`, `min`... #10611

jsignell · 2025-08-06T15:27:36Z

It looks like having the order of the dims in reduce operations (mean, min, ...) match the order of the dims in the array makes the performance of these operations more consistent.

This has to do with how the data is stored. So ascending ordering is best for c-ordered arrays and descending is better for fortran-ordered arrays. So I'm not 100% sure that this is a good idea. Might be better to just add a note to the docs or something (I did not check if such a note is already in there).

Closes Optimize .mean calls according to dimensions' sizes? #10606
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst

Demo

For the following c-ordered data array:

In [2]: import numpy as np
   ...: import pandas as pd
   ...: import xarray as xr
   ...: 
   ...: # Create a large test DataArray with multiple dimensions
   ...: np.random.seed(42)
   ...: data = np.random.random((1000, 200, 10, 3))  # Large enough to see performance differences
   ...: dims = ['time', 'lat', 'lon', 'level']
   ...: coords = {
   ...:     'time': pd.date_range('2000-01-01', periods=1000),
   ...:     'lat': np.linspace(-90, 90, 200),
   ...:     'lon': np.linspace(-180, 180, 10),
   ...:     'level': np.arange(3)
   ...: }
   ...: 
   ...: da = xr.DataArray(data, dims=dims, coords=coords)

On main:

In [3]:  %timeit da.mean(["time", "lat", "lon"])
3.88 ms ± 207 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]:  %timeit da.mean(["lon", "lat", "time"])
67.8 ms ± 1.56 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

On this PR:

In [3]:  %timeit da.mean(["time", "lat", "lon"])
3.78 ms ± 70.3 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]:  %timeit da.mean(["lon", "lat", "time"])
3.73 ms ± 85.3 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

If we use a fortran-ordered array (wrap np.asfortranarray around data) we get consistent performance on this PR, but it is consistent with the worse performance.

On main:

In [3]: %timeit da.mean(["time", "lat", "lon"])
18 ms ± 317 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]: %timeit da.mean(["lon", "lat", "time"])
2.75 ms ± 135 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

On this PR:

In [3]: %timeit da.mean(["time", "lat", "lon"])
18.1 ms ± 201 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]: %timeit da.mean(["lon", "lat", "time"])
18.9 ms ± 674 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Honestly this might something that makes more sense doing on the numpy side.

jsignell · 2025-08-06T16:47:27Z

I opened this as an issue on numbagg: numbagg/numbagg#402

Sort axis/dims before passing to numpy

74ee357

github-actions bot added the topic-NamedArray Lightweight version of Variable label Aug 6, 2025

jsignell added the topic-performance label Aug 6, 2025

jsignell closed this Aug 6, 2025

jsignell deleted the sort-dims branch August 6, 2025 16:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Make performance more consistent by sorting dims before passing to `mean`, `min`... #10611

Make performance more consistent by sorting dims before passing to `mean`, `min`... #10611

Uh oh!

jsignell commented Aug 6, 2025

Uh oh!

jsignell commented Aug 6, 2025

Uh oh!

Uh oh!

Uh oh!

Make performance more consistent by sorting dims before passing to mean, min... #10611

Make performance more consistent by sorting dims before passing to mean, min... #10611

Uh oh!

Conversation

jsignell commented Aug 6, 2025

Demo

Uh oh!

jsignell commented Aug 6, 2025

Uh oh!

Uh oh!

Make performance more consistent by sorting dims before passing to `mean`, `min`... #10611

Make performance more consistent by sorting dims before passing to `mean`, `min`... #10611