Coordinates.dims: only include dimensions found in coordinate variables. #10609

benbovy · 2025-08-06T13:47:04Z

Closes ds.coords.dims can display dims not present in the coordinate vars #9466 (leaving the addition of a "dimensions" section to the Coordinates repr for a follow-up PR)
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

Two comments:

To me it looks rather like a bug fix. Do others view this as a breaking change?
Dataset.coords.dims, DataArray.coords.dims and DataTree.coords.dims now all calculate the dimensions on the fly (no cache), which may have an impact of performance. Not sure those properties are heavily used, though (not much reports apart from ds.coords.dims can display dims not present in the coordinate vars #9466).

Exclude the dimension(s) present in the wrapped Dataset / DataArray / DataTree object that do not have any coordinate variable.

This function is used internally for Xarray -> Pandas conversion, where we still want dimensions without coordinate be converted to an index.

TomNicholas

The test failures here look real?

TomNicholas · 2025-08-06T15:15:21Z

xarray/tests/test_dataarray.py

@@ -1522,6 +1522,12 @@ def test_coords(self) -> None:
            self.mda["level_1"] = ("x", np.arange(4))
            self.mda.coords["level_1"] = ("x", np.arange(4))

+    def test_coords_dims(self) -> None:


Suggested change

def test_coords_dims(self) -> None:

# https://github.com/pydata/xarray/issues/9466

def test_coords_dims(self) -> None:

TomNicholas · 2025-08-06T15:45:04Z

xarray/tests/test_datatree.py

@@ -666,6 +666,18 @@ def test_properties(self) -> None:
            "b": np.dtype("int64"),
        }

+    def test_dims(self) -> None:


Should probably also add a similar test to the one you used for DataArray here too

benbovy · 2025-08-07T07:55:28Z

The test failures here look real?

This is caused by this line

xarray/xarray/core/dataarray.py

Line 453 in 6e9414d

dims = getattr(data, "dims", getattr(coords, "dims", None))

where DataArray dimensions, if not explicitly set via the constructor argument, may be ultimately set from coords.dims. However, if I replace getattr(coords, "dims", None) by None, other tests fails.

This is why this PR may be considered as both a bug fix and a breaking change. Apparently quite a few tests (and maybe 3rd-party code?) rely on the dimensions of the object wrapped by the Coordinates proxy object (passed via the coords argument) to set the dimension names of the new DataArray.

On the main branch:

>>> import numpy as np
>>> import xarray as xr
>>> da = xr.DataArray([[1, 2], [3, 4]], coords={"x": [0, 1]}, dims=("x", "y"))
>>> values = np.array([[10, 20], [30, 40]])

Creating a new DataArray by directly passing the Coordinates proxy object (success, dimension name "y" magically discovered from da.coords.dims pointing to da.dims!!):

>>> coords = da.coords
>>> xr.DataArray(values, coords=coords) 
<xarray.DataArray (x: 2, y: 2)> Size: 32B
array([[10, 20],
       [30, 40]])
Coordinates:
  * x        (x) int64 16B 0 1
Dimensions without coordinates: y

Creating a new DataArray by passing a new "standalone" Coordinates object (error, the coordinates have only one "x" dimension):

>>> coords = xr.Coordinates(da.coords)
>>> xr.DataArray(values, coords=coords)
ValueError: different number of dimensions on data and dims: 2 vs 1

This seems very confusing to me and I think that we should deprecate the succeeding behavior above.

benbovy added 3 commits August 6, 2025 15:29

coords.dims: only include dims found in variables

2b90358

Exclude the dimension(s) present in the wrapped Dataset / DataArray / DataTree object that do not have any coordinate variable.

Coordinates.to_index: still use full data dims

d7c40a1

This function is used internally for Xarray -> Pandas conversion, where we still want dimensions without coordinate be converted to an index.

update tests

6e9414d

benbovy requested a review from TomNicholas August 6, 2025 13:47

TomNicholas reviewed Aug 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Coordinates.dims: only include dimensions found in coordinate variables. #10609

Coordinates.dims: only include dimensions found in coordinate variables. #10609

Uh oh!

benbovy commented Aug 6, 2025

Uh oh!

TomNicholas left a comment

Uh oh!

TomNicholas Aug 6, 2025

Uh oh!

TomNicholas Aug 6, 2025

Uh oh!

benbovy commented Aug 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

	def test_coords_dims(self) -> None:
	# https://github.com/pydata/xarray/issues/9466
	def test_coords_dims(self) -> None:

Uh oh!

Coordinates.dims: only include dimensions found in coordinate variables. #10609

Are you sure you want to change the base?

Coordinates.dims: only include dimensions found in coordinate variables. #10609

Uh oh!

Conversation

benbovy commented Aug 6, 2025

Uh oh!

TomNicholas left a comment

Choose a reason for hiding this comment

Uh oh!

TomNicholas Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

TomNicholas Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

benbovy commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

benbovy commented Aug 7, 2025 •

edited

Loading