Skip to content

Coordinates.dims: only include dimensions found in coordinate variables. #10609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

benbovy
Copy link
Member

@benbovy benbovy commented Aug 6, 2025

Two comments:

  • To me it looks rather like a bug fix. Do others view this as a breaking change?
  • Dataset.coords.dims, DataArray.coords.dims and DataTree.coords.dims now all calculate the dimensions on the fly (no cache), which may have an impact of performance. Not sure those properties are heavily used, though (not much reports apart from ds.coords.dims can display dims not present in the coordinate vars #9466).

benbovy added 3 commits August 6, 2025 15:29
Exclude the dimension(s) present in the wrapped Dataset / DataArray /
DataTree object that do not have any coordinate variable.
This function is used internally for Xarray -> Pandas conversion, where
we still want dimensions without coordinate be converted to an index.
@benbovy benbovy requested a review from TomNicholas August 6, 2025 13:47
Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test failures here look real?

@@ -1522,6 +1522,12 @@ def test_coords(self) -> None:
self.mda["level_1"] = ("x", np.arange(4))
self.mda.coords["level_1"] = ("x", np.arange(4))

def test_coords_dims(self) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_coords_dims(self) -> None:
# https://github.com/pydata/xarray/issues/9466
def test_coords_dims(self) -> None:

@@ -666,6 +666,18 @@ def test_properties(self) -> None:
"b": np.dtype("int64"),
}

def test_dims(self) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably also add a similar test to the one you used for DataArray here too

@benbovy
Copy link
Member Author

benbovy commented Aug 7, 2025

The test failures here look real?

This is caused by this line

dims = getattr(data, "dims", getattr(coords, "dims", None))

where DataArray dimensions, if not explicitly set via the constructor argument, may be ultimately set from coords.dims. However, if I replace getattr(coords, "dims", None) by None, other tests fails.

This is why this PR may be considered as both a bug fix and a breaking change. Apparently quite a few tests (and maybe 3rd-party code?) rely on the dimensions of the object wrapped by the Coordinates proxy object (passed via the coords argument) to set the dimension names of the new DataArray.

On the main branch:

>>> import numpy as np
>>> import xarray as xr
>>> da = xr.DataArray([[1, 2], [3, 4]], coords={"x": [0, 1]}, dims=("x", "y"))
>>> values = np.array([[10, 20], [30, 40]])

Creating a new DataArray by directly passing the Coordinates proxy object (success, dimension name "y" magically discovered from da.coords.dims pointing to da.dims!!):

>>> coords = da.coords
>>> xr.DataArray(values, coords=coords) 
<xarray.DataArray (x: 2, y: 2)> Size: 32B
array([[10, 20],
       [30, 40]])
Coordinates:
  * x        (x) int64 16B 0 1
Dimensions without coordinates: y

Creating a new DataArray by passing a new "standalone" Coordinates object (error, the coordinates have only one "x" dimension):

>>> coords = xr.Coordinates(da.coords)
>>> xr.DataArray(values, coords=coords)
ValueError: different number of dimensions on data and dims: 2 vs 1

This seems very confusing to me and I think that we should deprecate the succeeding behavior above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ds.coords.dims can display dims not present in the coordinate vars
2 participants