-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Coordinates.dims: only include dimensions found in coordinate variables. #10609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Exclude the dimension(s) present in the wrapped Dataset / DataArray / DataTree object that do not have any coordinate variable.
This function is used internally for Xarray -> Pandas conversion, where we still want dimensions without coordinate be converted to an index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failures here look real?
@@ -1522,6 +1522,12 @@ def test_coords(self) -> None: | |||
self.mda["level_1"] = ("x", np.arange(4)) | |||
self.mda.coords["level_1"] = ("x", np.arange(4)) | |||
|
|||
def test_coords_dims(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_coords_dims(self) -> None: | |
# https://github.com/pydata/xarray/issues/9466 | |
def test_coords_dims(self) -> None: |
@@ -666,6 +666,18 @@ def test_properties(self) -> None: | |||
"b": np.dtype("int64"), | |||
} | |||
|
|||
def test_dims(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably also add a similar test to the one you used for DataArray
here too
This is caused by this line xarray/xarray/core/dataarray.py Line 453 in 6e9414d
where DataArray dimensions, if not explicitly set via the constructor argument, may be ultimately set from This is why this PR may be considered as both a bug fix and a breaking change. Apparently quite a few tests (and maybe 3rd-party code?) rely on the dimensions of the object wrapped by the Coordinates proxy object (passed via the On the >>> import numpy as np
>>> import xarray as xr
>>> da = xr.DataArray([[1, 2], [3, 4]], coords={"x": [0, 1]}, dims=("x", "y"))
>>> values = np.array([[10, 20], [30, 40]]) Creating a new DataArray by directly passing the Coordinates proxy object (success, dimension name "y" magically discovered from >>> coords = da.coords
>>> xr.DataArray(values, coords=coords)
<xarray.DataArray (x: 2, y: 2)> Size: 32B
array([[10, 20],
[30, 40]])
Coordinates:
* x (x) int64 16B 0 1
Dimensions without coordinates: y Creating a new DataArray by passing a new "standalone" Coordinates object (error, the coordinates have only one "x" dimension): >>> coords = xr.Coordinates(da.coords)
>>> xr.DataArray(values, coords=coords)
ValueError: different number of dimensions on data and dims: 2 vs 1 This seems very confusing to me and I think that we should deprecate the succeeding behavior above. |
whats-new.rst
api.rst
Two comments:
Dataset.coords.dims
,DataArray.coords.dims
andDataTree.coords.dims
now all calculate the dimensions on the fly (no cache), which may have an impact of performance. Not sure those properties are heavily used, though (not much reports apart from ds.coords.dims can display dims not present in the coordinate vars #9466).