Support non-ndarray computations, cache unit calls, and add slots to DataArray by stubbiali · Pull Request #47 · mcgibbon/sympl

stubbiali · 2021-07-13T15:33:15Z

A few disparate minor changes.

Improve integration with packages implementing NumPy's ndarray API (in the specific case: GT4Py) by (i) avoiding axes transposition unless necessary, (ii) accessing the data rather than the values attribute of DataArray, and (iii) avoiding explicit coercion.
Cache calls to UnitRegistry to improve performance.
Add __slots__ class attribute to DataArray to suppress warning.

stubbiali · 2021-07-13T15:38:26Z

sympl/_core/restore_dataarray.py

-    for name, value in array_dict.items():
-        if not isinstance(value, np.ndarray):
-            array_dict[name] = np.asarray(value)
+    pass


This is a temporary and dirty solution. We could think of a mechanism to control whether arrays must be coerced or not.

We can't merge this change as-is, what problem is being solved here and what other solutions are available for it?

The asarray function of Numpy seeks to coerce the input array-like storage value into a ndarray. This operation could break e.g. the data layout and the memory alignment of value. In the specific case of Tasmania, value could be a GT4Py storage, whose low-level details and features are fitted to the target computing architecture and thus must be preserved. The problem can be circumvented by monkey-patching Numpy via the function gt4py.storage.prepare_numpy(), but this is much GT4Py-specific. We could think of a more organic solution, or just pass value to DataArray as it is and let DataArray perform all type checks (and eventually throw exceptions).

mcgibbon · 2021-08-25T18:57:02Z

sympl/_core/dataarray.py



 class DataArray(xr.DataArray):
+    __slots__ = []


What are the implications of setting this? What warning is it suppressing, and what behavior does it cause when you set this to an empty list?

This is aimed to suppress FutureWarning: xarray subclass DataArray should explicitly define __slots__. Here is a nice explanation of how __slots__ work.

mcgibbon · 2021-08-25T19:00:01Z

sympl/_core/get_np_arrays.py

    """
-    if len(data_array.values.shape) == 0 and len(out_dims) == 0:
-        return data_array.values  # special case, 0-dimensional scalar array
+    if len(data_array.data.shape) == 0 and len(out_dims) == 0:


This change alters the behavior of this function, which is OK, but the variable names, function name, file name, and docstring need to be updated. For example, I would suggest naming the function something like get_underlying_data.

The tests in test_get_restore_numpy_array.py should also be updated to cover cases where data is not a numpy array (and have similar re-namings).

That's correct. I will rename the function and update the tests.

mcgibbon · 2021-08-25T19:01:09Z

sympl/_core/restore_dataarray.py

-    for name, value in array_dict.items():
-        if not isinstance(value, np.ndarray):
-            array_dict[name] = np.asarray(value)
+    pass


We can't merge this change as-is, what problem is being solved here and what other solutions are available for it?

mcgibbon · 2021-08-25T19:04:45Z

Could you also please update the PR name with a brief description of what these changes do? e.g. "support non-ndarray computations, cache unit calls, add slots to DataArray"? If the name is too long, these can be put into separate PRs. The unit caching for example is mergeable as-is.

stubbiali · 2022-02-02T07:14:22Z

I think the new name is not too long ;)

stubbiali added 6 commits October 28, 2019 09:52

Transpose fields only when really needed.

9d8ec5b

Added __slots__ to DataArray class.

12327bb

Use data rather than values property of DataArrays.

be38d65

Do not coerce raw storages to np.ndarrays.

56620ea

Avoid relying on unit_registry to check if units are the same.

9ddcebd

Cache calls to UnitRegistry.

29b5f23

stubbiali commented Jul 13, 2021

View reviewed changes

mcgibbon reviewed Aug 25, 2021

View reviewed changes

stubbiali changed the title ~~Miscellaneous~~ Support non-ndarray computations, cache unit calls, and add slots to DataArray Feb 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support non-ndarray computations, cache unit calls, and add slots to DataArray#47

Support non-ndarray computations, cache unit calls, and add slots to DataArray#47
stubbiali wants to merge 6 commits intomcgibbon:masterfrom
stubbiali:ubbiali

stubbiali commented Jul 13, 2021

Uh oh!

stubbiali Jul 13, 2021

Uh oh!

mcgibbon Aug 25, 2021

Uh oh!

stubbiali Feb 2, 2022

Uh oh!

mcgibbon Aug 25, 2021

Uh oh!

stubbiali Feb 2, 2022

Uh oh!

mcgibbon Aug 25, 2021 •

edited

Loading

Uh oh!

stubbiali Feb 2, 2022

Uh oh!

mcgibbon Aug 25, 2021

Uh oh!

mcgibbon commented Aug 25, 2021

Uh oh!

stubbiali commented Feb 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stubbiali commented Jul 13, 2021

Uh oh!

stubbiali Jul 13, 2021

Choose a reason for hiding this comment

Uh oh!

mcgibbon Aug 25, 2021

Choose a reason for hiding this comment

Uh oh!

stubbiali Feb 2, 2022

Choose a reason for hiding this comment

Uh oh!

mcgibbon Aug 25, 2021

Choose a reason for hiding this comment

Uh oh!

stubbiali Feb 2, 2022

Choose a reason for hiding this comment

Uh oh!

mcgibbon Aug 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stubbiali Feb 2, 2022

Choose a reason for hiding this comment

Uh oh!

mcgibbon Aug 25, 2021

Choose a reason for hiding this comment

Uh oh!

mcgibbon commented Aug 25, 2021

Uh oh!

stubbiali commented Feb 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mcgibbon Aug 25, 2021 •

edited

Loading