Skip to content

Conversation

@davidhassell
Copy link
Contributor

Fixes #354 and #355

Orientation

cfdm/data/aggregatedarray.py

  • Change to allow URIs to be Python strngs or scalar numpy string types.

cfdm/data/data.py

  • Include Zarr dataset sharding methods
  • Include Zarr fragment class
  • Move the test for an "empty" slice to the parent construct (in cfdm/mixin/propertiesdata.py), because sometime we want to allow slices on abstract data (just like numpy does)

cfdm/data/fragment/fragmentfilearray.py

  • Include Zarr fragment class

cfdm/data/fragment/fragmentzarrarray.py

  • New file defining a Zarr fragment (very similary to cfdm/data/fragment/fragmentnetcdf4array.py

cfdm/data/netcdfindexer.py

  • Account for zarr using the new numpy T data type

cfdm/mixin/netcdf.py

  • New mixin class for dataset shards

cfdm/read_write/abstract/abstractio.py

  • Rename "file" to "dataset"

cfdm/read_write/netcdf/flatten/flatten.py

  • Rename "file" to "dataset"
  • Use match clauses to switch between dataset format APIs
  • Changes to allow Zarr datasets to be flattened
    • Zarr datasets to not have well-defined Dimensions, so finding
      which dimensions belong to which groups for Zarr datasets is more
      involved, and needs configuring with the new
      group_dimension_search keyword.

cfdm/read_write/netcdf/netcdfread.py

  • Rename "file" to "dataset"
  • Use match clauses to switch between dataset format APIs
  • Handle Zarr shards
  • Performance improvements in _cache_data_elements, largely aimed at
    reducing the number of requests to disk

cfdm/read_write/netcdf/netcdfwrite.py

  • Rename "file" to "dataset"
  • Use match clauses to switch between dataset format APIs
  • Handle Zarr shards

cfdm/read_write/netcdf/zarr.py

  • Include a reference variable in ZarrDimension

cfdm/read_write/read.py

  • New keywords store_dataset_shards and group_dimension_search

cfdm/read_write/write.py

  • Rename "file" to "dataset"
  • New keyword dataset_shards

setup.py

  • zarr import optional

@davidhassell davidhassell added this to the NEXTVERSION milestone Nov 13, 2025
@davidhassell davidhassell added enhancement New feature or request dataset write Relating to writing datasets dataset read Relating to reading datasets labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dataset read Relating to reading datasets dataset write Relating to writing datasets enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Write Zarr datasets

1 participant