How will we be handling test data in v4 of Parcels? At the moment we download it via parcels/tools/exampledata_utils.py, but perhaps it would be better to:
- Generate the data (i.e., xarray datasets) on the fly in the idealised cases
- Download the data (from some hosting provider) for realistic cases. Perhaps using
pooch rather than our current custom downloading mechanism.
This will also mean that we can remove parcels/data (and any other committed data files) which I think would be good.
Let's continue this discussion in an issue :)
Originally posted by @VeckoTheGecko in #1946 (comment)
Okay. Let's flesh this out...
Summary
- For idealised cases: Data can be generated
- For real world cases: Download the data
- For small datasets (<100Mb per file): Host on GitHub via https://github.com/OceanParcels/parcels-data
- For large datasets: We won't have any example datasets in this category. If we need them, can investigate using Zenodo
In all cases the return should be an xarray dataset object (feedback welcome on this point. Is this suitable for unstructured grids - would it be better to return a uxarray dataset? Would there be instances where we want a collection (e.g., list) of xr.Dataset objects?)
TODO
Originally posted by @VeckoTheGecko in #1946 (comment)
Okay. Let's flesh this out...
Summary
In all cases the return should be an xarray dataset object (feedback welcome on this point. Is this suitable for unstructured grids - would it be better to return a uxarray dataset? Would there be instances where we want a collection (e.g., list) of xr.Dataset objects?)
TODO
MovingEddies_datadataset to be generateddownload_example_datasetin favor ofget_example_dataset(which will return an xarray object)tests/test_datathat don't need to be there (i.e., they can be generated with code)