Summary
download_dataset has ~18 method definitions across src/DataWrangling/ and ext/, and they don't agree on a return value. Three different return types are in use, and there isn't a clear pattern (e.g. by single-date vs multi-date) that callers can rely on.
Surfaced while reviewing the API in #93.
Current return types
String (single file path) — 5 methods
src/DataWrangling/ETOPO/ETOPO.jl:58 — download_dataset(::ETOPOMetadatum)
src/DataWrangling/ORCA/ORCA.jl:110 — download_dataset(::ORCAMetadatum)
src/DataWrangling/OSPapa/OSPapa_ocean_observations.jl:69 — download_dataset(::OSPapaMetadata)
ext/NumericalEarthCopernicusMarineExt.jl:24 — download_dataset(::GLORYSMetadatum; …)
ext/NumericalEarthCDSAPIExt.jl:105 — download_dataset(::ERA5Metadatum; …)
Vector{String} (multiple file paths) — 3 methods
src/DataWrangling/DataWrangling.jl:246 — generic fallback download_dataset(::Metadata)
ext/NumericalEarthCopernicusMarineExt.jl:16 — download_dataset(::GLORYSMetadata; …)
ext/NumericalEarthCDSAPIExt.jl:227 — download_dataset(names, ::ERA5PressureMetadatum; …)
nothing — 8+ methods
src/DataWrangling/EN4/EN4.jl:207 — download_dataset(::Metadata{<:EN4Monthly})
src/DataWrangling/JRA55/JRA55_metadata.jl:192 — download_dataset(::JRA55Metadata)
src/DataWrangling/ECCO/ECCO.jl:304 — download_dataset(::ECCOMetadata)
src/DataWrangling/OSPapa/OSPapa_flux_observations.jl:97 — download_dataset(::OSPapaFluxMetadata)
ext/NumericalEarthWOAExt.jl:38 — download_dataset(::Metadata{<:WOAClimatology}; …)
ext/NumericalEarthCDSAPIExt.jl:138 — download_dataset(::ERA5Metadata; …) (implicit nothing)
ext/NumericalEarthCDSAPIExt.jl:214 — download_dataset(names, ::ERA5PressureMetadata; …)
ext/NumericalEarthCDSAPIExt.jl:309 — download_dataset(names, ::ERA5Dataset, datetimes; …)
Delegating overloads (inherit one of the above)
ext/NumericalEarthCDSAPIExt.jl:291, 298, 327
Why it matters
-
Within a single module. ext/NumericalEarthCDSAPIExt.jl returns String, Vector{String}, or nothing depending on which overload you hit — even for the same dataset family. A caller has to read the source to know which one applies.
-
Composability. A consistent return of paths would let callers chain into NCDataset, metadata_path validators, etc. without re-deriving the path from the metadata.
Suggested resolution
Standardize on "return the path(s) on disk":
- Single-file methods →
String
- Multi-file methods →
Vector{String}
- Drop the
nothing returns; replace with return metadata_path(metadata) (which already gives String for Metadatum and Vector{String} for Metadata).
This is mechanical (~8 lines, one per offending method), cheap, and makes the dispatch tree predictable. It's a behavior change in a public-ish API, so it should land as its own small PR (separate from feature work) and call out the change in the next release notes.
Summary
download_datasethas ~18 method definitions acrosssrc/DataWrangling/andext/, and they don't agree on a return value. Three different return types are in use, and there isn't a clear pattern (e.g. by single-date vs multi-date) that callers can rely on.Surfaced while reviewing the API in #93.
Current return types
String(single file path) — 5 methodssrc/DataWrangling/ETOPO/ETOPO.jl:58—download_dataset(::ETOPOMetadatum)src/DataWrangling/ORCA/ORCA.jl:110—download_dataset(::ORCAMetadatum)src/DataWrangling/OSPapa/OSPapa_ocean_observations.jl:69—download_dataset(::OSPapaMetadata)ext/NumericalEarthCopernicusMarineExt.jl:24—download_dataset(::GLORYSMetadatum; …)ext/NumericalEarthCDSAPIExt.jl:105—download_dataset(::ERA5Metadatum; …)Vector{String}(multiple file paths) — 3 methodssrc/DataWrangling/DataWrangling.jl:246— generic fallbackdownload_dataset(::Metadata)ext/NumericalEarthCopernicusMarineExt.jl:16—download_dataset(::GLORYSMetadata; …)ext/NumericalEarthCDSAPIExt.jl:227—download_dataset(names, ::ERA5PressureMetadatum; …)nothing— 8+ methodssrc/DataWrangling/EN4/EN4.jl:207—download_dataset(::Metadata{<:EN4Monthly})src/DataWrangling/JRA55/JRA55_metadata.jl:192—download_dataset(::JRA55Metadata)src/DataWrangling/ECCO/ECCO.jl:304—download_dataset(::ECCOMetadata)src/DataWrangling/OSPapa/OSPapa_flux_observations.jl:97—download_dataset(::OSPapaFluxMetadata)ext/NumericalEarthWOAExt.jl:38—download_dataset(::Metadata{<:WOAClimatology}; …)ext/NumericalEarthCDSAPIExt.jl:138—download_dataset(::ERA5Metadata; …)(implicitnothing)ext/NumericalEarthCDSAPIExt.jl:214—download_dataset(names, ::ERA5PressureMetadata; …)ext/NumericalEarthCDSAPIExt.jl:309—download_dataset(names, ::ERA5Dataset, datetimes; …)Delegating overloads (inherit one of the above)
ext/NumericalEarthCDSAPIExt.jl:291, 298, 327Why it matters
Within a single module.
ext/NumericalEarthCDSAPIExt.jlreturnsString,Vector{String}, ornothingdepending on which overload you hit — even for the same dataset family. A caller has to read the source to know which one applies.Composability. A consistent return of paths would let callers chain into
NCDataset,metadata_pathvalidators, etc. without re-deriving the path from the metadata.Suggested resolution
Standardize on "return the path(s) on disk":
StringVector{String}nothingreturns; replace withreturn metadata_path(metadata)(which already givesStringforMetadatumandVector{String}forMetadata).This is mechanical (~8 lines, one per offending method), cheap, and makes the dispatch tree predictable. It's a behavior change in a public-ish API, so it should land as its own small PR (separate from feature work) and call out the change in the next release notes.