Skip to content

Inconsistent return types across download_dataset implementations #201

@ewquon

Description

@ewquon

Summary

download_dataset has ~18 method definitions across src/DataWrangling/ and ext/, and they don't agree on a return value. Three different return types are in use, and there isn't a clear pattern (e.g. by single-date vs multi-date) that callers can rely on.

Surfaced while reviewing the API in #93.

Current return types

String (single file path) — 5 methods

  • src/DataWrangling/ETOPO/ETOPO.jl:58download_dataset(::ETOPOMetadatum)
  • src/DataWrangling/ORCA/ORCA.jl:110download_dataset(::ORCAMetadatum)
  • src/DataWrangling/OSPapa/OSPapa_ocean_observations.jl:69download_dataset(::OSPapaMetadata)
  • ext/NumericalEarthCopernicusMarineExt.jl:24download_dataset(::GLORYSMetadatum; …)
  • ext/NumericalEarthCDSAPIExt.jl:105download_dataset(::ERA5Metadatum; …)

Vector{String} (multiple file paths) — 3 methods

  • src/DataWrangling/DataWrangling.jl:246 — generic fallback download_dataset(::Metadata)
  • ext/NumericalEarthCopernicusMarineExt.jl:16download_dataset(::GLORYSMetadata; …)
  • ext/NumericalEarthCDSAPIExt.jl:227download_dataset(names, ::ERA5PressureMetadatum; …)

nothing — 8+ methods

  • src/DataWrangling/EN4/EN4.jl:207download_dataset(::Metadata{<:EN4Monthly})
  • src/DataWrangling/JRA55/JRA55_metadata.jl:192download_dataset(::JRA55Metadata)
  • src/DataWrangling/ECCO/ECCO.jl:304download_dataset(::ECCOMetadata)
  • src/DataWrangling/OSPapa/OSPapa_flux_observations.jl:97download_dataset(::OSPapaFluxMetadata)
  • ext/NumericalEarthWOAExt.jl:38download_dataset(::Metadata{<:WOAClimatology}; …)
  • ext/NumericalEarthCDSAPIExt.jl:138download_dataset(::ERA5Metadata; …) (implicit nothing)
  • ext/NumericalEarthCDSAPIExt.jl:214download_dataset(names, ::ERA5PressureMetadata; …)
  • ext/NumericalEarthCDSAPIExt.jl:309download_dataset(names, ::ERA5Dataset, datetimes; …)

Delegating overloads (inherit one of the above)

  • ext/NumericalEarthCDSAPIExt.jl:291, 298, 327

Why it matters

  1. Within a single module. ext/NumericalEarthCDSAPIExt.jl returns String, Vector{String}, or nothing depending on which overload you hit — even for the same dataset family. A caller has to read the source to know which one applies.

  2. Composability. A consistent return of paths would let callers chain into NCDataset, metadata_path validators, etc. without re-deriving the path from the metadata.

Suggested resolution

Standardize on "return the path(s) on disk":

  • Single-file methods → String
  • Multi-file methods → Vector{String}
  • Drop the nothing returns; replace with return metadata_path(metadata) (which already gives String for Metadatum and Vector{String} for Metadata).

This is mechanical (~8 lines, one per offending method), cheap, and makes the dispatch tree predictable. It's a behavior change in a public-ish API, so it should land as its own small PR (separate from feature work) and call out the change in the next release notes.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions