Skip to content

DataWrangling cleanup: standardize download_dataset returns, drop ERA5 dead code#207

Closed
ewquon wants to merge 3 commits intoNumericalEarth:mainfrom
ewquon:eq/datawrangling_cleanup
Closed

DataWrangling cleanup: standardize download_dataset returns, drop ERA5 dead code#207
ewquon wants to merge 3 commits intoNumericalEarth:mainfrom
ewquon:eq/datawrangling_cleanup

Conversation

@ewquon
Copy link
Copy Markdown
Collaborator

@ewquon ewquon commented May 1, 2026

Summary

  • Standardize download_dataset return types across all ~17 methods: String for single-file Metadatum, Vector{String} for multi-file Metadata. Drops the previous mix of String / Vector{String} / nothing returns. Closes Inconsistent return types across download_dataset implementations #201.
  • Refactor ERA5 helpers (download_era5_day, download_era5_multivar_day) to return their already-computed path lists instead of nothing, and to construct paths via joinpath(dir, metadata_filename(...)) directly rather than instantiating throwaway Metadatum objects.
  • Remove dead code in the ERA5 module: unused ERA5_single_level_accumulated_variables Set, unused Day import, redundant metadata_path import, and 1-arg metadata_prefix overloads that duplicated the 4-arg form (tests updated to call 4-arg form directly).

Test plan

  • Existing CPU test suite passes
  • test_cds_downloading.jl still exercises the ERA5 surface + pressure-level paths
  • Spot-check that download_dataset(...) callers receive paths (previously discarded)

🤖 Generated with Claude Code

ewquon and others added 2 commits May 1, 2026 13:53
- Drop the unused `ERA5_single_level_accumulated_variables` Set in
  `ERA5_single_levels.jl` — it was defined with a comment about CDS
  step types but never wired up to anything (no references in src/,
  ext/, test/, or examples/).
- Drop the unused `Day` from the `using Dates: ...` import in `ERA5.jl`
  (only `DateTime`, `Month`, `Hour` are referenced).
- Drop the redundant `metadata_path` from the `import` block — already
  brought in via `using NumericalEarth.DataWrangling: metadata_path`,
  and the ERA5 module never extends it (the `import` form is for
  extending only).
- Delete the 1-arg `metadata_prefix(::ERA5Metadata)` /
  `(::ERA5PressureMetadata)` overloads. They duplicate the logic of
  the 4-arg form `metadata_prefix(dataset, name, date, region)` (which
  is what `metadata_filename` actually calls). Update the
  metadata_prefix unit tests to call the 4-arg form directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make download_dataset consistently return paths instead of nothing:
String for single-file Metadatum, Vector{String} for multi-file Metadata.
ERA5 helpers now return their already-computed path lists rather than
re-deriving them via throwaway Metadatum construction.

Closes NumericalEarth#201.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ewquon ewquon added the clean up 🧹 technical debt is real people label May 1, 2026
download_era5_day and download_era5_multivar_day now return path
vectors instead of nothing on the skip-existing short-circuit; update
the assertions accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 1, 2026

Codecov Report

❌ Patch coverage is 43.47826% with 13 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
ext/NumericalEarthCDSAPIExt.jl 35.29% 11 Missing ⚠️
...c/DataWrangling/OSPapa/OSPapa_flux_observations.jl 0.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@ewquon
Copy link
Copy Markdown
Collaborator Author

ewquon commented May 1, 2026

Oops, accidentally opened a PR from my fork again.

@ewquon ewquon closed this May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clean up 🧹 technical debt is real people

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent return types across download_dataset implementations

1 participant