Skip to content

Consolidate duplicated metadata-helper code across dataset modules #196

@ewquon

Description

@ewquon

Background

Surfaced during code review of the merge resolution in #93 (ERA5 pressure-levels branch picking up the bbox→region refactor from #142). Several near-identical helper patterns now live in multiple dataset modules and could be hoisted into src/DataWrangling/metadata.jl (or a sibling utility file) for reuse.

These are pre-existing duplications, not regressions from #93, but the merge made them more visible and the new region schema is a natural moment to factor them out.

Findings

1. bbox_strs / region_suffix duplicated between ERA5 and GLORYS

  • src/DataWrangling/ERA5/ERA5.jl — defines bbox_strs(::Nothing), bbox_strs(::Number), bbox_strs(c) and region_suffix
  • src/DataWrangling/GLORYS/GLORYS.jl — defines essentially identical functions

These are pure string-formatting utilities that depend only on the structure of BoundingBox / Column regions. They should live in metadata.jl and be reused.

2. metadata_prefix(dataset, name, date, region) pattern across modules

ERA5, GLORYS, ECCO, and EN4 each implement metadata_prefix following nearly identical logic:

  1. Look up the variable name string for the dataset
  2. Format start and end dates
  3. Append region_suffix(region)
  4. Apply colon2dash / underscore_spaces cleanup

A shared metadata_prefix_base(varname_str, ds_name, start_str, end_str, region) helper in metadata.jl would let each dataset module shrink to a thin wrapper that supplies the variable-name lookup and dataset_name.

3. inpainted_metadata_filename reimplemented per dataset

src/DataWrangling/ERA5/ERA5.jl, src/DataWrangling/GLORYS/GLORYS.jl, and src/DataWrangling/EN4/EN4.jl each define inpainted_metadata_filename(metadata) = ... * "_inpainted.jld2".

A default implementation in metadata.jl that derives the inpainted filename from metadata.filename would let datasets opt out only when they need custom naming.

Suggested approach

Tackle (1) → (2) → (3) in order; each enables the next. (1) is a pure relocation, (2) introduces the shared base helper, (3) folds in once metadata_prefix is centralised.

These changes touch ERA5, GLORYS, ECCO, and EN4 modules, so timing them after #93 lands (to avoid extending the merge surface) is probably wise.

Metadata

Metadata

Assignees

No one assigned

    Labels

    clean up 🧹technical debt is real people

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions