Background
Surfaced during code review of the merge resolution in #93 (ERA5 pressure-levels branch picking up the bbox→region refactor from #142). Several near-identical helper patterns now live in multiple dataset modules and could be hoisted into src/DataWrangling/metadata.jl (or a sibling utility file) for reuse.
These are pre-existing duplications, not regressions from #93, but the merge made them more visible and the new region schema is a natural moment to factor them out.
Findings
1. bbox_strs / region_suffix duplicated between ERA5 and GLORYS
src/DataWrangling/ERA5/ERA5.jl — defines bbox_strs(::Nothing), bbox_strs(::Number), bbox_strs(c) and region_suffix
src/DataWrangling/GLORYS/GLORYS.jl — defines essentially identical functions
These are pure string-formatting utilities that depend only on the structure of BoundingBox / Column regions. They should live in metadata.jl and be reused.
2. metadata_prefix(dataset, name, date, region) pattern across modules
ERA5, GLORYS, ECCO, and EN4 each implement metadata_prefix following nearly identical logic:
- Look up the variable name string for the dataset
- Format start and end dates
- Append
region_suffix(region)
- Apply
colon2dash / underscore_spaces cleanup
A shared metadata_prefix_base(varname_str, ds_name, start_str, end_str, region) helper in metadata.jl would let each dataset module shrink to a thin wrapper that supplies the variable-name lookup and dataset_name.
3. inpainted_metadata_filename reimplemented per dataset
src/DataWrangling/ERA5/ERA5.jl, src/DataWrangling/GLORYS/GLORYS.jl, and src/DataWrangling/EN4/EN4.jl each define inpainted_metadata_filename(metadata) = ... * "_inpainted.jld2".
A default implementation in metadata.jl that derives the inpainted filename from metadata.filename would let datasets opt out only when they need custom naming.
Suggested approach
Tackle (1) → (2) → (3) in order; each enables the next. (1) is a pure relocation, (2) introduces the shared base helper, (3) folds in once metadata_prefix is centralised.
These changes touch ERA5, GLORYS, ECCO, and EN4 modules, so timing them after #93 lands (to avoid extending the merge surface) is probably wise.
Background
Surfaced during code review of the merge resolution in #93 (ERA5 pressure-levels branch picking up the bbox→region refactor from #142). Several near-identical helper patterns now live in multiple dataset modules and could be hoisted into
src/DataWrangling/metadata.jl(or a sibling utility file) for reuse.These are pre-existing duplications, not regressions from #93, but the merge made them more visible and the new
regionschema is a natural moment to factor them out.Findings
1.
bbox_strs/region_suffixduplicated between ERA5 and GLORYSsrc/DataWrangling/ERA5/ERA5.jl— definesbbox_strs(::Nothing),bbox_strs(::Number),bbox_strs(c)andregion_suffixsrc/DataWrangling/GLORYS/GLORYS.jl— defines essentially identical functionsThese are pure string-formatting utilities that depend only on the structure of
BoundingBox/Columnregions. They should live inmetadata.jland be reused.2.
metadata_prefix(dataset, name, date, region)pattern across modulesERA5, GLORYS, ECCO, and EN4 each implement
metadata_prefixfollowing nearly identical logic:region_suffix(region)colon2dash/underscore_spacescleanupA shared
metadata_prefix_base(varname_str, ds_name, start_str, end_str, region)helper inmetadata.jlwould let each dataset module shrink to a thin wrapper that supplies the variable-name lookup anddataset_name.3.
inpainted_metadata_filenamereimplemented per datasetsrc/DataWrangling/ERA5/ERA5.jl,src/DataWrangling/GLORYS/GLORYS.jl, andsrc/DataWrangling/EN4/EN4.jleach defineinpainted_metadata_filename(metadata) = ... * "_inpainted.jld2".A default implementation in
metadata.jlthat derives the inpainted filename frommetadata.filenamewould let datasets opt out only when they need custom naming.Suggested approach
Tackle (1) → (2) → (3) in order; each enables the next. (1) is a pure relocation, (2) introduces the shared base helper, (3) folds in once
metadata_prefixis centralised.These changes touch ERA5, GLORYS, ECCO, and EN4 modules, so timing them after #93 lands (to avoid extending the merge surface) is probably wise.