Skip to content

tifs_to_geozarr: propagate GDAL Unit Type + long_name into zarr attrs#53

Merged
scottstanie merged 1 commit intoopera-adt:mainfrom
scottstanie:feat/tifs-to-geozarr-units
Apr 24, 2026
Merged

tifs_to_geozarr: propagate GDAL Unit Type + long_name into zarr attrs#53
scottstanie merged 1 commit intoopera-adt:mainfrom
scottstanie:feat/tifs-to-geozarr-units

Conversation

@scottstanie
Copy link
Copy Markdown
Collaborator

Summary

When bowser tifs-to-geozarr converts a stack of GeoTIFFs whose band has a GDAL Unit Type set (e.g. dolphin's velocity.tifm/year), that unit was getting dropped on the floor. Every variable in the output zarr ended up with just {grid_mapping: spatial_ref} attrs.

Meanwhile the backend has been surfacing attrs["units"] as dsInfo.unit on /datasets for a while, and the frontend ColormapBar has a unit slot next to the colorbar that was always empty because no variable ever set the attr.

What changes

  • _Loaded dataclass gains a units: str | None field.
  • _load_group reads src.units[0] from the first file of the group at the same time it sniffs dtype (one extra line, same rasterio context). Empty-string units normalised to None.
  • _write_variable_to_all_levels stamps da.attrs["units"] when present and always stamps da.attrs["long_name"] = lv.display_name. Both are CF-conventions attrs, so other zarr readers benefit too.

Side effect: the colorbar title now reads Velocity (the RasterGroup's display_name) instead of the sanitized zarr variable name velocity.

Verified

Against /Volumes/.../velocity.tif (GDAL Unit Type: m/year):

>>> lv = _load_group({'name': 'Velocity', 'file_list': [velocity_tif]}, ref)
>>> lv.name, lv.display_name, lv.units
('velocity', 'Velocity', 'm/year')

Caveat

Existing cubes must be rebuilt with bowser tifs-to-geozarr to pick up the new attrs — this is a write-time fix, not a back-fill.

Test plan

  • _load_group returns the expected units on a real file (see above)
  • ruff / ruff-format / mypy pass
  • Reporter to rebuild a cube and confirm the colorbar shows m/year under the velocity bar in the UI

🤖 Generated with Claude Code

rasterio exposes the source GeoTIFF's GDAL Unit Type (set by
`SetUnitType`) as `src.units[0]`; on the palos-verdes velocity.tif it
reads as `'m/year'`. The converter was dropping it — variables in the
output zarr had only `grid_mapping: spatial_ref` attrs.

The backend already surfaces `attrs["units"]` as `dsInfo.unit` on the
`/datasets` endpoint, and the frontend `ColormapBar` already has a
`unit` slot next to the colorbar that was always empty because no
variable ever set the attr. Writing it at conversion time fills it
in automatically.

Also sets `long_name` to the RasterGroup's human-readable `display_name`
(e.g. `"Velocity"`) so the colorbar title is "Velocity" instead of
the sanitized zarr variable name `"velocity"`.

Both are CF-conventions attrs, so other zarr readers benefit too.

Note: existing cubes must be rebuilt with `bowser tifs-to-geozarr` to
pick up the attrs — the fix is at write time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@scottstanie scottstanie merged commit 53f5fe0 into opera-adt:main Apr 24, 2026
0 of 2 checks passed
@scottstanie scottstanie deleted the feat/tifs-to-geozarr-units branch April 24, 2026 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant