Skip to content

Add GEBCO, IBCSO, and IBCAO bathymetry dataset modules#180

Open
aklocker42 wants to merge 19 commits intoNumericalEarth:mainfrom
aklocker42:add-gebco-ibcso-ibcao
Open

Add GEBCO, IBCSO, and IBCAO bathymetry dataset modules#180
aklocker42 wants to merge 19 commits intoNumericalEarth:mainfrom
aklocker42:add-gebco-ibcso-ibcao

Conversation

@aklocker42
Copy link
Copy Markdown
Collaborator

Adds three new bathymetry dataset modules to DataWrangling:

  • GEBCO: GEBCO 2024 global bathymetry (ZipFile-based, no system unzip needed)
  • IBCSO: International Bathymetric Chart of the Southern Ocean v2
  • IBCAO: International Bathymetric Chart of the Arctic Ocean

Each module follows the existing dataset pattern with metadata_filename, download_dataset, and Bathymetry loading support.

Adds three new bathymetry dataset modules to DataWrangling:
- GEBCO: GEBCO 2024 global bathymetry (ZipFile-based, no system unzip needed)
- IBCSO: International Bathymetric Chart of the Southern Ocean v2
- IBCAO: International Bathymetric Chart of the Arctic Ocean

Each module follows the existing dataset pattern with metadata_filename,
download_dataset, and Bathymetry loading support.
Comment thread src/DataWrangling/IBCAO/IBCAO.jl Outdated
using Oceananigans.DistributedComputations: @root
using Scratch
using NCDatasets
using ArchGDAL
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have ArchGDAL as a dependency. Is this a large / mature / well maintained package? Is it worth to bring it in as a dependency?

Comment thread src/DataWrangling/IBCAO/IBCAO.jl Outdated
Comment thread src/DataWrangling/IBCAO/IBCAO.jl Outdated
@simone-silvestri
Copy link
Copy Markdown
Member

very nice and clear. My only (small) concern is bringing in another package (ArchGDAL) as a hard dependency.
Is it a heavy package? If yes we could think at implementing an extension.

I would add a couple of tests in the testset. Would be nice to make them quite lean since the testset is already quite long. Maybe regridding on a simple LatitudeLongitudeGrid and verifying that the outputs are sensible?

latitude_interfaces,
z_interfaces,
reversed_vertical_axis,
validate_dataset_coverage
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
validate_dataset_coverage
validate_dataset_coverage

this function does not exist in NumericalEarth.DataWrangling.
It is a nice idea to have it though. We can define a fallback in NumericalEarth.DataWrangling

validate_dataset_coverage(grid, metadata) = nothing

then wire it in the regrid_bathymetry function

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how heavy ArchGDAL is, but currently trying of there is an easier way without it. I'll try and generate a test.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test/test_polar_bathymetry.jl — tests metadata interfaces (longitude/latitude extents, grid sizes, variable names) and validate_dataset_coverage for IBCSO, GEBCO, and IBCAO. Includes a regridding test for IBCSO over the Drake Passage that checks for finite values and physically sensible ocean depths.

@simone-silvestri
Copy link
Copy Markdown
Member

An comparison of the new bathymetries with ETOPO22

compare_new_bathymetries

generated by

using NumericalEarth
using NumericalEarth.DataWrangling: GEBCO2024, IBCSOv2, IBCAOv5
using Oceananigans
using CairoMakie

function tenth_degree_grid(latitude)
    Nφ = round(Int, 10 * (latitude[2] - latitude[1]))
    Nλ = 3600
    return LatitudeLongitudeGrid(size = (Nλ, Nφ, 1),
                                 longitude = (-180, 180),
                                 latitude  = latitude,
                                 z = (0, 1),
                                 halo = (7, 7, 1))
end

global_grid   = tenth_degree_grid((-90, 90))
southern_grid = tenth_degree_grid((-90, -50))
arctic_grid   = tenth_degree_grid((64, 90))

h_gebco   = regrid_bathymetry(global_grid;   dataset = GEBCO2024())
h_etopo_g = regrid_bathymetry(global_grid;   dataset = ETOPO2022())

h_ibcso   = regrid_bathymetry(southern_grid; dataset = IBCSOv2())
h_etopo_s = regrid_bathymetry(southern_grid; dataset = ETOPO2022())

h_ibcao   = regrid_bathymetry(arctic_grid;   dataset = IBCAOv5())
h_etopo_a = regrid_bathymetry(arctic_grid;   dataset = ETOPO2022())

function mask_land!(h)
    land = interior(h) .>= 0
    interior(h)[land] .= NaN
    return h
end

for h in (h_gebco, h_etopo_g, h_ibcso, h_etopo_s, h_ibcao, h_etopo_a)
    mask_land!(h)
end

function difference_field(h_new, h_ref)
    Δh = deepcopy(h_new)
    interior(Δh) .= interior(h_new) .- interior(h_ref)
    return Δh
end

Δ_gebco = difference_field(h_gebco, h_etopo_g)
Δ_ibcso = difference_field(h_ibcso, h_etopo_s)
Δ_ibcao = difference_field(h_ibcao, h_etopo_a)

# Symmetric diverging range around zero, ignoring NaN
symmetric_range(Δh) = let vmax = maximum(x -> isfinite(x) ? abs(x) : 0.0, interior(Δh))
    (-vmax, vmax)
end

rows = (("GEBCO2024 (global)",       h_gebco, Δ_gebco),
        ("IBCSOv2 (Southern Ocean)", h_ibcso, Δ_ibcso),
        ("IBCAOv5 (Arctic)",         h_ibcao, Δ_ibcao))

fig = Figure(size = (2000, 1400), fontsize = 20)

for (i, (title, h, Δh)) in enumerate(rows)
    ax1 = Axis(fig[i, 1]; title, xlabel = "Longitude", ylabel = "Latitude")
    hm1 = heatmap!(ax1, h; colormap = Reverse(:deep), nan_color = :lightgray)
    Colorbar(fig[i, 2], hm1; label = "Bottom height (m)")

    ax2 = Axis(fig[i, 3];
               title = title * " − ETOPO2022",
               xlabel = "Longitude", ylabel = "Latitude")
    hm2 = heatmap!(ax2, Δh;
                   colormap = :balance,
                   nan_color = :lightgray,
                   colorrange = (-200, 200))
    Colorbar(fig[i, 4], hm2; label = "Δ bottom height (m)")
end

save("compare_new_bathymetries.png", fig)

to make this work, in addition to the validate_dataset_coverage addition to NumericalEarth.DataWrangling, we need to change

z_data = convert(Array{FT}, dataset["z"][:, :])

to read

z_data = convert(Array{FT}, dataset[dataset_variable_name(metadata)][:, :])

since GEBCO calls the bottom height elevation rather than z.

aklocker42 and others added 5 commits April 22, 2026 13:17
…lidation

Tests verify correct longitude/latitude extents, grid sizes, variable names,
and that validate_dataset_coverage throws for out-of-range grids.
All 22 tests pass on Olivia (job 559437).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t variable name, doc table reorganization

- Add validate_dataset_coverage(grid, metadata) = nothing fallback in metadata.jl
  and export it from DataWrangling; wire it into regrid_bathymetry so coverage
  checks run automatically for IBCSO and IBCAO
- Fix hardcoded dataset["z"] in regrid_bathymetry to use dataset_variable_name(metadata),
  accommodating GEBCO's "elevation" variable name
- Add test that regrid_bathymetry throws for out-of-range grids (IBCSO and IBCAO)
- Reorganize dataset table in docs into Bathymetry / Ocean reanalysis / Atmospheric
  forcing sections; fix link text for new bathymetry entries to use "overview"
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

- Move _reproject_ibcao_to_netcdf into NumericalEarthArchGDALExt so ArchGDAL
  is only required when loading IBCAO data, not at package load time
- ArchGDAL moved from [deps] to [weakdeps] in Project.toml
- ArchGDAL added as regular dep in test/Project.toml
Comment thread ext/NumericalEarthArchGDALExt.jl Outdated
Comment thread ext/NumericalEarthArchGDALExt.jl Outdated
@simone-silvestri
Copy link
Copy Markdown
Member

Ok, so the decision is to make an ArchGDAL extension. ArchGDAL brings in much of the GDAL ecosystem including a bunch of jll among which

Arrow_jll, Blosc_jll, Expat_jll, GEOS_jll, HDF4_jll, HDF5_jll, LERC_jll, LibCURL_jll, LibPQ_jll (PostgreSQL!), Libtiff_jll, Lz4_jll, NetCDF_jll, OpenJpeg_jll, PCRE2_jll, PROJ_jll, Qhull_jll, SQLite_jll, XML2_jll, XZ_jll, Zlib_jll, Zstd_jll, libgeotiff_jll, libpng_jll, libwebp_jll, muparser_jll

So it is good to keep it separate to not overwhelm the precompilation for a limited use case (the use of the IBCAO bathymetry)

@simone-silvestri
Copy link
Copy Markdown
Member

Thinking about it a bit, bathymetries share a lot of methods (no time for example) so it might be worth it to introduce an AbstractBathymetryDataset that captures common behavior

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants