Skip to content

Conversation

@evetion
Copy link

@evetion evetion commented Aug 31, 2025

Not sure if this is something we want, it doesn't work with my formats exactly (multiple independent groups in a hdf5 file). It still misses DiskArray stuff, I believe @rafaqz once looked at that for Rasters, and there's AStupidBear/HDF5Utils.jl#8.

cc @danlooo

@lazarusA
Copy link
Collaborator

lazarusA commented Sep 2, 2025

Well, some people asked me about this. So, adding support for it would be nice, I think. We need some tests. It looks like the goal is reading, hence maybe just for that, like in:

@testset "Reading ArchGDAL" begin

@codecov
Copy link

codecov bot commented Sep 2, 2025

Codecov Report

❌ Patch coverage is 0% with 53 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.69%. Comparing base (37d701f) to head (d503966).

Files with missing lines Patch % Lines
ext/HDF5Ext.jl 0.00% 53 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (37d701f) and HEAD (d503966). Click for more details.

HEAD has 3 uploads less than BASE
Flag BASE (37d701f) HEAD (d503966)
9 6
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #41      +/-   ##
==========================================
- Coverage   52.27%   46.69%   -5.59%     
==========================================
  Files          12       13       +1     
  Lines         461      514      +53     
==========================================
- Hits          241      240       -1     
- Misses        220      274      +54     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rafaqz
Copy link

rafaqz commented Sep 2, 2025

Rasters.jl has this already via netcdf. I guess its good to have a native julia version?

@danlooo
Copy link
Member

danlooo commented Sep 9, 2025

Is this for HDF5 files that do not follow CF conventions? Unfortunately, we don't have a YAXDimTree yet.

@evetion
Copy link
Author

evetion commented Sep 11, 2025

This is mostly for opening .h5 files that have many groups and hundreds variables (typical NASA products), indeed not necessarily CDF compliant at all, although they could be on individual groups? Most of these variables share an dimension with the other variables of a group (like time or latitude). See screenshot below.

It might also be an API question, ideally we can specific which variables we want, instead of only being able to exclude some of them in the API?

Screenshot 2025-09-11 at 14 32 53

@felixcremer
Copy link
Member

I have this function which I developed for SentinelDataSource:

export open_tree

open_tree(path::AbstractString;kwargs...) = open_tree(ZarrDataset(path);kwargs...)

function open_tree(dataset::ZarrDataset; prefer_datetime=true)
    stem = DimTree()
    groupnames = CDM.groupnames(dataset)
    varnames = CDM.varnames(dataset)
    alldimnames = nesteddimnames(dataset)
    for v in setdiff(varnames, alldimnames)
        @show v
        setindex!(stem, Raster(CDM.variable(dataset, v); lazy=true, prefer_datetime),Symbol(v))
    end
    for g in groupnames
        @show g
        setindex!(stem,  open_tree(CDM.group(dataset, g);prefer_datetime),Symbol(g))
    end
    stem
end


function nesteddimnames(zarrdataset)
    alldims = []
    for v in CDM.varnames(zarrdataset)
        append!(alldims, CDM.dimnames(CDM.variable(zarrdataset, v)))
    end
    unique(alldims)
end

https://github.com/JuliaGeo/SentinelDataSource.jl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants