-
Notifications
You must be signed in to change notification settings - Fork 5
Add write_to_netcdf support for OutputVar #370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ph-kev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this issue! I left some comments that should be addressed. Also, could you also add this to NEWS.md as well? Let me know if there was anything that isn't clear.
| """ | ||
| write_to_netcdf(path::AbstractString, var::OutputVar) | ||
|
|
||
| Write the `OutputVar` to a NetCDF file at `path`, overwriting any existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you should be overwriting a file if it exists. An error should be thrown instead.
| error( | ||
| "Dimension $dim_name must be 1D to be written to NetCDF (ndims=$(ndims(dim_val)))", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is okay to have. Can you mention it in the documentation?
| var_dim_names = collect(keys(var.dims)) | ||
| ndims(var.data) == length(var.dims) || error( | ||
| "Number of dimensions in data ($(ndims(var.data))) does not match number of dims ($(length(var.dims)))", | ||
| ) | ||
| for (i, (dim_name, dim_val)) in enumerate(var.dims) | ||
| if ndims(dim_val) == 1 | ||
| expected_len = length(dim_val) | ||
| actual_len = size(var.data, i) | ||
| expected_len == actual_len || error( | ||
| "Dimension $dim_name has length $expected_len but data has size $actual_len along dimension $i", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to do this check. This check is already done in the OutputVar constructor. If this is possible, then this is a bug in ClimaAnalysis.
| "Attributes:\n long_name => hi\nDimension attributes:\n lat:\n units => deg\nData defined over:\n lat with 0 element" | ||
| end | ||
|
|
||
| @testset "Write to NetCDF" begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For your tests, can you use TemplateVar instead? This should simplify the construction of the OutputVar.
| @test var_read.attributes["short_name"] == attribs["short_name"] | ||
| @test var_read.attributes["long_name"] == attribs["long_name"] | ||
| @test var_read.attributes["units"] == attribs["units"] | ||
| @test var_read.dims["lon"] == lon | ||
| @test var_read.dims["lat"] == lat | ||
| @test var_read.data ≈ data | ||
| @test var_read.dim_attributes["lon"]["units"] == "deg" | ||
| @test var_read.dim_attributes["lat"]["units"] == "deg" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably simplify this to something like
| @test var_read.attributes["short_name"] == attribs["short_name"] | |
| @test var_read.attributes["long_name"] == attribs["long_name"] | |
| @test var_read.attributes["units"] == attribs["units"] | |
| @test var_read.dims["lon"] == lon | |
| @test var_read.dims["lat"] == lat | |
| @test var_read.data ≈ data | |
| @test var_read.dim_attributes["lon"]["units"] == "deg" | |
| @test var_read.dim_attributes["lat"]["units"] == "deg" | |
| @test var_read.attributes == var.attributes | |
| @test var_read.dims == var.dims | |
| @test var_read.data ≈ var.data | |
| @test var_read.dim_attributes == var_read.dim_attributes |
|
|
||
| nc_var[:] = var.data | ||
|
|
||
| # Write attributes with type safety (overwrite existing keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment. It would be better to write that attributes of NetCDF files must be basic scalar types or strings (like what you did for the documentation of _netcdf_safe_attrib.
| # Test 6: NaN round-trip | ||
| nan_path = joinpath(tmpdir, "nan_file.nc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need a test case for this.
| # # Test 7: Scalar (0D) round-trip | ||
| # scalar_path = joinpath(tmpdir, "scalar_file.nc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not possible in general with NetCDF files?
| var = ClimaAnalysis.OutputVar(attribs, dims, dim_attribs, data) | ||
| ClimaAnalysis.write_to_netcdf(nc_path, var) | ||
|
|
||
| var_read = ClimaAnalysis.read_var(nc_path; short_name = "test_var") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to pass a short name here.
| var_read = ClimaAnalysis.read_var(nc_path; short_name = "test_var") | |
| var_read = ClimaAnalysis.read_var(nc_path) |
| v isa AbstractString && return v | ||
| v isa Number && return v | ||
| v isa AbstractArray && eltype(v) <: Union{Number, AbstractString} && | ||
| return v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are different concrete types of AbstractString like String, LazyString, and SubString. Even though, we probably won't expect any of these, I think it would be better to do
v isa AbstractString && string(v)I think you can apply a similar reasoning to the other paths of the function as well.
Purpose
closes #368
This PR adds support for writing an
OutputVarback to a NetCDF file viawrite_to_netcdf.The goal is to enable round trip workflow where users can read data, perform analysis, and then persist the resulting variable for later use.
To-do
Content
write_to_netcdf(path, var)to write anOutputVarto a NetCDF file, including its data, dimensions, and associated attributes._netcdf_safe_attribto convert attribute values to NetCDF-compatible types when writing.
short_namerequirement for NetCDF output.Notes on scalar variables
Scalar (0D) variables: A round-trip test for scalar
OutputVars (Test 7 intest/test_Var.jl) is currently commented out.When reading a scalar variable from NetCDF,
read_varreturns an empty, untypedOrderedDictfor dimensions. This does not match the typed dictionary expected by theOutputVarconstructor and results in aMethodErrorduring the read step.Since this behavior originates in the existing read/constructor path and not in the new NetCDF writing logic, it was left unchanged here to keep this PR focused on issue #368. This can be addressed separately in a follow-up PR if desired.
If there are any adjustments you’d like to see I’m happy to make those changes.