Update Utilities/mean_nemo to handle time-varying masks and make other improvements #28

dcalve · 2026-02-04T18:11:56Z

PR Summary

Code Reviewer:

Add support for time-varying masks to mean_nemo, as well as other improvements relating to performance and testing.

Closes #26- see this for more details on the changes.

Code Quality Checklist

(Some checks are automatically carried out via the CI pipeline)

I have performed a self-review of my own code
My code follows the project's style guidelines
Comments have been included that aid undertanding and enhance the
readability of the code
My changes generate no new warnings

Testing

I have tested this change locally, using the Moci rose-stem suite
If any tests fail (rose-stem or CI) the reason is understood and
acceptable (eg. kgo changes)
I have added tests to cover new functionality as appropriate (eg. system
tests, unit tests, etc.)
- No tests have been added per se, but a new script (simple_test_data.py) has been added to generate NetCDF files containing testing datasets intended as input to mean_nemo, as well as separate files containing the expected results from mean_nemo for these input files

Other testing

Regression testing

Data from several models were used as input to mean_nemo and their results before/after the changes compared using nccmp -F -f -c 1 -d -m -w format:

GloSea ORCA025 data

The first daily mean of:

moose:devfc/rosie_u-cw134_ERA5/field.nc.file/prodh_rs_sfc_o1d_T_18_20161101_20161207_001.nc

was extracted and passed as input 30/31 times to simulate 30/31-day averages. The purpose of this was to test that the missing data issues described by MOSRS #652 did not reoccur.
GOSI9 ORCA025 data

The following files were passed as input (separately for each grid type):

moose:/crum/u-ct401/onm.nc.file/*1m_19770[678]*
GOSI10 ORCA025 data

The following files were passed as input (separately for each grid type):

moose:/crum/u-dc531/ony.nc.file/*1y_200[012]*
UKESM1.3 ORCA1 data

The following files were passed as input (separately for each grid and file type):

moose:/crum/u-dw112/onm.nc.file/*_1m_32{47,57,74}0101-*

mean_nemo produced identical results for all data, except for thickness-weighted diagnostics (those with cell_methods=time: mean (thickness weighted)) which have bit level differences. This is an expected consequence of the following change to the calculation of the average:

- meandata_4d_sp(:,:,:,:) = meandata_4d_sp(:,:,:,:) / (meancellthick_4d_sp(:,:,:,:) * ntimes_sp)
+ meandata_4d_sp(:,:,:,:) = meandata_4d_sp(:,:,:,:) / meancellthick_4d_sp(:,:,:,:)

The average of input data $D$ weighted by cell thickness $C$ is $\frac{\sum{C D}}{\sum{C}}$. meancellthick_4d* variables were previously calculated as exactly that- the average cell thickness- so $\sum{C}$ had to be recovered multplying by the number of time records used to calculate the average.

meancellthick_4d* now represents $\sum{C}$, so the extra * ntimes calculation is not necessary. This means the floating point arithmetic produces slightly different results, but since fewer calculations are performed the result will be more accurate (less accumulation of round off error).

Correctness testing

The new simple_test_data.py script generates input files for mean_nemo, as well as files containing the expected results from running mean_nemo with these input files. The variables in these files use different precisions, shapes etc and test much more mean_nemo functionality than standard model data.

The files produced by mean_nemo were compared with the expected results files generated by this script, using the same nccmp command as the regression testing.

Only one byte type variable fails this test, because mean_nemo calculates this average using byte precision arithmetic and the result overflows:

DIFFER : VARIABLE : byte_4d_unweighted_time_unmasked : POSITION : [10,10,2,1] : VALUES : 0xFFFFFFEB <> 0x15

The calculation should instead be performed using a higher precision and the result cast back to byte type. This is not addressed here, since we rarely use byte data types.

Security Considerations

I have reviewed my changes for potential security issues
~~Sensitive data is properly handled (if applicable)~~
~~Authentication and authorisation are properly implemented (if applicable)~~

Performance Impact

Performance of the code has been considered and, if applicable, suitable
performance measurements have been conducted

In addition to the timing calipers added to the code, the overall elapsed wallclock time and memory usage as given by qstat -fxw were monitored (although the memory usage fluctuates too much to be useful). The changes have no significant impact on performance.

AI Assistance and Attribution

Some of the content of this change has been produced with the assistance
of Generative AI tool name (e.g., Met Office Github Copilot Enterprise,
Github Copilot Personal, ChatGPT GPT-4, etc) and I have followed the
Simulation Systems AI policy
(including attribution labels)

Documentation

Where appropriate I have updated documentation related to this change and
confirmed that it builds correctly

Code Review

All dependencies have been resolved
Related Issues have been properly linked and addressed
CLA compliance has been confirmed
Code quality standards have been met
Tests are adequate and have passed
Documentation is complete and accurate
Security considerations have been addressed
Performance impact is acceptable

…-weighted data

…l errors are consistently reported

dcalve added 7 commits February 2, 2026 17:34

Add support for time-varying masked data

260077d

Add support for cell thickness masking not matching that of thickness…

f307cfb

…-weighted data

Ensure all stop errors resulting directly from NetCDF Fortran API cal…

13dd91e

…l errors are consistently reported

Update compilation scripts

bb89959

Remove unnecessary zeroing of input data arrays

a3896c1

Add basic timing calipers

32d8663

Add script for generating a simple dataset for testing mean_nemo

f8164a3

github-actions bot added the cla-required The CLA has not yet been signed by the author of this PR - added by GA label Feb 4, 2026

Update CONTRIBUTORS.md

527bac0

github-actions bot added cla-signed The CLA has been signed as part of this PR - added by GA and removed cla-required The CLA has not yet been signed by the author of this PR - added by GA labels Feb 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Utilities/mean_nemo to handle time-varying masks and make other improvements #28

Update Utilities/mean_nemo to handle time-varying masks and make other improvements #28

dcalve commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Update Utilities/mean_nemo to handle time-varying masks and make other improvements #28

Are you sure you want to change the base?

Update Utilities/mean_nemo to handle time-varying masks and make other improvements #28

Conversation

dcalve commented Feb 4, 2026

PR Summary

Code Quality Checklist

Testing

Other testing

Security Considerations

Performance Impact

AI Assistance and Attribution

Documentation

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant