Skip to content

Conversation

@dcalve
Copy link

@dcalve dcalve commented Feb 4, 2026

PR Summary

Code Reviewer:

Add support for time-varying masks to mean_nemo, as well as other improvements relating to performance and testing.

Closes #26- see this for more details on the changes.

Code Quality Checklist

(Some checks are automatically carried out via the CI pipeline)

  • I have performed a self-review of my own code
  • My code follows the project's style guidelines
  • Comments have been included that aid undertanding and enhance the
    readability of the code
  • My changes generate no new warnings

Testing

  • I have tested this change locally, using the Moci rose-stem suite
  • If any tests fail (rose-stem or CI) the reason is understood and
    acceptable (eg. kgo changes)
  • I have added tests to cover new functionality as appropriate (eg. system
    tests, unit tests, etc.)
    • No tests have been added per se, but a new script (simple_test_data.py) has been added to generate NetCDF files containing testing datasets intended as input to mean_nemo, as well as separate files containing the expected results from mean_nemo for these input files

Other testing

Regression testing

Data from several models were used as input to mean_nemo and their results before/after the changes compared using nccmp -F -f -c 1 -d -m -w format:

  • GloSea ORCA025 data

    The first daily mean of:

    moose:devfc/rosie_u-cw134_ERA5/field.nc.file/prodh_rs_sfc_o1d_T_18_20161101_20161207_001.nc

    was extracted and passed as input 30/31 times to simulate 30/31-day averages. The purpose of this was to test that the missing data issues described by MOSRS #652 did not reoccur.

  • GOSI9 ORCA025 data

    The following files were passed as input (separately for each grid type):

    moose:/crum/u-ct401/onm.nc.file/*1m_19770[678]*

  • GOSI10 ORCA025 data

    The following files were passed as input (separately for each grid type):

    moose:/crum/u-dc531/ony.nc.file/*1y_200[012]*

  • UKESM1.3 ORCA1 data

    The following files were passed as input (separately for each grid and file type):

    moose:/crum/u-dw112/onm.nc.file/*_1m_32{47,57,74}0101-*

mean_nemo produced identical results for all data, except for thickness-weighted diagnostics (those with cell_methods=time: mean (thickness weighted)) which have bit level differences. This is an expected consequence of the following change to the calculation of the average:

- meandata_4d_sp(:,:,:,:) = meandata_4d_sp(:,:,:,:) / (meancellthick_4d_sp(:,:,:,:) * ntimes_sp)
+ meandata_4d_sp(:,:,:,:) = meandata_4d_sp(:,:,:,:) / meancellthick_4d_sp(:,:,:,:)

The average of input data $D$ weighted by cell thickness $C$ is $\frac{\sum{C D}}{\sum{C}}$. meancellthick_4d* variables were previously calculated as exactly that- the average cell thickness- so $\sum{C}$ had to be recovered multplying by the number of time records used to calculate the average.

meancellthick_4d* now represents $\sum{C}$, so the extra * ntimes calculation is not necessary. This means the floating point arithmetic produces slightly different results, but since fewer calculations are performed the result will be more accurate (less accumulation of round off error).

Correctness testing

The new simple_test_data.py script generates input files for mean_nemo, as well as files containing the expected results from running mean_nemo with these input files. The variables in these files use different precisions, shapes etc and test much more mean_nemo functionality than standard model data.

The files produced by mean_nemo were compared with the expected results files generated by this script, using the same nccmp command as the regression testing.

Only one byte type variable fails this test, because mean_nemo calculates this average using byte precision arithmetic and the result overflows:

DIFFER : VARIABLE : byte_4d_unweighted_time_unmasked : POSITION : [10,10,2,1] : VALUES : 0xFFFFFFEB <> 0x15

The calculation should instead be performed using a higher precision and the result cast back to byte type. This is not addressed here, since we rarely use byte data types.

Security Considerations

  • I have reviewed my changes for potential security issues
  • Sensitive data is properly handled (if applicable)
  • Authentication and authorisation are properly implemented (if applicable)

Performance Impact

  • Performance of the code has been considered and, if applicable, suitable
    performance measurements have been conducted

In addition to the timing calipers added to the code, the overall elapsed wallclock time and memory usage as given by qstat -fxw were monitored (although the memory usage fluctuates too much to be useful). The changes have no significant impact on performance.

AI Assistance and Attribution

  • Some of the content of this change has been produced with the assistance
    of Generative AI tool name (e.g., Met Office Github Copilot Enterprise,
    Github Copilot Personal, ChatGPT GPT-4, etc) and I have followed the
    Simulation Systems AI policy
    (including attribution labels)

Documentation

  • Where appropriate I have updated documentation related to this change and
    confirmed that it builds correctly

Code Review

  • All dependencies have been resolved
  • Related Issues have been properly linked and addressed
  • CLA compliance has been confirmed
  • Code quality standards have been met
  • Tests are adequate and have passed
  • Documentation is complete and accurate
  • Security considerations have been addressed
  • Performance impact is acceptable

@github-actions github-actions bot added the cla-required The CLA has not yet been signed by the author of this PR - added by GA label Feb 4, 2026
@github-actions github-actions bot added cla-signed The CLA has been signed as part of this PR - added by GA and removed cla-required The CLA has not yet been signed by the author of this PR - added by GA labels Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed The CLA has been signed as part of this PR - added by GA

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update Utilities/mean_nemo to handle time-varying masks and make other improvements

1 participant