Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates AMOCatlas dataset metadata and the generated documentation reports, primarily to improve attribution/institution fields, OSNAP variable long names, and reduce report churn from canonical date_modified handling.
Changes:
- Removes the auto-appended “canonical date_modified” note from generated reports.
- Updates standardisation to preserve variable attributes during W→PW conversion and adds support for a
remove_unwanted_attributesmetadata directive. - Refreshes multiple array metadata YAMLs (OSNAP/MOCHA/NOAC/Arctic Gateway) and regenerates corresponding docs reports; updates
CITATION.cffrelease metadata.
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/source/reports/zheng2024_report.rst | Removes the appended canonical-date explanatory note from the rendered Comment field. |
| docs/source/reports/wh41n_report.rst | Updates contributing institution naming/vocabulary and removes canonical-date note lines. |
| docs/source/reports/samba_report.rst | Removes canonical-date note lines from report output. |
| docs/source/reports/rapid_report.rst | Removes canonical-date note lines from report output. |
| docs/source/reports/osnap_report.rst | Updates license fields and OSNAP variable long-name display in the report output. |
| docs/source/reports/noac47n_report.rst | Updates contributing institutions display and removes canonical-date note line. |
| docs/source/reports/move_report.rst | Trims canonical-date note from Comment content. |
| docs/source/reports/mocha_report.rst | Improves variable descriptions/mappings shown in report and removes canonical-date note line. |
| docs/source/reports/fw2015_report.rst | Removes canonical-date note line from report output. |
| docs/source/reports/fbc_report.rst | Removes canonical-date note line from report output. |
| docs/source/reports/dso_report.rst | Updates History timestamp and trims canonical-date note from Comment. |
| docs/source/reports/calafat2025_report.rst | Removes canonical-date note line from report output. |
| docs/source/reports/arcticgateway_report.rst | Updates contributor/institution display and variable descriptions; removes creator-type/comment lines. |
| amocatlas/standardise.py | Preserves attrs during W→PW conversion; adds remove_unwanted_attributes processing in metadata pipeline. |
| amocatlas/report.py | Stops appending canonical-date explanation text into comment when generating reports. |
| amocatlas/metadata/osnap55n.yml | Updates OSNAP license and variable long_name fields / key adjustments. |
| amocatlas/metadata/noac47n.yml | Adds contributing institution metadata for NOAC 47N. |
| amocatlas/metadata/mocha26n.yml | Adds mapping/metadata for moc streamfunction variable; adjusts maxmoc long_name. |
| amocatlas/metadata/institution_registry.yml | Adds NASA JPL (EDMO 1224) institution registry entry and name variants. |
| amocatlas/metadata/arcticgateway.yml | Adds remove_unwanted_attributes directive; fixes variable metadata keys (CT/SA/v) to match source variables. |
| CITATION.cff | Updates version and release date metadata. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # 4.4) Process remove_unwanted_attributes directive | ||
| remove_unwanted = all_yaml_metadata.get("remove_unwanted_attributes", []) | ||
| if remove_unwanted: | ||
| for attr_name in remove_unwanted: | ||
| if attr_name in cleaned: | ||
| cleaned.pop(attr_name) | ||
| log_debug(f"Removed unwanted attribute: '{attr_name}'") | ||
| # Also remove the remove_unwanted_attributes directive itself | ||
| cleaned.pop("remove_unwanted_attributes", None) |
There was a problem hiding this comment.
remove_unwanted_attributes is applied after _consolidate_contributors(). Since _consolidate_contributors() pops creator_*/publisher_* and folds them into contributor_*, removing those fields afterward won’t prevent them from influencing the consolidated contributor list (e.g., institutional creators/publishers may still appear as contributors). Apply the removals before calling _consolidate_contributors() (e.g., delete the listed keys from merged_attrs/combined first), then drop the directive key so it doesn’t propagate.
| # 4.4) Process remove_unwanted_attributes directive | ||
| remove_unwanted = all_yaml_metadata.get("remove_unwanted_attributes", []) | ||
| if remove_unwanted: | ||
| for attr_name in remove_unwanted: | ||
| if attr_name in cleaned: | ||
| cleaned.pop(attr_name) | ||
| log_debug(f"Removed unwanted attribute: '{attr_name}'") | ||
| # Also remove the remove_unwanted_attributes directive itself | ||
| cleaned.pop("remove_unwanted_attributes", None) |
There was a problem hiding this comment.
This introduces new metadata behavior (remove_unwanted_attributes) but there’s no test coverage for it. Add a unit/integration test that constructs attrs/YAML including remove_unwanted_attributes and verifies the listed keys are removed and do not affect contributor consolidation (e.g., creator_name/publisher_name don’t get merged into contributor_name).
| @@ -222,32 +222,32 @@ files: | |||
| units: PW | |||
| standard_name: northward_ocean_heat_transport | |||
| MFT: | |||
There was a problem hiding this comment.
In OSNAP_MOC_MHT_MFT_TimeSeries_201408_202207_2025.nc, variable_mapping indicates the source variable is MFT_ALL (mapped to MFT), and standardise_data() applies original_variable_metadata before renaming. The original_variable_metadata entry is currently keyed as MFT, so it won’t be applied to the source dataset variable and the long_name/attrs may remain missing. Rename this metadata key to MFT_ALL (and keep the attrs the same) so it is applied before the rename.
| MFT: | |
| MFT_ALL: |
Description:
Small fixes to metadata:
Checklist:
pytestto check that all tests pass.pre-commit run --all-filesto lint and format the code.