Skip to content

[DOC] Update metadata + reports#123

Merged
eleanorfrajka merged 1 commit intomainfrom
fixes/metadata-updates
Feb 12, 2026
Merged

[DOC] Update metadata + reports#123
eleanorfrajka merged 1 commit intomainfrom
fixes/metadata-updates

Conversation

@eleanorfrajka
Copy link
Collaborator

Description:

Small fixes to metadata:

  • Remove comment at the end
  • Fix OSNAP variables (not showing long name)
  • Enable a "remove_unwanted_attributes" in the amocatlas/metadata/*yml files, so that if any attributes are listed here and match the original dataset attributes, they'll be deleted. This is to simplify some of the renaming between contributors/contributing institutions and other formats that use "creator_institution", etc

Checklist:

  • I have followed the coding conventions.
  • I have updated or added tests to cover my changes.
  • I have updated the documentation if needed.
  • I have run pytest to check that all tests pass.
  • I have run pre-commit run --all-files to lint and format the code.

@eleanorfrajka eleanorfrajka merged commit ff14205 into main Feb 12, 2026
8 checks passed
@eleanorfrajka eleanorfrajka deleted the fixes/metadata-updates branch February 12, 2026 12:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates AMOCatlas dataset metadata and the generated documentation reports, primarily to improve attribution/institution fields, OSNAP variable long names, and reduce report churn from canonical date_modified handling.

Changes:

  • Removes the auto-appended “canonical date_modified” note from generated reports.
  • Updates standardisation to preserve variable attributes during W→PW conversion and adds support for a remove_unwanted_attributes metadata directive.
  • Refreshes multiple array metadata YAMLs (OSNAP/MOCHA/NOAC/Arctic Gateway) and regenerates corresponding docs reports; updates CITATION.cff release metadata.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
docs/source/reports/zheng2024_report.rst Removes the appended canonical-date explanatory note from the rendered Comment field.
docs/source/reports/wh41n_report.rst Updates contributing institution naming/vocabulary and removes canonical-date note lines.
docs/source/reports/samba_report.rst Removes canonical-date note lines from report output.
docs/source/reports/rapid_report.rst Removes canonical-date note lines from report output.
docs/source/reports/osnap_report.rst Updates license fields and OSNAP variable long-name display in the report output.
docs/source/reports/noac47n_report.rst Updates contributing institutions display and removes canonical-date note line.
docs/source/reports/move_report.rst Trims canonical-date note from Comment content.
docs/source/reports/mocha_report.rst Improves variable descriptions/mappings shown in report and removes canonical-date note line.
docs/source/reports/fw2015_report.rst Removes canonical-date note line from report output.
docs/source/reports/fbc_report.rst Removes canonical-date note line from report output.
docs/source/reports/dso_report.rst Updates History timestamp and trims canonical-date note from Comment.
docs/source/reports/calafat2025_report.rst Removes canonical-date note line from report output.
docs/source/reports/arcticgateway_report.rst Updates contributor/institution display and variable descriptions; removes creator-type/comment lines.
amocatlas/standardise.py Preserves attrs during W→PW conversion; adds remove_unwanted_attributes processing in metadata pipeline.
amocatlas/report.py Stops appending canonical-date explanation text into comment when generating reports.
amocatlas/metadata/osnap55n.yml Updates OSNAP license and variable long_name fields / key adjustments.
amocatlas/metadata/noac47n.yml Adds contributing institution metadata for NOAC 47N.
amocatlas/metadata/mocha26n.yml Adds mapping/metadata for moc streamfunction variable; adjusts maxmoc long_name.
amocatlas/metadata/institution_registry.yml Adds NASA JPL (EDMO 1224) institution registry entry and name variants.
amocatlas/metadata/arcticgateway.yml Adds remove_unwanted_attributes directive; fixes variable metadata keys (CT/SA/v) to match source variables.
CITATION.cff Updates version and release date metadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1453 to +1461
# 4.4) Process remove_unwanted_attributes directive
remove_unwanted = all_yaml_metadata.get("remove_unwanted_attributes", [])
if remove_unwanted:
for attr_name in remove_unwanted:
if attr_name in cleaned:
cleaned.pop(attr_name)
log_debug(f"Removed unwanted attribute: '{attr_name}'")
# Also remove the remove_unwanted_attributes directive itself
cleaned.pop("remove_unwanted_attributes", None)
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove_unwanted_attributes is applied after _consolidate_contributors(). Since _consolidate_contributors() pops creator_*/publisher_* and folds them into contributor_*, removing those fields afterward won’t prevent them from influencing the consolidated contributor list (e.g., institutional creators/publishers may still appear as contributors). Apply the removals before calling _consolidate_contributors() (e.g., delete the listed keys from merged_attrs/combined first), then drop the directive key so it doesn’t propagate.

Copilot uses AI. Check for mistakes.
Comment on lines +1453 to +1461
# 4.4) Process remove_unwanted_attributes directive
remove_unwanted = all_yaml_metadata.get("remove_unwanted_attributes", [])
if remove_unwanted:
for attr_name in remove_unwanted:
if attr_name in cleaned:
cleaned.pop(attr_name)
log_debug(f"Removed unwanted attribute: '{attr_name}'")
# Also remove the remove_unwanted_attributes directive itself
cleaned.pop("remove_unwanted_attributes", None)
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces new metadata behavior (remove_unwanted_attributes) but there’s no test coverage for it. Add a unit/integration test that constructs attrs/YAML including remove_unwanted_attributes and verifies the listed keys are removed and do not affect contributor consolidation (e.g., creator_name/publisher_name don’t get merged into contributor_name).

Copilot uses AI. Check for mistakes.
@@ -222,32 +222,32 @@ files:
units: PW
standard_name: northward_ocean_heat_transport
MFT:
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In OSNAP_MOC_MHT_MFT_TimeSeries_201408_202207_2025.nc, variable_mapping indicates the source variable is MFT_ALL (mapped to MFT), and standardise_data() applies original_variable_metadata before renaming. The original_variable_metadata entry is currently keyed as MFT, so it won’t be applied to the source dataset variable and the long_name/attrs may remain missing. Rename this metadata key to MFT_ALL (and keep the attrs the same) so it is applied before the rename.

Suggested change
MFT:
MFT_ALL:

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants