Skip to content

A repository for the management of issues related to vocabularies managed by the Argo Data Management Team

Notifications You must be signed in to change notification settings

OneArgo/ArgoVocabs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 

Repository files navigation

ArgoVocabs

Argo vocabulary (a.k.a tables, a.k.a collections) is exposed on the web through the Nerc Vocabulary Server:
https://vocab.nerc.ac.uk/search_nvs/cvl/?searchstr=argo&options=governance

Direct links to each of the vocabulary tables can be found in the last section of this readme.

Table of content

I The Argo Vocabulary Task Team

A repository for the management of issues related to vocabularies managed by the Argo Vocabulary Task Team (AVTT) under the umbrella of the Argo Data Management Team. This team is composed of (alphabetical order):

Regular meetings (every 2 months) are held to review the ticket lists. The last AVTT meeting before the ADMT meeting also reviews and lists the tickets that require an ADMT discussion/approval.

The dashboard for ArgoVocabs ticket is : https://github.com/orgs/OneArgo/projects/4

II. Management of Issues (a.k.a Tickets)

II.a. General case

  1. Raise, describe, label and indicate an assignee of your issue
  2. Discuss the issue along the thread of the ticket
  3. Update label relevantly (targetted table(s), mappings, documentation relevance, File checker, discussion is needed at the level of AVTT, ADMT approval is requested, etc.)
  4. On the right panel, add the project "AVTT issues Management" and select the status "to do", "in progress","avtt approval","done": this ensures it appears correctly on the dashboard https://github.com/orgs/OneArgo/projects/4
  5. Once an agreement has been reached summarize the agreement so that readers do not have to read the full thread back to know what to do
  6. Publish an action list in the form a ticking boxes with the responsible for each action (and add them as assignee so relevant people can filter on this to quickly get the list of the actions they have to do)
  7. Once assignee has started to effectively work on the ticket, label the ticket as "on-going", and move the project status to "in progress"
  8. Label the ticket as "done" when all the sub-actions are completely fulfilled (i.e. before "ticking an action", wait for the change to be visible in the operational flow. For instance, if you modify a table entry, wait for it to be accepted by NOC team, and effectively published in the table. This ensures potential final steps issues are not unnoticed).
  9. The list of "done" tickets will be reviewed at the following AVTT meeting for closure. Move the project status to "done" => This will automatically close the ticket.

N.B.: there is some duplications between status (for the project) and labels. For the moment, we keep this duplication live because it is easy to filter on label in the main issue page, whereas status helps in ordering the dashboard nicely. This may be adapted later on.

II.b. Special case when issues stem from an ADMT action

Sometimes, vocabulary discussions stem from a point raised at the ADMT. When this is the case, the chairs from the ADMT repository open an OneArgo/ADMT action and will attach the "AVTT issues management" project to OneArgo/ADMT action. Thus the action appears on the left hand side of the dashboard https://github.com/orgs/OneArgo/projects/4. It is the responsibility of the AVTT chair(s) to:

  • create as many sub-actions as necessary within the OneArgo/ArgoVocabs repository;
  • add the parent ADMT action to them (using "relationship" item in the right panel; tip you need to use "<-" symbol and type OneArgo/ADMT to be able to select the ADMT action);
  • change in the parent ADMT action, the status in the "AVTT issues management" project from "no status" to "ADMT repository action": this means the above steps have been performed.

The ADMT action has in general a wider scope than the sub-AVTT actions and can enclose more than vocabulary related sub-actions. Once the AVTT sub-actions are completed, the AVTT chair(s) report(s) within the OneArgo/ADMT action the completion of them.

II.c. The importance of label

  • Label are used for proper management of tickets and filtering for review, action follow-up, etc.
  • Each ticket should be flagged with the appropriate label for good management and labeling evolves as discussion progress.
  • Each label has a short description of its meaning.
  • While everyone should label ticket, it is part of the responsibilities of the co-chair of the AVTT to regularly review and update ticket labeling.

III. Standard process for releasing vocab, documentation and checker

vocab

  • The update and release of the new collection is made "on the fly", following ticket requirements.
  • When a new vocabulary collection is created, the documentation is updated accordingly (at the very least to indicate the new collection url).
  • When a new metadata field is created, it is good practice to first make this new metadata field optionnal. It should remain optionnal during a transition phase, which can usually last from 1 to 3 years, allowing the time necessary for DACs to perform the associated updates in their processing chain. Then, the field can be switched to mandatory and the Format_version number is increased. [EDIT] This statement may need to be revised, discussions on-going on this subject.

documentation

  • The release of new version of the documentation is made regularly. There is no formal schedule but there is a review process within 2 to 3 months after the ADMT for tickets with both the labels “documentation” and “AVTT approved” or “ADMT approved”.
  • the UM is updated accordingly when relevant.
  • a draft is sent out for comment to argo-dm@groups.wmo.int mailing list.
  • the new version of the documentation is advertised through the argo-dm@groups.wmo.int mailing list, around March/April.

FileChecker

  • Once the documentation is released, the checker is updated.
  • The new version of the file checker must be advertised through the argo-dm@groups.wmo.int mailing list.

IV. A few information on NVS updates

IV.a. Necessary inputs to request the creation of a new collection to NOC/BODC

The Vocab editors do not have the right to create new collections. This is only granted to NOC/BODC. This is to avoid duplication of similar collections when something already in place may be suitable for the purpose. Thus, to create a new collection, a request must be sent to NOC/BODC (Dani). This request must contain the following information:

  • Governance: who is the editor in charge of the table update
  • Collection Name
  • Description: description of the content of the collection
  • An excel or csv file containing the elements of the table: ID; preffered Label; Alternative Label (if relevant); Definition

IV.b. Mappings

Mappings are used to inform relationship between concepts. For instance, inform all the sensor_models manufactured by one sensor_maker, or all the platform_types manufactures by one platform_maker, etc. They are used by the FileChecker to ensure the consistency between these metadata fields in the Argo dataset. The Vocab editors have the right to insert mappings. It is advised to ask NOC/BODC the appropriate mapping type before proceeding.

i) How to read a mapping ?

Mappings are relationships between [NVS] concepts. There is a "subject", a "predicate" and an "object". "predicate" indicates the relationship type between the "subject" and the "object". There are two kinds of predicates: "narrower/broader" when there is a hierarchy between the subject and the object, and "related" when the subject is related to the object without strict hierarchy.
An example mapping can be found here https://vocab.nerc.ac.uk/mapping/I/1700614/. In this example, the "subject" https://vocab.nerc.ac.uk/collection/R27/current/AANDERAA_OPTODE_3830/ has a relationship to a "broader" concept (the "object"): http://vocab.nerc.ac.uk/collection/R26/current/AANDERAA/. The manufacturer is a broader concept than the more granular sensor designed and developed by the manufacturer.
Mappings can also be seen by clicking on the individual URIs for each concept. For instance, in https://vocab.nerc.ac.uk/collection/R27/current/AANDERAA_OPTODE_3830/, the optode (subject of the relationship) has a relationship to the broader concept ‘Aanderaa’, the manufacturer.

It is important to note that the inverse mapping/relationship must also exist for each of these, so if there is a ‘broader’ mapping in one direction between subject and object, a ‘narrower’ mapping must also exist in the other direction between object and subject. Otherwise the mappings won’t resolve on the NVS. For ‘related’, the inverse mapping is also ‘related’. By clicking on the ‘object’ in the above example: http://vocab.nerc.ac.uk/collection/R26/current/AANDERAA/, ‘Aanderaa’ is now the subject, and the relationship is now ‘narrower’ as the manufacturer is related to all the narrower, individual sensors.

ii) How to export mappings from the NVS ?

There are many solutions. Beside sparql queries, there is a url where you can at one glance look for all the mappings between two tables.
For instance, the url below allows to view and export (csv, tsv) all the mappings between R15 and RTV :
https://vocab.nerc.ac.uk/search_nvs/cmap/?a=R15&b=RTV
Just replace the R15 and RTV in the url with the tables you wish to check.
You can also access this through https://vocab.nerc.ac.uk/search_nvs/, go to the "Explore Mapping" section. Enter the first table. A view with the associated tables shows up. Selecting the "view mapping" icon on the right will redirect you to the url above.

iii) How to create a mapping ?

When mappings are loaded via the vocab editor, you only need to load one set of mappings in one direction however (i.e. only the broader ones, or only the narrower ones etc). This is because the inverse mappings are automatically generated on a trigger when we process what you have loaded.

For "broader/narrower" relationship, the "BRD" predicate code is used, for "related" relationship, the "MIN" predicate code is used (minor match).

Care must be taken as subject and object are reversed in the file, which can create confusion. Columns are as follows:

  • object_NVS_table, object_concept_id, predicate_code, subject_NVS_table, subject_concept_id, modification_type (I for Insertion)

Below is a concrete example of a mapping file's content submitted to the NVS editor (bulk update option) between table R23 (platform_type) and R08 (intrument_type = platform_type + mounted CTD sensor type), and between table R27 (sensor_model) and table R25 (sensor):

  • R23, ALTO, BRD, R08, 873, I
  • R27, AANDERAA_OPTODE , MIN, R25, OPTODE_DOXY , I

The Vocab editors cannot delete mappings. If ever a correction was needed, the editor must ask NOC/BODC (Dani) to perform the deletion

iv) Which Argo tables require a mapping ?

The following tables require a mapping for Argo workflow purpose. This means that when a new entry in these tables is performed, the associated mapping(s) must also be done, otherwise, it will be rejected by the FileChecker.

Table Mapped to
R08 R23
R15 RMC, RTV
R23 R08, R24
R24 R23
R25 R27
R26 R27
R27 R25, R26
RMC R15
RTV R15

The following tables require a mapping for external Argo purposes

Table Mapped to
R03 P01

V. M2M access to the NVS via API

For machine to machine (M2M) access to the Argo Vocabularies on the NVS, the NVS SPARQL endpoint can be used.

General information on the NVS SPARQL endpoint can be found on the NVS website: https://vocab.nerc.ac.uk/sparql/

SPARQL queries can be integrated into code written in other programming languages (Python, Matlab etc.).

For a basic example, please see the "m2m_NVS_sparql.ipynb" file linked to this repo. To test, open the file into a Jupyter Notebook; edit lines marked by '# Switch' to select either prefLabel/altLabel, and point to specific Argo vocabularies by inserting its name (e.g. 'R03') where the line is marked by '# Edit'.

VI. Management special case for R03, R14 and R18 tables

Three tables are specific: R03, R14 and R18 because they encompass more information than solely a list of terms.

The responsbility of these table management is held by a main person:

  • R. Cancouet is responsible for R14 and R18 core subpart
  • C. Schmechtig is responsible for R03 and R18 BGC subpart
    The responsible name is also highlighted in bold font in the table of the next section.

When a change is performed on R03/R14/R18, the associated Excel spreadsheets are updated accordingly:

VII. List of Argo NVS tables and their editors

The people responsible for updating Argo tables (or collections in NVS jargon) should be registered as editors in the NVS.
The interface that allows table updates is accessible through this url: https://vocab.nerc.ac.uk/editor/.
Once registered, the editors are allowed to edit the corresponding tables (collections) using this interface.
The edition process mainly consists of three steps: loading, submission by the editor and review by the NOC/NVS team.

Argo NVS tables (collections) Editors
R01 - DATA_TYPE V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
RR2 - RT_QC_FLAG V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
RD2 - DM_QC_FLAG V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
RP2 - PROF_QC_FLAG V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R03 - PARAMETER C. Schmechtig, V. Turpin, T. Carval, apswong, D. Dobler
R04 - DATA_CENTRE_CODES V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R05 - POSITION_ACCURACY V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R06 - DATA_STATE_INDICATOR V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R07 - HISTORY_ACTION V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R08 - ARGO_WMO_INST_TYPE V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R09 - POSITIONING_SYSTEM V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R10 - TRANS_SYSTEM V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R11 - RTQC_TESTID V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R12 - HISTORY_STEP V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R13 - OCEAN_CODE V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R14 - TECHNICAL_PARAMETER_NAME R. Cancouet, V. Turpin, T. Carval, , D. Dobler
R18 - CONFIG_PARAMETER_NAME C. Schmechtig (BGC), R. Cancouet (core), V. Turpin, T. Carval, D. Dobler
R15 - MEASUREMENT_CODE_ID V. Turpin, T. Carval, mscanderbeg, D. Dobler
RMC - MEASUREMENT_CODE_CATEGORY V. Turpin, T. Carval, mscanderbeg, D. Dobler
RTV - CYCLE_TIMING_VARIABLE V. Turpin, T. Carval, mscanderbeg, D. Dobler
R16 - VERTICAL_SAMPLING_SCHEME V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R19 - STATUS V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R20 - GROUNDED V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R21 - REPRESENTATIVE_PARK_PRESSURE_STATUS V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R22 - PLATFORM_FAMILY V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R23 - PLATFORM_TYPE V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R24 - PLATFORM_MAKER V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R25 - SENSOR V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R26 - SENSOR_MAKER V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R27 - SENSOR_MODEL V. Turpin, T. Carval, M. Krieger, R. Cancouet, D. Dobler
R28 - CONTROLLER_BOARD_TYPE V. Turpin, T. Carval, M. Krieger, R. Cancouet, C. Bellingham, D. Dobler
R40 - PI_NAME V. Turpin, T. Carval, M. Krieger, R. Cancouet, R. Wright, D. Dobler

About

A repository for the management of issues related to vocabularies managed by the Argo Data Management Team

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5