From f41c601182b93814b3fc55db3ea62fc8df90fcd3 Mon Sep 17 00:00:00 2001 From: evacougnon Date: Thu, 23 Jul 2020 16:26:25 +1000 Subject: [PATCH 1/4] add a readme file for parameters mapping workflow --- .../README.parameters-mapping.md | 40 +++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 PARAMETERS_MAPPING/README.parameters-mapping.md diff --git a/PARAMETERS_MAPPING/README.parameters-mapping.md b/PARAMETERS_MAPPING/README.parameters-mapping.md new file mode 100644 index 00000000..8e56635f --- /dev/null +++ b/PARAMETERS_MAPPING/README.parameters-mapping.md @@ -0,0 +1,40 @@ +# Parameters Mapping or how to add a metadata header for CSV files + +1. check that the parameters to add are listed in [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) + * ```parameters.csv``` list of available parameters and their ids + * ```qc_flags.csv``` + * ```qc_scheme.csv``` + * ```unit_view.csv``` list the available units and their ids (cf names, longnames and id) + +1. New parameters + * needs to follow the IMOS vocabulary [BENE PLEASE UPDATE] + +1. map the parameters for your dataset collection + * update ```parameters_mapping.csv```. This is the file where all the information from the other files is brought together, and where a variable name as written in the column name of the csv is matched to a unique id for each parameters find in ```parameters.csv```, units find in ```unit_view.csv```, ... + +1. Create view in Parameters mapping harvester: update the liquibase to update/include new views in the [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) + * start your stack restoring the paramaters_mapping schema and the schema you are working on + ```RestoreDatabaseSchemas: - schema: parameters_mapping, - schema: working_schema``` + * open pgadmin and access your stack-db to test the sql query that will be used to create/update the view in the parameters_mapping harvester, as it is easier to get a better understanding of the query before updating the liquibase via Talend + * start your pipeline box and Talend + * update liquidbase in the second components ```Create parameters_mapping views``` + * the query will crash because of 6 views are calling their respective dataset collection schema: + `aatams_biologging_shearwater_metadata_summary`; + `aatams_biologging_snowpetrel_metadata_summary`; + `aatams_sattag_dm_metadata_summary`; + `aatams_sattag_nrt_metadata_summary`; + `aodn_nt_sattag_hawksbill_metadata_summary`; + `aodn_nt_sattag_oliveridley_metadata_summary` + * write the new view you are working on at the top of the liquidbase script, so Talend can run and create before crashing at `aatams_biologging_shearwater_metadata_summary` + * check stack database that the views are created as expected + +1. merge the changes made in + * [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) + * [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) to test on RC before merging to production + +1. test on RC, check the csv files a user can download from the portal + +# Other information +The [PARAMETERS_MAPPING harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) runs on a cron job daily , Monday to Friday. +It harvests the content of these 5 files into the parameters_mapping DB schema and create a `_metadata_summary` view for each of the collection listed (it is not IMOS specific, for example we have a mapping for the AODN `_WAVE_DM` + `NRT` collections) + From 572d67050cbbd9449505d60d3073a0ff6888fb7d Mon Sep 17 00:00:00 2001 From: Ana Berger <68037383+ana-berger@users.noreply.github.com> Date: Mon, 17 Aug 2020 18:01:11 +1000 Subject: [PATCH 2/4] Update README.parameters-mapping.md --- PARAMETERS_MAPPING/README.parameters-mapping.md | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/PARAMETERS_MAPPING/README.parameters-mapping.md b/PARAMETERS_MAPPING/README.parameters-mapping.md index 8e56635f..fd5fa69e 100644 --- a/PARAMETERS_MAPPING/README.parameters-mapping.md +++ b/PARAMETERS_MAPPING/README.parameters-mapping.md @@ -1,6 +1,6 @@ # Parameters Mapping or how to add a metadata header for CSV files -1. check that the parameters to add are listed in [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) +1. Check that the parameters to add are listed in [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) * ```parameters.csv``` list of available parameters and their ids * ```qc_flags.csv``` * ```qc_scheme.csv``` @@ -9,7 +9,7 @@ 1. New parameters * needs to follow the IMOS vocabulary [BENE PLEASE UPDATE] -1. map the parameters for your dataset collection +1. Map the parameters for your dataset collection * update ```parameters_mapping.csv```. This is the file where all the information from the other files is brought together, and where a variable name as written in the column name of the csv is matched to a unique id for each parameters find in ```parameters.csv```, units find in ```unit_view.csv```, ... 1. Create view in Parameters mapping harvester: update the liquibase to update/include new views in the [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) @@ -28,11 +28,17 @@ * write the new view you are working on at the top of the liquidbase script, so Talend can run and create before crashing at `aatams_biologging_shearwater_metadata_summary` * check stack database that the views are created as expected -1. merge the changes made in +1. Merge the changes made in * [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) * [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) to test on RC before merging to production -1. test on RC, check the csv files a user can download from the portal +1. Test on RC, check the csv files a user can download from the portal + * once the harvester is deployed in RC, in the terminal ssh to 4-nec-hob (first ssh to jumpbox, then to 4-nec-hob as explained [here](https://github.com/aodn/internal-discussions/wiki/AODN-Remote-Access#ssh)) + * type the command `talend_run parameters_mapping-parameters_mapping` + * type `talend_log parameters_mapping-parameters_mapping` to check it run succesfully + * to confirm, check the new metadata additions are in the respective view in pgadmin, and ultimately in the downloaded csv file from the [rc-portal](http://portal-rc.aodn.org.au/) + +1. Merging to production # Other information The [PARAMETERS_MAPPING harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) runs on a cron job daily , Monday to Friday. From 8eb966919d9ec1602563e738f683a21d76846d34 Mon Sep 17 00:00:00 2001 From: Ana Berger <68037383+ana-berger@users.noreply.github.com> Date: Mon, 3 May 2021 19:38:57 +1000 Subject: [PATCH 3/4] Update README.parameters-mapping.md --- .../README.parameters-mapping.md | 44 +++++++++++-------- 1 file changed, 26 insertions(+), 18 deletions(-) diff --git a/PARAMETERS_MAPPING/README.parameters-mapping.md b/PARAMETERS_MAPPING/README.parameters-mapping.md index fd5fa69e..cd5bccc1 100644 --- a/PARAMETERS_MAPPING/README.parameters-mapping.md +++ b/PARAMETERS_MAPPING/README.parameters-mapping.md @@ -2,19 +2,26 @@ 1. Check that the parameters to add are listed in [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) * ```parameters.csv``` list of available parameters and their ids - * ```qc_flags.csv``` - * ```qc_scheme.csv``` - * ```unit_view.csv``` list the available units and their ids (cf names, longnames and id) + * ```qc_flags.csv``` list of the quality control flags and meanings + * ```qc_scheme.csv``` list of the quality control scheme and definitions + * ```unit_view.csv``` list of the available units and their ids (cf names, longnames and id). -1. New parameters +2. New parameters * needs to follow the IMOS vocabulary [BENE PLEASE UPDATE] -1. Map the parameters for your dataset collection - * update ```parameters_mapping.csv```. This is the file where all the information from the other files is brought together, and where a variable name as written in the column name of the csv is matched to a unique id for each parameters find in ```parameters.csv```, units find in ```unit_view.csv```, ... +3. Map the parameters for your dataset collection + * update ```parameters_mapping.csv```. This is the file where all the information from the other files is brought together, and where a variable name as written in the column name of the csv is matched to a unique id for each parameters find in ```parameters.csv```, units find in ```unit_view.csv```, and qc_scheme find in ```qc_scheme.csv``` and ```qc_flags.csv``` in case the data has quality control. -1. Create view in Parameters mapping harvester: update the liquibase to update/include new views in the [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) +4. Create view in Parameters mapping harvester: update the liquibase to update/include new views in the [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) * start your stack restoring the paramaters_mapping schema and the schema you are working on - ```RestoreDatabaseSchemas: - schema: parameters_mapping, - schema: working_schema``` + ``` + + RestoreDatabaseSchemas: + + - schema: parameters_mapping + + - schema: working_schema + ``` * open pgadmin and access your stack-db to test the sql query that will be used to create/update the view in the parameters_mapping harvester, as it is easier to get a better understanding of the query before updating the liquibase via Talend * start your pipeline box and Talend * update liquidbase in the second components ```Create parameters_mapping views``` @@ -26,21 +33,22 @@ `aodn_nt_sattag_hawksbill_metadata_summary`; `aodn_nt_sattag_oliveridley_metadata_summary` * write the new view you are working on at the top of the liquidbase script, so Talend can run and create before crashing at `aatams_biologging_shearwater_metadata_summary` - * check stack database that the views are created as expected + * check stack database that the views are created as expected. -1. Merge the changes made in +5. Merge the changes made in * [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) - * [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) to test on RC before merging to production + * [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) to test on RC before merging to production. -1. Test on RC, check the csv files a user can download from the portal - * once the harvester is deployed in RC, in the terminal ssh to 4-nec-hob (first ssh to jumpbox, then to 4-nec-hob as explained [here](https://github.com/aodn/internal-discussions/wiki/AODN-Remote-Access#ssh)) + +6. Test on RC: check the csv files a user can download from the portal + * once the harvester is deployed in RC, in the terminal ssh to 4-nec-hob * type the command `talend_run parameters_mapping-parameters_mapping` * type `talend_log parameters_mapping-parameters_mapping` to check it run succesfully - * to confirm, check the new metadata additions are in the respective view in pgadmin, and ultimately in the downloaded csv file from the [rc-portal](http://portal-rc.aodn.org.au/) + * to confirm, check the new metadata additions are in the respective view in pgadmin, and ultimately in the downloaded csv file from the [rc-portal](http://portal-rc.aodn.org.au/). -1. Merging to production +7. Promoting to production + * on Jenkins, go to [data-services_tag_promote](https://build.aodn.org.au/view/projectofficer/job/data-services_tag_promote/) and proceed to the next steps `tag_prod` and `push_prod`. At 5pm on the same day the changes will be in production. # Other information -The [PARAMETERS_MAPPING harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) runs on a cron job daily , Monday to Friday. -It harvests the content of these 5 files into the parameters_mapping DB schema and create a `_metadata_summary` view for each of the collection listed (it is not IMOS specific, for example we have a mapping for the AODN `_WAVE_DM` + `NRT` collections) - +The [PARAMETERS_MAPPING harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) runs on a cron job daily, Monday to Friday. +It harvests the content of these 5 files (```parameters.csv```,```parameters_mapping.csv```,```qc_flags.csv```,```qc_scheme.csv```,```unit_view.csv```) into the parameters_mapping DB schema and create a `_metadata_summary` view for each of the collection listed (it is not IMOS specific, for example we have a mapping for the AODN `_WAVE_DM` + `NRT` collections). From 96b1a0156556dd014857a0194adb4e855ed205d4 Mon Sep 17 00:00:00 2001 From: Leonardo Laiolo Date: Mon, 30 Aug 2021 13:51:07 +1000 Subject: [PATCH 4/4] few edits and changes --- PARAMETERS_MAPPING/README.parameters-mapping.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/PARAMETERS_MAPPING/README.parameters-mapping.md b/PARAMETERS_MAPPING/README.parameters-mapping.md index cd5bccc1..8411e851 100644 --- a/PARAMETERS_MAPPING/README.parameters-mapping.md +++ b/PARAMETERS_MAPPING/README.parameters-mapping.md @@ -1,10 +1,10 @@ # Parameters Mapping or how to add a metadata header for CSV files 1. Check that the parameters to add are listed in [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) - * ```parameters.csv``` list of available parameters and their ids + * ```parameters.csv``` list of parameters and their ids * ```qc_flags.csv``` list of the quality control flags and meanings * ```qc_scheme.csv``` list of the quality control scheme and definitions - * ```unit_view.csv``` list of the available units and their ids (cf names, longnames and id). + * ```unit_view.csv``` list of the units and their ids (cf names, longnames and id). 2. New parameters * needs to follow the IMOS vocabulary [BENE PLEASE UPDATE] @@ -13,13 +13,12 @@ * update ```parameters_mapping.csv```. This is the file where all the information from the other files is brought together, and where a variable name as written in the column name of the csv is matched to a unique id for each parameters find in ```parameters.csv```, units find in ```unit_view.csv```, and qc_scheme find in ```qc_scheme.csv``` and ```qc_flags.csv``` in case the data has quality control. 4. Create view in Parameters mapping harvester: update the liquibase to update/include new views in the [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) - * start your stack restoring the paramaters_mapping schema and the schema you are working on + * start your stack restoring the paramaters_mapping schema and the schema you are working on. You can add ```restore_data: True``` under each schema if you want the data from a schema in your stack as well, this may be useful if you need to test the ```PARMETER_MAPPING_harvester```. ``` RestoreDatabaseSchemas: - schema: parameters_mapping - - schema: working_schema ``` * open pgadmin and access your stack-db to test the sql query that will be used to create/update the view in the parameters_mapping harvester, as it is easier to get a better understanding of the query before updating the liquibase via Talend @@ -39,7 +38,6 @@ * [data-services](https://github.com/aodn/data-services/tree/master/PARAMETERS_MAPPING) * [harvester](https://github.com/aodn/harvesters/tree/master/workspace/PARAMETERS_MAPPING) to test on RC before merging to production. - 6. Test on RC: check the csv files a user can download from the portal * once the harvester is deployed in RC, in the terminal ssh to 4-nec-hob * type the command `talend_run parameters_mapping-parameters_mapping`