From dc88886377858203b964f2d9093a82b7ddd44ee7 Mon Sep 17 00:00:00 2001
From: George Brownbridge <gbrownbridge@cmcl.io>
Date: Fri, 13 Feb 2026 14:40:32 +0000
Subject: [PATCH 1/4] Document `additionalMetadata` in README

Added section for additional metadata in stack-data-uploader README.
---
 stack-data-uploader/README.md | 40 ++++++++++++++++++++++++++++-------
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/stack-data-uploader/README.md b/stack-data-uploader/README.md
index c29839a1..2f950501 100644
--- a/stack-data-uploader/README.md
+++ b/stack-data-uploader/README.md
@@ -128,6 +128,7 @@ The following table shows the top level nodes allowed in a configuration file.
 | [`"workspace"`](#workspace)                     | No        | The dataset's name                       | The GeoServer workspace into which any 2D geospatial data layers, vector and raster, will be added                                                 |
 | [`"namespace"`](#namespace)                     | No        | The dataset's name                       | The Blazegraph namespace into which RDF data will be added. The long syntax can be used to specify properties if the namespace needs to be created |
 | [`"externalDatasets"`](#externaldatasets)       | No*       | `[]`                                     | A list of other datasets' names. Each listed dataset will also be loaded if this dataset is loaded by name                                         |
+| [`"additionalMetadata"`](#additionalmetadata)   | No        | `{}`                                     | Additional metadata discribing the dataset                                                                                                         |
 | [`"dataSubsets"`](#datasubsets)                 | No*       | `[]`                                     | A list of *data subset* objects                                                                                                                    |
 | [`"styles"`](#styles)                           | No*       | `[]`                                     | A list of GeoServer style file definition objects                                                                                                  |
 | [`"mappings"`](#mappings)                       | No*       | `[]`                                     | A list of Ontop mapping (OBDA) file names                                                                                                          |
@@ -222,6 +223,28 @@ Be aware though that some of the property keys contain the namespace's name so c
 
 Any datasets that are named under this node will be included if this dataset is loaded by name, either because the stack has the same name or because it appears in the `"externalDatasets"` list of another dataset that is loaded by name.
 
+### `"additionalMetadata"`
+
+Additional metadata discribing the dataset (or data subset). For example additional information that corresponds to concepts in the dcat or prov-o ontologies.
+
+| Key                | Required? | Default value | Description                                                                                                                                                                          |
+| ------------------ | --------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `"prefixes"`       | No        | {}            | Key-value map of prefixes to IRIs                                                                                                                                                    |
+| `"triplePatterns"` | No        | ""            | Turtle formatted triples as a single-line string with double-quote characters escaped. The dataset IRI can be accessed as `?dataset`. For data subsets the variable is `?dataSubset` |
+
+Example:
+```json
+"additionalMetadata": {
+        "prefixes": {
+            "ex": "<https://dcat.example.org/>",
+            "dct": "http://purl.org/dc/terms/",
+            "dcat": "http://www.w3.org/ns/dcat#",
+            "xsd": "http://www.w3.org/2001/XMLSchema#"
+        },
+        "triplePatterns": "?dataset dct:publisher ex:finance-ministry ; dct:spatial <http://sws.geonames.org/6695072/> ; dct:temporal [ a dct:PeriodOfTime ; dcat:startDate \"2011-07-01\"^^xsd:date ;  dcat:endDate \"2011-09-30\"^^xsd:date ; ] ."
+    },
+```
+
 ### `"dataSubsets"`
 
 This node should contain a list of data subset objects.
@@ -231,13 +254,14 @@ Each data subset should then have its own subdirectory.
 These specify how to load the data from a particular set of files.
 Each data subset must have the following values specified:
 
-| Key                               | Required? | Default value                      | Description                                                                                                                                                              |
-| --------------------------------- | --------- | ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| [`"name"`](#name-1)               | No        | Last component of the subdirectory | The name of the data subset                                                                                                                                              |
-| [`"type"`](#type)                 | Yes       | N/A                                | The type of the data                                                                                                                                                     |
-| [`"subdirectory"`](#subdirectory) | Yes       | N/A                                | The subdirectory within the dataset directory that contains the data in this data subset                                                                                 |
-| `"skip"`                          | No        | `false`                            | If set to `true` this data subset will be ignored by the data uploader                                                                                                   |
-| `"sql"`                           | No        | N/A                                | If the data is being loaded into the PostgreSQL database then the query provided here is run straight after the data is loaded [:open_file_folder:](#value-by-file-name) |
+| Key                                           | Required? | Default value                      | Description                                                                                                                                                              |
+| --------------------------------------------- | --------- | ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| [`"name"`](#name-1)                           | No        | Last component of the subdirectory | The name of the data subset                                                                                                                                              |
+| [`"type"`](#type)                             | Yes       | N/A                                | The type of the data                                                                                                                                                     |
+| [`"subdirectory"`](#subdirectory)             | Yes       | N/A                                | The subdirectory within the dataset directory that contains the data in this data subset                                                                                 |
+| `"skip"`                                      | No        | `false`                            | If set to `true` this data subset will be ignored by the data uploader                                                                                                   |
+| `"sql"`                                       | No        | N/A                                | If the data is being loaded into the PostgreSQL database then the query provided here is run straight after the data is loaded [:open_file_folder:](#value-by-file-name) |
+| [`"additionalMetadata"`](#additionalmetadata) | No        | `{}`                               | Additional metadata discribing the data subset                                                                                                                           |
 
 #### `"name"`
 
@@ -1011,4 +1035,4 @@ This way you can look at look at the user interfaces of the various services (se
 
 [zone-id]: https://docs.oracle.com/javase/8/docs/api/java/time/ZoneId.html
 
-[ontop-lenses]: https://ontop-vkg.org/guide/advanced/lenses.html#lenses
\ No newline at end of file
+[ontop-lenses]: https://ontop-vkg.org/guide/advanced/lenses.html#lenses

From 689bc222a5bef89f443064d9bf1c7f6024199703 Mon Sep 17 00:00:00 2001
From: George Brownbridge <gbrownbridge@cmcl.io>
Date: Fri, 13 Feb 2026 14:42:17 +0000
Subject: [PATCH 2/4] Fix formatting in README

---
 stack-data-uploader/README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/stack-data-uploader/README.md b/stack-data-uploader/README.md
index 2f950501..e3b1e17b 100644
--- a/stack-data-uploader/README.md
+++ b/stack-data-uploader/README.md
@@ -229,8 +229,8 @@ Additional metadata discribing the dataset (or data subset). For example additio
 
 | Key                | Required? | Default value | Description                                                                                                                                                                          |
 | ------------------ | --------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| `"prefixes"`       | No        | {}            | Key-value map of prefixes to IRIs                                                                                                                                                    |
-| `"triplePatterns"` | No        | ""            | Turtle formatted triples as a single-line string with double-quote characters escaped. The dataset IRI can be accessed as `?dataset`. For data subsets the variable is `?dataSubset` |
+| `"prefixes"`       | No        | `{}`          | Key-value map of prefixes to IRIs                                                                                                                                                    |
+| `"triplePatterns"` | No        | `""`          | Turtle formatted triples as a single-line string with double-quote characters escaped. The dataset IRI can be accessed as `?dataset`. For data subsets the variable is `?dataSubset` |
 
 Example:
 ```json

From 06f49e8579d02e095de2e823480d421a7262dda6 Mon Sep 17 00:00:00 2001
From: George Brownbridge <gbrownbridge@cmcl.io>
Date: Mon, 16 Feb 2026 12:25:45 +0000
Subject: [PATCH 3/4] Fixed spelling of "describing"

Applied suggestions from code review

Co-authored-by: Sebastian Mosbach <46816676+sm453@users.noreply.github.com>
---
 stack-data-uploader/README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/stack-data-uploader/README.md b/stack-data-uploader/README.md
index e3b1e17b..f5504e98 100644
--- a/stack-data-uploader/README.md
+++ b/stack-data-uploader/README.md
@@ -128,7 +128,7 @@ The following table shows the top level nodes allowed in a configuration file.
 | [`"workspace"`](#workspace)                     | No        | The dataset's name                       | The GeoServer workspace into which any 2D geospatial data layers, vector and raster, will be added                                                 |
 | [`"namespace"`](#namespace)                     | No        | The dataset's name                       | The Blazegraph namespace into which RDF data will be added. The long syntax can be used to specify properties if the namespace needs to be created |
 | [`"externalDatasets"`](#externaldatasets)       | No*       | `[]`                                     | A list of other datasets' names. Each listed dataset will also be loaded if this dataset is loaded by name                                         |
-| [`"additionalMetadata"`](#additionalmetadata)   | No        | `{}`                                     | Additional metadata discribing the dataset                                                                                                         |
+| [`"additionalMetadata"`](#additionalmetadata)   | No        | `{}`                                     | Additional metadata describing the dataset                                                                                                         |
 | [`"dataSubsets"`](#datasubsets)                 | No*       | `[]`                                     | A list of *data subset* objects                                                                                                                    |
 | [`"styles"`](#styles)                           | No*       | `[]`                                     | A list of GeoServer style file definition objects                                                                                                  |
 | [`"mappings"`](#mappings)                       | No*       | `[]`                                     | A list of Ontop mapping (OBDA) file names                                                                                                          |
@@ -225,7 +225,7 @@ Any datasets that are named under this node will be included if this dataset is
 
 ### `"additionalMetadata"`
 
-Additional metadata discribing the dataset (or data subset). For example additional information that corresponds to concepts in the dcat or prov-o ontologies.
+Additional metadata describing the dataset (or data subset). For example additional information that corresponds to concepts in the dcat or prov-o ontologies.
 
 | Key                | Required? | Default value | Description                                                                                                                                                                          |
 | ------------------ | --------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
@@ -261,7 +261,7 @@ Each data subset must have the following values specified:
 | [`"subdirectory"`](#subdirectory)             | Yes       | N/A                                | The subdirectory within the dataset directory that contains the data in this data subset                                                                                 |
 | `"skip"`                                      | No        | `false`                            | If set to `true` this data subset will be ignored by the data uploader                                                                                                   |
 | `"sql"`                                       | No        | N/A                                | If the data is being loaded into the PostgreSQL database then the query provided here is run straight after the data is loaded [:open_file_folder:](#value-by-file-name) |
-| [`"additionalMetadata"`](#additionalmetadata) | No        | `{}`                               | Additional metadata discribing the data subset                                                                                                                           |
+| [`"additionalMetadata"`](#additionalmetadata) | No        | `{}`                               | Additional metadata describing the data subset                                                                                                                           |
 
 #### `"name"`
 

From 71dcd646a1bb846282452e11bc1c0493be9c5b71 Mon Sep 17 00:00:00 2001
From: George Brownbridge <gbrownbridge@cmcl.io>
Date: Mon, 16 Feb 2026 14:17:38 +0000
Subject: [PATCH 4/4] Enhance clarity of additionalMetadata description

Clarify additional metadata section in README.
---
 stack-data-uploader/README.md | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/stack-data-uploader/README.md b/stack-data-uploader/README.md
index f5504e98..bcc99ba9 100644
--- a/stack-data-uploader/README.md
+++ b/stack-data-uploader/README.md
@@ -225,7 +225,10 @@ Any datasets that are named under this node will be included if this dataset is
 
 ### `"additionalMetadata"`
 
-Additional metadata describing the dataset (or data subset). For example additional information that corresponds to concepts in the dcat or prov-o ontologies.
+Additional metadata describing the dataset (or data subset).
+For example additional information that corresponds to concepts in the dcat or prov-o ontologies.
+Any triples added here are in addition to the basic metadata that is added by default.
+All metadata triples are added to the `kb` namespace in Blazegraph.
 
 | Key                | Required? | Default value | Description                                                                                                                                                                          |
 | ------------------ | --------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |