Multiple language tags trigger SHACL violation for multiple values

This has been a [long-standing issue](https://github.com/OP-TED/ted-rdf-mapping/issues/407), not least due to the `rdf:PlainLiteral` (more recently `xsd:string`) and `rdfs:langString` dichotomy in ontologies like [ePO](https://github.com/OP-TED/ePO/tree/master/implementation/ePO_core/shacl_shapes), whose constraints are generated by model2owl. Having data like the following:

```ttl
epd:id_d9997c19-6dd6-43e2-9706-28df10f8f1eb_AwardCriterion_Y6iaTUeQDukaqhJjdTfKhV
  a epo:AwardCriterion;
  epo:hasAwardCriterionType <http://publications.europa.eu/resource/authority/award-criterion-type/quality>;
  epo:hasWeightValueType <http://publications.europa.eu/resource/authority/number-weight/per-exa>;
  cccev:weight 10.0;
  dct:description "See purchase documents"@en, "Nurodyta pirkimo dokumentuose"@lt;
  skos:prefLabel "Delivery time for goods"@en, "Delivery time for goods"@lt .
```

will raise a `sh:MaxCountConstraintComponent` ("More than 1 values") violation for `dct:description` and/or `skos:prefLabel`.

There is a simple fix to this: [`sh:uniqueLang true` ](https://www.w3.org/TR/shacl/#UniqueLangConstraintComponent)

However, that's not the whole story. There is a more sophisticated variant which allows also plain, non-language-tagged literals to co-exist with language-tagged ones, combining `sh:uniqueLang` and `sh:qualifiedValueShape`:

```ttl
ex:MaxOneRDFLabelShape
  a sh:NodeShape ;
  sh:targetSubjectsOf rdf:type ;
  sh:property [
    sh:path rdfs:label ;
    sh:uniqueLang true ;
  ] ;
  sh:property [
    sh:path rdfs:label ;
    sh:qualifiedMaxCount 1 ;
    sh:qualifiedValueShape [
      sh:datatype xsd:string ;
    ] ;
    sh:message "Violation of standard practice: More than one `rdfs:label` exists without a language tag" ;
  ]
.
```

This was implemented for the [SEMIC validator](https://www.itb.ec.europa.eu/shacl/semicstyleguide/upload) (see [shape](https://github.com/meaningfy-ws/semic-styleguide-rdf-validator/blob/main/shapes/owl/max_one_label.shacl.ttl) and accompanying [test data](https://github.com/meaningfy-ws/semic-styleguide-rdf-validator/tree/main/tests/test_data/owl/max_one_label)).

The above technique would allow the following to pass:

```ttl
ex:Note a owl:Class ;
  skos:prefLabel "note"@en , "nota"@es ;
  rdfs:label "note"@en , "nota"@es ;
  rdfs:comment "note" , "notee" .
``` 

but not:

```
ex:Note a owl:Class ;
  skos:prefLabel "note" , "notee" , "note"@en , "notee"@en ;
  rdfs:label "note" , "notee" , "note"@en , "notee"@en .
```

The technique [currently seen](https://github.com/OP-TED/ePO/blob/566816f815071970b22ae3485f979316b92d56a2/implementation/ePO_core/shacl_shapes/ePO_core_shapes.ttl#L1702) with `sh:or`:

```ttl
...
	sh:minCount 0 ;
	sh:maxCount 1 ;
	sh:or (
		[
			sh:datatype xsd:string ;
		]
		[
			sh:datatype rdf:langString ;
		]
	) .
```

as implemented based on https://github.com/OP-TED/model2owl/issues/219, does _not_ work for multiple language tags.

Whether to implement the simple or advanced variant allowing co-existence of plain and language-tagged literals, is perhaps a question for the ontology stakeholders.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple language tags trigger SHACL violation for multiple values #252

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multiple language tags trigger SHACL violation for multiple values #252

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions