Skip to content

didomi/consent-string-schema

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Consent String Schema

Consent strings store user consent status as binary data, encoded in base64. Encoding user consent status into a binary string, or decoding it from an existing one, requires an encoder/decoder library. The schema defines the consent string format in a human-readable way. This allows new formats to be defined, or existing ones updated, without modifying the encoder/decoder library.

Schema format

The schema has the following structure:

  • consent_string_type
  • specification_version
  • tests
  • types
  • fields or segments

Type

Encoders and decoders need to know the type of data they're working with. The schema defines fields with specific types, much like variables in programming. The following types must be supported by the encoder/decoder library:

# type id description
1 u1 Unsigned integer number of 1 bit size (0 or 1)
2 u2 Unsigned integer number of 2 bits size
3 u3 Unsigned integer number of 3 bits size
4 u4 Unsigned integer number of 4 bits size
5 u6 Unsigned integer number of 6 bits size
6 u12 Unsigned integer number of 12 bits size
7 u16 Unsigned integer number of 16 bits size
8 u24 Unsigned integer number of 24 bits size
9 u32 Unsigned integer number of 32 bits size
10 date Date
11 uuid User ID - Base16 string xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
12 fibonacci Fibonacci Number
13 fibonacci_range Fibonacci range of ids with start_id and number_of_ids in Fibonacci numbers
14 u16_range Ids range with start_id and end_id as u16 numbers
15 bit_field Array of bits
16 fixed_bit_field Fixed Size BitField array, size property must be specified in the schema
17 bit_field_2_bits BitField array of tuples (bool, bool)
18 ranges_u16 Array of id ranges - (start_id: u16, end_id: u16)
19 ranges_fibonacci Array of id ranges - (start_id: fibonacci, number_of_ids: fibonacci)
20 string A fixed amount of bit representing a string. The character’s ASCII integer ID is subtracted by 65 and encoded into an int(6).
21 optimized_range First data type is always a Boolean (u1) and if it is true then the next is Fibonacci range, if false - Bitfield.
22 optimized_u16_range Optimized BitField with u16_range - (max_id: u12, range_flag: u1, type: [u16_range, fixed_bit_field(size: max_id)])
23 array_of_optimized_u16_ranges Array of optimized_u16_range (number_of_records: u12, key: u6, type: u2, ids: optimized_u16_range)
24 n_array_of_ranges_x_y Generic type<X,Y> (number_of_records: u12, key: u, type: u, ids: optimized_range)
25 optimized_array_of_u16_ranges (max_id: u16, is_range_encoding: u1, type: [(number_of_entries: u12, is_a_range: u1, type: [(start_id:u16, end_id: u16), start_id:u16]), fixed_bit_field(size: max_id)])
26 array_of_u16_ranges (number_of_records: u12, key: u6, type: u2, number_of_entries: u12, is_a_range: u1, type: [(start_id:u16, end_id: u16), start_id:u16])
27 segment_type The 3bits code of segment type
28 enabled_disabled_ids The enabled and disables ids encoded with some types listed in variants property
29 array_of_attributed_u16_ranges The array of attributed ranges used for publisher restriction encoding
30 version The consent string version encoded in 6bits number

Field

The main building block of the schema is a field. The field is analog of variable in a programming language and each filed must have a type. There is a field definition example:

{
  "type": "version",
  "description": "The Didomi Consent String version 6bits number",
  "key": "dcs_version",
  "value": 2  
}

Here we are defining the field with type "version" and key equal to "dcs_version". The encoder will use the key property to access the field value provided in the user status object. We could also define the field value property to instruct the encoder to read the value not from the user status object but from the schema.

Another example of the field property is optional. Each field could be an optional and this will instruct encoder to place 1 bit flag in encoded string before the field value. If the flag is 1, the value is present and decoded. If it's 0, the value is absent and skipped during decoding. There is example of the optional field definition

{
  "type": "date",
  "key": "sync",
  "optional": true
}

There are fields that could be encoded using different encoding algorithms for optimization purposes. To specify what algorithms (types) could be used by encoder for value encoder we use variants field attribute:

{

  "type": "enabled_disabled_ids",
  "description": "User Status for purposes opt-out",
  "key": "purposes_li",
  "variants": [
    "bit_field_2_bits",
    "ranges_u16",
    "ranges_fibonacci"
  ]
}

This example shows how to encode user status for purposes using three methods: bit_field_2_bits, ranges_u16, and ranges_fibonacci. The encoder chooses the method producing the shortest result and includes the chosen method's code before the encoded value, allowing the decoder to correctly decode the string.

Some types need to have size information to encode / decode correctly.

{
  "type": "string",
  "description": "Two-letter ISO 639-1 language code in which the CMP UI was presented",
  "key": "consent_language",
  "size": 12
}

Here we are defining a 12 bits string type field.

There is the summary for all field attributes:

# field attribute mandatory description
1 type Y field type
2 key Y key to extract value from the model
3 description Y field description
4 size N type size bits
5 optional N true if value could be missing
6 value N field constant value
7 variants N type encoding methods variants

Consent String Type

The consent string schema supports the following types:

# string type description
1 didomi_consent_string Didomi Consent String
2 iab_tcf_string IAB TCF Consent String
3 gpp_string GPP String

Specification Version

For each string type several specification versions could be defined, for example, for Didomi Consent String we have specification_version 1 and 2, where version 1 has some restrictions for the encoder.

Test

The consent string schema defines tests for an encoder and decoder in tests array. The encoder/decoder library should run these tests to make sure the library produces expected results based on the schema.

Types

The types array defines a subset of types supported by the consent string schema (Type). Each type from types must appear at least ones in field definition and all types from fields definition should be listed in types array.

Fields and segments

The consent string schema could have fields array on the top level or segments array. The fields array contains field definition (Field) and segments could be used to define a segmented string structure, where each segment encoded separately and then connected to each other using . character.

For example:

{
  "tests": {
    "encoded": "CQH-gkAQH-gkAAHABBENBOFgAPAAAELAAAAAF5wAQF5gXnABAXmAAAAA.YAAAAAAAAAAA"
  }
}

Here we have an IAB TCF string as encoded value with two segments : core and publisher.

In the schema segments defined as the following object:

 {
    "name": "Disclosed Vendors",
    "key": "disclosed_vendors",
    "optional": true,
    "fields": [
      {
        "type": "segment_type",
        "description": "DisclosedVendors segment is 1 which is 001 in binary.",
        "key": "disclosed_vendors_segment_type",
        "value": 1
      },
      {
        "type": "optimized_array_of_u16_ranges",
        "description": "Vendors Disclosure",
        "key": "disclosed_vendors"
      }
    ]
  }

Each segment has name, key, optional and fields properties, where the all property except optional mandatory ones.

Schema Validation

The consent string schema library provides validator function that validates schemas using 3 steps:

  • validate schema structure using zod library
  • validate schema types
  • validate schema keys

Structure validation

We use zod to define the consent string schema structure and also control types. For example, there is field definition using zod:

import { z } from 'zod';

export const availableTypesSchema = z.nativeEnum(SchemaTypes);

export const typesSchema = uniqueNotEmptyArray(availableTypesSchema);

export const consentStringTypeSchema = z.enum([
  'didomi_consent_string',
  'iab_tcf_string',
  'gpp_string',
]);

export const variantsTypes = z.enum([SchemaTypes.bit_field_2_bits, SchemaTypes.ranges_fibonacci, SchemaTypes.ranges_u16]);

export const variantsSchema = uniqueNotEmptyArray(variantsTypes);

export const fieldSchema = z.object({
  type: availableTypesSchema,
  description: z.string(),
  key: z.string(),
  optional: z.boolean().optional(),
  size: z.number().or(z.string()).optional(),
  value: z.number().optional(),
  variants: variantsSchema.optional()
});

This schema ensures the type property contains only permitted types, and the variants property contains exactly three allowed types.

Type validation

While zod validates types, the schema must also ensure all types in its types array are used in field definitions, and that all field types are listed in the types array. This mirrors programming languages, where types are defined before use and unused definitions may trigger warnings or errors. The library's typeValidator enforces this constraint after zod validation.

Key validation

The last validation step need to ensure that all keys in the fields definition unique. The library's keyValidator runs after the typeValidator to enforces this constraint.

About

Consent String Schema

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors