Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions docs/modeling/metadata-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,20 @@ Conceptually, metadata is modeled using the following abstractions
- [institutionalMemory](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/InstitutionalMemory.pdl): Captures internal company Documents associated with an Entity (e.g. links!)
- [status](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/Status.pdl): Captures the "deletion" status of an Entity, i.e. whether it should be soft-deleted.
- [subTypes](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/SubTypes.pdl): Captures one or more "sub types" of a more generic Entity type. An example can be a "Looker Explore" Dataset, a "View" Dataset. Specific sub types can imply that certain additional aspects are present for a given Entity.
- [editableSchemaMetadata](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/EditableSchemaMetadata.pdl): Stores metadata that is edited through the UI, such as descriptions, ensuring that these edits are preserved and not overwritten by subsequent ingestions.
- [schemaMetadata](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/SchemaMetadata.pdl): Stores metadata ingested from external sources.

# Handling UI-Edited Metadata

DataHub distinguishes between metadata that is edited through the UI and metadata that is ingested from external sources. This distinction is crucial for preserving manual edits made in the UI.

- **Editable Schema Metadata**: UI-edited metadata, such as descriptions, is stored in the `editableSchemaMetadata` aspect. These edits are preserved and not overwritten by subsequent ingestions.

- **Schema Metadata**: Metadata ingested from external sources is stored in the `schemaMetadata` aspect.

To ensure that manual edits are not lost during the ingestion process, users can configure their ingestion recipes to preserve existing metadata. This includes using the `simple_add_dataset_tags` transformer with `semantics: PATCH` for tags and setting allow/deny patterns for descriptions.
- [editableSchemaMetadata](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/EditableSchemaMetadata.pdl): Stores metadata that is edited through the UI, such as descriptions, ensuring that these edits are preserved and not overwritten by subsequent ingestions.
- [schemaMetadata](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/SchemaMetadata.pdl): Stores metadata ingested from external sources.


- **Relationships**: A relationship represents a named edge between 2 entities. They are declared via foreign key attributes within Aspects along with a custom annotation (@Relationship). Relationships permit edges to be traversed bi-directionally. For example, a Chart may refer to a CorpUser as its owner via a relationship named "OwnedBy". This edge would be walkable starting from the Chart *or* the CorpUser instance.
Expand Down