diff --git a/docs/modeling/metadata-model.md b/docs/modeling/metadata-model.md index 4c97cadc88417..911c8e4d943b9 100644 --- a/docs/modeling/metadata-model.md +++ b/docs/modeling/metadata-model.md @@ -21,6 +21,20 @@ Conceptually, metadata is modeled using the following abstractions - [institutionalMemory](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/InstitutionalMemory.pdl): Captures internal company Documents associated with an Entity (e.g. links!) - [status](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/Status.pdl): Captures the "deletion" status of an Entity, i.e. whether it should be soft-deleted. - [subTypes](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/SubTypes.pdl): Captures one or more "sub types" of a more generic Entity type. An example can be a "Looker Explore" Dataset, a "View" Dataset. Specific sub types can imply that certain additional aspects are present for a given Entity. + - [editableSchemaMetadata](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/EditableSchemaMetadata.pdl): Stores metadata that is edited through the UI, such as descriptions, ensuring that these edits are preserved and not overwritten by subsequent ingestions. + - [schemaMetadata](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/SchemaMetadata.pdl): Stores metadata ingested from external sources. + +# Handling UI-Edited Metadata + +DataHub distinguishes between metadata that is edited through the UI and metadata that is ingested from external sources. This distinction is crucial for preserving manual edits made in the UI. + +- **Editable Schema Metadata**: UI-edited metadata, such as descriptions, is stored in the `editableSchemaMetadata` aspect. These edits are preserved and not overwritten by subsequent ingestions. + +- **Schema Metadata**: Metadata ingested from external sources is stored in the `schemaMetadata` aspect. + +To ensure that manual edits are not lost during the ingestion process, users can configure their ingestion recipes to preserve existing metadata. This includes using the `simple_add_dataset_tags` transformer with `semantics: PATCH` for tags and setting allow/deny patterns for descriptions. + - [editableSchemaMetadata](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/EditableSchemaMetadata.pdl): Stores metadata that is edited through the UI, such as descriptions, ensuring that these edits are preserved and not overwritten by subsequent ingestions. + - [schemaMetadata](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/SchemaMetadata.pdl): Stores metadata ingested from external sources. - **Relationships**: A relationship represents a named edge between 2 entities. They are declared via foreign key attributes within Aspects along with a custom annotation (@Relationship). Relationships permit edges to be traversed bi-directionally. For example, a Chart may refer to a CorpUser as its owner via a relationship named "OwnedBy". This edge would be walkable starting from the Chart *or* the CorpUser instance.