diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/README.md b/modules/accelerators/contextualization/cdf_three_dimenssion/README.md new file mode 100644 index 00000000..57c6d648 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/README.md @@ -0,0 +1,109 @@ +# 3D contextualization (CDF Toolkit) + +This CDF Toolkit module deploys a **data modeling (DM)–first** pipeline that contextualizes 3D CAD nodes with Cognite Asset instances: manual and rule-based input, ML matching, quality review, and RAW outputs for iterative tuning. + +It is parameterized by **`default_location`** and **`source_name`** (for example `clov` / `navisworks`), so resource external IDs follow the pattern `…_{{default_location}}_{{source_name}}` after build. + +## Prerequisites + +- A **DM-only or hybrid** CDF project with the 3D chain and views required for Industrial Tools / CAD contextualization. Follow **[docs/3d_contextualization_dm_only_guide.md](docs/3d_contextualization_dm_only_guide.md)** for spaces, `Cognite3DObject` / `CADNode` linking, scene configuration, and common pitfalls. +- After configuring **`asset_instance_space`**, **view** (`assetView*` / `asset_view_*`), and pipeline IDs for your project, use **[docs/CONFIGURATION_VERIFICATION.md](docs/CONFIGURATION_VERIFICATION.md)** to verify live CDF pipeline config and instance vs view spaces. +- A **3D model** ingested and processed in CDF (file upload via UI or the file extractor pattern in this module). +- Toolkit **`variables`** set in your environment config (see `default.config.dev.yaml` and `env.template`). + +Optional reference data (for demos): small asset / OID-style examples can be adapted; the module itself is not tied to a single field like Valhall. + +## Processing workflow + +High-level contextualization flow (manual input → ML → good/bad RAW → retune): + +![Processing workflow](https://github.com/cognitedata/toolkit/assets/31886431/b29522f8-7f4b-4e23-b06a-f3ffffde103c) + +## Managed resources + +Names below use toolkit variables; **deployed** external IDs use your configured `default_location` and `source_name`. + +### 1. Auth groups + +| Pattern | Purpose | +|--------|---------| +| `gp_3d_{{default_location}}_extractor` | Extractor: RAW, files, extraction runs for the 3D dataset and source pipeline | +| `gp_3d_{{default_location}}_processing` | Runs contextualization functions: RAW, DM instances, files, 3D, sessions, pipelines | +| `gp_3d_{{default_location}}_read` | Read access to 3D / files for consumers | + +Source IDs for each group come from your IdP (`3d_location_*_group_source_id` in config). + +### 2. Data set + +| External ID | Role | +|-------------|------| +| `ds_3d_{{default_location}}` | Lineage for extraction pipelines, functions, RAW, and files | + +### 3. Extraction pipelines + +| External ID pattern | Role | +|---------------------|------| +| `ep_src_3d_{{default_location}}_{{source_name}}` | File extractor: upload 3D files from local disk, SharePoint, etc. | +| `ep_ctx_3d_{{default_location}}_{{source_name}}_annotation` | Main **3D ↔ asset** contextualization (CDF Function) | +| `ep_ctx_3d_{{default_location}}_{{source_name}}_annotation_quality_check` | Quality check on contextualization results | +| `ep_ctx_3d_{{default_location}}_{{source_name}}_upload_manual_mappings` | Load **CSV** mappings into RAW (`3DId`, `assetId`, …) | +| `ep_ctx_3d_{{default_location}}_{{source_name}}_upload_asset_hierarchy` | Load **asset hierarchy** CSV into DM asset instances | + +Pipeline documentation and **runtime parameters** (RAW DB, DM asset/CAD spaces, model name, thresholds) live in the corresponding `*.config.Config.yaml` next to each pipeline. + +### 4. Functions + +| External ID pattern | Role | +|---------------------|------| +| `fn_context_3d_{{default_location}}_{{source_name}}_asset` | Main annotation / contextualization | +| `fn_context_3d_{{default_location}}_{{source_name}}_quality_check` | Post-run quality checks | +| `fn_context_3d_{{default_location}}_{{source_name}}_upload_manual_mappings` | CSV → `contextualization_manual_input` | +| `fn_context_3d_{{default_location}}_{{source_name}}_upload_asset_hierarchy` | CSV → DM assets | + +The main contextualization function, in normal operation, follows the pipeline described on the annotation extraction pipeline (read manual RAW, apply overrides, match assets using DM configuration, write **good** / **bad** tables for workflow and tuning). + +### 5. RAW database and tables + +| Database | Tables | +|----------|--------| +| `3d_{{default_location}}_{{source_name}}` | `contextualization_good`, `contextualization_bad`, `contextualization_manual_input`, `contextualization_rule` (optional rule-based mapping before ML) | + +### Illustrations + +**3D data pipeline (time series / context):** + +![3D data pipeline](https://github.com/cognitedata/toolkit/assets/31886431/f1129181-bab0-42cb-8366-860e8fb30d7e) + +**Contextualization workflow** (good/bad tables, manual and rule modules): + +![Contextualization workflow](https://github.com/cognitedata/toolkit/assets/31886431/0e990b47-0c06-4040-b680-7e2dddcdccee) + +## Variables + +Set module variables in **`default.config.yaml`** (merged by the Toolkit). Values such as DM spaces, model names, and `function_space` are read from **environment variables** (see **`env.template`**): `FUNCTION_SPACE`, `THREE_D_MODEL_NAME`, `CAD_MODEL_NAME`, `DEFAULT_CAD_SPACE`, `DEFAULT_DM_SPACE`, `DATA_MODEL_SPACE`, `DM_EXT_ID`, `DM_VERSION`, `ASSET_INSTANCE_SPACE`, `CAD_NODE_INSTANCE_SPACE`, plus `CDF_PROJECT`, `CDF_CLUSTER`, IDP and CI/CD vars as already referenced in `default.config.yaml`. Copy `env.template` → `.env` and fill in real values before `cdf build`. + +| Variable | Description | +|----------|-------------| +| `default_location` | Short location key used in resource names (e.g. `clov`) | +| `source_name` | Source system key (e.g. `navisworks`) | +| `3d_model_name` | 3D / CAD model identifier in CDF (from `.env`: `THREE_D_MODEL_NAME`) | +| `3d_dataset` | Data set external ID for 3D resources (typically `ds_3d_`) | +| `raw_db`, `raw_table_manual` | RAW database and manual-input table (for scripts / docs) | +| `asset_instance_space`, `cad_node_instance_space` | DM instance spaces (from `.env`: `ASSET_INSTANCE_SPACE`, `CAD_NODE_INSTANCE_SPACE`) | +| `function_space` | Space for Cognite Function code artifacts (from `.env`: `FUNCTION_SPACE`) | +| `default_dm_space`, `dm_ext_id`, `dm_version` | Data model reference (spaces / ext id from `.env` via `${DM_EXT_ID}` / `${DM_VERSION}`) | +| `default_cad_space`, `default_scene_space` | CAD and scene configuration spaces (`DEFAULT_CAD_SPACE` in `.env`) | +| `required_views`, `cad_model_view`, … | View references for DM 3D chain (see dev config) | +| `file_extractor_watch_path` | Local path for file extractor config (if used) | +| `3d_location_extractor_group_source_id`, `3d_location_processing_group_source_id`, `3d_location_read_group_source_id` | IdP object IDs for the three auth groups | +| `cicd_clientId`, `cicd_clientSecret` | Optional: schedules / automation | + +Legacy variable **`external_root_id_asset`** may appear in older templates; DM deployments use **`asset_instance_space`** / **`cad_node_instance_space`** in pipeline config instead. + +## Usage + +Copy this module into your project’s `custom_modules` (or reference it from `modules/`), adjust **`variables`** for your project and IdP groups, then add it under **`selected_modules_and_packages`** in your `config..yaml` and deploy with the CDF Toolkit. + +See [Using Templates](https://developer.cognite.com/sdks/toolkit/templates). + +**Note:** Cognite Functions have **time and memory limits**. Very large asset or node volumes may need batching or a different runtime than a single function invocation. diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/auth/3d.Group.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/auth/3d.Group.yaml new file mode 100644 index 00000000..c6c02d19 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/auth/3d.Group.yaml @@ -0,0 +1,95 @@ +# Single group for all 3D pipeline actors: file extractor, contextualization functions, and read-only users. +# Merged from 3d.extractor.Group, 3d.processing.Group, 3d.read.Group for simplified dev/small-team setup. +name: 'gp_3d_{{default_location}}' +sourceId: '{{3d_location_group_source_id}}' +metadata: + origin: 'cdf-project-templates' +capabilities: + # RAW: read/write (extractor + processing) + - rawAcl: + actions: + - READ + - WRITE + scope: + tableScope: + dbsToTables: + 3d_{{default_location}}_{{source_name}}: [] + + # Files: read/write (extractor uploads, processing reads, users view) + - filesAcl: + actions: + - READ + - WRITE + scope: + datasetScope: + ids: ['ds_3d_{{default_location}}'] + + # 3D models: read (processing + users) + - threedAcl: + actions: + - READ + scope: + datasetScope: + ids: ['ds_3d_{{default_location}}'] + + # Data Model instances: read/write (processing writes AssetExtension nodes) + - dataModelInstancesAcl: + actions: + - READ + - WRITE + scope: + spaceIdScope: + spaceIds: + - '{{data_model_space}}' + - cdf_cdm + + # Extraction pipelines: read configs, read/write runs + - extractionConfigsAcl: + actions: + - READ + - WRITE + scope: + datasetScope: + ids: ['ds_3d_{{default_location}}'] + - extractionRunsAcl: + actions: + - READ + - WRITE + scope: + datasetScope: + ids: ['ds_3d_{{default_location}}'] + + # Sessions: needed for CDF Functions + - sessionsAcl: + actions: + - LIST + - CREATE + - DELETE + scope: + all: {} + + # Entity matching + functions (contextualization processing) + - entitymatchingAcl: + actions: + - READ + - WRITE + scope: + all: {} + - functionsAcl: + actions: + - READ + - WRITE + scope: + all: {} + + # Annotations + labels: read (users) + - annotationsAcl: + actions: + - READ + scope: + all: {} + - labelsAcl: + actions: + - READ + scope: + all: {} diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/data_sets/dataset.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/data_sets/dataset.yaml new file mode 100644 index 00000000..eba04b82 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/data_sets/dataset.yaml @@ -0,0 +1,14 @@ +externalId: ds_3d_{{default_location}} +name: 3d:{{default_location}} +description: 3D data for {{default_location}} uploaded from sharepoint +metadata: + consoleSource: + names: + - '{{source_name}}' + rawTables: + - databaseName: 3d_{{default_location}}_{{source_name}} + tableName: 'contextualization_good' + - databaseName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_bad' + - databaseName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_manual_input' diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/default.config.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/default.config.yaml new file mode 100644 index 00000000..d1bbba41 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/default.config.yaml @@ -0,0 +1,137 @@ +environment: + name: dev + project: shivam-qs + validation-type: dev + selected: + - modules/ + +variables: + modules: + accelerators: + contextualization: + cdf_three_dimenssion: + default_location: ${default_location} + source_name: ${source_name} + raw_db: ${RAW_DB} # raw database name for manual mappings + raw_table_manual: contextualization_manual_input + raw_table_asset_staging: asset_hierarchy_staging + function_space: ${FUNCTION_SPACE} + 3d_model_name: ${3d_model_name} # 3d_model_name is the "name" property of cognite3dmodel + cad_model_name: cdf_cdm + cad_model_type: CAD + default_cad_space: cdf_cdm + default_dm_space: cdf_cdm + data_model_space: cdf_cdm + dm_ext_id: ${DM_EXT_ID} # data model external id for the project + dm_version: ${DM_VERSION} + default_scene_space: scene + cad_contextualization_batch_size: 100 + required_views: + - space: cdf_cdm + external_id: Cognite3DObject + version: v1 + + - space: cdf_cdm + external_id: CogniteCADNode + version: v1 + + - space: cdf_cdm + external_id: CogniteCADRevision + version: v1 + + - space: cdf_cdm + external_id: CogniteCADModel + version: v1 + + - space: cdf_cdm + external_id: Cognite3DRevision + version: v1 + + - space: cdf_cdm + external_id: Cognite3DModel + version: v1 + + - space: cdf_cdm + external_id: Cognite3DTransformation + version: v1 + + - space: cdf_cdm + external_id: CogniteVisualizable + version: v1 + + - space: cdf_cdm + external_id: Cognite360Image + version: v1 + + - space: cdf_cdm + external_id: Cognite360ImageAnnotation + version: v1 + + - space: cdf_cdm + external_id: CogniteAnnotation + version: v1 + + - space: cdf_cdm + external_id: Cognite360ImageCollection + version: v1 + + - space: cdf_cdm + external_id: Cognite360ImageStation + version: v1 + + - space: cdf_cdm + external_id: CognitePointCloudVolume + version: v1 + + - space: cdf_cdm + external_id: CognitePointCloudRevision + version: v1 + + - space: scene + external_id: SceneConfiguration + version: v1 + + - space: scene + external_id: RevisionProperties + version: v1 + + - space: cdf_3d_schema + external_id: Cdf3dModel + version: '1' + + asset_instance_space: ${ASSET_INSTANCE_SPACE} + cad_node_instance_space: ${CAD_NODE_INSTANCE_SPACE} + asset_view_ext_id: ${ASSET_VIEW_EXT_ID} #view where asset are available for contextualization + asset_view_version: ${ASSET_VIEW_VERSION} #version of the view where asset are available for contextualization + file_extractor_watch_path: c:/tmp/files + 3d_location_group_source_id: ${3d_location_group_source_id} #group source id for the project + cicd_clientId: ${cicd_clientId} + cicd_clientSecret: ${cicd_clientSecret} + cicd_tokenUri: ${IDP_TOKEN_URL} + cdf_project_name: ${CDF_PROJECT} + cdf_cluster: ${CDF_CLUSTER} + external_root_id_asset: ${EXTERNAL_ROOT_ID_ASSET} #optional: only used for legacy behavior + cad_model_view: + space: cdf_cdm + external_id: CogniteCADModel + version: v1 + + cad_revision_view: + space: cdf_cdm + external_id: CogniteCADRevision + version: v1 + + scene_config_view: + space: scene + external_id: SceneConfiguration + version: v1 + + scene_model_view: + space: cdf_3d_schema + external_id: Cdf3dModel + version: '1' + + rev_props_view: + space: scene + external_id: RevisionProperties + version: v1 diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/docs/3d_contextualization_dm_only_guide.md b/modules/accelerators/contextualization/cdf_three_dimenssion/docs/3d_contextualization_dm_only_guide.md new file mode 100644 index 00000000..2ab40ed7 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/docs/3d_contextualization_dm_only_guide.md @@ -0,0 +1,453 @@ +# 3D Contextualization in a DM-Only CDF Project — Setup Guide + +> **Project reference:** `cdf-shivam-test` (bluefield cluster) +> **Data model:** `upstream-value-chain / upstream_value_chain / v1` +> **Asset instance space:** `instance_upstream_value_chain` +> **3D instance space:** `rmdm` + +--- + +## Table of Contents + +1. [Confirm Project Type](#1-confirm-project-type) +2. [Full DM 3D Chain](#2-full-dm-3d-chain) +3. [Required Data Model Views](#3-required-data-model-views) +4. [Create the DM 3D Instances](#4-create-the-dm-3d-instances) +5. [Create the Scene Configuration](#5-create-the-scene-configuration) +6. [Space Assignment Rules](#6-space-assignment-rules) +7. [Industrial Tools Location Setup](#7-industrial-tools-location-setup-ui) +8. [Verification Checklist](#8-verification-checklist) +9. [Common Mistakes](#9-common-mistakes) + +--- + +## 1. Confirm Project Type + +Before starting, confirm whether the project is **DM-only** or **Hybrid**. + +```python +try: + client.three_d.asset_mappings.list(model_id=MODEL_ID, revision_id=REVISION_ID, limit=1) + print("HYBRID project — classic asset mappings available") +except Exception as e: + if "Not supported in DMS Only project" in str(e): + print("DM-ONLY project — must use DM chain, classic mappings will NEVER work") +``` + +| Project Type | Classic 3D Asset Mappings | DM 3D Chain | IT 3D Preview | +|---|---|---|---| +| DM-only | Not supported | Required | Works via DM | +| Hybrid | Supported | Optional | Works via classic | + +--- + +## 2. Full DM 3D Chain + +All 5 node types must exist and be linked correctly: + +``` +Asset (instance_upstream_value_chain) + └─ object3D ──→ Cognite3DObject (instance_upstream_value_chain) ← SAME space as Asset! + └─ cadNodes (reverse relation) ──→ CADNode (rmdm) + ├─ treeIndexes = [int] ← NOT node_id + ├─ revisions = [→ CADRevision] + └─ object3D = → Cognite3DObject + +CADRevision (rmdm) + ├─ status = "Done" + ├─ published = True + ├─ type = "CAD" + └─ model3D ──→ CADModel (rmdm) + └─ type = "CAD" +``` + +### Critical Rules + +- `Cognite3DObject` **must be in the same instance space as Asset** — Industrial Tools scopes + queries by the asset's instance space. If it is in the 3D space (`rmdm`), IT will not find it. +- `cadNodeReference` must use `str(treeIndex)` — **not** `node_id`. They are different values. +- `treeIndexes` must be `[int]` (a list of integers). The 3D viewer uses this list, not `cadNodeReference`, to match clicked nodes. +- `revisions` must be a direct-relation list pointing to the `CADRevision` node. +- `CADRevision` must have data written to **both** `cdf_cdm_3d/Cognite3DRevision` (for status/published) + **and** to `cdf_cdm_3d/Cognite3DModel` on the revision node itself (for `type="CAD"`) to satisfy the view filter. + +--- + +## 3. Required Data Model Views + +When you add `Cognite3DObject/v1` to your data model you **must also add every view type +that its relation fields reference**. Missing target view types cause GraphQL `_UnknownType` +errors which crash Industrial Tools. + +Add **all of the following views** to your data model as a single operation: + +```python +from cognite.client.data_classes.data_modeling import ViewId, DataModelApply + +REQUIRED_3D_VIEWS = [ + # Core 3D chain + ViewId("cdf_cdm", "Cognite3DObject", "v1"), + ViewId("cdf_cdm", "CogniteCADNode", "v1"), + ViewId("cdf_cdm", "CogniteCADRevision", "v1"), + ViewId("cdf_cdm", "CogniteCADModel", "v1"), + ViewId("cdf_cdm", "Cognite3DRevision", "v1"), + ViewId("cdf_cdm", "Cognite3DModel", "v1"), + ViewId("cdf_cdm", "Cognite3DTransformation", "v1"), + ViewId("cdf_cdm", "CogniteVisualizable", "v1"), + # Required by Cognite3DObject.images360 + # (multi-edge connection — needs BOTH target view AND edgeSource view) + ViewId("cdf_cdm", "Cognite360Image", "v1"), + ViewId("cdf_cdm", "Cognite360ImageAnnotation", "v1"), # ← edgeSource! + ViewId("cdf_cdm", "CogniteAnnotation", "v1"), # parent of above + ViewId("cdf_cdm", "Cognite360ImageCollection", "v1"), + ViewId("cdf_cdm", "Cognite360ImageStation", "v1"), + # Required by Cognite3DObject.pointCloudVolumes + ViewId("cdf_cdm", "CognitePointCloudVolume", "v1"), + ViewId("cdf_cdm", "CognitePointCloudRevision", "v1"), + # Required by Industrial Tools 3D scene preview + ViewId("scene", "SceneConfiguration", "v1"), + ViewId("scene", "RevisionProperties", "v1"), + ViewId("cdf_3d_schema", "Cdf3dModel", "1"), +] + +# Add all to the data model +dm = client.data_modeling.data_models.retrieve( + (DM_SPACE, DM_EXT_ID, DM_VERSION), inline_views=False)[0] +existing = {(v.space, v.external_id, v.version) for v in dm.views} +for v in REQUIRED_3D_VIEWS: + if (v.space, v.external_id, v.version) not in existing: + dm.views.append(v) +client.data_modeling.data_models.apply(DataModelApply( + space=DM_SPACE, external_id=DM_EXT_ID, version=DM_VERSION, + name=dm.name, description=dm.description, views=dm.views)) +``` + +### Scanning for Missing View Types + +After adding views, run this scanner to ensure no further types are missing: + +```python +dm = client.data_modeling.data_models.retrieve( + (DM_SPACE, DM_EXT_ID, DM_VERSION), inline_views=True)[0] + +dm_view_ids = {(v.space, v.external_id, v.version) for v in dm.views} +missing = set() +for view in dm.views: + for prop_name, prop in view.properties.items(): + prop_dict = prop.dump() if hasattr(prop, 'dump') else {} + for field in ["source", "edgeSource"]: + src = prop_dict.get(field) or {} + if isinstance(src, dict) and src.get("type") == "view": + key = (src.get("space"), src.get("externalId"), src.get("version")) + if key not in dm_view_ids: + missing.add((key, f"{view.external_id}.{prop_name}[{field}]")) + +if missing: + print("MISSING referenced view types:") + for (space, ext, ver), context in sorted(missing): + print(f" {space}/{ext}/{ver} ← from {context}") +else: + print("All referenced view types are present.") +``` + +--- + +## 4. Create the DM 3D Instances + +### Recommended: Use the Dedicated Endpoint (DM-Only Projects Only) + +CDF provides a dedicated endpoint specifically for DM-only projects that handles the full +DM chain creation internally — no need to manually build `Cognite3DObject`, `CADNode`, or +update `Asset.object3D`. + +``` +POST /api/v1/projects/{project}/3d/contextualization/cad +``` + +**NOTE: This endpoint is only available for DataModelOnly projects.** + +```python +import requests + +# Read matches from RAW +matches = list(client.raw.rows.list(RAW_DB, RAW_TABLE, limit=-1)) + +items = [ + { + "asset": {"instanceId": {"space": ASSET_SPACE, "externalId": row.columns["assetId"]}}, + "nodeId": int(row.columns["3DId"]) # classic 3D node_id — API resolves treeIndex internally + } + for row in matches +] + +config = { + "object3DSpace": ASSET_SPACE, # where Cognite3DObject instances are created + "contextualizationSpace": CAD_SPACE, # where CADNode instances are created + "revision": { + "instanceId": {"space": CAD_SPACE, "externalId": REVISION_EXT_ID} + } +} + +# Send in batches of 100 (API limit per request) +url = f"https://{cluster}.cognitedata.com/api/v1/projects/{project}/3d/contextualization/cad" +token = client._config.credentials.authorization_header()[1] +headers = {"Authorization": token, "Content-Type": "application/json"} + +for i in range(0, len(items), 100): + batch = items[i:i+100] + resp = requests.post(url, headers=headers, + json={"items": batch, "dmsContextualizationConfig": config}) + resp.raise_for_status() + print(f" Batch {i//100 + 1}: {len(batch)} items → {resp.status_code}") +``` + +This replaces the entire manual `create_cad_node_mappings.py` script (~300 lines) with +~20 lines. The API internally handles: +- `nodeId` → `treeIndex` lookup +- `Cognite3DObject` instance creation in `object3DSpace` +- `CADNode` instance creation with `treeIndexes`, `revisions`, `object3D` properties +- `Asset.object3D` property update + +### Alternative: Manual DM Instance Creation (Legacy) + +If the endpoint is unavailable, use `scripts/create_cad_node_mappings.py` to create all instances. + +The script reads from the `contextualization_good` RAW table (output of the entity-matching +pipeline) and creates: + +| What | Where | Count (this project) | +|---|---|---| +| `Cognite3DObject` nodes | `instance_upstream_value_chain` | 523 | +| `CADNode` nodes | `rmdm` | 523 | +| `Asset.object3D` updates | `instance_upstream_value_chain` | 14 | + +```bash +python scripts/create_cad_node_mappings.py +``` + +The script is fully idempotent — all operations are upserts, safe to re-run. + +### Key Variables in the Script + +```python +CAD_INSTANCE_SPACE = "rmdm" +OBJ3D_INSTANCE_SPACE = "instance_upstream_value_chain" # MUST match ASSET_INSTANCE_SPACE +ASSET_INSTANCE_SPACE = "instance_upstream_value_chain" + +CLASSIC_MODEL_ID = 1872804187104968 +CLASSIC_REVISION_ID = 3857338668041659 +``` + +--- + +## 5. Create the Scene Configuration + +Industrial Tools 3D preview requires a `SceneConfiguration` node. **Without it the 3D +preview panel shows "No data available".** + +Run `scripts/create_scene_config.py` (or paste the snippet below): + +```python +from cognite.client.data_classes.data_modeling import NodeApply, NodeOrEdgeData, ViewId, EdgeApply + +SCENE_SPACE = "scene" +CLASSIC_MODEL_ID = 1872804187104968 +CLASSIC_REVISION_ID = 3857338668041659 + +MODEL_EXT_ID = f"clov_3d_model_{CLASSIC_MODEL_ID}" +SCENE_EXT_ID = "clov_navisworks_scene" + +# 1. Cdf3dModel node — represents the physical 3D model +client.data_modeling.instances.apply(nodes=[NodeApply( + space=SCENE_SPACE, + external_id=MODEL_EXT_ID, + sources=[NodeOrEdgeData( + source=ViewId("cdf_3d_schema", "Cdf3dModel", "1"), + properties={"name": "deployment_pack"}, + )], +)]) + +# 2. SceneConfiguration node — the scene entry point +client.data_modeling.instances.apply(nodes=[NodeApply( + space=SCENE_SPACE, + external_id=SCENE_EXT_ID, + sources=[NodeOrEdgeData( + source=ViewId("scene", "SceneConfiguration", "v1"), + properties={ + "name": "CLOV Navisworks Scene", + "description": "3D contextualization scene for CLOV Navisworks model", + "cameraTranslationX": 0.0, + "cameraTranslationY": 0.0, + "cameraTranslationZ": 50.0, + "cameraEulerRotationX": 0.0, + "cameraEulerRotationY": 0.0, + "cameraEulerRotationZ": 0.0, + }, + )], +)]) + +# 3. Edge: SceneConfiguration ──(model3ds)──→ Cdf3dModel +# Edge carries RevisionProperties (which revision to load + transform) +client.data_modeling.instances.apply(edges=[EdgeApply( + space=SCENE_SPACE, + external_id=f"{SCENE_EXT_ID}_to_{MODEL_EXT_ID}", + type={"space": "scene", "externalId": "SceneConfiguration.model3ds"}, + start_node={"space": SCENE_SPACE, "externalId": SCENE_EXT_ID}, + end_node={"space": SCENE_SPACE, "externalId": MODEL_EXT_ID}, + sources=[NodeOrEdgeData( + source=ViewId("scene", "RevisionProperties", "v1"), + properties={ + "revisionId": CLASSIC_REVISION_ID, + "translationX": 0.0, + "translationY": 0.0, + "translationZ": 0.0, + "eulerRotationX": 0.0, + "eulerRotationY": 0.0, + "eulerRotationZ": 0.0, + "scaleX": 1.0, + "scaleY": 1.0, + "scaleZ": 1.0, + "defaultVisible": True, + }, + )], +)]) +``` + +--- + +## 6. Space Assignment Rules + +| Resource | Instance Space | Reason | +|---|---|---| +| Assets | `instance_` | Main project instance space | +| `Cognite3DObject` | **Same as Assets** | IT queries 3D objects scoped to the asset's space | +| `CADNode` | `rmdm` or a dedicated 3D space | Keeps 3D geometry data separate | +| `CADRevision` / `CADModel` | Same as `CADNode` | Same 3D scope | +| `SceneConfiguration` / `Cdf3dModel` | `scene` | Fixed — IT always reads scenes from this space | + +> **Never** put `Cognite3DObject` in `rmdm`. Industrial Tools will not find it. + +--- + +## 7. Industrial Tools Location Setup (UI) + +This step **cannot be done via API** — it must be done through the browser once per project. + +1. Open Industrial Tools in CDF Fusion +2. Go to **Settings / Admin → Locations** +3. Create a new location (or edit the existing one): + - **Name:** e.g. `upstream-value-chain` + - **Data model space:** `upstream-value-chain` + - **Data model external ID:** `upstream_value_chain` + - **Data model version:** `v1` + - **Instance space:** `instance_upstream_value_chain` +4. Save → the **"incompatible data model"** banner should disappear + +If you see the banner, click **"Reset to preset location"** to apply the configured preset. + +--- + +## 8. Verification Checklist + +Run this after completing setup to confirm the full chain is intact: + +```python +from cognite.client.data_classes.data_modeling import ViewId, NodeId + +ASSET_SPACE = "instance_upstream_value_chain" +CAD_SPACE = "rmdm" +TEST_ASSET = "EN0110-BA-201-001" # an asset known to be contextualized + +VISUAL_VIEW = ViewId("cdf_cdm", "CogniteVisualizable", "v1") +OBJ3D_VIEW = ViewId("cdf_cdm", "Cognite3DObject", "v1") +CAD_VIEW = ViewId("sp_enterprise_process_industry", "CADNode", "v1") +REV_VIEW = ViewId("cdf_cdm", "CogniteCADRevision", "v1") +MODEL_VIEW = ViewId("cdf_cdm", "CogniteCADModel", "v1") + +def pdump(node, view_id): + d = node.properties.dump() + return d.get(view_id.space, {}).get(f"{view_id.external_id}/{view_id.version}", {}) + +checks = [] + +# 1. Asset has object3D +r = client.data_modeling.instances.retrieve(nodes=[NodeId(ASSET_SPACE, TEST_ASSET)], sources=[VISUAL_VIEW]) +obj3d = pdump(r.nodes[0], VISUAL_VIEW).get("object3D") +checks.append(("Asset.object3D set", obj3d is not None)) + +# 2. Cognite3DObject exists +obj3d_ext = obj3d.get("externalId") if isinstance(obj3d, dict) else obj3d.external_id +r2 = client.data_modeling.instances.retrieve(nodes=[NodeId(ASSET_SPACE, obj3d_ext)], sources=[OBJ3D_VIEW]) +checks.append(("Cognite3DObject exists", bool(r2.nodes))) + +# 3. CADNode exists with treeIndexes + revisions +tree_idx = obj3d_ext.replace("cog_3dobj_", "") +r3 = client.data_modeling.instances.retrieve(nodes=[NodeId(CAD_SPACE, f"cog_3d_node_{tree_idx}")], sources=[CAD_VIEW]) +if r3.nodes: + cp = pdump(r3.nodes[0], CAD_VIEW) + checks.append(("CADNode.treeIndexes set", bool(cp.get("treeIndexes")))) + checks.append(("CADNode.revisions set", bool(cp.get("revisions")))) + rev = cp.get("revisions", [None])[0] +else: + checks += [("CADNode exists", False), ("CADNode.treeIndexes", False), ("CADNode.revisions", False)] + rev = None + +# 4. CADRevision exists +if rev: + rev_ext = rev.external_id if hasattr(rev, "external_id") else rev.get("externalId") + r4 = client.data_modeling.instances.retrieve(nodes=[NodeId(CAD_SPACE, rev_ext)], sources=[REV_VIEW]) + if r4.nodes: + rp = pdump(r4.nodes[0], REV_VIEW) + checks.append(("CADRevision.published=True", rp.get("published") is True)) + checks.append(("CADRevision.status=Done", rp.get("status") == "Done")) + model = rp.get("model3D") + else: + checks += [("CADRevision exists", False)]; model = None +else: + model = None + +# 5. SceneConfiguration exists +r5 = client.data_modeling.instances.retrieve(nodes=[NodeId("scene", "clov_navisworks_scene")]) +checks.append(("SceneConfiguration exists", bool(r5.nodes))) + +print("\n=== 3D Chain Verification ===") +all_ok = True +for label, ok in checks: + status = "OK" if ok else "FAIL" + print(f" [{status}] {label}") + if not ok: + all_ok = False +print(f"\n {'All checks passed!' if all_ok else 'Some checks FAILED — review above.'}") +``` + +--- + +## 9. Common Mistakes + +| Mistake | Impact | Correct Approach | +|---|---|---| +| Using `node_id` for `cadNodeReference` | 3D viewer cannot highlight nodes | Use `str(treeIndex)` | +| Using `node_id` in external IDs (`cog_3dobj_{node_id}`) | Mismatched IDs when tree_index ≠ node_id | Use `cog_3dobj_{treeIndex}` | +| Putting `Cognite3DObject` in `rmdm` | IT 3D preview shows "No data" | Put it in the asset instance space | +| Adding only `Cognite3DObject` to data model | App crashes with `_UnknownType` GraphQL error | Add all 18 views listed in Section 3 | +| Omitting `edgeSource` view for edge connections | `_UnknownType` on `.edges` in GraphQL | Check `edgeSource` field in each multi-edge view | +| Skipping `SceneConfiguration` | IT 3D preview shows "No data" | Always create the scene (Section 5) | +| Not configuring IT Location | "Incompatible data model" banner permanently | Configure Location in IT admin (Section 7) | +| Running create script once and assuming success | Silent partial failures leave chain broken | Always run the verification script (Section 8) | + +--- + +## Quick Reference — Script Execution Order + +For a new project, run these scripts in order: + +``` +1. python scripts/upload_asset_hierarchy_dm.py # upload asset hierarchy to DM +2. (run entity matching pipeline / function) # populate RAW contextualization_good +3. python scripts/add_3d_views_to_datamodel.py # add all required views to DM ← do BEFORE step 4 +4. POST /3d/contextualization/cad (see Section 4) # create 3D DM chain via dedicated endpoint + OR python scripts/create_cad_node_mappings.py # manual fallback +5. python scripts/create_scene_config.py # create SceneConfiguration +6. (browser) Configure IT Location # one-time UI step +7. python scripts/verify_3d_chain.py # confirm everything is wired up +``` diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/docs/CONFIGURATION_VERIFICATION.md b/modules/accelerators/contextualization/cdf_three_dimenssion/docs/CONFIGURATION_VERIFICATION.md new file mode 100644 index 00000000..e47c850c --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/docs/CONFIGURATION_VERIFICATION.md @@ -0,0 +1,116 @@ +# Configuration verification guide (instance spaces, views, pipeline config) + +This module reads **runtime settings from CDF extraction pipeline config** (and function **env vars** set at deploy). Local `default.config.yaml` / project `config.dev.yaml` / `.env` are inputs to **`cdf build`** only—they must be **deployed** so CDF stores **resolved** values (no literal `${...}` or `{{...}}` left in the live pipeline config). + +Use this checklist when you point the module at **your** data model, **your** instance spaces, and **your** view external ids and versions. + +--- + +## 1. Concepts (DM) + +| Concept | What it is | Example | +|--------|------------|--------| +| **Instance space** | Where **node instances** live (`space` + `externalId` on `NodeApply` / list filters). | `sp_inst_domain_whiptail` | +| **View** | Schema slice for properties: **`ViewId(view_space, external_id, version)`**. The **view space** is usually the **data model / schema** space where the view is defined—not always the same string as the instance space. | `sp_dm_domain_epc`, `Tag`, `0.0.5` | +| **Asset DM space (`assetDmSpace`)** | Used by the main contextualization function to **list asset nodes** in DM (`get_resources.get_assets`). Must match where your **Asset** (or asset view) instances actually exist. | Same as instance space if all assets are in one space | +| **CAD node space (`cadNodeDmSpace`)** | Instance space for **CAD** nodes (`CogniteCADNode`, etc.). | Often separate from assets | +| **RAW DB** | `rawdb` must match the deployed RAW database name, typically `3d__`. | + +See also **[3d_contextualization_dm_only_guide.md](./3d_contextualization_dm_only_guide.md)** (`Cognite3DObject` same space as `Asset` for IT, CAD nodes may differ). + +--- + +## 2. Where configuration is read + +| Component | Source at runtime | +|-----------|-------------------| +| **Main contextualization** (`fn_context_3d_cad_asset_contextualization`) | `client.extraction_pipelines.config.retrieve()` → `config.data.parameters` (camelCase in API). Optional env fallbacks where implemented (e.g. `ASSET_INSTANCE_SPACE`, `ASSET_VIEW_*`). | +| **Upload asset hierarchy** | Same pattern for `ep_ctx_*_upload_asset_hierarchy`; **requires** `assetInstanceSpace` + `assetViewSpace` / `assetViewExternalId` / `assetViewVersion` (or matching env vars). **No** hardcoded defaults for view/instance space. | +| **Upload manual mappings** | Pipeline config for `rawdb`, `rawTableManual`, `fileExternalId`. | +| **Function.yaml `envVars`** | Injected at deploy; must align with `default.config` / `.env` so values are correct **after** `cdf build`. | + +--- + +## 3. Main annotation pipeline (`ctx_3d_oid_fileshare_annotation.config.Config.yaml`) + +Verify these **`parameters`** in the **deployed** pipeline config in CDF (names are **camelCase** in JSON/YAML): + +| Parameter | Purpose | +|-----------|---------| +| `assetDmSpace` | Instance space for listing **asset** nodes used in matching (`get_assets`). | +| `assetViewSpace` | View **space** for `HasData` filter (often data model space; can match `asset_view_space` in your project). | +| `assetViewExtId` | View **external id** (e.g. `AssetExtension`, `Tag`). | +| `assetViewVersion` | View **version** string (must match the view in your data model). | +| `cadNodeDmSpace` | Instance space for CAD / 3D node side. | +| `rawdb` | RAW database name. | +| `threeDModelName` | 3D model name in CDF. | +| `threeDDataSetExtId` | Dataset for 3D resources. | + +**Code reference:** `get_resources.get_assets` uses `asset_dm_space`, `asset_view_space`, `asset_view_ext_id`, `asset_view_version` (from pipeline config / env). + +--- + +## 4. Upload asset hierarchy pipeline (`ctx_3d_upload_asset_hierarchy.config.Config.yaml`) + +| Parameter | Purpose | +|-----------|---------| +| `assetInstanceSpace` | **Instance space** for upserted asset nodes. | +| `assetViewSpace` | View space for `NodeApply.sources` (template may use `data_model_space`). | +| `assetViewExternalId` | View external id. | +| `assetViewVersion` | View version. | + +**Code reference:** `fn_context_3d_upload_asset_hierarchy/handler.py` — `_resolve_asset_instance_space`, `_resolve_asset_view`. + +--- + +## 5. Verification checklist (before relying on production runs) + +1. **Build and deploy** + Run `cdf build` and `cdf deploy` after changing `variables` / `.env`. + +2. **Inspect live pipeline config in CDF** + Open the extraction pipeline → configuration. Confirm: + - No unresolved **`${VAR}`** or **`{{var}}`** strings. + - **`assetDmSpace` / `assetView*` / `cadNodeDmSpace`** match your DM project. + - **`rawdb`** matches your RAW database. + +3. **Views exist and versions match** + In **Data modeling**, confirm the view **`(space, externalId, version)`** used in config is **exactly** what your instances implement. A wrong version returns **no rows** in `get_assets` or fails applies. + +4. **Instance space vs view space** + If assets live in `sp_inst_*` but the view is published under `sp_dm_*`, set **`assetViewSpace`** to the **view’s** space and **`assetDmSpace`** to the **instance** space where nodes exist (per your model). + +5. **Functions** + Confirm the annotation function’s schedule/trigger uses the **same** pipeline external id as in config. After renaming **`default_location` / `source_name`**, pipeline external ids change—update schedules, workflows, and any hardcoded ids in scripts. + +6. **Smoke test** + - Trigger **upload asset hierarchy** with a tiny CSV and confirm instances appear under the expected **space** and **view**. + - Run **annotation** with a small `readLimit` / debug if available and confirm `get_assets` returns entities (check logs for `Number of DM assets found`). + +--- + +## 6. Common failures + +| Symptom | Likely cause | +|--------|----------------| +| Literal `${3d_model_name}` in config | Pipeline config in CDF not substituted—redeploy from a correct build or patch config in CDF. | +| `0` assets found | Wrong `assetDmSpace`, or view triple wrong, or subtree filters exclude everything. | +| `ValueError` on upload hierarchy | Missing `assetInstanceSpace` or view fields—set pipeline parameters or `ASSET_INSTANCE_SPACE` / `ASSET_VIEW_*` env. | +| Permission errors | Auth group `3d.Group.yaml` scopes must include your spaces and datasets. | + +--- + +## 7. Changing `default_location` or `source_name` + +These drive **many** external ids (`ep_*`, `fn_*` patterns, `rawdb`, etc.). Changing them creates a **new** naming slice in CDF; existing pipelines and RAW DBs under the old names are **not** renamed automatically. Plan migration or accept parallel resources. + +--- + +## Related files (reference) + +- `extraction_pipelines/ctx_3d_oid_fileshare_annotation.config.Config.yaml` — main parameters +- `extraction_pipelines/ctx_3d_upload_asset_hierarchy.config.Config.yaml` — upload hierarchy +- `extraction_pipelines/ctx_3d_upload_manual_mappings.config.Config.yaml` — manual CSV → RAW +- `functions/fn_context_3d_cad_asset_contextualization/get_resources.py` — `get_assets` +- `functions/fn_context_3d_upload_asset_hierarchy/handler.py` — instance space + view resolution +- `env.template` — env var names for local/project `.env` diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/env.template b/modules/accelerators/contextualization/cdf_three_dimenssion/env.template new file mode 100644 index 00000000..27b6ff8f --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/env.template @@ -0,0 +1,133 @@ +# ============================================================================ +# 3D CONTEXTUALIZATION PIPELINE - ENVIRONMENT VARIABLES +# ============================================================================ +# Rename this file to .env and fill in your values +# DO NOT commit .env to version control! +# ============================================================================ + +# ---------------------------------------------------------------------------- +# CDF CONNECTION (Required) +# ---------------------------------------------------------------------------- +# These are required for connecting to Cognite Data Fusion + +# Your CDF project name (e.g., "my-company-dev", "my-company-prod") +CDF_PROJECT=your-cdf-project-name + +# CDF cluster (e.g., "westeurope-1", "az-eastus-1", "europe-west1-1") +CDF_CLUSTER=westeurope-1 + +# ---------------------------------------------------------------------------- +# IDENTITY PROVIDER / AUTHENTICATION (Required) +# ---------------------------------------------------------------------------- +# OAuth2 credentials for authenticating with CDF + +# Client ID from Azure AD / Identity Provider +IDP_CLIENT_ID=your-client-id-here + +# Client Secret from Azure AD / Identity Provider +IDP_CLIENT_SECRET=your-client-secret-here + +# Token URL for OAuth2 authentication +# Azure AD format: https://login.microsoftonline.com/{tenant-id}/oauth2/v2.0/token +IDP_TOKEN_URL=https://login.microsoftonline.com/your-tenant-id/oauth2/v2.0/token + +# ---------------------------------------------------------------------------- +# CDF TOOLKIT TEMPLATE VARIABLES +# ---------------------------------------------------------------------------- +# These are used in YAML templates ({{variable_name}}) + +# ---------------------------------------------------------------------------- +# DM / 3D — substituted in default.config.yaml (required for cdf build) +# ---------------------------------------------------------------------------- +FUNCTION_SPACE=upstream-value-chain +THREE_D_MODEL_NAME=deployment_pack +CAD_MODEL_NAME=deployment_pack +DEFAULT_CAD_SPACE=rmdm +DEFAULT_DM_SPACE=upstream-value-chain +DATA_MODEL_SPACE=upstream-value-chain +DM_EXT_ID=upstream_value_chain +DM_VERSION=v1 +ASSET_INSTANCE_SPACE=instance_upstream_value_chain +# ViewId for AssetExtension (upload_asset_hierarchy / DM asset upserts) +ASSET_VIEW_SPACE=upstream-value-chain +ASSET_VIEW_EXT_ID=AssetExtension +ASSET_VIEW_VERSION=v1 +CAD_NODE_INSTANCE_SPACE=rmdm + +# Location identifier (e.g., "oid", "clov", "valhall") +default_location=clov + +# Source name identifier (e.g., "fileshare", "sharepoint", "navisworks") +source_name=navisworks + +# 3D model name as it appears in CDF +3d_model_name=clov + +# External ID of the root asset to match against +external_root_id_asset=CLV + +# Dataset external ID for 3D data +3d_dataset=ds_3d_clov + +# ---------------------------------------------------------------------------- +# AZURE AD SECURITY GROUPS (Source IDs) +# ---------------------------------------------------------------------------- +# These are the Object/Source IDs from Azure AD for security groups + +# Group for extractors (writing data to CDF) +3d_location_extractor_group_source_id=c32ac88c-4048-4b74-854a-107a0527f22c + +# Group for processing/contextualization functions +3d_location_processing_group_source_id=81678087-872a-41ed-9d93-d9948098eab1 + +# Group for reading 3D data +3d_location_read_group_source_id=788105c1-adbd-4e09-9f85-98e8fb001166 + +# ---------------------------------------------------------------------------- +# CI/CD CREDENTIALS (For Scheduled Functions) +# ---------------------------------------------------------------------------- +# Used when running functions on a schedule + +cicd_clientId=your-cicd-client-id +cicd_clientSecret=your-cicd-client-secret + +# ---------------------------------------------------------------------------- +# EXTRACTION PIPELINE CONFIGURATION +# ---------------------------------------------------------------------------- +# These can be overridden in the extraction pipeline config + +# Extraction Pipeline External ID +EXTRACTION_PIPELINE_EXT_ID=ep_ctx_3d_clov_navisworks_annotation + +# Debug mode (True = dry run, no writes to CDF) +DEBUG=False + +# Run on all 3D nodes or only unmatched ones +RUN_ALL=False + +# Matching threshold (0.0 - 1.0) +MATCH_THRESHOLD=0.6 + +# Keep existing mappings or overwrite +KEEP_OLD_MAPPING=False + +# ---------------------------------------------------------------------------- +# RAW DATABASE CONFIGURATION +# ---------------------------------------------------------------------------- +# RAW database and table names for storing results + +RAW_DB=3d_clov_navisworks +RAW_TABLE_GOOD=contextualization_good +RAW_TABLE_BAD=contextualization_bad +RAW_TABLE_ALL=contextualization_all +RAW_TABLE_MANUAL=contextualization_manual_input + +# ---------------------------------------------------------------------------- +# OPTIONAL: BUILD/DEPLOY CONFIGURATION +# ---------------------------------------------------------------------------- + +# Build type (dev, staging, prod) +CDF_BUILD_TYPE=dev + +# Environment name +CDF_ENVIRON=dev diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation.ExtractionPipeline.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation.ExtractionPipeline.yaml new file mode 100644 index 00000000..0378d622 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation.ExtractionPipeline.yaml @@ -0,0 +1,64 @@ +externalId: 'ep_ctx_3d_{{default_location}}_{{source_name}}_annotation' +name: 'ctx:3d:{{default_location}}:{{source_name}}:annotation' +dataSetExternalId: 'ds_3d_{{default_location}}' +description: 'Contextualization of 3D model and assets for {{default_location}}:{{source_name}}' +rawTables: + - dbName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_good' + - dbName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_bad' + - dbName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_manual_input' + - dbName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_rule' +source: '{{source_name}}' +documentation: > + # 3D model and assets Contextualization + + The 3d model and assets Contextualization runs as a process in CDF Functions. Process triggers on new 3d model that not is connected to an asset. + The process stages are as follows: + + + ``` + Read configuration and start process by + 1. Read RAW table with manual mappings and extract all rows not contextualized + 2. Apply manual mappings from 3D nodes to Asset - this will overwrite any existing mapping + 3. Read all time series not matched (or all if runAll is True) + 4. Read all assets + 5. Run ML contextualization to match 3D Nodes -> Assets + 6. Update 3D Nodes with mapping + 7. Write results matched (good) not matched (bad) to RAW + 8. Output in good/bad table can then be used in workflow to update manual mappings + ``` + + Default configuration is provided as part of the deployment, but can be overwritten by configuration provided in configuration section below: + + ``` + # if True, run without writing any annotations to CDF + debug: False + # if True run contextualization on all 3d model, if False only run on 3d model that not is linked to an asset + runAll: False + # RAW DB used to store input and results for contextualization + rawdb: '3d_{{default_location}}_{{source_name}}' + # RAW table used to store all manual input for mapping that should be used in contextualization + # This table could be populated from earlier mappings that are approved or used for manual override + rawTableManual: 'contextualization_manual_input' + # RAW table used to store all mapping that was automatically approved as good + rawTableGood: 'contextualization_good' + # RAW table used to store all mapping that NOT was automatically approved + rawTableBad: 'contextualization_bad' + # RAW table used to store all mappings regardless of the quality. + rawTableAll: 'contextualization_all' + # ExternalId for root assets to be used for contextualization + assetRootExtIds {{external_root_id_asset}} + # externalId for 3d data set + 3dDataSetExtId: 'ds_3d_{{default_location}}' + # Name of 3d model + 3dModelName: {{3d_model_name}} + # Number between 0 and 1, indication of strict the matching should be 1 = only exact matches + matchThreshold: 0.75 + # if True keep old mapping, if False overwrite old mapping + keepOldMapping: False + + ``` + diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation.config.Config.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation.config.Config.yaml new file mode 100644 index 00000000..f5bfb693 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation.config.Config.yaml @@ -0,0 +1,27 @@ +# Optional (defaults used when omitted): assetSubtreeExternalIds, nodeNamePrefixes, nodeNameMaxSlashes, nameReplacements, suffixesToStrip, rawTableRule +# Example: assetSubtreeExternalIds: ['CLV/FPSOH', 'CLV/FPSOT'] +# Example: nodeNamePrefixes: ['/6049', '/6A2TE']; nodeNameMaxSlashes: 1 +# Example: nameReplacements: [{"from": "TO-", "to": ""}]; suffixesToStrip: ['CTRL'] +# Example: rawTableRule: 'contextualization_rule' (table with columns regexpEntity, regexpAsset for rule-based mapping before ML) +externalId: 'ep_ctx_3d_{{default_location}}_{{source_name}}_annotation' +config: + data: + parameters: + debug: False + runAll: False + rawdb: '3d_{{default_location}}_{{source_name}}' + rawTableGood: 'contextualization_good' + rawTableBad: 'contextualization_bad' + rawTableAll: 'contextualization_all' + rawTableManual: 'contextualization_manual_input' + assetDmSpace: {{asset_instance_space}} + assetViewSpace: {{asset_view_space}} + assetViewExtId: {{asset_view_ext_id}} + assetViewVersion: {{asset_view_version}} + + cadNodeDmSpace: {{cad_node_instance_space}} + threeDDataSetExtId: 'ds_3d_{{default_location}}' + threeDModelName: '{{3d_model_name}}' + matchThreshold: 0.5 + keepOldMapping: False + diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation_quality_check.ExtractionPipeline.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation_quality_check.ExtractionPipeline.yaml new file mode 100644 index 00000000..64d6aad1 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation_quality_check.ExtractionPipeline.yaml @@ -0,0 +1,14 @@ +externalId: 'ep_ctx_3d_{{default_location}}_{{source_name}}_annotation_quality_check' +name: 'ctx:3d:{{default_location}}:{{source_name}}:annotation_quality_check' +dataSetExternalId: 'ds_3d_{{default_location}}' +description: 'Quality checking the contextualization of 3D model and assets for {{default_location}}:{{source_name}} after possible manual changes' +rawTables: + - dbName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_good' + - dbName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_bad' + - dbName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_manual_input' + +source: '{{source_name}}' + diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation_quality_check.config.Config.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation_quality_check.config.Config.yaml new file mode 100644 index 00000000..9d3d3609 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_oid_fileshare_annotation_quality_check.config.Config.yaml @@ -0,0 +1,12 @@ +externalId: 'ep_ctx_3d_{{default_location}}_{{source_name}}_annotation_quality_check' +config: + data: + debug: False + rawdb: '3d_{{default_location}}_{{source_name}}' + rawTableGood: 'contextualization_good' + rawTableBad: 'contextualization_bad' + rawTableManual: 'contextualization_manual_input' + assetRootExtId: {{external_root_id_asset}} + 3dDataSetExtId: 'ds_3d_{{default_location}}' + 3dModelName: '{{3d_model_name}}' + diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_asset_hierarchy.ExtractionPipeline.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_asset_hierarchy.ExtractionPipeline.yaml new file mode 100644 index 00000000..b7821989 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_asset_hierarchy.ExtractionPipeline.yaml @@ -0,0 +1,9 @@ +externalId: 'ep_ctx_3d_{{default_location}}_{{source_name}}_upload_asset_hierarchy' +name: 'ctx:3d:{{default_location}}:{{source_name}}:upload_asset_hierarchy' +dataSetExternalId: 'ds_3d_{{default_location}}' +description: 'Upload Asset hierarchy (DM nodes) from CSV for {{default_location}}:{{source_name}}' +source: '{{source_name}}' +documentation: > + Run this pipeline to create or update Asset instances in CDF Data Modeling from a CSV file. + CSV columns: externalId (required), name, description, parentExternalId, tags, aliases, assetType. + Set fileExternalId (and dataSetExternalId) in config or in the run input to point to the CSV in CDF Files. diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_asset_hierarchy.config.Config.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_asset_hierarchy.config.Config.yaml new file mode 100644 index 00000000..97d7ee41 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_asset_hierarchy.config.Config.yaml @@ -0,0 +1,14 @@ +# Config for upload asset hierarchy pipeline. Set fileExternalId (and optionally dataSetExternalId) +# to the CDF File containing the asset CSV, or pass them when triggering the run. +externalId: 'ep_ctx_3d_{{default_location}}_{{source_name}}_upload_asset_hierarchy' +config: + data: + parameters: + assetInstanceSpace: {{asset_instance_space}} + # DM view for asset nodes (same variables as default.config / .env) + assetViewSpace: '{{data_model_space}}' + assetViewExternalId: '{{asset_view_ext_id}}' + assetViewVersion: '{{asset_view_version}}' + # Optional: CDF File external id of the asset hierarchy CSV + # fileExternalId: 'asset_hierarchy.csv' + # dataSetExternalId: 'ds_3d_{{default_location}}' diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_manual_mappings.ExtractionPipeline.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_manual_mappings.ExtractionPipeline.yaml new file mode 100644 index 00000000..6f3ecd4c --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_manual_mappings.ExtractionPipeline.yaml @@ -0,0 +1,15 @@ +externalId: 'ep_ctx_3d_{{default_location}}_{{source_name}}_upload_manual_mappings' +name: 'ctx:3d:{{default_location}}:{{source_name}}:upload_manual_mappings' +dataSetExternalId: 'ds_3d_{{default_location}}' +description: 'Upload manual 3D–asset mappings from CSV to RAW table for {{default_location}}:{{source_name}}' +rawTables: + - dbName: '3d_{{default_location}}_{{source_name}}' + tableName: 'contextualization_manual_input' +source: '{{source_name}}' +documentation: > + Run this pipeline to upload manual mapping rows from a CSV file into the + contextualization_manual_input RAW table. The CSV must have columns 3DId and assetId + (and optionally 3dNodeName, assetExternalId, contextualized). + + Either set fileExternalId (and dataSetExternalId) in the pipeline config to a CDF File + containing the CSV, or pass them when triggering the pipeline run. diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_manual_mappings.config.Config.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_manual_mappings.config.Config.yaml new file mode 100644 index 00000000..513283a1 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/ctx_3d_upload_manual_mappings.config.Config.yaml @@ -0,0 +1,12 @@ +# Config for upload manual mappings pipeline. Optional: fileExternalId, dataSetExternalId +# to read CSV from CDF Files; if omitted, trigger the pipeline run with input containing +# fileExternalId (and optionally dataSetExternalId). +externalId: 'ep_ctx_3d_{{default_location}}_{{source_name}}_upload_manual_mappings' +config: + data: + parameters: + rawdb: '3d_{{default_location}}_{{source_name}}' + rawTableManual: 'contextualization_manual_input' + # Optional: CDF File external id of the CSV to upload (e.g. manual_mappings.csv) + # fileExternalId: 'manual_mappings.csv' + # dataSetExternalId: 'ds_3d_{{default_location}}' diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/source_3d_oid_fileshare.ExtractionPipeline.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/source_3d_oid_fileshare.ExtractionPipeline.yaml new file mode 100644 index 00000000..15043e96 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/source_3d_oid_fileshare.ExtractionPipeline.yaml @@ -0,0 +1,21 @@ +externalId: 'ep_src_3d_{{default_location}}_{{source_name}}' +name: 'src:3d:{{default_location}}:{{source_name}}' +dataSetExternalId: 'ds_3d_{{default_location}}' +description: 'File source extraction pipeline with configuration for File extractor reading 3D data from {{default_location}}:{{source_name}}' +source: '{{source_name}}' +documentation: > + The Cognite File Extractor is a generic file extractor that can connect to and extract file from a host of different sources. + + + The currently supported sources are: + * Local files + * File Transfer Protocol (FTP) + * Secure File Transfer Protocol (SFTP) + * File Transfer Protocol over SSH (FTPS) + * Documents and files in SharePoint Online + * Files in Google Cloud Storage + + + The extractor is provided with template configuration files for all mentioned source systems. To set up the file extractor against your source system, pick the config template that fits your setup and adjust parameters as you see fit. + + diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/source_3d_oid_fileshare.config.Config.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/source_3d_oid_fileshare.config.Config.yaml new file mode 100644 index 00000000..f5b0677e --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/extraction_pipelines/source_3d_oid_fileshare.config.Config.yaml @@ -0,0 +1,22 @@ +# 3D file extractor: watches a local folder and uploads files to CDF. Set file_extractor_watch_path in config.dev.yaml. +externalId: 'ep_src_3d_{{default_location}}_{{source_name}}' +config: + logger: + console: + level: INFO + file: + level: INFO + path: "c:/tmp/file_extractor.log" + + cognite: + data-set: + external-id: 'ds_3d_{{default_location}}' + + files: + extensions: + - .zip + - .nwd + - .rvm + file-provider: + type: local + path: '{{file_extractor_watch_path}}' \ No newline at end of file diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check.yaml new file mode 100644 index 00000000..1660e709 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check.yaml @@ -0,0 +1,15 @@ +# Toolkit function definition. The code directory must have the same name as externalId. +- name: 'context:3d:cad:asset:contextualization:quality_check' + externalId: 'fn_context_3d_asset_contextualization_quality_check' + owner: 'Anonymous' + description: 'Quality checking the contextualization of 3D model and assets for {{default_location}}:{{source_name}} after possible manual changes' + metadata: + version: '0.1.0' + envVars: + ENV_TYPE: '${CDF_BUILD_TYPE}' + CDF_ENV: '${CDF_ENVIRON}' + cpu: 0.60 + runtime: 'py311' + functionPath: './handler.py' + dataSetExternalId: 'ds_3d_{{default_location}}' + space: '{{function_space}}' diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/__init__.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/config.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/config.py new file mode 100644 index 00000000..6fe88668 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/config.py @@ -0,0 +1,59 @@ +from __future__ import annotations + +from dataclasses import dataclass +from typing import Any + +import yaml +from cognite.client import CogniteClient + + +@dataclass +class ContextConfig: + extraction_pipeline_ext_id: str + debug: bool + rawdb: str + raw_table_good: str + raw_table_bad: str + raw_table_manual: str + three_d_data_set_ext_id: str + three_d_model_name: str + asset_root_ext_id: str + + @classmethod + def load(cls, data: dict[str, Any]) -> ContextConfig: + return cls( + extraction_pipeline_ext_id=data["ExtractionPipelineExtId"], + debug=data["debug"], + rawdb=data["rawdb"], + raw_table_good=data["rawTableGood"], + raw_table_bad=data["rawTableBad"], + raw_table_manual=data["rawTableManual"], + three_d_data_set_ext_id=data["3dDataSetExtId"], + three_d_model_name=data["3dModelName"], + asset_root_ext_id=data["assetRootExtId"], + ) + + +def load_config_parameters(client: CogniteClient, function_data: dict[str, Any]) -> ContextConfig: + """ + Retrieves the configuration parameters from the function data and loads the configuration from CDF. + Configuration is loaded from the extraction pipeline configuration and the function data. + + Args: + client: Instance of CogniteClient + function_data: dictionary containing the function input configuration data + + Returns: + ContextConfig object + """ + try: + extraction_pipeline_ext_id = function_data["ExtractionPipelineExtId"] + except KeyError: + raise ValueError("Missing parameter 'ExtractionPipelineExtId' in function data") + + serialized_config = client.extraction_pipelines.config.retrieve(extraction_pipeline_ext_id) + if serialized_config: + data = yaml.safe_load(serialized_config.config)["data"] + data["ExtractionPipelineExtId"] = extraction_pipeline_ext_id + return ContextConfig.load(data) + raise ValueError(f"No configuration found in pipeline {extraction_pipeline_ext_id!r}") diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/get_resources.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/get_resources.py new file mode 100644 index 00000000..eafbc7cb --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/get_resources.py @@ -0,0 +1,46 @@ +from typing import Any + +from cognite.client import CogniteClient +from config import ContextConfig + + +def get_treed_asset_mappings( + client: CogniteClient, model_id: int, revision_id: int +) -> dict[str, Any]: + mappings = client.three_d.asset_mappings.list(model_id=model_id, revision_id=revision_id, limit=-1) + existing_matches = {} + for mapping in mappings.data: + if mapping.asset_id in existing_matches: + existing_matches[mapping.asset_id].append(mapping.node_id) + else: + existing_matches[mapping.asset_id] = [mapping.node_id] + return existing_matches + + +def get_3d_model_id_and_revision_id( + client: CogniteClient, config: ContextConfig, three_d_model_name: str +) -> tuple[int, int]: + try: + model_id_list = [ + model.id + for model in client.three_d.models.list(published=True, limit=1) + if model.name == three_d_model_name + ] + if not model_id_list: + raise ValueError(f"3D model with name {three_d_model_name} not found") + model_id = model_id_list[0] + + revision_list = client.three_d.revisions.list(model_id=model_id, published=True) + if not revision_list: + raise ValueError(f"3D model with name {three_d_model_name} has no published revisions") + revision = revision_list[0] # get latest revision + + print(f"INFO: For Model name: {three_d_model_name} using 3D model ID: {model_id} - revision ID: {revision.id}") + print("If wrong model ID/revision remove other published versions of the model and try again") + + return model_id, revision.id + + except Exception as e: + raise Exception( + f"ERROR: Not able to get entities for 3D nodes in data set: {config.three_d_data_set_ext_id}- error: {e}" + ) \ No newline at end of file diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/handler.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/handler.py new file mode 100644 index 00000000..08e1134e --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/handler.py @@ -0,0 +1,65 @@ +from __future__ import annotations + +import os +import sys +from pathlib import Path + +from cognite.client import ClientConfig, CogniteClient +from cognite.client.credentials import OAuthClientCredentials + +from config import load_config_parameters +from pipeline import run_quality_check + +sys.path.append(str(Path(__file__).parent)) + + +def handle(data: dict, client: CogniteClient) -> dict: + """ + Function handler for 3D annotation + Note that the name in the definition needs to be handle related to CDF Function usage + + Args: + data: dictionary containing the function input configuration data (by default only the ExtractionPipelineExtId) + client: Instance of CogniteClient + + Returns: + dict containing the function input data + """ + config = load_config_parameters(client, data) + run_quality_check(client, config) + return {"status": "succeeded", "data": data} + + +def run_local(): + """Code used for local Test & Debug""" + required_envvars = ("CDF_PROJECT", "CDF_CLUSTER", "IDP_CLIENT_ID", "IDP_CLIENT_SECRET", "IDP_TOKEN_URL") + if missing := [envvar for envvar in required_envvars if envvar not in os.environ]: + raise ValueError(f"Missing one or more env.vars: {missing}") + cdf_project_name = os.environ["CDF_PROJECT"] + cdf_cluster = os.environ["CDF_CLUSTER"] + client_id = os.environ["IDP_CLIENT_ID"] + client_secret = os.environ["IDP_CLIENT_SECRET"] + token_uri = os.environ["IDP_TOKEN_URL"] + + base_url = f"https://{cdf_cluster}.cognitedata.com" + scopes = f"{base_url}/.default" + + client = CogniteClient( + ClientConfig( + client_name=cdf_project_name, + base_url=base_url, + project=cdf_project_name, + credentials=OAuthClientCredentials( + token_url=token_uri, + client_id=client_id, + client_secret=client_secret, + scopes=[scopes], + ), + ) + ) + data = {"ExtractionPipelineExtId": "ep_ctx_3d_clov_sharepoint_annotation_quality_check"} + handle(data, client) + + +if __name__ == "__main__": + run_local() diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/pipeline.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/pipeline.py new file mode 100644 index 00000000..442799e5 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/pipeline.py @@ -0,0 +1,70 @@ +from cognite.client import CogniteClient +from cognite.extractorutils.uploader import RawUploadQueue + +from config import ContextConfig + +from get_resources import get_treed_asset_mappings, get_3d_model_id_and_revision_id +from write_resources import write_mapping_to_raw + + +def run_quality_check(client: CogniteClient, config: ContextConfig): + raw_uploader = RawUploadQueue(cdf_client=client, max_queue_size=500_000, trigger_log_level="INFO") + + model_id, revision_id = get_3d_model_id_and_revision_id(client, config, config.three_d_model_name) + mappings = get_treed_asset_mappings(client, model_id=model_id, revision_id=revision_id) + raw_table_good = client.raw.rows.retrieve_dataframe(db_name=config.rawdb, table_name=config.raw_table_good, + limit=-1) + raw_table_bad = client.raw.rows.retrieve_dataframe(db_name=config.rawdb, table_name=config.raw_table_bad, limit=-1) + raw_table_manual = client.raw.rows.retrieve_dataframe(db_name=config.rawdb, table_name=config.raw_table_manual, + limit=-1) + + new_mappings, old_mappings = compare_raw_to_real(client, mappings, raw_table_good, raw_table_bad) + update_raw_tables(client, config, raw_uploader, new_mappings, old_mappings, raw_table_good, raw_table_bad, + raw_table_manual) + + +def compare_raw_to_real(client, mappings, raw_table_good, raw_table_bad): + # Convert raw tables to lists of dictionaries for easier manipulation + good_matches = raw_table_good.to_dict(orient='records') + bad_matches = raw_table_bad.to_dict(orient='records') + + # Create a set of tuples for good and bad mappings + good_mappings = {(entry['assetId'], entry['3DId']) for entry in good_matches} + bad_mappings = {(entry['assetId'], entry['3DId']) for entry in bad_matches} + + # Flatten the mappings dictionary into a set of (assetId, nodeId) tuples + all_mappings = {(asset_id, node_id) for asset_id, node_ids in mappings.items() for node_id in node_ids} + + # Find new mappings that are in the model but not in the good raw table + new_mappings = {asset_id: node_id for asset_id, node_id in all_mappings if (asset_id, node_id) not in good_mappings} + + # Find old mappings that are in the good raw table but no longer in the model + old_mappings = {asset_id: node_id for asset_id, node_id in good_mappings if (asset_id, node_id) not in all_mappings} + + return new_mappings, old_mappings + + + +def update_raw_tables(client, config, raw_uploader, new_mappings, old_mappings, raw_table_good, raw_table_bad, + raw_table_manual): + # Convert raw tables to lists of dictionaries + good_matches = raw_table_good.to_dict(orient='records') + bad_matches = raw_table_bad.to_dict(orient='records') + manual_entries = raw_table_manual.to_dict(orient='records') + + # Add new mappings to the good matches and write to manual entries with manual_action: created + for asset_id, node_id in new_mappings.items(): + new_entry = {'asset name': client.assets.retrieve(id=asset_id).name, 'assetId': asset_id, '3DId': node_id} + manual_entries.append({**new_entry, 'manualAction': 'created'}) + # Remove the mapping from bad matches if it exists + bad_matches = [entry for entry in bad_matches if not (entry['assetId'] == asset_id and entry['3DId'] == node_id)] + + # Remove old mappings from the good matches and write to manual entries with manual_action: deleted + for asset_id, node_id in old_mappings.items(): + old_entry = next((entry for entry in good_matches if entry['assetId'] == asset_id and entry['3DId'] == node_id), None) + if old_entry: + manual_entries.append({**old_entry, 'manualAction': 'deleted'}) + good_matches = [entry for entry in good_matches if not (entry['assetId'] == asset_id and entry['3DId'] == node_id)] + + write_mapping_to_raw(client, config, raw_uploader, good_matches, bad_matches, manual_entries) + diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/requirements.txt b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/requirements.txt new file mode 100644 index 00000000..eb16b83d --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/requirements.txt @@ -0,0 +1,3 @@ +cognite-extractor-utils>=7 +cognite-sdk>=7.26 +pyyaml>=6 diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/write_resources.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/write_resources.py new file mode 100644 index 00000000..a85271b0 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_asset_contextualization_quality_check/write_resources.py @@ -0,0 +1,66 @@ +from cognite.client import CogniteClient +from cognite.client.data_classes import Row +from cognite.client.exceptions import CogniteAPIError +from cognite.extractorutils.uploader import RawUploadQueue +from typing import List +from config import ContextConfig +import numpy as np +from cognite.client import CogniteClient +from cognite.client.data_classes import Row +from cognite.extractorutils.uploader import RawUploadQueue +from typing import List + + +def delete_table_if_needed(client: CogniteClient, db: str, tbl: str) -> None: + try: + client.raw.tables.delete(db, [tbl]) + print(f"INFO: Deleted table {db}/{tbl}") + except CogniteAPIError as e: + # Any other error than table not found, and we re-raise + if e.code != 404: + raise + else: + print(f"INFO: Table {tbl} not found in {db}, nothing to delete.") + + +def clean_data(entries: List[dict]) -> List[dict]: + cleaned_entries = [] + for entry in entries: + cleaned_entry = {k: (v if not (isinstance(v, float) and (np.isnan(v) or np.isinf(v))) else None) for k, v in entry.items()} + cleaned_entries.append(cleaned_entry) + return cleaned_entries + + +def write_mapping_to_raw(client: CogniteClient, config: ContextConfig, raw_uploader: RawUploadQueue, + good_matches: List[dict], bad_matches: List[dict], manual_entries: List[dict]) -> None: + + # Clean data to ensure no NaN or infinite values + good_matches = clean_data(good_matches) + bad_matches = clean_data(bad_matches) + manual_entries = clean_data(manual_entries) + + if good_matches: + print(f"INFO: Clean up GOOD table: {config.rawdb}/{config.raw_table_good} before writing new status") + delete_table_if_needed(client, config.rawdb, config.raw_table_good) + for match in good_matches: + raw_uploader.add_to_upload_queue(config.rawdb, config.raw_table_good, Row(match["3DId"], match)) + raw_uploader.upload() + print(f"INFO: Added {len(good_matches)} to {config.rawdb}/{config.raw_table_good}") + + if bad_matches: + print(f"INFO: Clean up BAD table: {config.rawdb}/{config.raw_table_bad} before writing new status") + delete_table_if_needed(client, config.rawdb, config.raw_table_bad) + for not_match in bad_matches: + raw_uploader.add_to_upload_queue(config.rawdb, config.raw_table_bad, Row(not_match["3DId"], not_match)) + raw_uploader.upload() + print(f"INFO: Added {len(bad_matches)} to {config.rawdb}/{config.raw_table_bad}") + + if manual_entries: + print(f"INFO: Clean up MANUAL table: {config.rawdb}/{config.raw_table_manual} before writing new status") + delete_table_if_needed(client, config.rawdb, config.raw_table_manual) + for entry in manual_entries: + raw_uploader.add_to_upload_queue(config.rawdb, config.raw_table_manual, Row(entry["3DId"], entry)) + raw_uploader.upload() + print(f"INFO: Added {len(manual_entries)} to {config.rawdb}/{config.raw_table_manual}") + + print("INFO: Upload complete.") diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization.yaml new file mode 100644 index 00000000..ae31014f --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization.yaml @@ -0,0 +1,16 @@ +# Toolkit function definition. The code directory must have the same name as externalId. +# Variables (e.g. {{default_location}}, {{source_name}}) are substituted from config.dev.yaml at build. +- name: 'context:3d:cad:asset:contextualization' + externalId: 'fn_context_3d_cad_asset_contextualization' + owner: 'Anonymous' + description: 'Contextualization of 3d model and asset data for {{default_location}}:{{source_name}}' + metadata: + version: '0.1.0' + envVars: + ENV_TYPE: '${CDF_BUILD_TYPE}' + CDF_ENV: '${CDF_ENVIRON}' + cpu: 0.60 + runtime: 'py311' + functionPath: './handler.py' + dataSetExternalId: 'ds_3d_{{default_location}}' + space: '{{function_space}}' diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/__init__.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/apply_dm_cad_contextualization.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/apply_dm_cad_contextualization.py new file mode 100644 index 00000000..70744205 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/apply_dm_cad_contextualization.py @@ -0,0 +1,398 @@ +""" +DM-only 3D CAD contextualization via dedicated CDF endpoint. + +Uses POST /api/v1/projects/{project}/3d/contextualization/cad to create the full +DM chain: Asset.object3D → Cognite3DObject ← CADNode → CADRevision. + +Also ensures CADModel/CADRevision and SceneConfiguration exist. +Called from pipeline after writing good matches to RAW (when use_dm_cad_contextualization is True). +""" +from __future__ import annotations + +import time +from typing import Any, Optional + +import requests +from cognite.client import CogniteClient +from cognite.client.data_classes.data_modeling import ( + DataModelApply, + EdgeApply, + NodeApply, + NodeOrEdgeData, + ViewId, +) + +from config import ContextConfig, resolve_dm_cad_contextualization_config +from get_resources import get_3d_model_id_and_revision_id +from logger import log + + +def _get_cad_node_view(required_views: list[ViewId]) -> ViewId: + return next( + (v for v in required_views if v.external_id == "CogniteCADNode"), + ViewId("cdf_cdm", "CogniteCADNode", "v1"), + ) + + +def run( + client: CogniteClient, + config: ContextConfig, + model_id: Optional[int] = None, + revision_id: Optional[int] = None, + *, + ensure_views: bool = True, +) -> None: + """ + Run DM-only CAD contextualization: ensure CAD revision/scene, then apply + mappings from config.rawdb / config.raw_table_good via the dedicated API. + + model_id and revision_id can be passed in (e.g. from the pipeline) or + resolved from config.three_d_model_name when either is None. + Uses config.asset_dm_space and config.cad_node_dm_space (or defaults). + """ + if model_id is None or revision_id is None: + model_id, revision_id = get_3d_model_id_and_revision_id( + client, config, config.three_d_model_name + ) + log.info(f"Resolved model_id={model_id}, revision_id={revision_id} from config.three_d_model_name={config.three_d_model_name!r}") + + resolved = resolve_dm_cad_contextualization_config(config) + asset_space = config.asset_dm_space + ctx_space = asset_space # CADNodes in same space as assets for IT location + raw_db = config.rawdb + raw_table = config.raw_table_good + revision_ext_id = f"cog_3d_revision_{revision_id}" + model_ext_id = f"cog_3d_model_{model_id}" + scene_model_ext = resolved.scene_model_ext_id or f"clov_3d_model_{model_id}" + + if config.debug: + log.info("apply_dm_cad_contextualization: debug=True, skipping DM CAD apply") + return + + # 1) Ensure CADModel + CADRevision + _ensure_cad_revision( + client, resolved.cad_space, model_ext_id, revision_ext_id, revision_id, + resolved.cad_model_name, resolved.cad_model_type, + resolved.views["cad_model_view"], resolved.views["cad_revision_view"], + ) + + # 2) Optionally add required views to data model + if ensure_views: + _ensure_dm_views(client, resolved.dm_space, resolved.dm_ext_id, resolved.dm_version, resolved.required_views) + + # 3) Apply contextualization from RAW via dedicated API + _apply_contextualization( + client, asset_space, ctx_space, resolved.cad_space, revision_ext_id, + raw_db, raw_table, model_id, revision_id, + batch_size=resolved.batch_size, + required_views=resolved.required_views, + ) + + # 4) Ensure SceneConfiguration + _ensure_scene( + client, resolved.scene_space, resolved.scene_ext_id, scene_model_ext, revision_id, resolved.cad_model_name, + resolved.views["scene_config_view"], resolved.views["scene_model_view"], resolved.views["rev_props_view"], + ) + + # 5) Cleanup legacy cog_3d_node_* nodes + cad_node_view = _get_cad_node_view(resolved.required_views) + _cleanup_legacy_cadnodes(client, resolved.cad_space, ctx_space, cad_node_view, resolved.batch_size) + + log.info("apply_dm_cad_contextualization: completed successfully") + + +def _ensure_cad_revision( + client: CogniteClient, + cad_space: str, + model_ext_id: str, + revision_ext_id: str, + revision_id: int, + cad_model_name: str, + cad_model_type: str, + cad_model_view: ViewId, + cad_revision_view: ViewId, +) -> None: + client.data_modeling.instances.apply( + nodes=[ + NodeApply( + space=cad_space, + external_id=model_ext_id, + sources=[ + NodeOrEdgeData( + source=cad_model_view, + properties={"name": cad_model_name, "type": cad_model_type}, + ) + ], + ) + ] + ) + log.info(f"CADModel: {cad_space}/{model_ext_id} (name={cad_model_name!r}, type={cad_model_type!r})") + + client.data_modeling.instances.apply( + nodes=[ + NodeApply( + space=cad_space, + external_id=revision_ext_id, + sources=[ + NodeOrEdgeData( + source=cad_revision_view, + properties={ + "revisionId": revision_id, + "published": True, + "status": "Done", + "type": cad_model_type, + "model3D": {"space": cad_space, "externalId": model_ext_id}, + }, + ), + NodeOrEdgeData(source=cad_model_view, properties={"type": cad_model_type}), + ], + ) + ] + ) + log.info(f"CADRevision: {cad_space}/{revision_ext_id} (revisionId={revision_id})") + + +def _ensure_dm_views( + client: CogniteClient, + dm_space: str, + dm_ext_id: str, + dm_version: str, + required_views: list[ViewId], +) -> None: + dms = client.data_modeling.data_models.retrieve( + (dm_space, dm_ext_id, dm_version), inline_views=False + ) + if not dms: + log.warning(f"Data model {dm_space}/{dm_ext_id}/{dm_version} not found — skipping view injection") + return + dm = dms[0] + existing = {(v.space, v.external_id, v.version) for v in dm.views} + added = [v for v in required_views if (v.space, v.external_id, v.version) not in existing] + if added: + dm.views.extend(added) + client.data_modeling.data_models.apply( + DataModelApply( + space=dm_space, + external_id=dm_ext_id, + version=dm_version, + name=dm.name, + description=dm.description or "", + views=dm.views, + ) + ) + log.info(f"Added {len(added)} views to data model {dm_space}/{dm_ext_id}/{dm_version}") + else: + log.info(f"All required 3D views already present in data model {dm_space}/{dm_ext_id}/{dm_version}") + + +def _get_cluster_token(client: CogniteClient) -> str: + """Return the cluster-scoped Bearer token from the SDK credentials.""" + return client._config.credentials.authorization_header()[1] + + +def _apply_contextualization( + client: CogniteClient, + asset_space: str, + ctx_space: str, + cad_space: str, + revision_ext_id: str, + raw_db: str, + raw_table: str, + model_id: int, + revision_id: int, + *, + batch_size: int = 100, + required_views: list[ViewId], +) -> None: + from cognite.client.data_classes.data_modeling import NodeId + from cognite.client.data_classes.data_modeling import filters as dm_filters + + cad_view = _get_cad_node_view(required_views) + for sp in {ctx_space, cad_space}: + existing = client.data_modeling.instances.list( + instance_type="node", + space=sp, + filter=dm_filters.HasData(views=[(cad_view.space, cad_view.external_id, cad_view.version)]), + limit=-1, + ) + api_nodes = [ + NodeId(n.space, n.external_id) + for n in existing + if n.external_id.startswith("cog_3d_cadnode_") + ] + if api_nodes: + for i in range(0, len(api_nodes), batch_size): + client.data_modeling.instances.delete(nodes=api_nodes[i : i + batch_size]) + log.info(f"Deleted {len(api_nodes)} existing cog_3d_cadnode_* in {sp}") + + rows = list(client.raw.rows.list(raw_db, raw_table, limit=-1)) + items = [] + seen = set() + skipped_fake = 0 + for r in rows: + aid = r.columns.get("assetId") or r.columns.get("assetExternalId") + nid = r.columns.get("3DId") + if aid is None or nid is None: + continue + nid = int(nid) + # Skip fake/demo nodeIds — real 3D node IDs in CDF are always large numbers. + # Small IDs (< 100_000) are placeholder test data that don't exist in the model. + if nid < 100_000: + skipped_fake += 1 + log.warning(f"Skipping item assetId={aid!r} nodeId={nid} — nodeId looks like a demo/fake ID (< 100,000). Update contextualization_manual_input with real 3D node IDs.") + continue + key = (str(aid), nid) + if key in seen: + continue + seen.add(key) + items.append( + { + "asset": {"instanceId": {"space": asset_space, "externalId": str(aid)}}, + "nodeId": nid, + } + ) + + if skipped_fake: + log.warning(f"Skipped {skipped_fake} items with fake/demo nodeIds. Populate contextualization_manual_input with real 3D node IDs from the model.") + + if not items: + log.info("No valid items to contextualize (all nodeIds were fake/demo data)") + return + + log.info(f"Applying contextualization for {len(items)} valid items from {raw_db}/{raw_table} ({skipped_fake} fake nodeIds skipped)") + + project = client._config.project + api_config = { + "object3DSpace": asset_space, + "contextualizationSpace": ctx_space, + "revision": { + "instanceId": {"space": cad_space, "externalId": revision_ext_id}, + }, + } + + cluster_token = _get_cluster_token(client) + # Extract cluster name from base_url: "https://{cluster}.cognitedata.com" + cluster = client._config.base_url.rstrip("/").removeprefix("https://").removesuffix(".cognitedata.com") + url = f"https://{cluster}.cognitedata.com/api/v1/projects/{project}/3d/contextualization/cad" + headers = {"Authorization": cluster_token, "Content-Type": "application/json"} + + log.info(f"Using contextualization URL: {url}") + + for i in range(0, len(items), batch_size): + batch = items[i : i + batch_size] + resp = requests.post( + url, + headers=headers, + json={"items": batch, "dmsContextualizationConfig": api_config}, + timeout=60, + ) + if resp.status_code != 200: + log.error(f"Contextualization API error {resp.status_code}: {resp.text[:500]}") + raise RuntimeError(f"Contextualization API failed ({resp.status_code}): {resp.text}") + log.info(f"Batch {i // batch_size + 1}/{(len(items) + batch_size - 1) // batch_size}: {len(batch)} items OK") + time.sleep(0.1) + + log.info(f"Contextualization done: {len(items)} asset-3D links applied") + + +def _ensure_scene( + client: CogniteClient, + scene_space: str, + scene_ext_id: str, + scene_model_ext: str, + revision_id: int, + cad_model_name: str, + scene_config_view: ViewId, + scene_model_view: ViewId, + rev_props_view: ViewId, +) -> None: + client.data_modeling.instances.apply( + nodes=[ + NodeApply( + space=scene_space, + external_id=scene_model_ext, + sources=[NodeOrEdgeData(source=scene_model_view, properties={"name": cad_model_name})], + ) + ] + ) + client.data_modeling.instances.apply( + nodes=[ + NodeApply( + space=scene_space, + external_id=scene_ext_id, + sources=[ + NodeOrEdgeData( + source=scene_config_view, + properties={ + "name": f"{cad_model_name} Scene", + "description": f"3D contextualization scene for {cad_model_name} model", + "cameraTranslationX": 0.0, + "cameraTranslationY": 0.0, + "cameraTranslationZ": 50.0, + "cameraEulerRotationX": 0.0, + "cameraEulerRotationY": 0.0, + "cameraEulerRotationZ": 0.0, + }, + ) + ], + ) + ] + ) + edge_ext = f"{scene_ext_id}_to_{scene_model_ext}" + client.data_modeling.instances.apply( + edges=[ + EdgeApply( + space=scene_space, + external_id=edge_ext, + type={"space": "scene", "externalId": "SceneConfiguration.model3ds"}, + start_node={"space": scene_space, "externalId": scene_ext_id}, + end_node={"space": scene_space, "externalId": scene_model_ext}, + sources=[ + NodeOrEdgeData( + source=rev_props_view, + properties={ + "revisionId": revision_id, + "translationX": 0.0, + "translationY": 0.0, + "translationZ": 0.0, + "eulerRotationX": 0.0, + "eulerRotationY": 0.0, + "eulerRotationZ": 0.0, + "scaleX": 1.0, + "scaleY": 1.0, + "scaleZ": 1.0, + "defaultVisible": True, + }, + ) + ], + ) + ] + ) + log.info(f"Scene: {scene_space}/{scene_ext_id}, model {scene_model_ext} (revisionId={revision_id})") + + +def _cleanup_legacy_cadnodes( + client: CogniteClient, + cad_space: str, + ctx_space: str, + cad_node_view: ViewId, + batch_size: int = 100, +) -> None: + from cognite.client.data_classes.data_modeling import NodeId + from cognite.client.data_classes.data_modeling import filters as dm_filters + + to_delete = [] + for sp in {cad_space, ctx_space}: + for n in client.data_modeling.instances.list( + instance_type="node", + space=sp, + filter=dm_filters.HasData(views=[(cad_node_view.space, cad_node_view.external_id, cad_node_view.version)]), + limit=-1, + ): + if n.external_id.startswith("cog_3d_node_"): + to_delete.append(NodeId(n.space, n.external_id)) + if not to_delete: + return + for i in range(0, len(to_delete), batch_size): + client.data_modeling.instances.delete(nodes=to_delete[i : i + batch_size]) + log.info(f"Cleaned up {len(to_delete)} legacy cog_3d_node_* CADNodes") diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/config.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/config.py new file mode 100644 index 00000000..a7d42445 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/config.py @@ -0,0 +1,273 @@ +from __future__ import annotations + +from dataclasses import dataclass +from typing import Any, Optional + +import yaml +from cognite.client import CogniteClient +from cognite.client.data_classes.data_modeling import ViewId +from cognite.client.exceptions import CogniteAPIError +from pydantic import BaseModel +from pydantic.alias_generators import to_camel + + +class Parameters(BaseModel, alias_generator=to_camel): + debug: bool + run_all: bool + rawdb: str + raw_table_good: str + raw_table_bad: str + raw_table_all: str + raw_table_manual: str + three_d_data_set_ext_id: str + three_d_model_name: str + asset_dm_space: str + cad_node_dm_space: str = "" + node_name_prefixes: list[str] = [] + match_threshold: float = 0.6 + keep_old_mapping: bool + node_name_prefixes: Optional[list[str]] = None + node_name_max_slashes: Optional[int] = None + name_replacements: Optional[list[dict[str, str]]] = None + suffixes_to_strip: Optional[list[str]] = None + asset_root_ext_id: Optional[str] = None + asset_subtree_external_ids: Optional[list[str]] = None + threed_from_quantum: bool = False + raw_table_rule: Optional[str] = None + # DM-only: apply via POST /3d/contextualization/cad after writing to RAW + use_dm_cad_contextualization: bool = True + cad_space: Optional[str] = None # e.g. "rmdm"; default used if empty + dm_data_model_space: Optional[str] = None + dm_data_model_ext_id: Optional[str] = None + dm_data_model_version: Optional[str] = None + scene_external_id: Optional[str] = None + scene_model_external_id: Optional[str] = None + cad_model_name: Optional[str] = None # default: three_d_model_name + cad_model_type: Optional[str] = None # default: "CAD" + # Optional list of { space, externalId, version } for DM views to ensure; default in code if omitted + required_views: Optional[list[dict[str, str]]] = None + # Optional overrides (pipeline config; variables from config.dev.yaml at build); all default in code if omitted + default_cad_space: Optional[str] = None + default_dm_space: Optional[str] = None + dm_ext_id: Optional[str] = None + dm_version: Optional[str] = None + default_scene_space: Optional[str] = None + cad_contextualization_batch_size: Optional[int] = None + cad_model_view: Optional[dict[str, str]] = None # { space, externalId, version } + cad_revision_view: Optional[dict[str, str]] = None + scene_config_view: Optional[dict[str, str]] = None + scene_model_view: Optional[dict[str, str]] = None + rev_props_view: Optional[dict[str, str]] = None + asset_view_space: Optional[str] = None + asset_view_ext_id: Optional[str] = None + asset_view_version: Optional[str] = None + + + +class ConfigData(BaseModel, alias_generator=to_camel): + parameters: Parameters + + +class Config(BaseModel, alias_generator=to_camel): + data: ConfigData + extraction_pipeline_ext_id: str = "" + + @property + def params(self) -> Parameters: + return self.data.parameters + + def __getattr__(self, name: str) -> Any: + """Expose Parameters fields as top-level attributes for backward compatibility.""" + if "data" in self.__dict__ and self.data is not None: + return getattr(self.data.parameters, name) + raise AttributeError(f"{type(self).__name__!r} object has no attribute {name!r}") + + +# Alias for backward compatibility with pipeline/get_resources/write_resources. +ContextConfig = Config + + +def load_config_parameters(client: CogniteClient, function_data: dict[str, Any]) -> Config: + """ + Retrieves the configuration parameters from the function data and loads the configuration from CDF. + + Args: + client: Instance of CogniteClient + function_data: dictionary containing the function input configuration data + + Returns: + Config object (supports config.rawdb, config.three_d_model_name, etc., and config.extraction_pipeline_ext_id) + """ + if "ExtractionPipelineExtId" not in function_data: + raise ValueError("Missing parameter 'ExtractionPipelineExtId' in function data") + + pipeline_ext_id = function_data["ExtractionPipelineExtId"] + try: + raw_config = client.extraction_pipelines.config.retrieve(pipeline_ext_id) + if raw_config.config is None: + raise ValueError(f"No config found for extraction pipeline: {pipeline_ext_id!r}") + except CogniteAPIError: + raise RuntimeError(f"Not able to retrieve pipeline config for extraction pipeline: {pipeline_ext_id!r}") + + config = Config.model_validate(yaml.safe_load(raw_config.config)) + return config.model_copy(update={"extraction_pipeline_ext_id": pipeline_ext_id}) + + +# --- DM CAD contextualization: resolved config (all defaults applied) --- + +_DEFAULT_REQUIRED_VIEWS = [ + ViewId("cdf_cdm", "Cognite3DObject", "v1"), + ViewId("cdf_cdm", "CogniteCADNode", "v1"), + ViewId("cdf_cdm", "CogniteCADRevision", "v1"), + ViewId("cdf_cdm", "CogniteCADModel", "v1"), + ViewId("cdf_cdm", "Cognite3DRevision", "v1"), + ViewId("cdf_cdm", "Cognite3DModel", "v1"), + ViewId("cdf_cdm", "Cognite3DTransformation", "v1"), + ViewId("cdf_cdm", "CogniteVisualizable", "v1"), + ViewId("cdf_cdm", "Cognite360Image", "v1"), + ViewId("cdf_cdm", "Cognite360ImageAnnotation", "v1"), + ViewId("cdf_cdm", "CogniteAnnotation", "v1"), + ViewId("cdf_cdm", "Cognite360ImageCollection", "v1"), + ViewId("cdf_cdm", "Cognite360ImageStation", "v1"), + ViewId("cdf_cdm", "CognitePointCloudVolume", "v1"), + ViewId("cdf_cdm", "CognitePointCloudRevision", "v1"), + ViewId("scene", "SceneConfiguration", "v1"), + ViewId("scene", "RevisionProperties", "v1"), + ViewId("cdf_3d_schema", "Cdf3dModel", "1"), +] + +_DEFAULT_CAD_VIEWS = { + "cad_model_view": ViewId("cdf_cdm", "CogniteCADModel", "v1"), + "cad_revision_view": ViewId("cdf_cdm", "CogniteCADRevision", "v1"), + "scene_config_view": ViewId("scene", "SceneConfiguration", "v1"), + "scene_model_view": ViewId("cdf_3d_schema", "Cdf3dModel", "1"), + "rev_props_view": ViewId("scene", "RevisionProperties", "v1"), +} + + +def _get_opt(config: Config, name: str, default: Any) -> Any: + return getattr(config, name, default) + + +def view_id_from_dict(d: dict[str, Any]) -> ViewId: + """Build ViewId from dict with space, externalId/external_id, version (from pipeline config variables).""" + space = d.get("space") + ext_id = d.get("externalId") or d.get("external_id") + version = d.get("version", "v1") + if not space or not ext_id: + raise ValueError(f"View dict must have space and externalId/external_id: {d}") + return ViewId(str(space), str(ext_id), str(version)) + + +def resolve_required_views(config: Config) -> list[ViewId]: + """Build list of ViewIds from config.required_views or use built-in default.""" + raw = _get_opt(config, "required_views", None) + if isinstance(raw, list) and len(raw) > 0: + out = [] + for v in raw: + if not isinstance(v, dict): + continue + space = v.get("space") + ext_id = v.get("externalId") or v.get("external_id") + version = v.get("version", "v1") + if space and ext_id: + out.append(ViewId(str(space), str(ext_id), str(version))) + if out: + return out + return list(_DEFAULT_REQUIRED_VIEWS) + + +def get_cad_node_view(config: Config) -> ViewId: + """Return CogniteCADNode ViewId from config.required_views or default (for pre_ml_mappings, get_resources, etc.).""" + views = resolve_required_views(config) + return next( + (v for v in views if v.external_id == "CogniteCADNode"), + ViewId("cdf_cdm", "CogniteCADNode", "v1"), + ) + + +def resolve_cad_views(config: Config) -> dict[str, ViewId]: + """Resolve CAD/Scene view ViewIds from pipeline config (variables substituted at build).""" + out = {} + for key, default in _DEFAULT_CAD_VIEWS.items(): + raw = _get_opt(config, key, None) + if isinstance(raw, dict): + out[key] = view_id_from_dict(raw) + else: + out[key] = default + return out + + +@dataclass(frozen=True) +class DMCadContextualizationConfig: + """Resolved config for DM-only CAD contextualization (all defaults applied).""" + + cad_space: str + dm_space: str + dm_ext_id: str + dm_version: str + scene_space: str + scene_ext_id: str + scene_model_ext_id: Optional[str] # None → use f"clov_3d_model_{model_id}" at use site + batch_size: int + cad_model_name: str + cad_model_type: str + views: dict[str, ViewId] # cad_model_view, cad_revision_view, scene_config_view, scene_model_view, rev_props_view + required_views: list[ViewId] + + +def resolve_dm_cad_contextualization_config(config: Config) -> DMCadContextualizationConfig: + """ + Resolve all DM CAD contextualization settings from pipeline config (with defaults). + Use this in apply_dm_cad_contextualization.run() so config reading lives in config.py. + """ + default_cad = _get_opt(config, "default_cad_space", None) or "rmdm" + cad_space = ( + _get_opt(config, "cad_node_dm_space", None) + or _get_opt(config, "cad_space", None) + or default_cad + ) or default_cad + + dm_space = ( + _get_opt(config, "dm_data_model_space", None) + or _get_opt(config, "default_dm_space", None) + or "upstream-value-chain" + ) + dm_ext_id = ( + _get_opt(config, "dm_data_model_ext_id", None) + or _get_opt(config, "dm_ext_id", None) + or "upstream_value_chain" + ) + dm_version = ( + _get_opt(config, "dm_data_model_version", None) + or _get_opt(config, "dm_version", None) + or "v1" + ) + scene_space = ( + _get_opt(config, "default_scene_space", None) + or _get_opt(config, "scene_space", None) + or "scene" + ) + scene_ext_id = _get_opt(config, "scene_external_id", None) or "clov_navisworks_scene" + scene_model_ext_id = _get_opt(config, "scene_model_external_id", None) + batch_size = _get_opt(config, "cad_contextualization_batch_size", None) or 100 + cad_model_name = _get_opt(config, "cad_model_name", None) or config.three_d_model_name + cad_model_type = _get_opt(config, "cad_model_type", None) or "CAD" + + views = resolve_cad_views(config) + required_views = resolve_required_views(config) + + return DMCadContextualizationConfig( + cad_space=cad_space, + dm_space=dm_space, + dm_ext_id=dm_ext_id, + dm_version=dm_version, + scene_space=scene_space, + scene_ext_id=scene_ext_id, + scene_model_ext_id=scene_model_ext_id, + batch_size=batch_size, + cad_model_name=cad_model_name, + cad_model_type=cad_model_type, + views=views, + required_views=required_views, + ) diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/configBkp.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/configBkp.py new file mode 100644 index 00000000..092d67d8 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/configBkp.py @@ -0,0 +1,67 @@ +from __future__ import annotations + +from dataclasses import dataclass +from typing import Any + +import yaml +from cognite.client import CogniteClient + + +@dataclass +class ContextConfig: + extraction_pipeline_ext_id: str + debug: bool + run_all: bool + rawdb: str + raw_table_good: str + raw_table_bad: str + raw_table_all: str + raw_table_manual: str + three_d_data_set_ext_id: str + three_d_model_name: str + asset_root_ext_id: str + match_threshold: float + keep_old_mapping: bool + + @classmethod + def load(cls, data: dict[str, Any]) -> ContextConfig: + return cls( + extraction_pipeline_ext_id=data["ExtractionPipelineExtId"], + debug=data["debug"], + run_all=False, # data["runAll"], + rawdb=data["rawdb"], + raw_table_good=data["rawTableGood"], + raw_table_bad=data["rawTableBad"], + raw_table_all=data["rawTableAll"], + raw_table_manual=data["rawTableManual"], + three_d_data_set_ext_id=data["3dDataSetExtId"], + three_d_model_name=data["3dModelName"], + asset_root_ext_id=data["assetRootExtId"], + match_threshold=0.6, # data["matchThreshold"], + keep_old_mapping=data["keepOldMapping"], + ) + + +def load_config_parameters(client: CogniteClient, function_data: dict[str, Any]) -> ContextConfig: + """ + Retrieves the configuration parameters from the function data and loads the configuration from CDF. + Configuration is loaded from the extraction pipeline configuration and the function data. + + Args: + client: Instance of CogniteClient + function_data: dictionary containing the function input configuration data + + Returns: + ContextConfig object + """ + try: + extraction_pipeline_ext_id = function_data["ExtractionPipelineExtId"] + except KeyError: + raise ValueError("Missing parameter 'ExtractionPipelineExtId' in function data") + + serialized_config = client.extraction_pipelines.config.retrieve(extraction_pipeline_ext_id) + if serialized_config: + data = yaml.safe_load(serialized_config.config)["data"] + data["ExtractionPipelineExtId"] = extraction_pipeline_ext_id + return ContextConfig.load(data) + raise ValueError(f"No configuration found in pipeline {extraction_pipeline_ext_id!r}") diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/constants.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/constants.py new file mode 100644 index 00000000..7bd43f89 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/constants.py @@ -0,0 +1,15 @@ +ML_MODEL_FEATURE_TYPE = "sourcesubtokens" +COL_MATCH_KEY = "name" + +COL_KEY_MAN_MAPPING_3D_NODE_NAME = "3dNodeName" # Node for 3d node not mapped related to manual mapping +COL_KEY_MAN_MAPPING_3D_NODE_ID = "3dId" # Node for 3d node not mapped related to manual mapping +COL_KEY_MAN_MAPPING_ASSET_EXTID = "assetExternalId" # ExternalID Col name for Asset related to manual mapping +COL_KEY_MAN_MAPPING_ASSET_ID = "assetId" # ID Col name for Asset related to manual mapping +COL_KEY_MAN_CONTEXTUALIZED = "contextualized" # Col name for if mapping is done for manual mapping + +# Rule-based mapping (RAW table columns) +COL_KEY_RULE_REGEXP_ENTITY = "regexpEntity" +COL_KEY_RULE_REGEXP_ASSET = "regexpAsset" + +FUNCTION_NAME = "3D & Asset contextualization" +MAX_MODEL_SIZE_TO_CREATE_MODEL = 100000 # Max number of rows to use for training ML model diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/get_resources.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/get_resources.py new file mode 100644 index 00000000..a72aa0f7 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/get_resources.py @@ -0,0 +1,787 @@ +from __future__ import annotations + +import json +import re +import sys +import io +import os +from pathlib import Path +from typing import Any + +from cognite.client import CogniteClient +from cognite.client.data_classes import Asset, ContextualizationJob, Row +from cognite.client.data_classes.data_modeling import NodeApply, NodeOrEdgeData, ViewId +from cognite.client.data_classes.three_d import ThreeDAssetMapping + +sys.path.append(str(Path(__file__).parent)) + +from config import ContextConfig +from constants import ( + COL_KEY_MAN_CONTEXTUALIZED, + COL_KEY_MAN_MAPPING_3D_NODE_NAME, + COL_KEY_MAN_MAPPING_3D_NODE_ID, + COL_KEY_MAN_MAPPING_ASSET_EXTID, + COL_KEY_MAN_MAPPING_ASSET_ID, + COL_MATCH_KEY, + MAX_MODEL_SIZE_TO_CREATE_MODEL, + ML_MODEL_FEATURE_TYPE, +) +from logger import log + + +def manual_table_exists(client: CogniteClient, config: str) -> bool: + tables = client.raw.tables.list(config.rawdb, limit=None) + return any(tbl.name == config.raw_table_manual for tbl in tables) + + +def read_manual_mappings(client: CogniteClient, config: ContextConfig) -> list[dict[str, Any]]: + raw_table_manual = client.raw.rows.retrieve_dataframe(db_name=config.rawdb, table_name=config.raw_table_manual, + limit=-1) + + manual_entries = [{'targetId': row['assetId'], 'sourceId': row['3DId']} + for index, row in raw_table_manual.iterrows()] + return manual_entries + + +def build_cad_node_lookup(client: CogniteClient, instance_space: str) -> dict[int, str]: + """ + Build a lookup from node3DId (int) to CogniteCADNode external_id in DM. + + Uses filter-only (no sources) to avoid SDK v7.x sources deserialization issue. + node3DId is extracted from the standard externalId pattern cog_3d_cadnode_{node3d_id}. + """ + from cognite.client.data_classes.data_modeling import filters as dm_filters + + cad_node_view = ViewId("cdf_cdm", "CogniteCADNode", "v1") + _cv = f"{cad_node_view.external_id}/{cad_node_view.version}" + + nodes = client.data_modeling.instances.list( + instance_type="node", + space=instance_space, + filter=dm_filters.HasData(views=[(cad_node_view.space, cad_node_view.external_id, cad_node_view.version)]), + limit=-1, + ) + lookup: dict[int, str] = {} + for node in nodes: + # Extract node3DId from the standard externalId pattern: cog_3d_cadnode_{node3d_id} + if node.external_id.startswith("cog_3d_cadnode_"): + try: + node3d_id = int(node.external_id.split("cog_3d_cadnode_")[1]) + lookup[node3d_id] = node.external_id + except ValueError: + pass + log.info(f"Built CogniteCADNode lookup with {len(lookup)} entries from space '{instance_space}'") + return lookup + + +def create_cad_node_mappings( + client: CogniteClient, + cad_node_lookup: dict[int, str], + dm_mappings: list[tuple[int, str, str]], +) -> None: + """Update CogniteCADNode.asset direct relation for each (node3d_id, asset_ext_id, space) mapping.""" + cad_node_view = ViewId("cdf_cdm", "CogniteCADNode", "v1") + nodes_to_update: list[NodeApply] = [] + + for node3d_id, asset_ext_id, instance_space in dm_mappings: + cad_node_ext_id = cad_node_lookup.get(node3d_id) + if not cad_node_ext_id: + log.warning(f"CogniteCADNode not found for node3DId={node3d_id}, skipping") + continue + nodes_to_update.append( + NodeApply( + space=instance_space, + external_id=cad_node_ext_id, + sources=[NodeOrEdgeData( + source=cad_node_view, + properties={"asset": {"space": instance_space, "externalId": asset_ext_id}}, + )], + ) + ) + + _BATCH = 1000 + for i in range(0, len(nodes_to_update), _BATCH): + client.data_modeling.instances.apply(nodes=nodes_to_update[i : i + _BATCH]) + log.info(f"Updated {len(nodes_to_update)} CogniteCADNode asset mappings (DM)") + + +def get_asset_id_ext_id_mapping(manual_mappings: list[Row]) -> dict[str, Any]: + """ + Read assets specified in manual mapping input based on external ID and find the corresponding asset internal ID + Internal ID is used to update time series with asset ID + + Args: + manual_mappings: list of manual mappings + + Returns: + dictionary with asset external id as key and asset id as value + + """ + try: + three_d_node_asset_id = {} + for mapping in manual_mappings: + three_d_node_asset_id[mapping[COL_KEY_MAN_MAPPING_3D_NODE_NAME]] = [ + mapping[COL_KEY_MAN_MAPPING_ASSET_ID], + mapping[COL_KEY_MAN_MAPPING_ASSET_EXTID], + ] + + return three_d_node_asset_id + + except Exception as e: + raise Exception(f"ERROR: Not able read list of assets from {manual_mappings}. Error: {type(e)}({e})") + + +def get_3d_model_id_and_revision_id( + client: CogniteClient, config: ContextConfig, three_d_model_name: str +) -> tuple[int, int]: + """ + Look up 3D model ID and revision ID from DM only (no classic 3D API). + + Uses DM property filters without `sources` to avoid SDK v7.x deserialization issues. + Numeric IDs are extracted from Cognite's standard externalId patterns: + model: cog_3d_model_{model_id} + revision: cog_3d_revision_{revision_id} + """ + try: + from cognite.client.data_classes.data_modeling import filters as dm_filters + + model_view = ViewId("cdf_cdm", "CogniteCADModel", "v1") + revision_view = ViewId("cdf_cdm", "CogniteCADRevision", "v1") + _mv = f"{model_view.external_id}/{model_view.version}" + _rv = f"{revision_view.external_id}/{revision_view.version}" + + # 1) Find CogniteCADModel by name — filter only, no sources (avoids SDK v7.x deserialization bug) + model_nodes = client.data_modeling.instances.list( + instance_type="node", + filter=dm_filters.Equals( + property=[model_view.space, _mv, "name"], + value=three_d_model_name, + ), + limit=10, + ) + + model_id: int | None = None + model_ext_id: str | None = None + model_space: str | None = None + + for node in model_nodes: + if node.external_id.startswith("cog_3d_model_"): + model_ext_id = node.external_id + model_id = int(node.external_id.split("cog_3d_model_")[1]) + model_space = node.space + log.info(f"Found CogniteCADModel '{three_d_model_name}' space='{model_space}' ext_id={model_ext_id} id={model_id}") + break + + if model_id is None: + raise ValueError(f"No CogniteCADModel with name='{three_d_model_name}' found in DM") + + # 2) Find CogniteCADRevision for this model — filter by model3D direct relation + revision_nodes = client.data_modeling.instances.list( + instance_type="node", + filter=dm_filters.Equals( + property=[revision_view.space, _rv, "model3D"], + value={"space": model_space, "externalId": model_ext_id}, + ), + limit=10, + ) + + revision_id: int | None = None + for node in revision_nodes: + if node.external_id.startswith("cog_3d_revision_"): + revision_id = int(node.external_id.split("cog_3d_revision_")[1]) + log.info(f"Found CogniteCADRevision space='{node.space}' ext_id={node.external_id} revision_id={revision_id}") + break + + if revision_id is None: + raise ValueError( + f"No CogniteCADRevision found for model '{three_d_model_name}' (model externalId={model_ext_id})" + ) + + log.info(f"Resolved: model='{three_d_model_name}' model_id={model_id} revision_id={revision_id}") + return model_id, revision_id + + except Exception as e: + raise Exception( + f"ERROR: Not able to get entities for 3D nodes in data set: {config.three_d_data_set_ext_id}- error: {e}" + ) + + +def get_mapping_to_delete(client: CogniteClient, model_id: int, revision_id: int) -> list[ThreeDAssetMapping]: + mapping_to_delete = client.three_d.asset_mappings.list(model_id=model_id, revision_id=revision_id, limit=-1) + + return mapping_to_delete + + +def get_treed_asset_mappings( + client: CogniteClient, model_id: int, revision_id: int, existing_matches: dict[str, Any] +) -> dict[str, Any]: + mappings = client.three_d.asset_mappings.list(model_id=model_id, revision_id=revision_id, limit=-1) + + for mapping in mappings.data: + if mapping.asset_id in existing_matches: + existing_matches[mapping.asset_id].append(mapping.node_id) + else: + existing_matches[mapping.asset_id] = [mapping.node_id] + + return existing_matches + + +def filter_3d_nodes( + client: CogniteClient, + config: ContextConfig, + model_id: int, + revision_id: int, + manual_mappings: list[Row], +) -> dict[str, Any]: + """ + Read time series based on root ASSET id + Read all if config property readAll = True, else only read time series not contextualized ( connected to asset) + + Args: + client: Instance of CogniteClient + config: Instance of ContextConfig + manual_matches: list of manual mappings + + Returns: + list of entities + list of dict with time series id and metadata + """ + tree_d_nodes = {} + + node_names = [manual_mappings["3dNodeName"] for manual_mappings in manual_mappings] + try: + # read 3D nodes from API with filter on node names + three_d_nodes = client.three_d.revisions.filter_nodes( + model_id=model_id, + revision_id=revision_id, + properties={"Item": {"Name": node_names}}, + partitions=10, + limit=-1, + ) + + num_nodes = 0 + for node in three_d_nodes: + if node.name and node.name != "": + num_nodes += 1 + + if node.name in tree_d_nodes: + node_ids = tree_d_nodes[node.name] + node_ids.append( + { + "id": node.id, + "subtree_size": node.subtree_size, + "tree_index": node.tree_index, + } + ) + else: + node_ids = [ + { + "id": node.id, + "subtree_size": node.subtree_size, + "tree_index": node.tree_index, + } + ] + tree_d_nodes[node.name] = node_ids + + log.info( + f"Total number of 3D Node names found for manual mapping: {num_nodes} - unique names : {len(tree_d_nodes)}" + ) + + return tree_d_nodes + + except Exception as e: + raise Exception( + f"ERROR: Not able to get entities for 3D nodes in data set: {config.three_d_data_set_ext_id}- error: {e}" + ) + + +def get_3d_nodes( + client: CogniteClient, + config: ContextConfig, + asset_entities: list[dict[str, Any]], + model_id: int, + revision_id: int, + numNodes: int = -1, + threed_from_quantum: bool = False +) -> tuple[list[dict[str, Any]], dict[str, Any]]: + """ + Read time series based on root ASSET id + Read all if config property readAll = True, else only read time series not contextualized ( connected to asset) + + Args: + client: Instance of CogniteClient + config: Instance of ContextConfig + manual_matches: list of manual mappings + + Returns: + list of entities + list of dict with time series id and metadata + """ + entities: list[dict[str, Any]] = [] + cdf_3d_nodes = [] + three_d_nodes = {} + input_three_d_nodes = None + + three_d_model_name = config.three_d_model_name + try: + quantum_three_d_nodes = [] + if threed_from_quantum: + column_three_d_name = "3DName" + quantum_df = client.raw.rows.retrieve_dataframe(db_name="ds_qc", table_name="table:quantum_3d_qc1", limit=None, columns=[column_three_d_name]) + quantum_three_d_nodes.extend(quantum_df[column_three_d_name].astype(str).tolist()) + + # prep list of asset filters + # asset_filter = [asset["name"] for asset in asset_entities] + _ds = client.data_sets.retrieve(external_id=config.three_d_data_set_ext_id) if config.three_d_data_set_ext_id else None + three_d_data_set_id = _ds.id if _ds else None + + model_file_name = f"3D_nodes_{three_d_model_name}_id_{model_id}_rev_id_{revision_id}.json" + if not config.run_all: + three_d_file = client.files.retrieve(external_id=model_file_name) + if three_d_file: + file_content = client.files.download_bytes(external_id=model_file_name) + input_three_d_nodes = json.loads(file_content) + + # with open("/home/priyanka/Documents/Office/Cognite/Customers/TE/binary_nodes", "rb") as b_file: + # file_content = b_file.read() + # input_three_d_nodes = json.loads(file_content) + + if not input_three_d_nodes: + # root_node_ids = [ + # # 7337885809804409, # 20210504_CLOV_Part1_#1.nwd + # # 8544317359930269, # Topside.nwd + # # 4680504186852549, # Topside.nwd + # 7168465427634580, # M111.nwd + # 6853894173819691, # M112.nwd + # 6902588297621933, # M113.nwd + # 597056255964510, # M114.nwd + # 5875817696881367, # M115.nwd + # 4156786689629169, # M116.nwd + # 6728129584161211, # M121.nwd + # 2124587024995238, # M122.nwd + # 6166097983325138, # M123.nwd + # 4785285216113503, # M124.nwd + # 7156978479391014, # M125.nwd + # 7021601188473505, # M126.nwd + # 845230947260427, # Hull.nwd + # ] + + input_three_d_nodes_cdf = [] + + nodes = client.three_d.revisions.list_nodes( + model_id=model_id, + revision_id=revision_id, + sort_by_node_id=True, + partitions=500, + limit=-1, + ) + + input_three_d_nodes_cdf.extend(nodes) + + # for node_id in root_node_ids: + # nodes = client.three_d.revisions.list_nodes( + # model_id=model_id, + # revision_id=revision_id, + # sort_by_node_id=True, + # partitions=100, + # limit=-1, + # node_id=node_id + # ) + # input_three_d_nodes_cdf.extend(nodes) + + # Now `all_nodes` contains the nodes retrieved from all the specified root nodes. + + '''import pickle + # Load the data from the file when needed + with open('three_d_nodes_data.pkl', 'rb') as file: + input_three_d_nodes_cdf = pickle.load(file)''' + + for node in input_three_d_nodes_cdf: + if node.name and node.name != "": + # mod_node_name = node.name + # if "/" in mod_node_name: + # mod_node_name = mod_node_name.split("/")[-1] + + # if mod_node_name in asset_filter: + cdf_3d_nodes.append(node.dump()) + + file_content = json.dumps(cdf_3d_nodes) + json_bytes = file_content.encode('utf-8') + binary_io = io.BytesIO(json_bytes) + client.files.upload_bytes( + binary_io, + external_id=model_file_name, + name=model_file_name, + overwrite=True, + data_set_id=three_d_data_set_id, + ) + + log.info(f"Uploaded {model_file_name} to CDF.") + + input_three_d_nodes = json.loads(file_content) + + num_nodes = 0 + if input_three_d_nodes: + import pandas as pd + + # Convert the list of dictionaries to a DataFrame + df_nodes = pd.DataFrame(input_three_d_nodes) + + # Name normalizer: use config if available, else generic default for 3D paths + replacements = getattr(config, "name_replacements", None) + suffixes_to_strip = getattr(config, "suffixes_to_strip", None) + + def clean_name(name: str) -> str: + original_name = name + # Generic 3D path: take from first "/", drop trailing "/-suffix", normalize separators + if "/" in name: + name = re.search(r"/.*", name) + name = name.group() if name else name + if isinstance(name, str): + name = re.sub(r"/-.+", "", name) + name = name.replace("/", "").replace(".", "-") + else: + name = str(original_name) + name = _normalize_name_generic(name, replacements=replacements, suffixes_to_strip=suffixes_to_strip) + if not name: + name = original_name + log.debug(f"{original_name}, {name}") + return name + + # Filter nodes: quantum list, or config prefixes/slashes, or no filter (default) + if quantum_three_d_nodes: + df_nodes = df_nodes[df_nodes["name"].isin(quantum_three_d_nodes)] + else: + node_prefixes = getattr(config, "node_name_prefixes", None) + max_slashes = getattr(config, "node_name_max_slashes", None) + if node_prefixes: + df_nodes = df_nodes[df_nodes["name"].str.startswith(tuple(node_prefixes))] + if max_slashes is not None: + df_nodes = df_nodes[df_nodes["name"].str.count("/") <= max_slashes] + + df_nodes["mode_node_name"] = df_nodes["name"].apply(clean_name) + + # Convert the DataFrame back to a list of dictionaries if needed + input_three_d_nodes = df_nodes.to_dict(orient='records') + + for node in input_three_d_nodes: + if node["name"] and node["name"] != "": + num_nodes += 1 + mod_node_name = node["mode_node_name"] + + if mod_node_name in three_d_nodes: + node_ids = three_d_nodes[mod_node_name] + node_ids.append( + { + "id": node["id"], + "subtree_size": node["subtreeSize"], + "tree_index": node["treeIndex"], + } + ) + three_d_nodes[mod_node_name] = node_ids + + entities = get_3d_entities(node, mod_node_name, entities) + else: + node_ids = [ + { + "id": node["id"], + "subtree_size": node["subtreeSize"], + "tree_index": node["treeIndex"], + } + ] + three_d_nodes[mod_node_name] = node_ids + + entities = get_3d_entities(node, mod_node_name, entities) + + '''if mod_node_name in three_d_nodes: + existing_node = three_d_nodes[mod_node_name] + if node["subtreeSize"] > existing_node[0]["subtree_size"]: + node_ids = [ + { + "id": node["id"], + "subtree_size": node["subtreeSize"], + "tree_index": node["treeIndex"], + } + ] + three_d_nodes[mod_node_name] = node_ids + entities = replace_3d_entities(node, mod_node_name, entities) + else: + node_ids = [ + { + "id": node["id"], + "subtree_size": node["subtreeSize"], + "tree_index": node["treeIndex"], + } + ] + three_d_nodes[mod_node_name] = node_ids + + entities = get_3d_entities(node, mod_node_name, entities)''' + + log.info( + f"Total number of 3D Node found: {num_nodes} - unique names to match after asset name filtering: {len(three_d_nodes)}" + ) + + return entities, three_d_nodes + + except Exception as e: + raise Exception(f"ERROR: Not able to get 3D nodes in data set: {config.three_d_data_set_ext_id} - error: {e}") + + +def get_3d_entities(node: list[str], modNodeName: str, entities: list[dict[str, Any]]) -> list[dict[str, Any]]: + """ + process time series metadata and create an entity used as input to contextualization + + Args: + node: metadata for 3D node + modNodeName: modified node name + entities: already processed entities + + Returns: + list of entities + """ + + # add entities for files used to match between 3D nodes and assets + entities.append( + { + "id": node["id"], + "name": modNodeName, + "external_id": node["treeIndex"], + "org_name": node["name"], + "type": "3dNode", + } + ) + return entities + + +def replace_3d_entities(node: dict[str, Any], modNodeName: str, entities: list[dict[str, Any]]) -> list[dict[str, Any]]: + """ + Process time series metadata and replace the entity with the same modNodeName. + + Args: + node: metadata for 3D node + modNodeName: modified node name + entities: already processed entities + + Returns: + list of entities + """ + + # Create a dictionary for quick lookup by name + entity_dict = {entity["name"]: entity for entity in entities} + + # Create the new entity + new_entity = { + "id": node["id"], + "name": modNodeName, + "external_id": node["treeIndex"], + "org_name": node["name"], + "type": "3dNode", + } + + # Replace the entity with the same modNodeName + entity_dict[modNodeName] = new_entity + + # Convert the dictionary back to a list + return list(entity_dict.values()) + + +def _normalize_name_generic( + name: str, + replacements: list[dict[str, str]] | None = None, + suffixes_to_strip: list[str] | None = None, +) -> str: + """ + Generic name normalizer: apply replacements, strip suffixes, then split on [-_] and rejoin. + Used for asset and 3D node names when config-driven or default. + """ + if not name: + return name + replacements = replacements or [] + suffixes_to_strip = suffixes_to_strip or [] + for r in replacements: + from_val = r.get("from") or r.get("from_val") + to_val = r.get("to") or r.get("to_val") or "" + if from_val is not None: + name = name.replace(from_val, to_val) + for suffix in suffixes_to_strip: + if name.endswith(suffix): + name = name[: -len(suffix)] + parts = re.split(r"[-_\s]+", name) + return "".join(p for p in parts if p) + + +def tag_is_dummy(asset: Asset) -> bool: + custom_description = (asset.metadata or {}).get("Description", "") + return "DUMMY TAG" in custom_description.upper() + + +def tag_is_package(asset: Asset) -> bool: + if not asset.metadata: + return False + + if "Class" not in asset.metadata.keys(): + return False + + if "PKG" in asset.metadata["Class"]: + return True + + +def get_assets( + client: CogniteClient, + config: Any, + existing_matches: list[dict[str, Any]], + read_limit: int, +) -> list[dict[str, Any]]: + """ + Get assets from DM (DataModelOnly project — classic Assets API not available). + + Queries AssetExtension nodes from config.asset_dm_space. Optionally restricts to + nodes whose externalId starts with one of config.asset_subtree_external_ids or + config.asset_root_ext_id (treats the IDs as path prefixes, e.g. 'CLV/'). + """ + from cognite.client.data_classes.data_modeling import filters as dm_filters + + entities: list[dict[str, Any]] = [] + try: + instance_space = getattr(config, "asset_dm_space", None) or os.getenv("ASSET_INSTANCE_SPACE") + asset_view_space = getattr(config, "asset_view_space", instance_space) + asset_view_ext_id = getattr(config, "asset_view_ext_id",None) or os.getenv("ASSET_VIEW_EXT_ID") + asset_view_version = getattr(config, "asset_view_version",None) or os.getenv("ASSET_VIEW_VERSION") + + nodes = client.data_modeling.instances.list( + instance_type="node", + space=instance_space, + filter=dm_filters.HasData( + views=[(asset_view_space, asset_view_ext_id, asset_view_version)] + ), + limit=read_limit if read_limit > 0 else -1, + ) + + # Optional prefix filtering by asset subtree / root + subtree_ids: list[str] = getattr(config, "asset_subtree_external_ids", None) or [] + root_ext_id: str | None = getattr(config, "asset_root_ext_id", None) + if not subtree_ids and root_ext_id: + subtree_ids = [root_ext_id] + + def _in_subtree(ext_id: str) -> bool: + if not subtree_ids: + return True + return any( + ext_id == prefix or ext_id.startswith(prefix + "/") or ext_id.startswith(prefix + "-") + for prefix in subtree_ids + ) + + replacements = getattr(config, "name_replacements", None) + suffixes_to_strip = getattr(config, "suffixes_to_strip", None) + + for node in nodes: + if not _in_subtree(node.external_id): + continue + + # Derive display name from externalId (last path segment) + raw_name = node.external_id.split("/")[-1] + name = _normalize_name_generic(raw_name, replacements=replacements, suffixes_to_strip=suffixes_to_strip) + if not name or len(name) <= 3: + continue + + entities.append( + { + "id": node.external_id, # DM: use externalId as id (no numeric id) + "name": name, + "external_id": node.external_id, + "org_name": raw_name, + "type": "asset", + } + ) + + log.info(f"Number of DM assets found: {len(entities)} (space='{instance_space}', subtree={subtree_ids or 'all'})") + return entities + + except Exception as e: + root = getattr(config, "asset_root_ext_id", "?") + raise Exception( + f"ERROR: Not able to get entities for asset extId root: {root}. Error: {type(e)}({e})" + ) + + +def get_matches( + client: CogniteClient, match_to: list[dict[str, Any]], match_from: list[dict[str, Any]], manual_mappings +) -> list[ContextualizationJob]: + """ + Create / Update entity matching model and run job to get matches + + Args: + client: Instance of CogniteClient + match_to: list of entities to match to (target) + match_from: list of entities to match from (source) + manual_mappings + + Returns: + list of matches + """ + + more_to_match = True + all_matches = [] + match_size = MAX_MODEL_SIZE_TO_CREATE_MODEL + min_match_size = int(MAX_MODEL_SIZE_TO_CREATE_MODEL / 4) + offset = 0 + retry_num = 3 + match_array = [] + + try: + # limit number input nodes to create model + if len(match_from) > MAX_MODEL_SIZE_TO_CREATE_MODEL: + sources = match_from[:MAX_MODEL_SIZE_TO_CREATE_MODEL] + else: + sources = match_from + + if len(match_to) > MAX_MODEL_SIZE_TO_CREATE_MODEL: + targets = match_to[:MAX_MODEL_SIZE_TO_CREATE_MODEL] + else: + targets = match_to + + def transform_dict_to_list_of_dicts(input_dict): + result = [] + for key, value in input_dict.items(): + result.append({"targetId": key, "sourceId": int(value[0])}) + return result + + model = client.entity_matching.fit( + sources=sources, + targets=targets, + match_fields=[(COL_MATCH_KEY, COL_MATCH_KEY)], + feature_type=ML_MODEL_FEATURE_TYPE, + # true_matches=manual_mappings + ) + + while more_to_match: + if len(match_from) < offset + match_size: + more_to_match = False + match_array = match_from[offset:] + else: + match_array = match_from[offset : offset + match_size] + + log.info(f"Run mapping of number of nodes from: {offset} to {offset + len(match_array)}") + + try: + job = model.predict(sources=match_array, targets=targets, num_matches=1) + job.wait_for_completion() + matches = job.result + all_matches = all_matches + matches["items"] + offset += match_size + retry_num = 3 # reset retry + except Exception as e: + retry_num -= 1 + if retry_num < 0: + raise Exception("Not able not run mapping job, giving up after retry - error: {e}") from e + else: + more_to_match = True + if int(match_size / 2) > min_match_size: + match_size = int(match_size / 2) + log.error(f"Not able to run mapping job - error: {e}") + pass + + return all_matches + + except Exception as e: + raise Exception(f"ERROR: Failed to get matching model and run fit / matching. Error: {type(e)}({e})") diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/handler.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/handler.py new file mode 100644 index 00000000..dad5f9a4 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/handler.py @@ -0,0 +1,27 @@ +from __future__ import annotations + +import sys +from pathlib import Path + +from cognite.client import CogniteClient + +sys.path.append(str(Path(__file__).parent)) + +from config import load_config_parameters +from pipeline import annotate_3d_model + + +def handle(data: dict, client: CogniteClient) -> dict: + """ + CDF Function entry point for 3D annotation. + + Args: + data: dictionary containing ExtractionPipelineExtId + client: CogniteClient injected by CDF Functions runtime + + Returns: + dict with status and input data + """ + config = load_config_parameters(client, data) + annotate_3d_model(client, config) + return {"status": "succeeded", "data": data} diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/logger.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/logger.py new file mode 100644 index 00000000..69bfbc52 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/logger.py @@ -0,0 +1,37 @@ +import os +from typing import Literal + + +# Logger using print +class CogniteFunctionLogger: + def __init__(self, log_level: Literal["DEBUG", "INFO", "WARNING", "ERROR"] = "INFO"): + self.log_level = log_level.upper() + + def _print(self, prefix: str, message: str) -> None: + if "\n" not in message: + print(f"{prefix} {message}") + return + lines = message.split("\n") + print(f"{prefix} {lines[0]}") + prefix_len = len(prefix) + for line in lines[1:]: + print(f"{' ' * prefix_len} {line}") + + def debug(self, message: str) -> None: + if self.log_level == "DEBUG": + self._print("[DEBUG]", message) + + def info(self, message: str) -> None: + if self.log_level in ("DEBUG", "INFO"): + self._print("[INFO]", message) + + def warning(self, message: str) -> None: + if self.log_level in ("DEBUG", "INFO", "WARNING"): + self._print("[WARNING]", message) + + def error(self, message: str) -> None: + self._print("[ERROR]", message) + + +# Default module-level logger (LOG_LEVEL env var optional, e.g. DEBUG, INFO, WARNING, ERROR) +log = CogniteFunctionLogger(os.environ.get("LOG_LEVEL", "INFO")) diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/pipeline.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/pipeline.py new file mode 100644 index 00000000..3a5fda3d --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/pipeline.py @@ -0,0 +1,324 @@ +from __future__ import annotations + +import sys +from pathlib import Path +from typing import Any + +from cognite.client import CogniteClient +from cognite.client.data_classes import ContextualizationJob, ExtractionPipelineRun, ThreeDAssetMapping +from cognite.client.utils._text import shorten +from cognite.extractorutils.uploader import RawUploadQueue + +sys.path.append(str(Path(__file__).parent)) + +from config import ContextConfig +from constants import ( + COL_KEY_MAN_CONTEXTUALIZED, + COL_KEY_MAN_MAPPING_3D_NODE_NAME, + COL_KEY_MAN_MAPPING_ASSET_EXTID, +) +from get_resources import ( + filter_3d_nodes, + get_3d_model_id_and_revision_id, + get_3d_nodes, + get_asset_id_ext_id_mapping, + get_assets, + get_mapping_to_delete, + get_matches, + get_treed_asset_mappings, + manual_table_exists, + read_manual_mappings, +) +from pre_ml_mappings import ( + apply_manual_mappings as apply_manual_mappings_pre_ml, + apply_rule_mappings, + rule_table_exists, +) +from write_resources import write_mapping_to_raw +from apply_dm_cad_contextualization import run as run_apply_dm_cad_contextualization +from logger import log + + +def annotate_3d_model(client: CogniteClient, config: ContextConfig) -> None: + """ + Read configuration and start process by + 1. Read RAW table with manual mappings and extract all rows not contextualized + 2. Apply manual mappings from 3D nodes to Asset - this will overwrite any existing mapping + 3. Read all time series not matched (or all if runAll is True) + 4. Read all assets + 5. Run ML contextualization to match 3D Nodes -> Assets + 6. Update 3D Nodes with mapping + 7. Write results matched (good) not matched (bad) to RAW + 8. Output in good/bad table can then be used in workflow to update manual mappings + + Args: + client: An instantiated CogniteClient + config: A dataclass containing the configuration for the annotation process + """ + log.info("Initiating 3D annotation process") + + len_good_matches = 0 + len_bad_matches = 0 + existing_matches = {} + mapping_to_delete = None + numAsset = -1 if not config.debug else 10000 + + raw_uploader = RawUploadQueue(cdf_client=client, max_queue_size=500_000, trigger_log_level="INFO") + + try: + # get model id and revision id based on name + model_id, revision_id = get_3d_model_id_and_revision_id(client, config, config.three_d_model_name) + + use_dm_cad = getattr(config, "use_dm_cad_contextualization", True) + if not use_dm_cad: + if config.run_all or not config.keep_old_mapping: + mapping_to_delete = get_mapping_to_delete(client, model_id, revision_id) + client.three_d.asset_mappings.delete(model_id, revision_id, mapping_to_delete) + if config.keep_old_mapping and not config.run_all: + existing_matches = get_treed_asset_mappings(client, model_id, revision_id, existing_matches) + + asset_entities = get_assets(client, config, existing_matches, numAsset) + if not asset_entities: + raise Exception("WARNING: No assets found for root asset: {config.asset_root_ext_id}") + + three_d_entities, tree_d_nodes = get_3d_nodes( + client=client, config=config, asset_entities=asset_entities, model_id=model_id, revision_id=revision_id, + threed_from_quantum=getattr(config, "threed_from_quantum", False)) + + good_matches: list[dict[str, Any]] = [] + matched_node_ids: set[int] = set() + + # 1) Manual mappings (before ML) + if manual_table_exists(client, config): + manual_mappings = read_manual_mappings(client, config) + if manual_mappings: + log.info("Applying manual mappings before ML") + good_manual, matched_manual = apply_manual_mappings_pre_ml( + client, config, manual_mappings, model_id, revision_id, tree_d_nodes, asset_entities + ) + good_matches.extend(good_manual) + matched_node_ids |= matched_manual + + # 2) Rule-based mappings (before ML) + if rule_table_exists(client, config): + log.info("Applying rule-based mappings before ML") + good_rule, matched_rule = apply_rule_mappings( + client, config, tree_d_nodes, asset_entities, model_id, revision_id, matched_node_ids + ) + good_matches.extend(good_rule) + matched_node_ids |= matched_rule + + # 3) ML matching for remaining entities + remaining_entities = [e for e in three_d_entities if e["id"] not in matched_node_ids] + if len(remaining_entities) > 0: + match_results = get_matches(client, asset_entities, remaining_entities, []) + good_ml, bad_matches, existing_matches = select_and_apply_matches( + client, config, match_results, tree_d_nodes, model_id, revision_id, existing_matches, use_dm_cad + ) + good_matches = good_matches + good_ml + else: + bad_matches = [] + + if len(good_matches) > 0 or len(bad_matches) > 0: + write_mapping_to_raw(client, config, raw_uploader, good_matches, bad_matches) + len_good_matches = len(good_matches) + len_bad_matches = len(bad_matches) + + if use_dm_cad and len_good_matches > 0 and not config.debug: + run_apply_dm_cad_contextualization(client, config, model_id, revision_id) + + _asset_ref = ( + getattr(config, "asset_root_ext_id", None) + or getattr(config, "asset_subtree_external_ids", None) + or config.three_d_model_name + ) + msg = ( + f"Contextualization of 3D to asset root: {_asset_ref}, " + f"num 3D nodes contextualized: {len_good_matches}, num 3D nodes NOT contextualized: {len_bad_matches} " + f"(score below {config.match_threshold})" + ) + log.info(msg) + client.extraction_pipelines.runs.create( + ExtractionPipelineRun( + extpipe_external_id=config.extraction_pipeline_ext_id, + status="success", + message=msg, + ) + ) + except Exception as e: + _asset_ref = ( + getattr(config, "asset_root_ext_id", None) + or getattr(config, "asset_subtree_external_ids", None) + or getattr(config, "three_d_model_name", "unknown") + ) + msg = f"Contextualization of 3D to root asset: {_asset_ref} failed - Message: {e!s}" + log.error(msg) + client.extraction_pipelines.runs.create( + ExtractionPipelineRun( + extpipe_external_id=config.extraction_pipeline_ext_id, + status="failure", + message=shorten(msg, 1000), + ) + ) + + +def select_and_apply_matches( + client: CogniteClient, + config: ContextConfig, + match_results: list[ContextualizationJob], + tree_d_nodes: dict[str, Any], + model_id: int, + revision_id: int, + existing_matches: dict[str, Any], + use_dm_cad_contextualization: bool = False, +) -> tuple[list[dict], list[dict], dict[str, Any]]: + """ + Select and apply matches based on filtering threshold. Matches with score above threshold are updating time series + with asset ID When matches are updated, metadata property with information about the match is added to time series + to indicate that it has been matched. + + Args: + client: Instance of CogniteClient + config: Instance of ContextConfig + match_results: list of matches from entity matching + ts_meta_dict: dictionary with time series id and metadata + + Returns: + list of good matches + list of bad matches + """ + good_matches = [] + bad_matches = [] + mapped_node = [] + asset_mappings = [] + + try: + for match in match_results: + if match["matches"]: + if match["matches"][0]["score"] >= config.match_threshold: + good_matches.append(add_to_dict(match)) + else: + bad_matches.append(add_to_dict(match)) + else: + bad_matches.append(add_to_dict(match)) + + log.info(f"Got {len(good_matches)} matches with score >= {config.match_threshold}") + log.info(f"Got {len(bad_matches)} matches with score < {config.match_threshold}") + + for match in good_matches: + node_str = match["3DNameMatched"] + + if node_str not in mapped_node: + mapped_node.append(node_str) + + asset_id = match["assetId"] + node_ids = tree_d_nodes[node_str] + + for node_id in node_ids: + if asset_id in existing_matches: + # Ensure it's a list before appending + if isinstance(existing_matches[asset_id], list): + existing_matches[asset_id].append(node_id["id"]) + else: + existing_matches[asset_id] = [existing_matches[asset_id], node_id["id"]] + else: + existing_matches[asset_id] = [node_id["id"]] + + asset_mappings.append( + ThreeDAssetMapping( + node_id=node_id["id"], + asset_id=asset_id, + ) + ) + + if len(asset_mappings) > 0 and len(asset_mappings) % 10000 == 0: + if not config.debug and not use_dm_cad_contextualization: + client.three_d.asset_mappings.create( + model_id=model_id, revision_id=revision_id, asset_mapping=asset_mappings + ) + log.info(f"Updated {len(asset_mappings)} 3D mappings") + asset_mappings = [] + + if not config.debug and not use_dm_cad_contextualization and asset_mappings: + client.three_d.asset_mappings.create( + model_id=model_id, revision_id=revision_id, asset_mapping=asset_mappings + ) + log.info(f"Updated {len(asset_mappings)} nodes with 3D mappings") + + return good_matches, bad_matches, existing_matches + + except Exception as e: + log.error(f"Failed to parse results from entity matching - error: {type(e)}({e})") + + +def add_to_dict(match: dict[Any]) -> dict[Any]: + """ + Add match to dictionary + + Args: + match: dictionary with match information + Returns: + dictionary with match information + """ + + try: + mFrom = match["source"] + + if len(match["matches"]) > 0: + mTo = match["matches"][0]["target"] + score = match["matches"][0]["score"] + asset_name = mTo["name"] + asset_id = mTo["id"] + asset_external_id = mTo["external_id"] + else: + score = 0 + asset_name = "_no_match_" + asset_id = None + asset_external_id = None + + return { + "matchType": "ml", + "score": score, + "3DName": mFrom["org_name"], + "3DNameQC": get_qc_friendly_3d_name(mFrom["org_name"]), + "3DNameMatched": mFrom["name"], + "3DId": mFrom["id"], + "assetName": asset_name, + "assetId": asset_id, + "assetExternalId": asset_external_id + } + except Exception as e: + raise Exception(f"ERROR: Not able to parse return object: {match} - error: {e}") + + +def get_qc_friendly_3d_name(s: str) -> str: + splits = s.split("/") + splits_length = len(splits) + + if splits_length >= 2: + return splits[1] + else: + return s + + +def remove_old_mappings( + client: CogniteClient, + mapping_to_delete: list[ThreeDAssetMapping], + existing_matches: dict[str, Any], + model_id: int, + revision_id: int, +) -> int: + delete_mapping = [] + + for mapping in mapping_to_delete: + asset_id = mapping.asset_id + + if asset_id not in existing_matches: + delete_mapping.append(mapping) + + if len(delete_mapping) > 0: + client.three_d.asset_mappings.delete(model_id=model_id, revision_id=revision_id, asset_mapping=delete_mapping) + + log.info(f"Deleted {len(delete_mapping)} old mappings") + + return len(delete_mapping) diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/pre_ml_mappings.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/pre_ml_mappings.py new file mode 100644 index 00000000..550ed0a0 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/pre_ml_mappings.py @@ -0,0 +1,194 @@ +""" +Pre-ML mapping: manual and rule-based 3D → CogniteAsset mappings (DM mode). + +Applied before entity matching (ML). Writes mappings by updating the +CogniteCADNode.asset direct relation in the Core Data Model instead of +using the classic three_d.asset_mappings API. +""" +from __future__ import annotations + +import re +from typing import Any + +from cognite.client import CogniteClient +from cognite.client.data_classes.data_modeling import NodeApply, NodeOrEdgeData + +from config import get_cad_node_view +from constants import ( + COL_KEY_RULE_REGEXP_ASSET, + COL_KEY_RULE_REGEXP_ENTITY, +) +from get_resources import manual_table_exists, read_manual_mappings, build_cad_node_lookup, create_cad_node_mappings +from logger import log + + +def rule_table_exists(client: CogniteClient, config: Any) -> bool: + """Return True if the rule-based mapping RAW table exists.""" + raw_table_rule = getattr(config, "raw_table_rule", None) + if not raw_table_rule: + return False + tables = client.raw.tables.list(config.rawdb, limit=None) + return any(tbl.name == raw_table_rule for tbl in tables) + + +def read_rule_mappings(client: CogniteClient, config: Any) -> list[dict[str, str]]: + """ + Read rule-based mappings from RAW table. + Each row: regexpEntity (pattern for 3D node name), regexpAsset (pattern for asset name). + """ + raw_table_rule = getattr(config, "raw_table_rule", None) + if not raw_table_rule: + return [] + try: + df = client.raw.rows.retrieve_dataframe( + db_name=config.rawdb, table_name=raw_table_rule, limit=-1 + ) + rules = [] + for _, row in df.iterrows(): + entity_pat = row.get(COL_KEY_RULE_REGEXP_ENTITY) or row.get("regexpEntity") + asset_pat = row.get(COL_KEY_RULE_REGEXP_ASSET) or row.get("regexpAsset") + if entity_pat and asset_pat: + rules.append({ + "regexp_entity": str(entity_pat).strip(), + "regexp_asset": str(asset_pat).strip(), + }) + log.info(f"Number of rule-based mapping rules: {len(rules)}") + return rules + except Exception as e: + log.warning(f"Could not read rule mappings from {config.rawdb}/{raw_table_rule}: {e}") + return [] + + +def _node_id_to_name(tree_d_nodes: dict[str, Any]) -> dict[int, str]: + """Build node3DId -> cleaned node name from tree_d_nodes.""" + out = {} + for name, node_list in tree_d_nodes.items(): + for n in node_list: + nid = n["id"] if isinstance(n.get("id"), int) else int(n["id"]) + out[nid] = name + return out + + +def _asset_ext_id_to_info(asset_entities: list[dict[str, Any]]) -> dict[str, dict[str, Any]]: + """Build asset_external_id -> {name} from asset_entities (DM: id == external_id).""" + return {a["external_id"]: {"name": a["name"]} for a in asset_entities} + + +def _qc_friendly_3d_name(s: str) -> str: + parts = s.split("/") + return parts[1] if len(parts) >= 2 else s + + +def apply_manual_mappings( + client: CogniteClient, + config: Any, + manual_mappings: list[dict[str, Any]], + model_id: int, + revision_id: int, + tree_d_nodes: dict[str, Any], + asset_entities: list[dict[str, Any]], +) -> tuple[list[dict[str, Any]], set[int]]: + """ + Apply manual mappings: update CogniteCADNode.asset in DM. + Returns good_matches list and set of matched node3DIds. + """ + good_matches: list[dict[str, Any]] = [] + matched_node_ids: set[int] = set() + if not manual_mappings: + return good_matches, matched_node_ids + + instance_space = config.asset_dm_space + node_id_to_name = _node_id_to_name(tree_d_nodes) + asset_info_map = _asset_ext_id_to_info(asset_entities) + cad_node_lookup = build_cad_node_lookup(client, instance_space) + dm_mappings: list[tuple[int, str, str]] = [] + + for m in manual_mappings: + node3d_id = int(m["sourceId"]) + asset_ext_id = str(m["targetId"]) + matched_node_ids.add(node3d_id) + dm_mappings.append((node3d_id, asset_ext_id, instance_space)) + node_name = node_id_to_name.get(node3d_id, "") + asset_info = asset_info_map.get(asset_ext_id, {}) + good_matches.append({ + "matchType": "manual", + "score": 1.0, + "3DName": node_name, + "3DNameQC": _qc_friendly_3d_name(node_name), + "3DNameMatched": node_name, + "3DId": node3d_id, + "assetName": asset_info.get("name", ""), + "assetId": asset_ext_id, + "assetExternalId": asset_ext_id, + }) + + if dm_mappings and not getattr(config, "debug", False): + create_cad_node_mappings(client, cad_node_lookup, dm_mappings) + log.info(f"Applied {len(dm_mappings)} manual 3D mappings (DM)") + + return good_matches, matched_node_ids + + +def apply_rule_mappings( + client: CogniteClient, + config: Any, + tree_d_nodes: dict[str, Any], + asset_entities: list[dict[str, Any]], + model_id: int, + revision_id: int, + already_matched_node_ids: set[int], +) -> tuple[list[dict[str, Any]], set[int]]: + """ + Apply rule-based mappings: for each 3D node name not yet matched, try each rule. + If exactly one asset matches the asset regex, write the mapping via DM. + """ + good_matches: list[dict[str, Any]] = [] + matched_node_ids: set[int] = set() + rules = read_rule_mappings(client, config) + if not rules: + return good_matches, matched_node_ids + + instance_space = config.asset_dm_space + asset_by_name = {a["name"]: a for a in asset_entities} + cad_node_lookup = build_cad_node_lookup(client, instance_space) + dm_mappings: list[tuple[int, str, str]] = [] + + for node_name, node_list in tree_d_nodes.items(): + node_ids = [n["id"] if isinstance(n.get("id"), int) else int(n["id"]) for n in node_list] + if any(nid in already_matched_node_ids for nid in node_ids): + continue + for rule in rules: + try: + entity_pat = re.compile(rule["regexp_entity"]) + asset_pat = re.compile(rule["regexp_asset"]) + except re.error: + log.warning(f"Invalid regex in rule: {rule}") + continue + if not entity_pat.search(node_name): + continue + matching_assets = [a for a in asset_entities if asset_pat.search(a["name"])] + if len(matching_assets) != 1: + continue + asset = matching_assets[0] + asset_ext_id = asset["external_id"] + for node3d_id in node_ids: + matched_node_ids.add(node3d_id) + dm_mappings.append((node3d_id, asset_ext_id, instance_space)) + good_matches.append({ + "matchType": "rule", + "score": 1.0, + "3DName": node_name, + "3DNameQC": _qc_friendly_3d_name(node_name), + "3DNameMatched": node_name, + "3DId": node3d_id, + "assetName": asset["name"], + "assetId": asset_ext_id, + "assetExternalId": asset_ext_id, + }) + break + + if dm_mappings and not getattr(config, "debug", False): + create_cad_node_mappings(client, cad_node_lookup, dm_mappings) + log.info(f"Applied {len(dm_mappings)} rule-based 3D mappings (DM)") + + return good_matches, matched_node_ids diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/requirements.txt b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/requirements.txt new file mode 100644 index 00000000..a216e546 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/requirements.txt @@ -0,0 +1,4 @@ +cognite-extractor-utils>=7 +cognite-sdk>=7.26 +pyyaml>=6 +requests>=2.28 diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/tests/__init__.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/tests/__init__.py new file mode 100644 index 00000000..b631dfb2 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/tests/__init__.py @@ -0,0 +1 @@ +# Tests for fn_context_3d_clov_navisworks_asset diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/tests/conftest.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/tests/conftest.py new file mode 100644 index 00000000..b22c4de0 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/tests/conftest.py @@ -0,0 +1,10 @@ +"""Pytest fixtures for 3D contextualization tests.""" +from __future__ import annotations + +import sys +from pathlib import Path + +# Ensure function package is importable when running from project root or function dir +_root = Path(__file__).resolve().parent.parent +if str(_root) not in sys.path: + sys.path.insert(0, str(_root)) diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/tests/test_pre_ml_mappings.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/tests/test_pre_ml_mappings.py new file mode 100644 index 00000000..0b0bc0a0 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/tests/test_pre_ml_mappings.py @@ -0,0 +1,167 @@ +"""Unit tests for pre-ML mappings (manual and rule-based). No CDF connection required.""" +from __future__ import annotations + +from unittest.mock import MagicMock, patch + +import pytest + +from pre_ml_mappings import ( + _asset_id_to_info, + _node_id_to_name, + _qc_friendly_3d_name, + apply_manual_mappings, + apply_rule_mappings, +) + + +class TestHelpers: + def test_node_id_to_name(self): + tree_d_nodes = { + "/path/node1": [{"id": 101}, {"id": 102}], + "/path/node2": [{"id": 201}], + } + out = _node_id_to_name(tree_d_nodes) + assert out == {101: "/path/node1", 102: "/path/node1", 201: "/path/node2"} + + def test_asset_id_to_info(self): + assets = [ + {"id": 1, "name": "Asset A", "external_id": "ext-a"}, + {"id": 2, "name": "Asset B"}, + ] + out = _asset_id_to_info(assets) + assert out[1] == {"name": "Asset A", "external_id": "ext-a"} + assert out[2] == {"name": "Asset B", "external_id": None} + + def test_qc_friendly_3d_name(self): + assert _qc_friendly_3d_name("/prefix/short") == "short" + assert _qc_friendly_3d_name("single") == "single" + assert _qc_friendly_3d_name("/a/b/c") == "b" + + +class TestApplyManualMappings: + """Test apply_manual_mappings with mock client; debug=True so no CDF writes.""" + + @pytest.fixture + def mock_client(self): + return MagicMock() + + @pytest.fixture + def mock_config(self): + config = MagicMock() + config.debug = True # skip client.three_d.asset_mappings.create + return config + + @pytest.fixture + def tree_d_nodes(self): + return {"/path/n1": [{"id": 100}], "/path/n2": [{"id": 200}]} + + @pytest.fixture + def asset_entities(self): + return [ + {"id": 500, "name": "Asset One", "external_id": "A1"}, + {"id": 600, "name": "Asset Two", "external_id": "A2"}, + ] + + def test_empty_manual_mappings(self, mock_client, mock_config, tree_d_nodes, asset_entities): + good, matched = apply_manual_mappings( + mock_client, + mock_config, + [], + model_id=1, + revision_id=2, + tree_d_nodes=tree_d_nodes, + asset_entities=asset_entities, + ) + assert good == [] + assert matched == set() + mock_client.three_d.asset_mappings.create.assert_not_called() + + def test_manual_mappings_applied(self, mock_client, mock_config, tree_d_nodes, asset_entities): + manual = [ + {"sourceId": 100, "targetId": 500}, + {"sourceId": 200, "targetId": 600}, + ] + good, matched = apply_manual_mappings( + mock_client, + mock_config, + manual, + model_id=1, + revision_id=2, + tree_d_nodes=tree_d_nodes, + asset_entities=asset_entities, + ) + assert len(good) == 2 + assert matched == {100, 200} + assert good[0]["matchType"] == "manual" + assert good[0]["3DId"] == 100 + assert good[0]["assetId"] == 500 + assert good[0]["assetName"] == "Asset One" + assert good[0]["assetExternalId"] == "A1" + assert good[1]["3DId"] == 200 and good[1]["assetId"] == 600 + # debug=True so create not called + mock_client.three_d.asset_mappings.create.assert_not_called() + + def test_manual_mappings_without_debug_calls_create(self, mock_client, tree_d_nodes, asset_entities): + config = MagicMock() + config.debug = False + manual = [{"sourceId": 100, "targetId": 500}] + good, matched = apply_manual_mappings( + mock_client, + config, + manual, + model_id=1, + revision_id=2, + tree_d_nodes=tree_d_nodes, + asset_entities=asset_entities, + ) + assert len(good) == 1 and matched == {100} + mock_client.three_d.asset_mappings.create.assert_called_once() + call_kw = mock_client.three_d.asset_mappings.create.call_args[1] + assert call_kw["model_id"] == 1 and call_kw["revision_id"] == 2 + assert len(call_kw["asset_mapping"]) == 1 + assert call_kw["asset_mapping"][0].node_id == 100 and call_kw["asset_mapping"][0].asset_id == 500 + + +class TestApplyRuleMappings: + """Test apply_rule_mappings with mocked read_rule_mappings (no RAW client calls).""" + + def test_no_rules_returns_empty(self): + client = MagicMock() + config = MagicMock() + config.rawdb = "db" + config.raw_table_rule = "rule_table" + # read_rule_mappings will be called and return [] if we don't patch; + # we need to patch so it returns [] without touching client.raw + with patch("pre_ml_mappings.read_rule_mappings", return_value=[]): + good, matched = apply_rule_mappings( + client, + config, + tree_d_nodes={"/n1": [{"id": 1}]}, + asset_entities=[{"id": 10, "name": "A", "external_id": "a"}], + model_id=1, + revision_id=2, + already_matched_node_ids=set(), + ) + assert good == [] + assert matched == set() + + def test_rule_matches_one_node_one_asset(self): + client = MagicMock() + config = MagicMock() + config.debug = True + rules = [{"regexp_entity": r"^/n1$", "regexp_asset": r"^AssetA$"}] + with patch("pre_ml_mappings.read_rule_mappings", return_value=rules): + good, matched = apply_rule_mappings( + client, + config, + tree_d_nodes={"/n1": [{"id": 1}]}, + asset_entities=[{"id": 10, "name": "AssetA", "external_id": "a"}], + model_id=1, + revision_id=2, + already_matched_node_ids=set(), + ) + assert len(good) == 1 + assert good[0]["matchType"] == "rule" + assert good[0]["3DId"] == 1 and good[0]["assetId"] == 10 + assert matched == {1} + client.three_d.asset_mappings.create.assert_not_called() diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/three_d_entities.pkl b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/three_d_entities.pkl new file mode 100644 index 00000000..92ae9fed Binary files /dev/null and b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/three_d_entities.pkl differ diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/three_d_nodes.pkl b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/three_d_nodes.pkl new file mode 100644 index 00000000..80c17bb6 Binary files /dev/null and b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/three_d_nodes.pkl differ diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/three_d_nodes_data.pkl b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/three_d_nodes_data.pkl new file mode 100644 index 00000000..69c5bfc0 Binary files /dev/null and b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/three_d_nodes_data.pkl differ diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/write_resources.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/write_resources.py new file mode 100644 index 00000000..0bab4347 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_cad_asset_contextualization/write_resources.py @@ -0,0 +1,67 @@ +from __future__ import annotations + +import sys +from pathlib import Path + +from cognite.client import CogniteClient +from cognite.client.data_classes import Row +from cognite.client.exceptions import CogniteAPIError +from cognite.extractorutils.uploader import RawUploadQueue + +sys.path.append(str(Path(__file__).parent)) + +from config import ContextConfig +from logger import log + + +def delete_table(client: CogniteClient, db: str, tbl: str) -> None: + try: + client.raw.tables.delete(db, [tbl]) + except CogniteAPIError as e: + # Any other error than table not found, and we re-raise + if e.code != 404: + raise + + +def write_mapping_to_raw( + client: CogniteClient, + config: ContextConfig, + raw_uploader: RawUploadQueue, + good_matches: list[Row], + bad_matches: list[Row], +) -> None: + """ + Write matching results to RAW DB + + Args: + client: Instance of CogniteClient + config: Instance of ContextConfig + raw_uploader : Instance of RawUploadQueue + good_matches: list of good matches + bad_matches: list of bad matches + """ + log.info(f"Clean up BAD table: {config.rawdb}/{config.raw_table_bad} before writing new status") + delete_table(client, config.rawdb, config.raw_table_bad) + delete_table(client, config.rawdb, config.raw_table_good) + delete_table(client, config.rawdb, config.raw_table_all) + + # if reset mapping, clean up good matches in table + if config.run_all and not config.debug: + log.info( + f"ResetMapping - Cleaning up GOOD table: {config.rawdb}/{config.raw_table_good} " + "before writing new status" + ) + delete_table(client, config.rawdb, config.raw_table_good) + + for match in good_matches: + raw_uploader.add_to_upload_queue(config.rawdb, config.raw_table_good, Row(match["3DId"], match)) + raw_uploader.add_to_upload_queue(config.rawdb, config.raw_table_all, Row(match["3DId"], match)) + log.info(f"Added {len(good_matches)} to {config.rawdb}/{config.raw_table_good}") + + for not_match in bad_matches: + raw_uploader.add_to_upload_queue(config.rawdb, config.raw_table_bad, Row(not_match["3DId"], not_match)) + raw_uploader.add_to_upload_queue(config.rawdb, config.raw_table_all, Row(not_match["3DId"], not_match)) + log.info(f"Added {len(bad_matches)} to {config.rawdb}/{config.raw_table_bad}") + + # Upload any remaining RAW cols in queue + raw_uploader.upload() diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_asset_hierarchy.Function.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_asset_hierarchy.Function.yaml new file mode 100644 index 00000000..9b820a99 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_asset_hierarchy.Function.yaml @@ -0,0 +1,21 @@ +# Pipeline: upload asset hierarchy from CSV to DM. Assign this function to ep_ctx_3d_*_upload_asset_hierarchy. +- name: 'context:3d:upload_asset_hierarchy' + externalId: 'fn_context_3d_upload_asset_hierarchy' + owner: 'Anonymous' + description: 'Upload Asset hierarchy (DM nodes) from CSV in CDF Files' + metadata: + version: '0.1.0' + envVars: + ENV_TYPE: '${CDF_BUILD_TYPE}' + CDF_ENV: '${CDF_ENVIRON}' + ASSET_INSTANCE_SPACE: '{{asset_instance_space}}' + RAW_DB: '{{raw_db}}' + RAW_TABLE: three_dimension_asset + ASSET_VIEW_SPACE: '{{data_model_space}}' + ASSET_VIEW_EXT_ID: '{{asset_view_ext_id}}' + ASSET_VIEW_VERSION: '{{asset_view_version}}' + cpu: 0.25 + runtime: 'py311' + functionPath: './handler.py' + dataSetExternalId: 'ds_3d_{{default_location}}' + space: '{{function_space}}' diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_asset_hierarchy/handler.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_asset_hierarchy/handler.py new file mode 100644 index 00000000..61e125b4 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_asset_hierarchy/handler.py @@ -0,0 +1,174 @@ +""" +Upload Asset hierarchy (DM nodes) from a CSV file in CDF Files to CDF Data Modeling. +Used by the upload_asset_hierarchy extraction pipeline. +CSV columns: externalId (required), name, description, parentExternalId, tags, aliases, assetType. +""" +from __future__ import annotations + +import csv +import io +import os +import sys +from pathlib import Path + +import yaml +from cognite.client import CogniteClient +from cognite.client.data_classes.data_modeling import NodeApply, NodeOrEdgeData, ViewId +from cognite.client.exceptions import CogniteAPIError + +sys.path.append(str(Path(__file__).parent)) + +BATCH_SIZE = 1000 + + +def _resolve_asset_view(config: dict) -> ViewId: + """ + View for AssetExtension (or project-specific asset view) from: + 1) extraction pipeline config parameters (assetViewSpace / assetViewExternalId / assetViewVersion) + 2) env vars set at deploy (ASSET_VIEW_SPACE, ASSET_VIEW_EXT_ID, ASSET_VIEW_VERSION from Function.yaml) + 3) no hardcoded fallback: must be explicitly provided + """ + space = ( + config.get("assetViewSpace") + or config.get("asset_view_space") + or os.environ.get("ASSET_VIEW_SPACE") + or os.environ.get("DATA_MODEL_SPACE") + ) + ext_id = ( + config.get("assetViewExternalId") + or config.get("assetViewExtId") + or config.get("asset_view_ext_id") + or os.environ.get("ASSET_VIEW_EXT_ID") + ) + version = ( + config.get("assetViewVersion") + or config.get("asset_view_version") + or os.environ.get("ASSET_VIEW_VERSION") + ) + if not space or not ext_id or not version: + raise ValueError( + "Missing asset view configuration. Provide assetViewSpace/assetViewExternalId/assetViewVersion " + "in extraction pipeline config or ASSET_VIEW_SPACE/ASSET_VIEW_EXT_ID/ASSET_VIEW_VERSION env vars." + ) + return ViewId(str(space), str(ext_id), str(version)) + + +def _resolve_asset_instance_space(config: dict, data: dict) -> str: + """ + Instance space for asset nodes from: + 1) extraction pipeline config parameters (assetInstanceSpace) + 2) function run input (assetInstanceSpace) + 3) env var injected by Function.yaml (ASSET_INSTANCE_SPACE) + """ + space = ( + config.get("assetInstanceSpace") + or config.get("asset_instance_space") + or data.get("assetInstanceSpace") + or data.get("asset_instance_space") + or os.environ.get("ASSET_INSTANCE_SPACE") + ) + if not space: + raise ValueError( + "Missing asset instance space. Provide assetInstanceSpace in extraction pipeline config or run input, " + "or set ASSET_INSTANCE_SPACE env var." + ) + return str(space) + + +def _get_config(client: CogniteClient, pipeline_ext_id: str) -> dict: + """Load pipeline config from CDF.""" + raw = client.extraction_pipelines.config.retrieve(pipeline_ext_id) + if not raw or not raw.config: + return {} + data = yaml.safe_load(raw.config) or {} + params = (data.get("config") or {}).get("data", {}).get("parameters", {}) + if not params: + params = data.get("data", {}).get("parameters", {}) + return params + + +def _split_list(value: str) -> list[str]: + return [v.strip() for v in value.split(";") if v.strip()] if value else [] + + +def _build_nodes(rows: list[dict], space: str, asset_view: ViewId) -> list[NodeApply]: + nodes: list[NodeApply] = [] + for row in rows: + ext_id = (row.get("externalId") or "").strip() + if not ext_id: + continue + name = (row.get("name") or "").strip() or ext_id + description = (row.get("description") or "").strip() or None + parent_id = (row.get("parentExternalId") or "").strip() or None + tags_raw = (row.get("tags") or "").strip() + aliases_raw = (row.get("aliases") or "").strip() + asset_type = (row.get("assetType") or "").strip() or None + + props: dict = {"name": name} + if description: + props["description"] = description + if parent_id: + props["parent"] = {"space": space, "externalId": parent_id} + if tags_raw: + props["tags"] = _split_list(tags_raw) + if aliases_raw: + props["aliases"] = _split_list(aliases_raw) + if asset_type: + props["assetType"] = asset_type + + nodes.append( + NodeApply( + space=space, + external_id=ext_id, + sources=[NodeOrEdgeData(source=asset_view, properties=props)], + ) + ) + return nodes + + +def handle(data: dict, client: CogniteClient) -> dict: + """ + Read asset CSV from CDF File (fileExternalId in config or data), parse it, + and upsert Asset instances into the configured instance space. + """ + pipeline_ext_id = data.get("ExtractionPipelineExtId") or os.environ.get( + "EXTRACTION_PIPELINE_EXT_ID", "ep_ctx_3d_clov_navisworks_upload_asset_hierarchy" + ) + config = _get_config(client, pipeline_ext_id) + space = _resolve_asset_instance_space(config, data) + asset_view = _resolve_asset_view(config) + file_ext_id = config.get("fileExternalId") or data.get("fileExternalId") + data_set_ext_id = config.get("dataSetExternalId") or data.get("dataSetExternalId") + + if not file_ext_id: + return { + "status": "skipped", + "message": "No fileExternalId in config or run input. Upload asset CSV to CDF Files and set fileExternalId.", + "assetInstanceSpace": space, + } + try: + if data_set_ext_id: + files = client.files.list(data_set_external_ids=[data_set_ext_id], external_id=file_ext_id, limit=1) + else: + files = client.files.list(external_id=file_ext_id, limit=1) + file_list = list(files) + if not file_list: + return {"status": "error", "message": f"File not found: externalId={file_ext_id!r}"} + content = client.files.download_bytes(file_list[0].id) + except CogniteAPIError as e: + return {"status": "error", "message": str(e)} + + text = content.decode("utf-8", errors="replace") + rows = list(csv.DictReader(io.StringIO(text))) + nodes = _build_nodes(rows, space, asset_view) + if not nodes: + return {"status": "skipped", "message": "CSV has no valid rows (need externalId)", "assetInstanceSpace": space} + + total = len(nodes) + for i in range(0, total, BATCH_SIZE): + batch = nodes[i : i + BATCH_SIZE] + client.data_modeling.instances.apply(nodes=batch) + + return {"status": "succeeded", "nodesUpserted": total, "assetInstanceSpace": space} + + diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_asset_hierarchy/requirements.txt b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_asset_hierarchy/requirements.txt new file mode 100644 index 00000000..42352de9 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_asset_hierarchy/requirements.txt @@ -0,0 +1,3 @@ +cognite-sdk>=6.0.0 +pyyaml>=6.0 +pandas>=1.5.0 diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_manual_mappings.Function.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_manual_mappings.Function.yaml new file mode 100644 index 00000000..556eca30 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_manual_mappings.Function.yaml @@ -0,0 +1,15 @@ +# Pipeline: upload manual mappings CSV to RAW. Assign this function to ep_ctx_3d_*_upload_manual_mappings. +- name: 'context:3d:upload_manual_mappings' + externalId: 'fn_context_3d_upload_manual_mappings' + owner: 'Anonymous' + description: 'Upload manual 3D–asset mappings from CSV (CDF File) to contextualization_manual_input RAW table' + metadata: + version: '0.1.0' + envVars: + ENV_TYPE: '${CDF_BUILD_TYPE}' + CDF_ENV: '${CDF_ENVIRON}' + cpu: 0.25 + runtime: 'py311' + functionPath: './handler.py' + dataSetExternalId: 'ds_3d_{{default_location}}' + space: '{{function_space}}' diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_manual_mappings/handler.py b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_manual_mappings/handler.py new file mode 100644 index 00000000..5df918e1 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_manual_mappings/handler.py @@ -0,0 +1,91 @@ +""" +Upload manual 3D–asset mappings from a CSV file (in CDF Files) to the +contextualization_manual_input RAW table. Used by the upload_manual_mappings extraction pipeline. +""" +from __future__ import annotations + +import csv +import io +import os +import sys +from pathlib import Path + +import yaml +from cognite.client import CogniteClient +from cognite.client.data_classes import Row +from cognite.client.exceptions import CogniteAPIError + +sys.path.append(str(Path(__file__).parent)) + + +def _get_config(client: CogniteClient, pipeline_ext_id: str) -> dict: + """Load pipeline config from CDF (toolkit deploys config.data.parameters or data.parameters).""" + raw = client.extraction_pipelines.config.retrieve(pipeline_ext_id) + if not raw or not raw.config: + return {} + data = yaml.safe_load(raw.config) or {} + params = (data.get("config") or {}).get("data", {}).get("parameters", {}) + if not params: + params = data.get("data", {}).get("parameters", {}) + return params + + +def handle(data: dict, client: CogniteClient) -> dict: + """ + Read CSV from CDF File (fileExternalId in config or in data), parse it, + and insert rows into the manual mappings RAW table. + + Config (from pipeline): rawdb, rawTableManual, optional fileExternalId, dataSetExternalId. + Or pass in data when triggering: fileExternalId, (optional) dataSetExternalId. + """ + pipeline_ext_id = data.get("ExtractionPipelineExtId") or os.environ.get( + "EXTRACTION_PIPELINE_EXT_ID", "ep_ctx_3d_clov_navisworks_upload_manual_mappings" + ) + config = _get_config(client, pipeline_ext_id) + raw_db = config.get("rawdb") or data.get("rawdb") or "3d_clov_navisworks" + raw_table = config.get("rawTableManual") or data.get("rawTableManual") or "contextualization_manual_input" + file_ext_id = config.get("fileExternalId") or data.get("fileExternalId") + data_set_ext_id = config.get("dataSetExternalId") or data.get("dataSetExternalId") + + if not file_ext_id: + return { + "status": "skipped", + "message": "No fileExternalId in config or run input. Upload a CSV to CDF Files and set fileExternalId, then run again.", + "rawdb": raw_db, + "rawTableManual": raw_table, + } + try: + if data_set_ext_id: + files = client.files.list(data_set_external_ids=[data_set_ext_id], external_id=file_ext_id, limit=1) + else: + files = client.files.list(external_id=file_ext_id, limit=1) + file_list = list(files) + if not file_list: + return {"status": "error", "message": f"File not found: externalId={file_ext_id!r}"} + content = client.files.download_bytes(file_list[0].id) + except CogniteAPIError as e: + return {"status": "error", "message": str(e)} + + text = content.decode("utf-8", errors="replace") + reader = csv.DictReader(io.StringIO(text)) + rows = [] + for r in reader: + key = (r.get("3DId") or "").strip() + if not key: + continue + cols = {} + for k, v in r.items(): + if not v: + continue + v = v.strip() + if k in ("3DId", "assetId") and v.isdigit(): + cols[k] = int(v) + else: + cols[k] = v + rows.append(Row(key=key, columns=cols)) + + if not rows: + return {"status": "skipped", "message": "CSV has no valid rows (need 3DId, assetId)", "rawdb": raw_db, "rawTableManual": raw_table} + + client.raw.rows.insert(db_name=raw_db, table_name=raw_table, row=rows) + return {"status": "succeeded", "rowsInserted": len(rows), "rawdb": raw_db, "rawTableManual": raw_table} diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_manual_mappings/requirements.txt b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_manual_mappings/requirements.txt new file mode 100644 index 00000000..e8f8f302 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/functions/fn_context_3d_upload_manual_mappings/requirements.txt @@ -0,0 +1,2 @@ +cognite-sdk>=6.0.0 +pyyaml>=6.0 diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/module.toml b/modules/accelerators/contextualization/cdf_three_dimenssion/module.toml new file mode 100644 index 00000000..a02a59d8 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/module.toml @@ -0,0 +1,4 @@ +[module] +title = "Contextualization 3D" +id = "dp:acc:contextualization:cdf_three_dimenssion" +package_id = "dp:contextualization" diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/raw/3d_clov_navisworks.RawDatabase.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/3d_clov_navisworks.RawDatabase.yaml new file mode 100644 index 00000000..afee7313 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/3d_clov_navisworks.RawDatabase.yaml @@ -0,0 +1 @@ +dbName: 3d_{{default_location}}_{{source_name}} diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_bad.Table.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_bad.Table.yaml new file mode 100644 index 00000000..9f5ffa08 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_bad.Table.yaml @@ -0,0 +1,2 @@ +dbName: 3d_{{default_location}}_{{source_name}} +tableName: contextualization_bad diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_good.Table.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_good.Table.yaml new file mode 100644 index 00000000..c0fd41af --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_good.Table.yaml @@ -0,0 +1,2 @@ +dbName: 3d_{{default_location}}_{{source_name}} +tableName: contextualization_good diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_manual_input.Table.csv b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_manual_input.Table.csv new file mode 100644 index 00000000..29f47ad5 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_manual_input.Table.csv @@ -0,0 +1,9 @@ +3DId,assetId,3dNodeName,assetExternalId,contextualized +1001,2001,PumpA,CLV-PUMP-001,false +1002,2002,ValveB,CLV-VALVE-002,false +1003,2003,CompressorC,CLV-COMP-003,false +1004,2004,TurbineD,CLV-TURB-004,false +1005,2005,HeatExchangerE,CLV-HEX-005,false +1006,2006,PressureVesselF,CLV-PV-006,false +1007,2007,PipelineG,CLV-PIPE-007,false +1008,2008,ControlValveH,CLV-CV-008,false diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_manual_input.Table.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_manual_input.Table.yaml new file mode 100644 index 00000000..a84909d5 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_manual_input.Table.yaml @@ -0,0 +1,2 @@ +dbName: 3d_{{default_location}}_{{source_name}} +tableName: contextualization_manual_input diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_rule.Table.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_rule.Table.yaml new file mode 100644 index 00000000..0be86c7b --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/contextualization_rule.Table.yaml @@ -0,0 +1,2 @@ +dbName: 3d_{{default_location}}_{{source_name}} +tableName: contextualization_rule diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/raw/three_dimension_asset.Table.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/three_dimension_asset.Table.yaml new file mode 100644 index 00000000..1b3c27a8 --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/raw/three_dimension_asset.Table.yaml @@ -0,0 +1,3 @@ +# Reference / staging for 3D node ↔ asset rows. Load manually from upload_data/three_dimension_asset.csv (see upload_data/manifest.yaml). +dbName: 3d_{{default_location}}_{{source_name}} +tableName: three_dimension_asset diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/upload_data/three_dimension_asset.Manifest.yaml b/modules/accelerators/contextualization/cdf_three_dimenssion/upload_data/three_dimension_asset.Manifest.yaml new file mode 100644 index 00000000..75b5148f --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/upload_data/three_dimension_asset.Manifest.yaml @@ -0,0 +1,5 @@ +kind: RawRows +type: rawTable +table: + dbName: 3d_{{default_location}}_{{source_name}} + tableName: three_dimension_asset \ No newline at end of file diff --git a/modules/accelerators/contextualization/cdf_three_dimenssion/upload_data/three_dimension_asset.RawRows.csv b/modules/accelerators/contextualization/cdf_three_dimenssion/upload_data/three_dimension_asset.RawRows.csv new file mode 100644 index 00000000..5c35ea5c --- /dev/null +++ b/modules/accelerators/contextualization/cdf_three_dimenssion/upload_data/three_dimension_asset.RawRows.csv @@ -0,0 +1,32 @@ +externalId,name,description,parentExternalId,tags,aliases,assetType,threeDNodeId,nodeName,modelName +CLV,CLOV Platform,Root asset — CLOV FPSO platform (TotalEnergies Angola Block 17),,,,,,, +CLV/FPSOH,CLV-FPSOH Hull System,FPSO hull and marine structural systems,CLV,,,,,, +CLV/FPSOT,CLV-FPSOT Topside System,FPSO topside process and utility systems,CLV,,,,,, +CLV/SUBSEA,CLV-SS Subsea Production System,Subsea production and distribution infrastructure (Block 17),CLV,,,,,, +CLV/SUBSEA/HDM-001,CLV-SS-HDM-001 Hydraulic Distribution Module,Subsea hydraulic distribution module — well cluster 001,CLV/SUBSEA,,,,,, +SB0110-SK-001-001,SB0110-SK-001-001 Module Skid Frame,Structural skid frame and mudmat base plate (HDM-001),CLV/SUBSEA/HDM-001,,,,,, +SB0110-MF-001-001,SB0110-MF-001-001 Hydraulic Manifold Block,Central hydraulic manifold body with 6-port distribution header,CLV/SUBSEA/HDM-001,,,,,, +SB0110-XV-001-001,SB0110-XV-001-001 Isolation Valve Port-1,Hydraulic actuated isolation valve — port 1 (supply),SB0110-MF-001-001,,,,,, +SB0110-XV-001-002,SB0110-XV-001-002 Isolation Valve Port-2,Hydraulic actuated isolation valve — port 2 (supply),SB0110-MF-001-001,,,,,, +SB0110-XV-001-003,SB0110-XV-001-003 Isolation Valve Port-3,Hydraulic actuated isolation valve — port 3 (supply),SB0110-MF-001-001,,,,,, +SB0110-XV-001-004,SB0110-XV-001-004 Isolation Valve Port-4,Hydraulic actuated isolation valve — port 4 (return),SB0110-MF-001-001,,,,,, +SB0110-XV-001-005,SB0110-XV-001-005 Isolation Valve Port-5,Hydraulic actuated isolation valve — port 5 (return),SB0110-MF-001-001,,,,,, +SB0110-XV-001-006,SB0110-XV-001-006 Isolation Valve Port-6,Hydraulic actuated isolation valve — port 6 (return),SB0110-MF-001-001,,,,,, +SB0110-XC-001-001,SB0110-XC-001-001 ROV Stab Connector Port-1,ROV-operable hydraulic stab plate and receptacle — port 1,SB0110-MF-001-001,,,,,, +SB0110-XC-001-002,SB0110-XC-001-002 ROV Stab Connector Port-2,ROV-operable hydraulic stab plate and receptacle — port 2,SB0110-MF-001-001,,,,,, +SB0110-XC-001-003,SB0110-XC-001-003 ROV Stab Connector Port-3,ROV-operable hydraulic stab plate and receptacle — port 3,SB0110-MF-001-001,,,,,, +SB0110-PS-001-001,SB0110-PS-001-001 Supply Pressure Sensor,Subsea pressure transmitter — supply header,SB0110-MF-001-001,,,,,, +SB0110-PS-001-002,SB0110-PS-001-002 Return Pressure Sensor,Subsea pressure transmitter — return header,SB0110-MF-001-001,,,,,, +SB0110-ACC-001-001,SB0110-ACC-001-001 Hydraulic Accumulator,Pre-charged accumulator — HDM-001,CLV/SUBSEA/HDM-001,,,,,, +SB0110-HPU-001-001,SB0110-HPU-001-001 Local HPU Interface,Local hydraulic power unit interface skid,CLV/SUBSEA/HDM-001,,,,,, +CLV/SUBSEA/HDM-002,CLV-SS-HDM-002 Hydraulic Distribution Module,Subsea hydraulic distribution module — well cluster 002,CLV/SUBSEA,,,,,, +SB0110-SK-002-001,SB0110-SK-002-001 Module Skid Frame HDM-002,Structural skid frame (HDM-002),CLV/SUBSEA/HDM-002,,,,,, +SB0110-MF-002-001,SB0110-MF-002-001 Hydraulic Manifold Block HDM-002,Manifold block — HDM-002,CLV/SUBSEA/HDM-002,,,,,, +CLV-PUMP-001,CLV-PUMP-001 Process Pump A,Example topside pump for contextualization demo,CLV/FPSOT,,,,1001,PumpA, +CLV-VALVE-002,CLV-VALVE-002 Process Valve B,Example topside valve for contextualization demo,CLV/FPSOT,,,,1002,ValveB, +CLV-COMP-003,CLV-COMP-003 Compressor C,Example compressor for contextualization demo,CLV/FPSOT,,,,1003,CompressorC, +CLV-TURB-004,CLV-TURB-004 Turbine D,Example turbine for contextualization demo,CLV/FPSOT,,,,1004,TurbineD, +CLV-HEX-005,CLV-HEX-005 Heat Exchanger E,Example heat exchanger for contextualization demo,CLV/FPSOT,,,,1005,HeatExchangerE, +CLV-PV-006,CLV-PV-006 Pressure Vessel F,Example pressure vessel for contextualization demo,CLV/FPSOT,,,,1006,PressureVesselF, +CLV-PIPE-007,CLV-PIPE-007 Pipeline G,Example pipeline segment for contextualization demo,CLV/FPSOT,,,,1007,PipelineG, +CLV-CV-008,CLV-CV-008 Control Valve H,Example control valve for contextualization demo,CLV/FPSOT,,,,1008,ControlValveH, diff --git a/modules/packages.toml b/modules/packages.toml index 452834d5..65dd25bd 100644 --- a/modules/packages.toml +++ b/modules/packages.toml @@ -13,6 +13,7 @@ modules = [ "accelerators/contextualization/cdf_p_and_id_annotation", "accelerators/contextualization/cdf_file_annotation", "accelerators/contextualization/cdf_entity_matching", + "accelerators/contextualization/cdf_three_dimenssion", ] [packages.infield_quickstart] diff --git a/packages.zip b/packages.zip new file mode 100644 index 00000000..f34e6f45 Binary files /dev/null and b/packages.zip differ