Fabric Pathfinder

Tenant-wide inventory collection and cross-environment artifact mapping for Microsoft Fabric CI/CD workflows.

STRONGLY RECOMMENDED: Run as a Service Principal (SPN) with Admin API Access

For the most complete and accurate inventory data, run this notebook as a Service Principal with Admin API permissions. This provides:

Tenant-wide visibility: See all workspaces and items, not just those you have access to

Creator metadata: Item creator IDs, names, types, and UPNs

Item state and timestamps: Last updated dates and state information

Capacity information: Which capacity each item belongs to

Without admin access, the notebook falls back to non-admin APIs which only return items in workspaces you can access, and many metadata fields will be NULL.

TL;DR

Import nb_000_fabric_pathfinder.ipynb into a Fabric workspace
Configure SINK_TYPE ("warehouse" or "lakehouse") in the notebook
Run the notebook to populate inventory tables
Execute fabric_pathfinder_views.sql in the warehouse/lakehouse to create environment mapping views
Query vw_env_map_all to get DEV/UAT/PROD artifact mappings for CI/CD parameterization

Why This Exists

Promoting Fabric artifacts between environments (DEV to UAT to PROD) requires replacing environment-specific identifiers: workspace IDs, connection GUIDs, SQL connection strings, and item references. Manually tracking these mappings is error-prone and doesn't scale.

Fabric Pathfinder solves this by:

Inventorying your tenant via Fabric REST APIs (the notebook)
Mapping artifacts across environments using naming conventions (the SQL views)

Vacation and Offboarding Coverage

Beyond CI/CD, the inventory provides critical visibility into artifact ownership. When someone goes on vacation or leaves the organization, you need to know what they own. Microsoft Entra ID's 30-day sign-in requirement can result in account deletion, and with it, orphaned artifacts that nobody knows exist until something breaks.

The workspace_role_assignments table captures who has access to what, enabling proactive ownership transfers before access is lost.

Coming Soon: Fabric Usurp

A companion project using undocumented APIs that enables non-SPN users to take over all objects in a specified workspace in one shot, without the UI, from a notebook. This will streamline bulk ownership transfers during offboarding.

The output supports two CI/CD approaches:

fabric-cicd: Generate parameter files for Azure DevOps deployments
Fabric Deployment Pipelines: Validate Variable Library completeness by comparing mapped artifacts against configured value sets before promotion

What's New

Lakehouse Sink Support

You can now write inventory data to either a Data Warehouse or a Lakehouse:

Warehouse (SINK_TYPE = "warehouse"): Uses the synapsesql connector. Best for SQL-based analytics.
Lakehouse (SINK_TYPE = "lakehouse"): Uses Delta format via abfss paths. Best for Spark-based workflows.

The lakehouse is created with schema support enabled by default (lh_integration).

Non-Admin Fallback

The notebook now automatically detects whether you have admin API access:

Admin mode: Uses tenant-wide Admin APIs for complete visibility
Non-admin mode: Falls back to per-workspace Core APIs if admin access is unavailable

In non-admin mode, the same table schema is used, but certain columns will be NULL (see Admin vs Non-Admin Mode for details).

Prerequisites

For Full Functionality (Recommended)

Admin API Access Required for Complete Data

For the most complete inventory data, use admin API access:

User mode: The executing user must have Fabric tenant administrator rights.

SPN mode (recommended): A Fabric Administrator must enable the following tenant settings:

Service principals can use Fabric APIs (Developer settings)

Service principals can access read-only admin APIs (Admin API settings)

The SPN's security group must be added to both settings.

For Limited Functionality (Non-Admin)

If admin access is unavailable, the notebook will still run but with reduced data:

Only inventories workspaces you have access to
Creator metadata, state, timestamps, and capacity IDs will be NULL
Git connections use per-workspace API (slower but functional)

Additional Requirements

A Fabric workspace with capacity (F or P SKU)
Permission to create a Warehouse or Lakehouse in the target workspace
Artifacts following the naming conventions described below (or willingness to refactor the views)

Files

File	Description
`nb_000_fabric_pathfinder.ipynb`	PySpark notebook that collects tenant inventory via REST APIs and writes to `wh_integration` or `lh_integration`
`fabric_pathfinder_views.sql`	SQL views that map artifacts across DEV/UAT/PROD environments
`claude.md`	Code style guide for contributors

How It Works

Step 1: Notebook Populates Inventory Tables

The notebook calls Fabric REST APIs to collect tenant metadata and writes to these tables:

Table	Admin API	Non-Admin Fallback	Description
`workspace_items`	Full metadata	Limited (no creator, state, timestamps)	All Fabric items with workspace context
`tenant_connections`	-	Same data	Tenant-level connection definitions
`sql_endpoint_connections`	-	Same data	SQL connection strings for Warehouses/Lakehouses
`item_schedules`	-	Same data	Refresh schedules for Dataflows and Semantic Models
`workspace_role_assignments`	-	Same data	User/group/SPN role assignments per workspace
`capacities`	-	Same data	Fabric capacities (F-SKUs) with region and state
`gateways`	-	Same data	On-premises, personal, and VNet gateways
`workspace_git_connections`	Tenant-wide	Per-workspace (slower)	Git integration status per workspace

Step 2: Views Map Artifacts Across Environments

The SQL views read from the inventory tables and correlate artifacts across environments using naming conventions:

workspace_items ─────────────────► vw_env_map_workspace_items ───┐
workspace_items (grouped) ───────► vw_env_map_workspaces ────────┤
tenant_connections ──────────────► vw_env_map_connections ───────┼──► vw_env_map_all
sql_endpoint_connections ────────► vw_env_map_sql_endpoints ─────┘

All views produce a consistent output schema:

Column	Description
`item_name`	Canonical artifact name (environment suffix stripped)
`item_type`	Category: Workspace, Connection, SQL Connection String, or item type
`id_dev`	DEV environment identifier (GUID, connection string, etc.)
`id_uat`	UAT environment identifier (NULL if not found)
`id_prod`	PROD environment identifier (NULL if not found)

DEV is the canonical source. All views LEFT JOIN from DEV, so:

id_dev is never NULL
Artifacts that exist only in UAT or PROD will not appear
NULL values in id_uat or id_prod indicate missing artifacts in those environments

Sink Type Selection

Configure where inventory data is written:

# In the notebook configuration section
SINK_TYPE = "warehouse"  # or "lakehouse"

Warehouse Mode (Default)

SINK_TYPE = "warehouse"
WAREHOUSE_NAME = "wh_integration"

Creates/uses wh_integration warehouse
Writes using synapsesql connector with mode("overwrite")
Best for SQL-based analytics and reporting

Lakehouse Mode

SINK_TYPE = "lakehouse"
LAKEHOUSE_NAME = "lh_integration"

Creates/uses lh_integration lakehouse with schema support enabled
Writes Delta format via abfss path
Removes existing files before writing for clean overwrite
Best for Spark-based data engineering workflows

Admin vs Non-Admin Mode

The notebook automatically detects admin access at startup and displays the mode in the execution summary.

Execution Summary Example

============================================================
EXECUTION SUMMARY
============================================================
API Mode: ADMIN (full metadata available)
Sink: warehouse (wh_integration)
------------------------------------------------------------
workspace_items              [OK] 1234 rows
...

Or for non-admin:

============================================================
EXECUTION SUMMARY
============================================================
API Mode: NON-ADMIN (limited metadata - some columns will be NULL)
Sink: warehouse (wh_integration)
------------------------------------------------------------
workspace_items              [OK] 567 rows
...

Data Differences by Mode

Column	Admin Mode	Non-Admin Mode
`guid`, `name`, `type`, `description`	Available	Available
`workspace_id`, `workspace_name`	Available	Available
`state`	Available	NULL
`last_updated_date`	Available	NULL
`capacity_id`	Available	NULL
`creator_id`	Available	NULL
`creator_name`	Available	NULL
`creator_type`	Available	NULL
`creator_upn`	Available	NULL

Why Non-Admin Mode Exists

Non-admin mode enables teams without tenant admin rights to still benefit from the inventory system. While the data is less complete, it's sufficient for:

Basic artifact tracking within accessible workspaces
CI/CD parameterization for workspaces you manage
Workspace role assignment auditing (uses non-admin API)

Usage

1. Import and Configure the Notebook

Upload nb_000_fabric_pathfinder.ipynb to your Fabric workspace. Configure the authentication and sink sections:

# Authentication mode: "user" for interactive, "spn" for service principal
auth_mode = "user"

# Sink type: "warehouse" or "lakehouse"
SINK_TYPE = "warehouse"

# For SPN mode, configure Key Vault secrets:
KEY_VAULT_NAME = "your-keyvault-name"

2. Run the Notebook

Execute all cells or call refresh_all() directly. The notebook will:

Check admin API access (falls back to non-admin if unavailable)
Create the warehouse/lakehouse if it doesn't exist
Fetch data from Fabric REST APIs
Write to all 8 inventory tables (full overwrite)
Display execution summary with API mode and results

Typical runtime: 2-10 minutes depending on tenant size and API mode.

3. Create the Views

Open the warehouse/lakehouse in Fabric and execute fabric_pathfinder_views.sql. This creates:

vw_env_map_connections
vw_env_map_sql_endpoints
vw_env_map_workspace_items
vw_env_map_workspaces
vw_env_map_all (unified view)

4. Query Environment Mappings

-- All mappings
SELECT * FROM dbo.vw_env_map_all;

-- Only artifacts present in all three environments
SELECT * FROM dbo.vw_env_map_all
WHERE id_uat IS NOT NULL
  AND id_prod IS NOT NULL;

-- Find artifacts missing from PROD
SELECT * FROM dbo.vw_env_map_all
WHERE id_prod IS NULL;

-- Specific item types
SELECT * FROM dbo.vw_env_map_all
WHERE item_type = 'Lakehouse';

Naming Convention Requirements

The views rely on consistent naming conventions to identify environments. If your naming differs, you must modify the views.

Workspaces

Environment	Pattern	Example
DEV	`<Name> - DEV`	`DATA Sales - DEV`
UAT	`<Name> - UAT`	`DATA Sales - UAT`
PROD	`<Name>` (no suffix)	`DATA Sales`

Additionally, workspaces must start with DATA to be included in vw_env_map_workspaces.

Connections

Environment	Pattern	Example
DEV	`conn_<name>_dev`	`conn_sales_dev`
UAT	`conn_<name>_uat`	`conn_sales_uat`
PROD	`conn_<name>_prd`	`conn_sales_prd`

Connections must start with conn_ to be included.

Workspace Items (Lakehouses, Warehouses, etc.)

Items are matched by name and type across workspaces. Environment detection uses the workspace name pattern (same as above).

Example: A Lakehouse named lh_sales in workspace DATA Sales - DEV maps to lh_sales in DATA Sales - UAT and DATA Sales.

SQL Endpoints

SQL endpoint connection strings are matched by item name and type across workspaces, using the same workspace naming pattern.

Customizing the Views

If your naming conventions differ, modify the views in fabric_pathfinder_views.sql.

Changing Workspace Patterns

Find the environment detection logic in each view:

-- Current: workspace ends with ' - DEV'
WHERE UPPER(workspace_name) LIKE '% - DEV'

-- Example: workspace contains '_DEV_'
WHERE UPPER(workspace_name) LIKE '%\_DEV\_%' ESCAPE '\'

Changing Connection Patterns

Find the connection suffix logic in vw_env_map_connections:

-- Current: connection ends with '_dev', '_uat', '_prd'
WHERE UPPER(display_name) LIKE 'CONN\_%' ESCAPE '\'
  AND UPPER(display_name) LIKE '%\_DEV' ESCAPE '\'

-- Example: connection ends with '.dev', '.uat', '.prod'
WHERE UPPER(display_name) LIKE 'CONN\_%' ESCAPE '\'
  AND UPPER(display_name) LIKE '%.DEV'

Changing Included Item Types

Find the item type filter in vw_env_map_workspace_items:

d.type IN (
    'Lakehouse',
    'Warehouse',
    'SQLEndpoint',
    'DataPipeline',
    'Report',
    'SemanticModel',
    'Reflex'
)

Add or remove types as needed.

Changing Exclusions

The vw_env_map_workspace_items view excludes certain items. Modify these patterns:

-- Exclude auto-generated usage metrics reports
AND UPPER(d.name) NOT LIKE '%USAGE METRICS%'

-- Exclude specific sample items by name
AND d.name NOT IN (
    'Potential Outage Detection for Orders',
    'Example'
)

Changing Notebook Inclusion Patterns

Only notebooks matching specific patterns are included:

OR UPPER(d.name) LIKE 'NB\_100%' ESCAPE '\'
OR UPPER(d.name) LIKE 'NB\_400%' ESCAPE '\'

Modify to match your notebook naming convention, or remove to exclude all notebooks.

Authentication

The notebook supports two authentication modes:

User Mode (Default)

auth_mode = "user"

Uses the identity of the user running the notebook. For full functionality, requires Fabric tenant administrator rights.

Service Principal Mode (Recommended for Production)

auth_mode = "spn"

KEY_VAULT_NAME = "your-keyvault-name"
KEY_VAULT_URL = f"https://{KEY_VAULT_NAME}.vault.azure.net/"

TENANT_ID = notebookutils.credentials.getSecret(KEY_VAULT_URL, "aad-tenant-id")
SP_CLIENT_ID = notebookutils.credentials.getSecret(KEY_VAULT_URL, "fabric-spn-client-id")
SP_CLIENT_SECRET = notebookutils.credentials.getSecret(KEY_VAULT_URL, "fabric-spn-client-secret")

Required tenant settings for full admin access:

Service principals can use Fabric APIs
Service principals can access read-only admin APIs

The SPN's security group must be added to both settings.

Why SPN is recommended:

Consistent, automated execution
No dependency on individual user availability
Full admin API access when properly configured
Required for scheduled/automated inventory refreshes

Troubleshooting

Error	Cause	Solution
`401 Unauthorized`	Token expired or insufficient permissions	Check auth_mode and refresh the notebook
`403 Forbidden` on admin API	User/SPN lacks admin rights	Configure admin access or accept non-admin mode
`429 Too Many Requests`	Rate limit exceeded	Reduce `REQUESTS_PER_MINUTE` or wait
Empty `vw_env_map_*` results	Naming conventions don't match	Verify workspace/connection names match expected patterns
NULL values in `id_uat`/`id_prod`	Artifact missing in that environment	Create the artifact or exclude from mapping
NULL creator/state columns	Running in non-admin mode	Configure admin API access for full metadata
Warehouse/Lakehouse creation timeout	Capacity overloaded	Retry or use existing item
`API Mode: NON-ADMIN` in summary	Admin API access unavailable	Check tenant settings or user permissions

Limitations

Full overwrite only: All tables are completely replaced on each run. No incremental updates.
Point-in-time snapshot: Data reflects tenant state at execution time only.
Naming convention dependency: Views require consistent naming to match artifacts across environments.
DEV is canonical: Artifacts existing only in UAT/PROD are not captured.
No secrets storage: Connection strings are stored in plain text. Use Variable Libraries for sensitive values.
Non-admin data gaps: Without admin access, creator metadata, state, and timestamps are unavailable.
Non-admin workspace scope: Without admin access, only workspaces the identity can access are inventoried.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
README.md		README.md
claude.md		claude.md
fabric_pathfinder_views.sql		fabric_pathfinder_views.sql
nb_000_fabric_pathfinder.ipynb		nb_000_fabric_pathfinder.ipynb

Folders and files

Latest commit

History

Repository files navigation

Fabric Pathfinder

TL;DR

Table of Contents

Why This Exists

Vacation and Offboarding Coverage

What's New

Lakehouse Sink Support

Non-Admin Fallback

Prerequisites

For Full Functionality (Recommended)

For Limited Functionality (Non-Admin)

Additional Requirements

Files

How It Works

Step 1: Notebook Populates Inventory Tables

Step 2: Views Map Artifacts Across Environments

Sink Type Selection

Warehouse Mode (Default)

Lakehouse Mode

Admin vs Non-Admin Mode

Execution Summary Example

Data Differences by Mode

Why Non-Admin Mode Exists

Usage

1. Import and Configure the Notebook

2. Run the Notebook

3. Create the Views

4. Query Environment Mappings

Naming Convention Requirements

Workspaces

Connections

Workspace Items (Lakehouses, Warehouses, etc.)

SQL Endpoints

Customizing the Views

Changing Workspace Patterns

Changing Connection Patterns

Changing Included Item Types

Changing Exclusions

Changing Notebook Inclusion Patterns

Authentication

User Mode (Default)

Service Principal Mode (Recommended for Production)

Troubleshooting

Limitations

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages