LATEST UPDATE v1.7.0 (January 28, 2026):
New Unified Catalog APIs - Analytics & Visualization
- [NEW] List Hierarchy Terms - Interactive tree visualization of glossary structure
- [NEW] Get Term Facets - Statistics and filters for glossary terms
- [NEW] Get CDE Facets - Compliance dashboards (GDPR, HIPAA, SOC2)
- [NEW] Get Data Product Facets - Analytics for data product portfolios
- [NEW] Get Objective Facets - OKR dashboards with health metrics
- [NEW] List Related Entities - Complete relationship exploration
- [IMPROVED] UC API Coverage increased from 81% to 96% (+15%)
- [ADDED] Rich UI with trees, tables, and color-coded outputs
- [DOCS] Comprehensive guides and API coverage analysis
Full Release Notes v1.7.0 | New APIs Guide | API Coverage Analysis
Previous Update v1.7.0 (January 27, 2026):
Collections Permissions Documentation & Diagnostics
- [NEW] Comprehensive permissions guides (English & French)
- [NEW] Automated diagnostic tools (PowerShell & Python)
- [ADDED] HTTP 403 troubleshooting documentation
PVW CLI v1.7.0 is a modern, full-featured command-line interface and Python library for Microsoft Purview. It enables automation and management of all major Purview APIs with 96% Unified Catalog API coverage (46 of 48 operations).
Unified Catalog (UC) Management - 96% Complete ⭐ NEW
- [NEW] Glossary hierarchy visualization with interactive tree views
- [NEW] Facets & analytics for terms, CDEs, data products, and objectives
- [NEW] Complete relationship exploration for terms
- Complete governance domains, glossary terms, data products, OKRs, CDEs
- Relationships API - Link data products/CDEs/terms to entities and columns
- Query APIs - Advanced OData filtering with multi-criteria search
- Policy Management - Complete CRUD for governance and RBAC policies
- Custom Metadata & Attributes - Extensible business metadata and attributes
Data Operations
- Entity management (create, update, bulk, import/export)
- Lineage operations with interactive creation and CSV import
- Advanced search and discovery with fixed suggest/autocomplete
- Business metadata with proper scope configuration
Collections Management - 100% Spec Compliant
- Full collection CRUD operations with proper API conformance
- Hierarchy and tree operations for collection navigation
- Permission management for collection access control
- Analytics for collection usage and asset tracking
Automation & Scripting
- Bulk Operations - Import/export from CSV/JSON with dry-run support
- Scriptable Output - Multiple formats (table, json, jsonc) for PowerShell/bash
- 80+ usage examples and 15+ comprehensive guides
- PowerShell integration with ConvertFrom-Json support
Legacy API Support
- Account management with full API compatibility
- Data product management (legacy operations)
- Classification, label, and status management
The CLI is designed for data engineers, stewards, architects, and platform teams to automate, scale, and enhance their Microsoft Purview experience.
[NEW] Model Context Protocol (MCP) server enables LLM-powered data governance workflows!
- Natural language interface to Purview catalog
- 20+ tools for AI assistants (Claude, Cline, etc.)
- Automate complex multi-step operations
- See
mcp/README.mdfor setup instructions
For detailed information about previous releases, see the Full Release Archive.
Latest Release: v1.7.0 (January 28, 2026)
Previous Release: v1.6.2 (January 27, 2026)
| Category | Coverage | Count | Status |
|---|---|---|---|
| Glossary Terms | 100% | 9/9 | ✅ Complete |
| Domains | 100% | 5/5 | ✅ Complete |
| Data Products | 100% | 8/8 | ✅ Complete |
| Critical Data Elements (CDE) | 100% | 8/8 | ✅ Complete |
| Objectives (OKRs) | 100% | 6/6 | ✅ Complete |
| Key Results | 100% | 5/5 | ✅ Complete |
| Policies | 100% | 4/4 | ✅ Complete |
| Facets & Analytics | 100% | 4/4 | ✅ Complete |
| Relationships | 100% | 3/3 | ✅ Complete |
| Hierarchy | 100% | 1/1 | ✅ Complete |
| TOTAL UC | 96% | 46/48 | 🎯 Production Ready |
-
List Hierarchy Terms (NEW)
# Interactive tree view of glossary hierarchy pvw uc term hierarchy --output tree # Filter by domain with max depth control pvw uc term hierarchy --domain-id <domain-guid> --max-depth 3 --output table
-
Get Term Facets (NEW)
# Statistics and filters for glossary terms pvw uc term facets --output table # JSON export for automation pvw uc term facets --output json
-
Get CDE Facets (NEW)
# Compliance dashboards (GDPR, HIPAA, SOC2) pvw uc cde facets --domain-id <domain-guid> --output table # See color-coded compliance summary pvw uc cde facets --facet-fields "criticality,compliance_status"
-
Get Data Product Facets (NEW)
# Analytics for data product portfolios pvw uc dataproduct facets --output table # Filter by domain pvw uc dataproduct facets --domain-id <domain-guid> --output json
-
Get Objective Facets (NEW)
# OKR dashboards with health metrics pvw uc objective facets --output table # JSON export for dashboards pvw uc objective facets --output json
-
List Related Entities (NEW)
# Complete relationship exploration for terms pvw uc term relationships --term-id <term-guid> --output table # Filter by relationship type (Synonym, Related, Parent) pvw uc term relationships --term-id <term-guid> --relationship-type "Synonym"
Follow this short flow to get PVW CLI installed and running quickly.
- Install (from PyPI):
pip install pvw-cliFor the bleeding edge or development:
pip install git+https://github.com/Keayoub/Purview_cli.git
# or for editable development
git clone https://github.com/Keayoub/Purview_cli.git
cd Purview_cli
pip install -r requirements.txt
pip install -e .- Set required environment variables (examples for cmd, PowerShell, and pwsh)
Windows cmd (example):
set PURVIEW_ACCOUNT_NAME=your-purview-account
set PURVIEW_ACCOUNT_ID=your-purview-account-id-guid
set PURVIEW_RESOURCE_GROUP=your-resource-group-name
set AZURE_REGION= # optionalPowerShell (Windows PowerShell):
$env:PURVIEW_ACCOUNT_NAME = "your-purview-account"
$env:PURVIEW_ACCOUNT_ID = "your-purview-account-id-guid"
$env:PURVIEW_RESOURCE_GROUP = "your-resource-group-name"
$env:AZURE_REGION = "" # optionalpwsh (PowerShell Core - cross-platform, recommended):
$env:PURVIEW_ACCOUNT_NAME = 'your-purview-account'
$env:PURVIEW_ACCOUNT_ID = 'your-purview-account-id-guid'
$env:PURVIEW_RESOURCE_GROUP = 'your-resource-group-name'
$env:AZURE_REGION = '' # optional- Authenticate
- Run
az login(recommended), or - Provide Service Principal credentials via environment variables.
Important for Legacy Tenants:
Some Azure environments use the legacy Purview service principal (https://purview.azure.net) instead of the current one (https://purview.azure.com). If you encounter authentication errors like:
AADSTS500011: The resource principal named https://purview.azure.com was not found in the tenant
You need to detect and set the correct authentication scope:
Step 1: Detect your tenant's Purview service principal
# Check which service principal your tenant uses
az ad sp show --id "73c2949e-da2d-457a-9607-fcc665198967" --query "servicePrincipalNames" -o jsonLook for one of these values:
https://purview.azure.comorhttps://purview.azure.com/→ Use.com(default)https://purview.azure.netorhttps://purview.azure.net/→ Use.net(legacy)
Step 2: Set the authentication scope (if using legacy .net)
If your tenant uses the legacy service principal, set this environment variable:
# PowerShell
$env:PURVIEW_AUTH_SCOPE = "https://purview.azure.net/.default"
# Or add to your profile for persistence
Add-Content $PROFILE "`n`$env:PURVIEW_AUTH_SCOPE = 'https://purview.azure.net/.default'"# Bash/Linux
export PURVIEW_AUTH_SCOPE="https://purview.azure.net/.default"
# Or add to ~/.bashrc for persistence
echo 'export PURVIEW_AUTH_SCOPE="https://purview.azure.net/.default"' >> ~/.bashrc# Windows CMD
set PURVIEW_AUTH_SCOPE=https://purview.azure.net/.defaultNote: Most modern Azure tenants use https://purview.azure.com (default), but some legacy or special environments (test, government clouds) may still use https://purview.azure.net. Always verify using the command above if you encounter authentication issues.
- Try a few commands:
# List governance domains
pvw uc domain list
# Search
pvw search query --keywords="customer" --limit=5
# Get help
pvw --help
pvw uc --helpFor more advanced usage, see the documentation in doc/ or the project docs: https://pvw-cli.readthedocs.io/
# Create a new collection
pvw collections create \
--name "Data Engineering" \
--friendly-name "Data Engineering Team" \
--description "Collection for DE team assets"
# List collection hierarchy
pvw collections read-hierarchy --collection-name "Data Engineering"
# Update collection
pvw collections update \
--name "Data Engineering" \
--friendly-name "Data Engineering (Updated)"
# Manage collection permissions
pvw collections read-permissions --collection-name "Data Engineering"# Create column-level lineage
pvw lineage create-column \
--process-name "ETL_Sales_Transform" \
--source-table-guid "9ebbd583-4987-4d1b-b4f5-d8f6f6f60000" \
--target-table-guids "c88126ba-5fb5-4d33-bbe2-5ff6f6f60000" \
--column-mapping "ProductID:ProductID,Name:Name"
# Import lineage from CSV
pvw lineage import samples/csv/lineage_with_columns.csv
# List column lineages
pvw lineage list-column --format table# Link data product to entity
pvw uc dataproduct link-entity \
--id "dp-sales-2024" \
--entity-id "4fae348b-e960-42f7-834c-38f6f6f60000" \
--type-name "azure_sql_table"
# Link CDE to specific column
pvw uc cde link-entity \
--id "cde-customer-email" \
--entity-id "ea3412c3-7387-4bc1-9923-11f6f6f60000" \
--column-qualified-name "mssql://server/db/schema/table#EmailAddress"
# Query terms by domain
pvw uc term query --domain-ids "finance,sales" --status Approved --top 50# List all policies
pvw uc policy list
# Create policy
pvw uc policy create --payload-file policy-rbac.json
# Import business metadata
pvw uc custom-metadata import --file business_concept.csv
# Add metadata to entity
pvw uc custom-metadata add \
--guid "4fae348b-e960-42f7-834c-38f6f6f60000" \
--name "BusinessConcept" \
--attributes '{"Department":"Sales"}'You can install PVW CLI in two ways:
-
From PyPI (recommended for most users):
pip install pvw-cli
-
Directly from the GitHub repository (for latest/dev version):
pip install git+https://github.com/Keayoub/Purview_cli.git
Or for development (editable install):
git clone https://github.com/Keayoub/Purview_cli.git
cd Purview_cli
pip install -r requirements.txt
pip install -e .- Python 3.8+
- Azure CLI (
az login) or Service Principal credentials - Microsoft Purview account
-
Install
pip install pvw-cli
-
Set Required Environment Variables
# Required for Purview API access set PURVIEW_ACCOUNT_NAME=your-purview-account set PURVIEW_ACCOUNT_ID=your-purview-account-id-guid set PURVIEW_RESOURCE_GROUP=your-resource-group-name # Optional set AZURE_REGION= # (optional, e.g. 'china', 'usgov')
-
Authenticate
-
Azure CLI:
az login -
Or set Service Principal credentials as environment variables
-
-
Run a Command
pvw search query --keywords="customer" --limit=5 -
See All Commands
pvw --help
PVW CLI supports multiple authentication methods for connecting to Microsoft Purview, powered by Azure Identity's DefaultAzureCredential. This allows you to use the CLI securely in local development, CI/CD, and production environments.
- Run
az loginto authenticate interactively with your Azure account. - The CLI will automatically use your Azure CLI credentials.
Set the following environment variables before running any PVW CLI command:
AZURE_CLIENT_ID(your Azure AD app registration/client ID)AZURE_TENANT_ID(your Azure AD tenant ID)AZURE_CLIENT_SECRET(your client secret)
Example (Windows):
set AZURE_CLIENT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
set AZURE_TENANT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
set AZURE_CLIENT_SECRET=your-client-secretExample (Linux/macOS):
export AZURE_CLIENT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export AZURE_TENANT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export AZURE_CLIENT_SECRET=your-client-secretIf running in Azure with a managed identity, no extra configuration is needed. The CLI will use the managed identity automatically.
If you are signed in to Azure in Visual Studio or VS Code, DefaultAzureCredential can use those credentials as a fallback.
Note:
- The CLI will try all supported authentication methods in order. The first one that works will be used.
- For most automation and CI/CD scenarios, service principal authentication is recommended.
- For local development, Azure CLI authentication is easiest.
For more details, see the Azure Identity documentation.
PVW CLI supports multiple output formats to fit different use cases - from human-readable tables to machine-parseable JSON.
All list commands now support the --output parameter with three formats:
table(default) - Rich formatted table with colors for human viewingjson- Plain JSON for scripting with PowerShell, bash, jq, etc.jsonc- Colored JSON with syntax highlighting for viewing
The --output json format produces plain JSON that works perfectly with PowerShell's ConvertFrom-Json:
# Get all terms as PowerShell objects
$domainId = "59ae27b5-40bc-4c90-abfe-fe1a0638fe3a"
$terms = py -m purviewcli uc term list --domain-id $domainId --output json | ConvertFrom-Json
# Access properties
Write-Host "Found $($terms.Count) terms"
foreach ($term in $terms) {
Write-Host " • $($term.name) - $($term.status)"
}
# Filter and export
$draftTerms = $terms | Where-Object { $_.status -eq "Draft" }
$draftTerms | Export-Csv -Path "draft_terms.csv" -NoTypeInformation
# Group by status
$terms | Group-Object status | Format-Table Count, NameUse jq for JSON processing in bash:
# Get domain ID
DOMAIN_ID="59ae27b5-40bc-4c90-abfe-fe1a0638fe3a"
# Get term names only
pvw uc term list --domain-id $DOMAIN_ID --output json | jq -r '.[] | .name'
# Count terms
pvw uc term list --domain-id $DOMAIN_ID --output json | jq 'length'
# Filter by status
pvw uc term list --domain-id $DOMAIN_ID --output json | jq '.[] | select(.status == "Draft")'
# Group by status
pvw uc term list --domain-id $DOMAIN_ID --output json | jq 'group_by(.status) | map({status: .[0].status, count: length})'
# Save to file
pvw uc term list --domain-id $DOMAIN_ID --output json > terms.json# Domains
pvw uc domain list --output json | jq '.[] | .name'
# Terms
pvw uc term list --domain-id "abc-123" --output json
pvw uc term list --domain-id "abc-123" --output table # Default
pvw uc term list --domain-id "abc-123" --output jsonc # Colored for viewing
# Data Products
pvw uc dataproduct list --domain-id "abc-123" --output jsonOld (deprecated):
pvw uc term list --domain-id "abc-123" --jsonNew (recommended):
pvw uc term list --domain-id "abc-123" --output json # Plain JSON for scripting
pvw uc term list --domain-id "abc-123" --output jsonc # Colored JSON (old behavior)Before using PVW CLI, you need to set three essential environment variables. Here's how to find them:
- This is your Purview account name as it appears in Azure Portal
- Example:
kaydemopurview
-
This is the GUID that identifies your Purview account for Unified Catalog APIs
-
Important: For most Purview deployments, this is your Azure Tenant ID
-
Method 1 - Get your Tenant ID (recommended):
Bash/Command Prompt:
az account show --query tenantId -o tsv
PowerShell:
az account show --query tenantId -o tsv # Or store directly in environment variable: $env:PURVIEW_ACCOUNT_ID = az account show --query tenantId -o tsv
-
Method 2 - Azure CLI (extract from Atlas endpoint):
az purview account show --name YOUR_ACCOUNT_NAME --resource-group YOUR_RG --query endpoints.catalog -o tsv
Extract the GUID from the URL (before
-api.purview-service.microsoft.com) -
Method 3 - Azure Portal:
- Go to your Purview account in Azure Portal
- Navigate to Properties → Atlas endpoint URL
- Extract GUID from:
https://GUID-api.purview-service.microsoft.com/catalog
- The Azure resource group containing your Purview account
- Example:
fabric-artifacts
Windows Command Prompt:
set PURVIEW_ACCOUNT_NAME=your-purview-account
set PURVIEW_ACCOUNT_ID=your-purview-account-id
set PURVIEW_RESOURCE_GROUP=your-resource-groupWindows PowerShell:
$env:PURVIEW_ACCOUNT_NAME="your-purview-account"
$env:PURVIEW_ACCOUNT_ID="your-purview-account-id"
$env:PURVIEW_RESOURCE_GROUP="your-resource-group"Linux/macOS:
export PURVIEW_ACCOUNT_NAME=your-purview-account
export PURVIEW_ACCOUNT_ID=your-purview-account-id
export PURVIEW_RESOURCE_GROUP=your-resource-groupPermanent (Windows Command Prompt):
setx PURVIEW_ACCOUNT_NAME "your-purview-account"
setx PURVIEW_ACCOUNT_ID "your-purview-account-id"
setx PURVIEW_RESOURCE_GROUP "your-resource-group"Permanent (Windows PowerShell):
[Environment]::SetEnvironmentVariable("PURVIEW_ACCOUNT_NAME", "your-purview-account", "User")
[Environment]::SetEnvironmentVariable("PURVIEW_ACCOUNT_ID", "your-purview-account-id", "User")
[Environment]::SetEnvironmentVariable("PURVIEW_RESOURCE_GROUP", "your-resource-group", "User")If you experience issues with environment variables between different terminals, use these debug commands:
Command Prompt/Bash:
# Run this to check your current environment
python -c "
import os
print('PURVIEW_ACCOUNT_NAME:', os.getenv('PURVIEW_ACCOUNT_NAME'))
print('PURVIEW_ACCOUNT_ID:', os.getenv('PURVIEW_ACCOUNT_ID'))
print('PURVIEW_RESOURCE_GROUP:', os.getenv('PURVIEW_RESOURCE_GROUP'))
"PowerShell:
# Check environment variables in PowerShell
python -c "
import os
print('PURVIEW_ACCOUNT_NAME:', os.getenv('PURVIEW_ACCOUNT_NAME'))
print('PURVIEW_ACCOUNT_ID:', os.getenv('PURVIEW_ACCOUNT_ID'))
print('PURVIEW_RESOURCE_GROUP:', os.getenv('PURVIEW_RESOURCE_GROUP'))
"
# Or use PowerShell native commands
Write-Host "PURVIEW_ACCOUNT_NAME: $env:PURVIEW_ACCOUNT_NAME"
Write-Host "PURVIEW_ACCOUNT_ID: $env:PURVIEW_ACCOUNT_ID"
Write-Host "PURVIEW_RESOURCE_GROUP: $env:PURVIEW_RESOURCE_GROUP"The PVW CLI provides advanced search using the latest Microsoft Purview Discovery Query API:
- Search for assets, tables, files, and more with flexible filters
- Use autocomplete and suggestion endpoints
- Perform faceted, time-based, and entity-type-specific queries
v1.6.2 Enhancements:
- Collections API now 100% conformant with Microsoft Purview specification
- Improved search result caching and performance
- Enhanced error handling and diagnostics
- All search commands validated and working correctly (query, browse, suggest, find-table)
# 1. Table Format (Default) - Quick overview
pvw search query --keywords="customer" --limit=5
# → Clean table with Name, Type, Collection, Classifications, Qualified Name
# 2. Detailed Format - Human-readable with all metadata
pvw search query --keywords="customer" --limit=5 --detailed
# → Rich panels showing full details, timestamps, search scores
# 3. JSON Format - Complete technical details with syntax highlighting (WELL-FORMATTED)
pvw search query --keywords="customer" --limit=5 --json
# → Full JSON response with indentation, line numbers and color coding
# 4. Table with IDs - For entity operations
pvw search query --keywords="customer" --limit=5 --show-ids
# → Table format + entity GUIDs for copy/paste into update commands# Basic search for assets with keyword 'customer'
pvw search query --keywords="customer" --limit=5
# Advanced search with classification filter
pvw search query --keywords="sales" --classification="PII" --objectType="Tables" --limit=10
# Pagination through large result sets
pvw search query --keywords="SQL" --offset=10 --limit=5
# Autocomplete suggestions for partial keyword
pvw search autocomplete --keywords="ord" --limit=3
# Get search suggestions (fuzzy matching)
pvw search suggest --keywords="prod" --limit=2
**IMPORTANT - Command Line Quoting:**
```cmd
# [OK] CORRECT - Use quotes around keywords
pvw search query --keywords="customer" --limit=5
# [OK] CORRECT - For wildcard searches, use quotes
pvw search query --keywords="*" --limit=5
# ❌ WRONG - Don't use unquoted * (shell expands to file names)
pvw search query --keywords=* --limit=5
# This causes: "Error: Got unexpected extra arguments (dist doc ...)"# Faceted search with aggregation
pvw search query --keywords="finance" --facetFields="objectType,classification" --limit=5
# Browse entities by type and path
pvw search browse --entityType="Tables" --path="/root/finance" --limit=2
# Time-based search for assets created after a date
pvw search query --keywords="audit" --createdAfter="2024-01-01" --limit=1
# Entity type specific search
pvw search query --keywords="finance" --entityTypes="Files,Tables" --limit=2- Daily browsing: Use default table format for quick scans
- Understanding assets: Use
--detailedfor rich information panels - Technical work: Use
--jsonfor complete API data access - Entity operations: Use
--show-idsto get GUIDs for updates
from purviewcli.client._search import Search
search = Search()
args = {"--keywords": "customer", "--limit": 5}
search.searchQuery(args)
print(search.payload) # Shows the constructed search payloadSee tests/test_search_examples.py for ready-to-run pytest examples covering all search scenarios:
- Basic query
- Advanced filter
- Autocomplete
- Suggest
- Faceted search
- Browse
- Time-based search
- Entity type search
PVW CLI now includes comprehensive Microsoft Purview Unified Catalog (UC) support with the new uc command group. This provides complete management of modern data governance features including governance domains, glossary terms, data products, objectives (OKRs), and critical data elements.
🎯 Feature Parity: Full compatibility with UnifiedCatalogPy functionality.
See doc/commands/unified-catalog.md for complete documentation and examples.
# List all governance domains
pvw uc domain list
# Create a new governance domain
pvw uc domain create --name "Finance" --description "Financial data governance domain"
# Get domain details
pvw uc domain get --domain-id "abc-123-def-456"
# Update domain information
pvw uc domain update --domain-id "abc-123" --description "Updated financial governance"# List all terms in a domain
pvw uc term list --domain-id "abc-123"
pvw uc term list --domain-id "abc-123" --output json # Plain JSON for scripting
pvw uc term list --domain-id "abc-123" --output jsonc # Colored JSON for viewing
# Create a single glossary term
pvw uc term create --name "Customer" --domain-id "abc-123" --description "A person or entity that purchases products"
# Get term details
pvw uc term show --term-id "term-456"
# Update term
pvw uc term update --term-id "term-456" --description "Updated description"
# Delete term
pvw uc term delete --term-id "term-456" --confirm📦 Bulk Import (NEW)
Import multiple terms from CSV or JSON files with validation and progress tracking:
# CSV Import - Preview with dry-run
pvw uc term import-csv --csv-file "samples/csv/uc_terms_bulk_example.csv" --domain-id "abc-123" --dry-run
# CSV Import - Actual import
pvw uc term import-csv --csv-file "samples/csv/uc_terms_bulk_example.csv" --domain-id "abc-123"
# JSON Import - Preview with dry-run
pvw uc term import-json --json-file "samples/json/term/uc_terms_bulk_example.json" --dry-run
# JSON Import - Actual import (domain_id from JSON or override with flag)
pvw uc term import-json --json-file "samples/json/term/uc_terms_bulk_example.json"
pvw uc term import-json --json-file "samples/json/term/uc_terms_bulk_example.json" --domain-id "abc-123"Bulk Import Features:
- [OK] Import from CSV or JSON files
- [OK] Dry-run mode to preview before importing
- [OK] Support for multiple owners (Entra ID Object IDs), acronyms, and resources
- [OK] Progress tracking with Rich console output
- [OK] Detailed error messages and summary reports
- [OK] Sequential POST requests (no native bulk endpoint available)
CSV Format Example:
name,description,status,acronym,owner_id,resource_name,resource_url
Customer Acquisition Cost,Cost to acquire new customer,Draft,CAC,<guid>,Metrics Guide,https://docs.example.com
Monthly Recurring Revenue,Predictable monthly revenue,Draft,MRR,<guid>,Finance Dashboard,https://finance.example.comJSON Format Example:
{
"terms": [
{
"name": "Data Lake",
"description": "Centralized repository for structured/unstructured data",
"domain_id": "your-domain-id-here",
"status": "Draft",
"acronyms": ["DL"],
"owner_ids": ["<entra-id-object-id-guid>"],
"resources": [{"name": "Architecture Guide", "url": "https://example.com"}]
}
]
}Important Notes:
⚠️ Owner IDs must be Entra ID Object IDs (GUIDs), not email addresses⚠️ Terms cannot be "Published" in unpublished domains - use "Draft" status- [OK] Sample files available:
samples/csv/uc_terms_bulk_example.csv,samples/json/term/uc_terms_bulk_example.json - 📖 Complete documentation:
doc/commands/unified-catalog/term-bulk-import.md
🗑️ Bulk Delete (NEW)
Delete all terms in a domain using PowerShell or Python scripts:
# PowerShell - Delete all terms with confirmation
.\scripts\delete-all-uc-terms.ps1 -DomainId "abc-123"
# PowerShell - Delete without confirmation
.\scripts\delete-all-uc-terms.ps1 -DomainId "abc-123" -Force# Python - Delete all terms with confirmation
python scripts/delete_all_uc_terms_v2.py --domain-id "abc-123"
# Python - Delete without confirmation
python scripts/delete_all_uc_terms_v2.py --domain-id "abc-123" --forceBulk Delete Features:
- [OK] Interactive confirmation prompts (type "DELETE" to confirm)
- [OK] Beautiful progress display with colors
- [OK] Success/failure tracking per term
- [OK] Detailed summary reports
- [OK] Rate limiting (200ms delay between deletes)
- [OK] Graceful error handling and Ctrl+C support
# List all data products in a domain
pvw uc dataproduct list --domain-id "abc-123"
# Create a comprehensive data product
pvw uc dataproduct create \
--name "Customer Analytics Dashboard" \
--domain-id "abc-123" \
--description "360-degree customer analytics with behavioral insights" \
--type Analytical \
--status Draft
# Get detailed data product information
pvw uc dataproduct show --product-id "prod-789"
# Update data product (partial updates supported - only specify fields to change)
pvw uc dataproduct update \
--product-id "prod-789" \
--status Published \
--description "Updated comprehensive customer analytics" \
--endorsed
# Update multiple fields at once
pvw uc dataproduct update \
--product-id "prod-789" \
--status Published \
--update-frequency Monthly \
--endorsed
# Delete a data product (with confirmation)
pvw uc dataproduct delete --product-id "prod-789"
# Delete without confirmation prompt
pvw uc dataproduct delete --product-id "prod-789" --yes# List objectives for a domain
pvw uc objective list --domain-id "abc-123"
# Create measurable objectives
pvw uc objective create \
--definition "Improve data quality score by 25% within Q4" \
--domain-id "abc-123" \
--target-value "95" \
--measurement-unit "percentage"
# Track objective progress
pvw uc objective update \
--objective-id "obj-456" \
--domain-id "abc-123" \
--current-value "87" \
--status "in-progress"# List critical data elements
pvw uc cde list --domain-id "abc-123"
# Define critical data elements with governance rules
pvw uc cde create \
--name "Social Security Number" \
--data-type "String" \
--domain-id "abc-123" \
--classification "PII" \
--retention-period "7-years"
# Associate CDEs with data assets
pvw uc cde link \
--cde-id "cde-789" \
--domain-id "abc-123" \
--asset-id "ea3412c3-7387-4bc1-9923-11f6f6f60000"Monitor governance health and get automated recommendations to improve your data governance posture.
# List all health findings and recommendations
pvw uc health query
# Filter by severity
pvw uc health query --severity High
pvw uc health query --severity Medium
# Filter by status
pvw uc health query --status NotStarted
pvw uc health query --status InProgress
# Get detailed information about a specific health action
pvw uc health show --action-id "5ea3fc78-6a77-4098-8779-ed81de6f87c9"
# Update health action status
pvw uc health update \
--action-id "5ea3fc78-6a77-4098-8779-ed81de6f87c9" \
--status InProgress \
--reason "Working on assigning glossary terms to data products"
# Get health summary statistics
pvw uc health summary
# Output health findings in JSON format
pvw uc health query --jsonHealth Finding Types:
- Missing glossary terms on data products (High)
- Data products without OKRs (Medium)
- Missing data quality scores (Medium)
- Classification gaps on data assets (Medium)
- Description quality issues (Medium)
- Business domains without critical data entities (Medium)
Manage approval workflows and business process automation in Purview.
# List all workflows
pvw workflow list
# Get workflow details
pvw workflow get --workflow-id "workflow-123"
# Create a new workflow (requires JSON definition)
pvw workflow create --workflow-id "approval-flow-1" --payload-file workflow-definition.json
# Execute a workflow
pvw workflow execute --workflow-id "workflow-123"
# List workflow executions
pvw workflow executions --workflow-id "workflow-123"
# View specific execution details
pvw workflow execution-details --workflow-id "workflow-123" --execution-id "exec-456"
# Update workflow configuration
pvw workflow update --workflow-id "workflow-123" --payload-file updated-workflow.json
# Delete a workflow
pvw workflow delete --workflow-id "workflow-123"
# Output workflows in JSON format
pvw workflow list --jsonWorkflow Use Cases:
- Data access request approvals
- Glossary term certification workflows
- Data product publishing approvals
- Classification review processes
# 1. Discover assets to govern
pvw search query --keywords="customer" --detailed
# 2. Create governance domain for discovered assets
pvw uc domain create --name "Customer Data" --description "Customer information governance"
# 3. Define governance terms
pvw uc term create --name "Customer PII" --domain-id "new-domain-id" --definition "Personal customer information"
# 4. Create data product from discovered assets
pvw uc dataproduct create --name "Customer Master Data" --domain-id "new-domain-id"
# 5. Set governance objectives
pvw uc objective create --definition "Ensure 100% PII classification compliance" --domain-id "new-domain-id"PVW CLI provides comprehensive entity management capabilities for updating Purview assets like descriptions, classifications, and custom attributes.
# Update table description using GUID
pvw entity update-attribute \
--guid "ece43ce5-ac45-4e50-a4d0-365a64299efc" \
--attribute "description" \
--value "Updated customer data warehouse table with enhanced analytics"
# Update dataset description using qualified name
pvw entity update-attribute \
--qualifiedName "https://app.powerbi.com/groups/abc-123/datasets/def-456" \
--attribute "description" \
--value "Power BI dataset for customer analytics dashboard"# Read entity details before updating
pvw entity read-by-attribute \
--guid "ea3412c3-7387-4bc1-9923-11f6f6f60000" \
--attribute "description,classifications,customAttributes"
# Update multiple attributes at once
pvw entity update-bulk \
--input-file entities_to_update.json \
--output-file update_results.json# Update specific column descriptions in a table
pvw entity update-attribute \
--guid "column-guid-123" \
--attribute "description" \
--value "Customer unique identifier - Primary Key"
# Add classifications to sensitive columns
pvw entity add-classification \
--guid "column-guid-456" \
--classification "MICROSOFT.PERSONAL.EMAIL"# 1. Find assets that need updates
pvw search query --keywords="customer table" --show-ids --limit=10
# 2. Get detailed information about a specific asset
pvw entity read-by-attribute --guid "FOUND_GUID" --attribute "description,classifications"
# 3. Update the asset description
pvw entity update-attribute \
--guid "FOUND_GUID" \
--attribute "description" \
--value "Updated description based on business requirements"
# 4. Verify the update
pvw search query --keywords="FOUND_GUID" --detailedPVW CLI provides powerful lineage management capabilities including CSV-based bulk import for automating data lineage creation.
Import lineage relationships from CSV files to automate the creation of data flow documentation in Microsoft Purview.
The CSV file must contain the following columns:
Required columns:
source_entity_guid- GUID of the source entitytarget_entity_guid- GUID of the target entity
Optional columns:
relationship_type- Type of relationship (default: "Process")process_name- Name of the transformation processdescription- Description of the transformationconfidence_score- Confidence score (0-1)owner- Process ownermetadata- Additional JSON metadata
Example CSV:
source_entity_guid,target_entity_guid,relationship_type,process_name,description,confidence_score,owner,metadata
dcfc99ed-c74d-49aa-bd0b-72f6f6f60000,1db9c650-acfb-4914-8bc5-1cf6f6f60000,Process,Transform_Product_Data,Transform product data for analytics,0.95,data-engineering,"{""tool"": ""Azure Data Factory""}"# Validate CSV format before import (no API calls)
pvw lineage validate lineage_data.csv
# Import lineage relationships from CSV
pvw lineage import lineage_data.csv
# Generate sample CSV file with examples
pvw lineage sample output.csv --num-samples 10 --template detailed
# View available CSV templates
pvw lineage templatesbasic- Minimal columns (source, target, process name)detailed- All columns including metadata and confidence scoresqualified_names- Use qualified names instead of GUIDs
# 1. Find entity GUIDs using search
pvw search find-table --name "Product" --schema "dbo" --id-only
# 2. Create CSV file with lineage relationships
# (use the GUIDs from step 1)
# 3. Validate CSV format
pvw lineage validate my_lineage.csv
# Output: SUCCESS: Lineage validation passed (5 rows, 8 columns)
# 4. Import to Purview
pvw lineage import my_lineage.csv
# Output: SUCCESS: Lineage import completed successfully- GUID Validation: Automatic validation of GUID format with helpful error messages
- Process Entity Creation: Creates intermediate "Process" entities to link source→target relationships
- Metadata Support: Add custom JSON metadata to each lineage relationship
- Dry-Run Validation: Validate CSV format locally before making API calls
For detailed documentation, see: doc/guides/lineage-csv-import.md
PVW CLI also includes the original data-product command group for backward compatibility with traditional data product lifecycle management.
See doc/commands/data-product.md for full documentation and examples.
# Create a data product
pvw data-product create --qualified-name="product.test.1" --name="Test Product" --description="A test data product"
# Add classification and label
pvw data-product add-classification --qualified-name="product.test.1" --classification="PII"
pvw data-product add-label --qualified-name="product.test.1" --label="gold"
# Link glossary term
pvw data-product link-glossary --qualified-name="product.test.1" --term="Customer"
# Set status and show lineage
pvw data-product set-status --qualified-name="product.test.1" --status="active"
pvw data-product show-lineage --qualified-name="product.test.1"-
Unified Catalog (UC): Complete modern data governance (NEW)
# Manage governance domains, terms, data products, OKRs, CDEs pvw uc domain list pvw uc term create --name "Customer" --domain-id "abc-123" pvw uc objective create --definition "Improve quality" --domain-id "abc-123"
-
Discovery Query/Search: Flexible, advanced search for all catalog assets
-
Entity Management: Bulk import/export, update, and validation
-
Glossary Management: Import/export terms, assign terms in bulk
# List all terms in a glossary pvw glossary list-terms --glossary-guid "your-glossary-guid" # Create and manage glossary terms pvw glossary create-term --payload-file term.json
-
Lineage Operations: Lineage discovery, CSV-based bulk lineage import/export
# Import lineage relationships from CSV pvw lineage import lineage_data.csv # Validate CSV format before import pvw lineage validate lineage_data.csv # Generate sample CSV file pvw lineage sample output.csv --num-samples 10
-
Monitoring & Analytics: Real-time dashboards, metrics, and reporting
-
Plugin System: Extensible with custom plugins
PVW CLI provides comprehensive automation for all major Microsoft Purview APIs, including the new Unified Catalog APIs for modern data governance.
- Unified Catalog: Complete governance domains, glossary terms, data products, OKRs, CDEs management [OK]
- Health Monitoring: Automated governance health checks and recommendations [OK] NEW
- Workflows: Approval workflows and business process automation [OK] NEW
- Data Map: Full entity and lineage management [OK]
- Discovery: Advanced search, browse, and query capabilities [OK]
- Collections: Collection and account management [OK]
- Management: Administrative operations [OK]
- Scan: Data source scanning and configuration [OK]
- Unified Catalog: Latest UC API endpoints (September 2025)
- Data Map: 2024-03-01-preview (default) or 2023-09-01 (stable)
- Collections: 2019-11-01-preview
- Account: 2019-11-01-preview
- Management: 2021-07-01
- Scan: 2018-12-01-preview
For the latest API documentation and updates, see:
- Microsoft Purview REST API reference
- Atlas 2.2 API documentation
- Azure Updates for new releases
If you need a feature that is not yet implemented, please open an issue or check for updates in future releases.
PVW CLI includes comprehensive sample files and scripts for bulk operations:
- CSV Samples:
samples/csv/uc_terms_bulk_example.csv(8 sample terms) - JSON Samples:
samples/json/term/uc_terms_bulk_example.json(8 data management terms)samples/json/term/uc_terms_sample.json(8 business terms)
- Lineage CSV Samples:
samples/csv/lineage_example.csv- Multiple lineage relationships with metadata
- Comprehensive Guide:
doc/guides/lineage-csv-import.md- Complete lineage CSV import documentation- CSV format specification with required/optional columns
- Command examples for validate, import, sample, templates
- Workflow recommendations and troubleshooting
- Advanced scenarios with metadata and multiple transformations
- PowerShell:
scripts/delete-all-uc-terms.ps1- Full-featured with confirmation prompts - Python:
scripts/delete_all_uc_terms_v2.py- Rich progress bars and error handling
- PowerShell:
scripts/test-json-output.ps1- Validates JSON output parsing
samples/notebooks (plus)/unified_catalog_terms_examples.ipynb- Complete examples including:- Examples 10-16: Bulk import demonstrations
- Code generation for CSV/JSON files
- Dry-run and actual import examples
- Term verification workflows
- Main Documentation:
doc/README.md - Unified Catalog:
doc/commands/unified-catalog.md - Bulk Import Guide:
doc/commands/unified-catalog/term-bulk-import.md - Data Products:
doc/commands/data-product.md
- API Coverage: All major Purview APIs including Unified Catalog, Data Map, Discovery, Collections
- Authentication: Azure CLI, Service Principal, Managed Identity support
- Output Formats: Table (default), JSON (plain), JSONC (colored)
- Bulk Operations: Import/export terms from CSV/JSON, bulk delete scripts
- Import multiple terms from CSV or JSON files
- Dry-run mode for validation before import
- Support for owners (Entra ID GUIDs), acronyms, resources
- Progress tracking and detailed error reporting
- 100% success rate in testing (8/8 terms)
- New
--outputparameter with table/json/jsonc formats - Plain JSON works with PowerShell's
ConvertFrom-Json - Compatible with jq, Python json module, and other tools
- Migration from deprecated
--jsonflag
- PowerShell script with interactive confirmation ("DELETE" to confirm)
- Python script with Rich progress bars
- Beautiful UI with colored output
- Success/failure tracking per term
- Rate limiting (200ms delay)
- Collections API Conformance: 100% alignment with Microsoft Purview specification - all endpoints verified and docstrings updated
- Docstring Accuracy: All collection methods now document correct request/response structures with actual field mappings
- Developer Experience: Enhanced IDE autocomplete with accurate parameter and response documentation
- CSV Import Reliability: Fixed issues with empty header cells in custom attribute parsing (v1.6.1)
- Search Performance: Optimized query execution with improved result caching
- Windows Console Compatibility: All output formats compatible with Windows terminal and PowerShell
- Full CRUD operations for collection lifecycle management
- Hierarchy and tree navigation APIs
- Permission management and access control
- Analytics and usage tracking per collection
- Governance domains, glossary terms, data products
- Objectives & Key Results (OKRs), Critical Data Elements (CDEs)
- Relationships API for linking data assets
- [NEW] Hierarchy visualization - Interactive tree views of glossary structure
- [NEW] Facets & Analytics - Statistics for terms, CDEs, data products, objectives
- [NEW] Impact Analysis - Complete relationship exploration
- Health monitoring and workflow automation
- Full CRUD operations with smart partial updates
- CSV/JSON import with dry-run validation
- PowerShell and Python bulk delete scripts
- Progress tracking and error handling
- Sample files and templates included
- Table format for human viewing (default)
- Plain JSON for PowerShell/bash scripting
- Colored JSON for visual inspection
- Azure CLI, Service Principal, Managed Identity auth
- Works in local development, CI/CD, and production
- Compatible with PowerShell, bash, Python, jq
- MCP Server for AI-powered automation
- Complete API coverage documentation
- Jupyter notebook examples
- Troubleshooting guides
- Sample files and templates
- Documentation: Full Documentation
- New APIs Guide: UC New APIs v1.7.0
- API Coverage Analysis: Complete Coverage Report
- Issue Tracker: GitHub Issues
- Email Support: keayoub@msn.com
- Repository: GitHub - Keayoub/Purview_cli
See LICENSE file for details.
PVW CLI v1.7.0 empowers data engineers, stewards, and architects to automate, scale, and enhance their Microsoft Purview experience with powerful command-line and programmatic capabilities.
Latest in v1.7.0:
- Six new Unified Catalog APIs for analytics and hierarchy visualization
- 96% UC API coverage (46 of 48 operations)
- Rich UI with interactive trees and color-coded tables
- Advanced facets for dashboards and compliance reporting
- Complete relationship exploration for data governance
- Comprehensive documentation and usage guides
- CSV import reliability improvements from v1.6.1
- Bulk operations with comprehensive error handling