Skip to content

rakmohan/agentic-bi-migration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agentic-bi-migration

A proof-of-concept multi-agent platform for migrating Metabase BI dashboards from a legacy normalized PostgreSQL schema to an enriched analytics schema — with semantic field mapping, automated SQL rewriting, equivalence validation, and a human-in-the-loop review portal.


The Problem

Enterprise BI teams accumulate dashboards built directly on top of raw, normalized schemas. When the underlying data model is redesigned — denormalized for performance, enriched with business definitions, restructured for a semantic layer — every dashboard breaks. Migrating them manually is slow, error-prone, and requires both SQL expertise and deep knowledge of the new schema.

The Approach

This PoC automates the migration pipeline using a team of LLM (Claude)-powered agents:

  1. Introspection Agent — connects to the Metabase API, extracts every chart's SQL, and identifies all tables and fields referenced
  2. Migration Agent — maps each legacy field to its enriched equivalent using semantic similarity (vector embeddings) + LLM-assisted review for low-confidence matches, then rewrites the SQL
  3. Validation — runs both the original and rewritten queries side-by-side and compares row counts and aggregates to confirm equivalence
  4. Q&A Agent — answers natural language questions about the migration: why a field was mapped a certain way, what a chart measures, which charts failed and why

Human review is central — a Streamlit portal surfaces every mapping decision with confidence scores, lets analysts override incorrect mappings, and shows the original vs. rewritten SQL before anything is applied.


Documentation

Doc What it covers
1_REQUIREMENTS.md Functional and non-functional requirements, scope, success criteria
2_ARCHITECTURE.md System design, tech stack, data model, API endpoints, semantic layer
3_METABASE_DASHBOARD_SETUP.md How to connect Metabase to PostgreSQL and create the 3 sample dashboards
4_VERIFICATION.md Step-by-step verification of every component after setup
5_DEMO.md Scripted 5-minute walkthrough with screenshots

Architecture

See docs/2_ARCHITECTURE.md for the full system design including Mermaid diagram, data model, API endpoints, semantic layer, and future phases.


Tech Stack

Layer Technology
BI Tool Metabase (open source)
Database PostgreSQL 15 — legacy, enriched, and migration state schemas
Vector Store ChromaDB — field embeddings for semantic mapping
Embeddings sentence-transformers (all-MiniLM-L6-v2) — local, no API calls
SQL parsing sqlglot — AST-based field substitution
Agents Anthropic Python SDK — plain Python, no agent framework
LLM Claude (claude-sonnet-4-6) — mapping review, SQL rewrite, Q&A
API FastAPI
Portal Streamlit
Containers Docker Compose — PostgreSQL + Metabase + ChromaDB

Prerequisites


Quick Start

git clone https://github.com/rakmohan/agentic-bi-migration.git
cd agentic-bi-migration

# 1. Create environment
conda create -n agentic-bi-migration python=3.11
conda activate agentic-bi-migration
pip install -e .

# 2. Configure
cp .env.example .env
# Edit .env — set ANTHROPIC_API_KEY

# 3. Set up Metabase dashboards (one-time)
# See docs/3_METABASE_DASHBOARD_SETUP.md

# 4. Start everything
./start.sh

start.sh boots Docker, seeds the database, indexes embeddings, and starts the API and portal. Open http://localhost:8501 when it finishes.


Manual Setup

Step-by-step if you prefer not to use start.sh
# Start Docker services
cd docker && docker compose up -d && cd ..

# Seed the database
python data/seeds/seed_legacy.py
python data/seeds/transform_enriched.py

# Index schema embeddings
python tests/test_tools.py --tool embedding

# Start the API
python -m uvicorn src.api.app:app --port 8080

# Start the portal (separate terminal)
streamlit run src/portal/app.py --server.port 8501

API docs: http://localhost:8080/docs


Project Structure

agentic-bi-migration/
├── data/
│   ├── schemas/          # SQL schema definitions
│   └── seeds/            # Synthetic data scripts
├── docker/
│   └── docker-compose.yml
├── docs/                 # See Documentation table above
├── src/
│   ├── agents/           # Introspection, Migration, Q&A agents
│   ├── api/              # FastAPI layer
│   ├── portal/           # Streamlit portal
│   ├── state/            # PostgreSQL state repository
│   ├── tools/            # Tool registry
│   └── config.py
├── tests/
├── .env.example
└── pyproject.toml

License

MIT — built as a portfolio demonstration. If you'd like to collaborate on a production implementation, feel free to get in touch.

About

Multi-agent pipeline for migrating Metabase BI dashboards to a new schema — semantic field mapping, automated SQL rewriting, equivalence validation, and a human-in-the-loop review portal.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors