Skip to content

NabilThange/insightx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

51 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

InsightX

Conversational Analytics Platform for Data-Driven Insights

πŸš€ 🌐 Check it out Live β†’ https://insightxx.vercel.app/

πŸ“– Read the Full Technical Documentation for comprehensive system architecture, API reference, and deployment guides.

πŸ”§ Complete Setup Guide, API Docs & Keys β€” Interactive API documentation with all endpoints, authentication, and environment setup details.


πŸ“‘ Table of Contents


What is InsightX?

InsightX transforms complex data analysis into simple conversations. Upload your CSV files and ask questions in plain English to get instant, data-backed insights with visualizations and statistical analysis. Built with a sophisticated multi-agent AI system that intelligently routes queries between SQL and Python for 10-50x performance improvements.

Perfect for business analysts, data scientists, and anyone who needs quick insights from their data without writing code.


✨ Key Features

  • Natural Language Queries - Ask questions in plain English, get data-backed answers
  • Automatic Data Profiling - Instant dataset analysis with schema, patterns, and anomalies
  • Multi-Agent AI System - 5 specialized agents (Orchestrator, SQL, Python, Composer, Explainer)
  • Hybrid SQLβ†’Python Architecture - 27x faster analysis by intelligently combining DuckDB and Python
  • Real-Time Streaming - Watch AI think and process your queries in real-time
  • Interactive Visualizations - Automatic chart generation with Recharts
  • Context-Aware Conversations - AI remembers previous questions and builds on them

πŸ› οΈ Tech Stack

Frontend: Next.js 16, React 19, TypeScript, Tailwind CSS v4, Zustand
Backend: FastAPI (Python), DuckDB, pandas, scipy
Database: Supabase (PostgreSQL + Storage)
AI/LLM: Claude Sonnet 4.5 via Bytez API
Deployment: Vercel (frontend), Railway (backend)


πŸš€ Quick Start

Prerequisites

Before you begin, ensure you have the following installed on your system:

  • Node.js 20.x or higher - Download
  • Python 3.11 or higher - Download
  • npm (comes with Node.js)
  • pip (comes with Python)
  • Git - Download

Verify installations:

node --version      # Should show v20.x or higher
npm --version       # Should show 10.x or higher
python --version    # Should show 3.11 or higher
pip --version       # Should show 24.x or higher

Step 1: Clone the Repository

git clone https://github.com/NabilThange/insightx.git
cd insightx-app

Step 2: Frontend Setup

2.1 Install Frontend Dependencies

Navigate to the project root and install all npm packages (GSAP, Supabase, Recharts, etc.):

npm install

Key packages being installed:

  • gsap - Animation library for smooth UI transitions
  • @supabase/supabase-js - Database and authentication
  • recharts - Interactive chart visualizations
  • next - React framework
  • tailwindcss - Styling framework
  • zustand - State management
  • framer-motion - Advanced animations

2.2 Create Frontend Environment File

Create a .env.local file in the project root:

# Windows (PowerShell)
New-Item -Path ".env.local" -ItemType File

# Mac/Linux
touch .env.local

πŸ”‘ Hackathon Judges: See HACKATHON_KEYS.md for pre-filled credentials. Copy the entire Frontend Environment block and paste it into .env.local.

For other users, add the following environment variables to .env.local:

# API Configuration
NEXT_PUBLIC_API_URL=http://localhost:8000/api

# Supabase Configuration
NEXT_PUBLIC_SUPABASE_URL=https://xvtqbvavwbowyyoevolo.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key-here

# AI/LLM Configuration
NEXT_PUBLIC_BYTEZ_API_KEY=your-bytez-api-key-here

Note: Replace your-anon-key-here and your-bytez-api-key-here with actual credentials from Supabase and Bytez dashboards.


Step 3: Backend Setup

3.1 Navigate to Backend Directory

cd backend

3.2 Create and Activate Virtual Environment

Create a Python virtual environment to isolate dependencies:

# Windows
python -m venv venv
venv\Scripts\activate

# Mac/Linux
python3 -m venv venv
source venv/bin/activate

Expected output after activation:

(venv) PS C:\Users\YourName\IIT-B-HACKATHON\backend>  # Windows PowerShell
(venv) user@machine backend %                         # Mac/Linux

The (venv) prefix indicates the virtual environment is active. If you don't see it, the venv is not activated.

3.3 Install Backend Dependencies

Install all Python packages (FastAPI, DuckDB, pandas, scipy, Supabase, etc.):

pip install -r requirements.txt

Key packages being installed:

  • fastapi - Web framework for API
  • uvicorn - ASGI server to run FastAPI
  • duckdb - SQL engine for data analysis
  • pandas - Data manipulation library
  • scipy - Statistical analysis
  • supabase - Database client
  • python-dotenv - Environment variable management

3.4 Create Backend Environment File

Create a .env file in the backend/ directory:

# Windows (PowerShell)
New-Item -Path ".env" -ItemType File

# Mac/Linux
touch .env

πŸ”‘ Hackathon Judges: See HACKATHON_KEYS.md for pre-filled credentials. Copy the entire Backend Environment block and paste it into backend/.env.

For other users, add the following environment variables to backend/.env:

# Supabase Configuration
SUPABASE_URL=https://xvtqbvavwbowyyoevolo.supabase.co
SUPABASE_SERVICE_KEY=your-service-role-key-here

# Bytez API Keys (Primary and Fallback)
BYTEZ_API_KEY_1=your-primary-bytez-key-here
BYTEZ_API_KEY_2=your-fallback-bytez-key-here

Note: Replace with actual credentials from your Supabase and Bytez dashboards.


Step 4: Run the Application

Terminal 1: Start the Frontend

# From project root (insightx-app/)
npm run dev

Expected output:

  β–² Next.js 16.1.6
  - Local:        http://localhost:3000
  - Environments: .env.local

βœ“ Ready in 3.2s
βœ“ Compiled client and server successfully

The frontend is now running at http://localhost:3000

Terminal 2: Start the Backend

# From backend directory (insightx-app/backend/)
# Make sure virtual environment is activated first!

uvicorn main:app --reload

Expected output:

INFO:     Will watch for changes in these directories: ['C:\Users\...\backend']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started server process [12345]
INFO:     Application startup complete

The backend API is now running at http://localhost:8000


Step 5: Verify Everything is Running

Open your browser and check:

  1. Frontend: http://localhost:3000

    • You should see the InsightX landing page
    • Navigation menu visible
    • No console errors
  2. Backend API Docs: http://localhost:8000/docs

    • Interactive Swagger UI showing all API endpoints
    • Try out endpoints directly from the browser
  3. Backend Health Check: http://localhost:8000/health

    • Should return: {"status": "healthy", "database": "connected", "storage": "connected"}

Step 6: Upload Your First Dataset

  1. Navigate to http://localhost:3000/connect
  2. Drag and drop a CSV file (or click to browse)
  3. Wait for "Data DNA Generation" message:
    πŸ“Š Analyzing dataset structure...
    πŸ” Profiling data patterns...
    ⚠️ Detecting anomalies...
    βœ“ Data DNA Complete
    
  4. You'll be redirected to the workspace
  5. Start asking questions in natural language!

Troubleshooting

Frontend won't start:

# Clear cache and reinstall
rm -r node_modules package-lock.json
npm install
npm run dev

Backend won't start - "ModuleNotFoundError":

# Make sure virtual environment is activated
# Windows: venv\Scripts\activate
# Mac/Linux: source venv/bin/activate

# Check Python version (should be 3.11+)
python --version

# Reinstall dependencies
pip install --upgrade -r requirements.txt

Backend won't start - "Port 8000 already in use":

# Run on a different port
uvicorn main:app --reload --port 8001

Environment variables not loading:

  • Ensure .env.local is in project root (frontend)
  • Ensure backend/.env is in backend directory
  • Restart both servers after creating/modifying env files

Next Steps

βœ… Setup complete! You can now:

  • Upload CSV datasets
  • Ask natural language questions
  • Get AI-powered insights with visualizations
  • Explore the multi-agent system in action

For detailed documentation, see INSIGHTX_TECHNICAL_DOCUMENTATION.md


πŸ“ Project Structure

insightx-app/
β”œβ”€β”€ app/                    # Next.js pages (landing, workspace, reports)
β”œβ”€β”€ components/             # React components (chat, data, workspace, ui)
β”œβ”€β”€ lib/                    # Core libraries (agents, api, db, utils)
β”œβ”€β”€ store/                  # Zustand state management
β”œβ”€β”€ backend/                # FastAPI backend
β”‚   β”œβ”€β”€ routes/            # API endpoints
β”‚   β”œβ”€β”€ services/          # Business logic (orchestrator, executors)
β”‚   β”œβ”€β”€ models/            # Pydantic schemas
β”‚   └── main.py            # FastAPI app entry
β”œβ”€β”€ MASTER_CONTEXT/         # Comprehensive documentation
└── package.json

πŸ’‘ Usage Example

  1. Upload Dataset: Navigate to /connect and drag-drop your CSV file
  2. Automatic Analysis: Wait for Data DNA generation (schema, patterns, anomalies)
  3. Ask Questions: In the workspace, type natural language queries:
    • "What's the success rate?"
    • "Show me transactions over $1000"
    • "What are the top 5 categories by revenue?"
  4. Get Insights: Receive answers with metrics, visualizations, and confidence scores
  5. Follow Up: Ask context-aware follow-up questions to dive deeper

🧠 Sample Query Bank

InsightX is validated against 30 business questions spanning 6 analytics categories. Each query is tagged by execution path β€” showing exactly how the multi-agent system routes and handles it.

Execution Paths

Symbol Path Description
β€” SQL Pure SQL aggregation via the SQL Agent
⭐ PYTHON Statistical analysis via the Python Agent
⭐⭐ HYBRID SQL extracts β†’ Python analyzes (InsightX's core differentiator)

Category 1 β€” Descriptive Analytics

What is happening in the data?

# Query Path
1 What is the overall transaction success rate across all transaction types? SQL
2 What is the average transaction amount for each transaction type? SQL
3 Which sender bank processes the highest total transaction volume (in β‚Ή)? SQL
4 How many transactions were made on weekends vs weekdays? SQL
5 What is the most commonly used device type for transactions? SQL

Category 2 β€” Comparative Analytics

How do different segments differ?

# Query Path
6 How do failure rates compare between Android, iOS, and Web users? SQL
7 Which bank pair (sender β†’ receiver) has the highest failure rate in P2P transfers? SQL
8 Do 5G users have a higher average transaction amount than 3G users? SQL
9 Compare fraud flag rates between P2P and P2M transactions. SQL
10 Are the failure rate differences between device types statistically significant, or could they be due to random chance? ⭐ PYTHON

Category 3 β€” Temporal Analytics

When do patterns occur?

# Query Path
11 What are the peak transaction hours by volume for the Food merchant category? SQL
12 Which day of the week has the highest P2P transfer volume? SQL
13 At what hour do transaction failures peak across all transaction types? SQL
14 How does average transaction amount differ between weekdays and weekends for the 18-25 age group? SQL
15 Is there a statistically significant spike in fraud-flagged transactions during peak hours (8–10 PM) vs off-peak? ⭐⭐ HYBRID

Category 4 β€” Segmentation Analytics

Who is doing what?

# Query Path
16 Which age group uses P2P transfers most frequently, and what is their average transfer amount? SQL
17 Which Indian state has the highest average transaction value, and which has the lowest? SQL
18 How do the 18-25 and 56+ age groups differ in their preferred transaction types? SQL
19 Which merchant category is most popular among users aged 36-45? SQL
20 Segment all Indian states into high, medium, and low fraud-risk tiers. Which tier does Maharashtra fall into? ⭐⭐ HYBRID

Category 5 β€” Correlation Analytics

What drives what?

# Query Path
21 Is there a relationship between network type and transaction success rate? SQL
22 Do higher-value transactions have a higher fraud flag rate? SQL
23 What is the correlation coefficient between transaction amount and fraud flag status? Is it significant? ⭐ PYTHON
24 Does the sender's bank significantly influence the probability of a transaction being flagged? ⭐ PYTHON
25 Which combination of network type Γ— device type produces the worst outcomes? Is the interaction effect significant? ⭐⭐ HYBRID

Category 6 β€” Risk Analytics

Where are the anomalies?

# Query Path
26 What percentage of transactions above β‚Ή10,000 are flagged for review? SQL
27 Which merchant category has the highest fraud flag rate? Does it also have the highest failure rate? SQL
28 Are failed transactions more likely to be flagged for review than successful ones? SQL
29 Which states are statistical outliers for fraud flag rates β€” significantly above or below the national average? ⭐ PYTHON
30 Which combination of age group Γ— transaction type Γ— network type produces the highest fraud flag rate? Is the sample size large enough to trust this finding? ⭐⭐ HYBRID

Bonus β€” Conversational Context Chain

Tests multi-turn memory and pronoun resolution

Turn 1 β†’ "What is the failure rate for Android users?"
Turn 2 β†’ "Why is that higher than iOS?"
Turn 3 β†’ "Is this pattern consistent on weekends too?"
Turn 4 β†’ "Now break it down by age group."
Turn 5 β†’ "Which age group should we prioritize fixing first?"

Each turn builds on the last. InsightX retains filters, resolves pronouns ("that", "this"), and accumulates context β€” no repetition needed from the user.


Distribution Summary

Path Count
SQL 20
⭐ PYTHON 5
⭐⭐ HYBRID 5

SQL handles speed. Python handles statistical rigor. Hybrid handles both β€” at 10–27x the performance of a Python-only approach on large datasets.


πŸ—οΈ Architecture Highlights

Multi-Agent System:

  • Orchestrator Agent - Classifies queries (SQL_ONLY, PY_ONLY, SQL_THEN_PY, EXPLAIN_ONLY)
  • SQL Agent - Generates DuckDB queries for data retrieval
  • Python Agent - Performs statistical analysis with scipy
  • Composer Agent - Synthesizes results into user-friendly responses
  • Explainer Agent - Handles general knowledge questions

Performance Innovation:

  • Traditional: Load 250K rows β†’ Python analysis β†’ 30+ seconds
  • InsightX: SQL aggregates 250K β†’ 28 rows β†’ Python analysis β†’ ~1 second (27x faster!)

πŸ“š Documentation

  • Technical Documentation: See INSIGHTX_TECHNICAL_DOCUMENTATION.md for comprehensive system details
  • Master Context: Check MASTER_CONTEXT/ folder for architecture, implementation, and integration docs
  • Backend Setup: See backend/START_HERE.md for detailed backend setup
  • API Reference: Visit http://localhost:8000/docs for interactive API documentation

πŸ§ͺ Testing

Backend API:

cd backend
python test_api.py
python test_exploration.py

Frontend:

npm run test

🚒 Deployment

Live Application: 🌐 https://insightxx.vercel.app/ β€” Check it out!

Frontend (Vercel):

npm run build
# Deploy via Vercel dashboard or CLI

Backend (Railway):

  • Push to GitHub
  • Railway auto-deploys from main branch
  • Configure environment variables in Railway dashboard

🀝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

MIT License - see LICENSE file for details


πŸ‘₯ Developers

Team Members:

Built for IIT-B Techfest Hackathon

Project Status: βœ… Production Ready
Version: 1.0.0
Last Updated: March 2, 2026


οΏ½ Links


πŸ†˜ Support

For issues or questions:

  • Check MASTER_CONTEXT/ documentation
  • Review API docs at /docs endpoint
  • Open a GitHub issue

Made with ❀️ for data-driven decision making

About

InsightX transforms complex data analysis into simple conversations. Upload your CSV files and ask questions in plain English to get instant, data-backed insights with visualizations and statistical analysis. Built with a sophisticated multi-agent AI system that intelligently routes queries between SQL and Python for 10-50x performance improvements

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors