Skip to content

IshanGupta09/nl2sql-rag-system

Repository files navigation

Live Demo Stars Forks MIT License

Python Streamlit Groq Llama SQLite FAISS FastAPI


┌─────────────────────────────────────────────────────────────────────────┐
│                                                                         │
│   "Which customer placed the most orders last month?"                   │
│                              ↓                                          │
│   SELECT name, COUNT(*) FROM customers JOIN orders ... LIMIT 1;         │
│                              ↓                                          │
│   "Alice placed the most orders with 12 orders last month."             │
│                                                                         │
│            ⚡ All in ~2 seconds  ·  100% benchmark accuracy             │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

🎥 Demo

NL2SQL Demo

Can't see the GIF? Watch live → nl2sql-rag.streamlit.app


🏆 Benchmark

📊 100%

Benchmark Accuracy 25 / 25 Questions

⚡ ~2s

Average Query Time SQL + NL Answer

🗄️ 4

Tables Tested 200 orders · 50 customers

Category Score Status
🔵   Simple SELECT 5 / 5
🟣   Filter (WHERE) 5 / 5
🟠   Aggregation (SUM · AVG · COUNT) 5 / 5
🔴   JOIN (multi-table) 5 / 5
🟡   Date Filter 5 / 5
⚡   Overall 25 / 25 🏆 100%

✨ Features

🧠 Intelligence

  • NL → SQL — plain English to precise SQL
  • NL Answers — results in one clear sentence
  • RAG Pipeline — FAISS + MiniLM context injection
  • Auto Correction — SQL errors trigger retry

🗄️ Database

  • Multi-Database — upload any SQLite .db file
  • Schema Discovery — auto-detected, no config
  • ecommerce.db — demo database included
  • CSV Export — download any result table

🎨 Interface

  • Cyberpunk UI — animated dot-grid + scanline
  • Session Counter — 8 query limit with live bar
  • Query History — recent queries in sidebar
  • Feedback — 👍 / 👎 rating per answer

🔒 Production

  • SQL Safety — blocks all destructive ops
  • Streamlit Secrets — API key never in code
  • Cloud Ready — Streamlit Cloud deployment
  • Error Recovery — graceful fallback answers

🔄 Architecture

NL2SQL Architecture

🛠️ Tech Stack

LayerTechnologyPurpose
🎨 Frontend Animated cyberpunk UI
🤖 LLM SQL generation + NL answers
🔍 RAG Business context retrieval
🗄️ Database Query execution
⚙️ API REST backend (local/Docker)
☁️ Deploy Live hosting

📁 Project Structure

nl2sql-rag-system/
│
├── 🚀 streamlit.py              # Main app — standalone, Cloud ready
├── ⚙️  api.py                   # FastAPI backend (local/Docker only)
├── 📋 requirements.txt
├── 🔒 .gitignore
│
├── .streamlit/
│   └── config.toml              # Dark theme config
│
├── assets/
│   └── demo.gif                 # Demo recording
│
├── data/
│   ├── ecommerce.db             # Demo SQLite database
│   └── .gitkeep
│
├── docs/
│   └── business_rules.txt       # Domain context for RAG
│
├── llm/
│   └── sql_generator.py         # Groq — SQL generation + correction
│
├── nlg/
│   └── answer_generator.py      # Groq — rows → plain English
│
├── rag/
│   ├── ingest.py                # Build FAISS vectorstore
│   └── retriever.py             # Retrieve context per question
│
└── eval/
    ├── benchmark.py             # 25-question benchmark
    ├── questions.json           # Test questions
    └── report.py                # Benchmark report

💡 Example Queries

💬 You Ask 🤖 AI Answers
What is the total revenue? "The total revenue is $70,071.24."
Which customer placed the most orders? "Alice placed the most orders with 12 orders."
List premium customers from the North Returns matching table
How many orders last month? "There were 18 orders placed last month."
Top 5 customers by total spend Returns ranked table
Average order value? "The average order value is $350.36."

☁️ Deploy on Streamlit Cloud

Get your own live instance in under 5 minutes, for free.

1 — Fork this repo

Fork

2 — Go to share.streamlit.io

  • Create app → select your fork → main file: streamlit.py

3 — Add API key in Secrets

GROQ_API_KEY = "gsk_your_key_here"

Free key at console.groq.com — no credit card.

4 — Deploy 🎉 · Live in ~2 minutes.


💻 Run Locally

# Clone
git clone https://github.com/IshanGupta09/nl2sql-rag-system.git
cd nl2sql-rag-system

# Setup
python -m venv venv && source venv/bin/activate   # Mac/Linux
python -m venv venv && venv\Scripts\activate       # Windows

pip install -r requirements.txt
echo "GROQ_API_KEY=gsk_your_key_here" > .env

# First time only
python rag/ingest.py

# Run
streamlit run streamlit.py

🧪 Benchmark

python eval/report.py
  ✅  Simple SELECT     5/5
  ✅  Filter            5/5
  ✅  Aggregation       5/5
  ✅  JOIN              5/5
  ✅  Date Filter       5/5
  ───────────────────────────
  🏆  Overall    25/25  100%
  ⚡  Avg time       1.94s

🔒 Security

🛡️ Protection Details
🚫 SQL Injection Blocks DROP DELETE UPDATE INSERT ALTER TRUNCATE
📁 Path Traversal Only data/ directory is accessible
🔑 API Key Streamlit Secrets — never in code or GitHub
⏱️ Rate Limiting Max 8 queries / session on live demo
👁️ Read-Only All queries enforced as SELECT only

🤝 Contributing

git checkout -b feature/your-feature
git commit -m "feat: your change"
git push origin feature/your-feature
# Open a Pull Request ↗

📄 License

MIT License — see LICENSE for details.


👤 Ishan Gupta

GitHub   LinkedIn


Found this useful? Give it a ⭐ — it helps a lot!


StarFork


Built with ❤️ and ⚡ · Ishan Gupta · 2026

About

Natural language to SQL system with RAG pipeline — 100% benchmark accuracy, Groq Llama 3.3 70B, Streamlit Cloud

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages