┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ "Which customer placed the most orders last month?" │
│ ↓ │
│ SELECT name, COUNT(*) FROM customers JOIN orders ... LIMIT 1; │
│ ↓ │
│ "Alice placed the most orders with 12 orders last month." │
│ │
│ ⚡ All in ~2 seconds · 100% benchmark accuracy │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Can't see the GIF? Watch live → nl2sql-rag.streamlit.app
|
Benchmark Accuracy 25 / 25 Questions |
Average Query Time SQL + NL Answer |
Tables Tested 200 orders · 50 customers |
| Category | Score | Status |
|---|---|---|
| 🔵 Simple SELECT | 5 / 5 | ✅ |
| 🟣 Filter (WHERE) | 5 / 5 | ✅ |
| 🟠 Aggregation (SUM · AVG · COUNT) | 5 / 5 | ✅ |
| 🔴 JOIN (multi-table) | 5 / 5 | ✅ |
| 🟡 Date Filter | 5 / 5 | ✅ |
| ⚡ Overall | 25 / 25 | 🏆 100% |
|
|
|
|
nl2sql-rag-system/
│
├── 🚀 streamlit.py # Main app — standalone, Cloud ready
├── ⚙️ api.py # FastAPI backend (local/Docker only)
├── 📋 requirements.txt
├── 🔒 .gitignore
│
├── .streamlit/
│ └── config.toml # Dark theme config
│
├── assets/
│ └── demo.gif # Demo recording
│
├── data/
│ ├── ecommerce.db # Demo SQLite database
│ └── .gitkeep
│
├── docs/
│ └── business_rules.txt # Domain context for RAG
│
├── llm/
│ └── sql_generator.py # Groq — SQL generation + correction
│
├── nlg/
│ └── answer_generator.py # Groq — rows → plain English
│
├── rag/
│ ├── ingest.py # Build FAISS vectorstore
│ └── retriever.py # Retrieve context per question
│
└── eval/
├── benchmark.py # 25-question benchmark
├── questions.json # Test questions
└── report.py # Benchmark report
| 💬 You Ask | 🤖 AI Answers |
|---|---|
What is the total revenue? |
"The total revenue is $70,071.24." |
Which customer placed the most orders? |
"Alice placed the most orders with 12 orders." |
List premium customers from the North |
Returns matching table |
How many orders last month? |
"There were 18 orders placed last month." |
Top 5 customers by total spend |
Returns ranked table |
Average order value? |
"The average order value is $350.36." |
Get your own live instance in under 5 minutes, for free.
1 — Fork this repo
2 — Go to share.streamlit.io
- Create app → select your fork → main file:
streamlit.py
3 — Add API key in Secrets
GROQ_API_KEY = "gsk_your_key_here"Free key at console.groq.com — no credit card.
4 — Deploy 🎉 · Live in ~2 minutes.
# Clone
git clone https://github.com/IshanGupta09/nl2sql-rag-system.git
cd nl2sql-rag-system
# Setup
python -m venv venv && source venv/bin/activate # Mac/Linux
python -m venv venv && venv\Scripts\activate # Windows
pip install -r requirements.txt
echo "GROQ_API_KEY=gsk_your_key_here" > .env
# First time only
python rag/ingest.py
# Run
streamlit run streamlit.pypython eval/report.py ✅ Simple SELECT 5/5
✅ Filter 5/5
✅ Aggregation 5/5
✅ JOIN 5/5
✅ Date Filter 5/5
───────────────────────────
🏆 Overall 25/25 100%
⚡ Avg time 1.94s
| 🛡️ | Protection | Details |
|---|---|---|
| 🚫 | SQL Injection | Blocks DROP DELETE UPDATE INSERT ALTER TRUNCATE |
| 📁 | Path Traversal | Only data/ directory is accessible |
| 🔑 | API Key | Streamlit Secrets — never in code or GitHub |
| ⏱️ | Rate Limiting | Max 8 queries / session on live demo |
| 👁️ | Read-Only | All queries enforced as SELECT only |
git checkout -b feature/your-feature
git commit -m "feat: your change"
git push origin feature/your-feature
# Open a Pull Request ↗MIT License — see LICENSE for details.