🔍 Advanced RAG Chatbot

A powerful Retrieval-Augmented Generation (RAG) system for processing multiple PDFs and delivering accurate, context-aware responses to user queries. Built in Python, it features GPU-accelerated embeddings, a semantic search engine, and the Mistral model via Ollama, wrapped in an intuitive Streamlit interface.

🚀 Features

📄 Multi-PDF Support – Upload and process several PDFs at once.
⚡ GPU-Accelerated Embeddings – Leverage NVIDIA GPUs for fast processing.
🔎 Semantic Search – Uses FAISS for efficient context retrieval with source tracking.
🌐 Streamlit Interface – Clean and interactive web UI for document upload, chat, and performance.
🧩 Highly Configurable – Adjust chunk size, overlap, batch size, and worker threads.
📊 Real-time Metrics – Monitor chunking speed, memory usage, and system stats.
💬 Chat History Export – Save and download all chat sessions in JSON format.

📁 Repository Info

Field	Details
Repo Name	Advanced-RAG-Chatbot
License	MIT
Language	Python
Model	Mistral via Ollama
Embeddings	`intfloat/e5-small` (HuggingFace)
Interface	Streamlit
Vector Store	FAISS
PDF Processor	PyMuPDF
Status	🚧 Actively maintained

🖥️ Requirements

✅ Hardware

Type	Minimum	Recommended
CPU	4-core	8-core
RAM	8 GB	16 GB
Storage	10 GB	20 GB
GPU (Optional)	❌	✅ NVIDIA (GTX 1060 or higher)

✅ Software

Python 3.8+
Git
CUDA Toolkit 11.2+ (for GPU acceleration)
Ollama (≥ 0.1.0)

⚙️ Installation

1️⃣ Install Python

# Download Python from https://python.org
python --version  # ✅ Ensure it's ≥ 3.8

2️⃣ Install Git

# Download from https://git-scm.com
git --version

3️⃣ Clone Repository

git clone https://github.com/milind899/Advanced-RAG-Chatbot.git
cd Advanced-RAG-Chatbot

4️⃣ Create Virtual Environment

python -m venv venv
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate

5️⃣ Install Dependencies

pip install -r requirements.txt

📦 Key Libraries

streamlit – Frontend UI
langchain – RAG orchestration
torch – Model + CUDA acceleration
transformers – Embedding generation
faiss-cpu – Vector similarity search
PyMuPDF – PDF parser

🤖 Ollama Setup

Install Ollama

curl -fsSL https://ollama.ai/install.sh | sh
ollama pull mistral
ollama serve

Verify Setup

curl http://localhost:11434/api/tags

Enable GPU (Optional)

# Verify CUDA
nvidia-smi

# Enable GPU
unset OLLAMA_NO_CUDA

# Test with PyTorch
python -c "import torch; print(torch.cuda.is_available())"

🚀 Run the Application

streamlit run app.py

🔗 Open your browser at: http://localhost:8501

💡 How to Use

📂 Uploading PDFs

Open the app in your browser.
Drag-and-drop or browse to upload PDFs.
Adjust settings from the sidebar:
- Chunk Size (default: 500)
- Overlap (default: 50)
- Workers (CPU count default)
- Batch Size (default: 100)
Click Process Documents.

❓ Asking Questions

Go to the Chat with Your Documents section.
Type a question and hit Send.
View responses with source links.
Review chat history at the bottom.
Export your chats via the Export JSON button.

📈 Monitoring & Stats

View chunks/sec, total time in Performance Metrics.
Sidebar shows GPU & system usage.
Troubleshoot with real-time logs.

🗂️ Project Structure

Advanced-RAG-Chatbot/
├── app.py              # Streamlit UI
├── rag_backend.py      # Core RAG logic
├── requirements.txt    # Dependencies
├── LICENSE             # MIT License
├── faiss_index/        # Generated vector store
├── embedding_cache/    # Cached embeddings

⚙️ Configuration Options (in `rag_backend.py`)

Parameter	Description	Default
`chunk_size`	Text chunk size	500
`chunk_overlap`	Overlap between chunks	50
`max_workers`	Concurrent threads	CPU count
`batch_size`	Documents per embedding batch	100
`embedding_model`	HuggingFace model	intfloat/e5-small
`cache_embeddings`	Use embedding cache	True

🧯 Troubleshooting

Problem	Fix
❌ Ollama not responding	Ensure `ollama serve` is running and model is pulled.
❌ CUDA not available	Install CUDA Toolkit, check `nvidia-smi`, enable with `unset`.
💥 Memory crash	Reduce `batch_size` or `chunk_size`, clear cache folders.
🐢 Slow performance	Enable GPU, increase `max_workers`, tune chunking strategy.

📬 Support

🔧 Use the Help & Tips section in the app.
🐛 File issues or suggestions on GitHub Issues
📄 Refer to rag_backend.py for logic and model details.

📜 License

This project is licensed under the MIT License. See LICENSE for more information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 Advanced RAG Chatbot

🚀 Features

📁 Repository Info

🖥️ Requirements

✅ Hardware

✅ Software

⚙️ Installation

1️⃣ Install Python

2️⃣ Install Git

3️⃣ Clone Repository

4️⃣ Create Virtual Environment

5️⃣ Install Dependencies

📦 Key Libraries

🤖 Ollama Setup

Install Ollama

Verify Setup

Enable GPU (Optional)

🚀 Run the Application

💡 How to Use

📂 Uploading PDFs

❓ Asking Questions

📈 Monitoring & Stats

🗂️ Project Structure

⚙️ Configuration Options (in `rag_backend.py`)

🧯 Troubleshooting

📬 Support

📜 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
app.py		app.py
rag_backend.py		rag_backend.py
requirements.txt		requirements.txt

License

milind899/Advanced-RAG-Chatbot

Folders and files

Latest commit

History

Repository files navigation

🔍 Advanced RAG Chatbot

🚀 Features

📁 Repository Info

🖥️ Requirements

✅ Hardware

✅ Software

⚙️ Installation

1️⃣ Install Python

2️⃣ Install Git

3️⃣ Clone Repository

4️⃣ Create Virtual Environment

5️⃣ Install Dependencies

📦 Key Libraries

🤖 Ollama Setup

Install Ollama

Verify Setup

Enable GPU (Optional)

🚀 Run the Application

💡 How to Use

📂 Uploading PDFs

❓ Asking Questions

📈 Monitoring & Stats

🗂️ Project Structure

⚙️ Configuration Options (in rag_backend.py)

🧯 Troubleshooting

📬 Support

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

⚙️ Configuration Options (in `rag_backend.py`)

Packages