PaperLens Multimodal RAG Bot is an AI-powered chatbot supporting text and image-based interactions using FastAPI, React, Ollama, Qdrant, and Llama 3.2 Vision. It offers:
- ✅ No-RAG Chat (Regular Chat Mode)
- ✅ File-Based RAG Chat (RAG Mode with Uploaded Files)
- ✅ Search Everything RAG Chat (Retrieval from All Sources)
- ✅ Admin Panel (User Management & Settings)
- ✅ JWT Authentication with SQLite
- ✅ MongoDB for Chat Data Storage
- Backend: FastAPI, Ollama, Qdrant, Llama 3.2 Vision
- Frontend: React.js
- Database: MongoDB (Chat Data), SQLite (Auth)
- Auth: JWT-Based Authentication
📦 multimodal-rag-bot
├── backend/ # FastAPI Backend
│ ├── main.py # API Endpoints
│ ├── auth/ # JWT Authentication
│ ├── models/ # SQLite & MongoDB Setup
│ ├── rag_modules/ # RAG and bot Logic
│ ├── schemas/ # Pydantic Models
│ ├── routes/ # API Routes
│ ├── tests/ # Unit tests
│ ├── requirements.txt # Python Dependencies
│ └── services/ # Service Logic
│
├── frontend/ # React.js Frontend
│ ├── src/ # Components & Pages
│ │ ├── components/ # UI Components
│ │ └── pages/ # Page Layouts
│ ├── package.json # Frontend Dependencies
│ └── App.js # Main App Entry
└── README.md # Documentation
git clone https://github.com/Rahul2991/PaperLens_Multimodal_RAG.git
cd PaperLens_Multimodal_RAGEnsure you have:
- Python 3.11.10 installed in a conda environment
- Docker Installed Get it here
cd backend
pip install -r requirements.txt
conda install -c conda-forge tesseract==5.5.0
conda install -c conda-forge poppler==24.12.0Follow the official PyTorch installation guide to install the appropriate version based on your system and CUDA setup.
Download and install Ollama from the official website. After installation, pull the required model:
ollama pull llama3.2-visionFollow the official MongoDB installation guide.
cd frontend
npm installcd backend
uvicorn main:app --reloaddocker run -p 6333:6333 -p 6334:6334 -v "${PWD}/qdrant_storage:/qdrant/storage" qdrant/qdrantcd frontend
npm start- React App: http://localhost:3000
- Users: Can chat in different modes and upload files.
- Admins: Can chat in different modes, manage users and upload files.
This project is licensed under the MIT License - see the LICENSE file for details.