This repository contains a full-stack Multimodal Retrieval-Augmented Generation (RAG) backend powered by FastAPI, PostgreSQL/pgvector, and Google Gemini integration.
- Document Ingestion: Processes PDF files extracting text (and theoretically images) chunks.
- Vector Search: Converts text into embeddings (via
gemini-embedding-001) and performs fast similarity searches usingpgvector. - Chat API: Uses context mapped from vector searches to answer user queries with
gemini-2.5-flash. - Frontend UI: Interactive Chat and Document uploading natively in
index.html.
Multimodel.RAG.mp4
- Python 3.9+
- Docker and Docker Compose (to run the PostgreSQL + pgvector instance)
- A
.envfile containing your valid Google Gemini API Key:GEMINI_API_KEY=your_api_key_here
-
Install Python Dependencies: Make sure you use a virtual environment (
venv).pip install -r requirements.txt
-
Start the Database: Use the included docker configuration to effortlessly boot a pgvector compatible PostgreSQL wrapper.
docker-compose up -d
-
Start the FastAPI Backend: Run the backend locally using
uvicorn.uvicorn main:app --reload
-
Access the Frontend: Open
index.htmlin your web browser, or serve it securely via your preferred web server tool (e.g. VS Code's Live Server extension).
When the repository grows, consider separating the structure logically to keep it organized:
/src- Core backend logic likemain.py,ingest.py, etc./frontend- Contains UI files such asindex.htmland static assets./tests- Scratchpad validation scripts such astest_dim.py.
Note: User-supplied pdfs and vector-mapped images are excluded by default via .gitignore.