🔍 RAG — Retrieval-Augmented Generation System

A lightweight RAG pipeline built with ChromaDB, HuggingFace Embeddings, and OpenAI GPT. Ask questions about your own documents and get accurate, context-grounded answers.

📐 Architecture

Your Document (.txt / .pdf)
        ↓
   [Loader] → [Chunker] → [Embedder] → [ChromaDB Vectorstore]
                                                ↓
                                User Question → [Retriever]
                                                ↓
                                           [Generator] → Answer

📁 Project Structure

RAG/
├── data/                  # ← Put YOUR documents here (.txt or .pdf)
├── src/
│   ├── api.py             # FastAPI backend
│   ├── loader.py          # Document loading (.txt, .pdf)
│   ├── chunker.py         # Text splitting into chunks
│   ├── embedder.py        # HuggingFace embeddings + ChromaDB
│   ├── retriever.py       # Similarity search
│   ├── generator.py       # OpenAI GPT answer generation
│   ├── ui.py              # Streamlit frontend
│   └── main.py            # CLI entry point (for testing)
├── tests/
├── .env                   # Your API keys (create this yourself)
├── .gitignore
└── requirements.txt

⚙️ Setup

1. Clone the repository

git clone https://github.com/your-username/RAG.git
cd RAG

2. Create and activate a virtual environment

python3 -m venv venv
source venv/bin/activate        # macOS / Linux
venv\Scripts\activate           # Windows

3. Install dependencies

pip install -r requirements.txt

4. Add your OpenAI API key

Create a .env file in the project root:

OPENAI_API_KEY=sk-...your-key-here...

⚠️ Never commit your .env file. It is already listed in .gitignore.

5. Add your documents

Place your .txt or .pdf files inside the data/ folder:

data/
└── your_document.txt

Then update the file path in src/api.py (line 9):

docs = load_documents("/absolute/path/to/your/RAG/data/your_document.txt")

🚀 Running the App

Start the FastAPI backend

cd src
uvicorn api:app --reload

API will be available at: http://127.0.0.1:8000
Swagger UI (for testing): http://127.0.0.1:8000/docs

Start the Streamlit frontend (in a new terminal)

# From the project root
streamlit run src/ui.py

UI will be available at: http://localhost:8501

💬 Usage

Open the Streamlit UI at http://localhost:8501
Type your question in the chat input
The system retrieves relevant chunks from your document and generates an answer

Or test directly via Swagger UI at http://127.0.0.1:8000/docs → POST /ask:

{
  "question": "What is MoE?",
  "history": ""
}

🛠️ Tech Stack

Component	Technology
Backend	FastAPI
Frontend	Streamlit
Embeddings	`sentence-transformers/all-MiniLM-L6-v2`
Vector Store	ChromaDB (in-memory)
LLM	OpenAI GPT (via `langchain-openai`)
Doc Loading	LangChain `TextLoader` / `PyPDFLoader`

📦 Requirements

See requirements.txt. Key dependencies:

fastapi
uvicorn
streamlit
langchain
langchain-community
langchain-openai
chromadb
sentence-transformers
python-dotenv
pypdf

📌 Notes

The vectorstore is in-memory — it rebuilds on every server restart.
Only one document is loaded at a time. To use a different document, change the path in api.py and restart the server.
The history field in the API request is optional — pass an empty string if not needed.

📄 License

MIT License — feel free to use and modify.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 RAG — Retrieval-Augmented Generation System

📐 Architecture

📁 Project Structure

⚙️ Setup

1. Clone the repository

2. Create and activate a virtual environment

3. Install dependencies

4. Add your OpenAI API key

5. Add your documents

🚀 Running the App

Start the FastAPI backend

Start the Streamlit frontend (in a new terminal)

💬 Usage

🛠️ Tech Stack

📦 Requirements

📌 Notes

📄 License

docquery

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔍 RAG — Retrieval-Augmented Generation System

📐 Architecture

📁 Project Structure

⚙️ Setup

1. Clone the repository

2. Create and activate a virtual environment

3. Install dependencies

4. Add your OpenAI API key

5. Add your documents

🚀 Running the App

Start the FastAPI backend

Start the Streamlit frontend (in a new terminal)

💬 Usage

🛠️ Tech Stack

📦 Requirements

📌 Notes

📄 License

docquery

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages