DocChat — Local RAG PDF Chatbot

DocChat is a fully local, private, offline PDF chatbot built with RAG (Retrieval Augmented Generation). Upload one or more PDFs and ask questions — only the relevant parts of the document are sent to the model, not the whole thing.

No API keys. No data leaves your machine.

How it works

Instead of dumping the entire PDF into the prompt, DocChat uses a two-step RAG pipeline:

Indexing (on upload)

Extract text from PDF using PyMuPDF
Split text into overlapping chunks
Embed each chunk into a vector using sentence-transformers (runs locally on CPU)
Store chunks + vectors in ChromaDB

Retrieval (on each question)

Embed the user's question using the same model
Query ChromaDB for the 3 most semantically similar chunks
Send only those chunks to Ollama as context
Return the answer

This means DocChat works on large documents without overwhelming the model's context window.

Features

100% local — no API keys, no internet required after setup
Multi-document support — upload multiple PDFs and search across all of them
RAG pipeline built from scratch — no LangChain, no n8n
Source tracking — knows which chunk came from which file
Model thinking display — see the model's reasoning process
Clean, minimal dark UI

Stack

Component	Library
UI	Gradio
PDF extraction	PyMuPDF
Chunking	plain Python
Embeddings	sentence-transformers (`all-MiniLM-L6-v2`)
Vector storage	ChromaDB
LLM inference	Ollama (`qwen3.5:0.8b`)

Quickstart

Install requirements:
```
pip install -r requirements.txt
```
Start Ollama and pull the model:
```
ollama serve
ollama pull qwen3.5:0.8b
```
Run the app:
```
python ui.py
```
Open http://127.0.0.1:7860 in your browser, upload a PDF, and start chatting.

Docker

docker build -t docchat .
docker run -p 7860:7860 --add-host=host.docker.internal:host-gateway docchat

Ollama must be running on your host machine.

File Overview

File	Purpose
`ui.py`	Gradio interface and app logic
`extract.py`	PDF extraction, chunking, embedding, ChromaDB, Ollama
`requirements.txt`	Python dependencies
`Dockerfile`	Container setup

Notes

Embedding runs on CPU — no GPU required
Tested with qwen3.5:0.8b but any Ollama model works
ChromaDB stores in-memory per session — reloading the app clears the index

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
__pycache__		__pycache__
.dockerignore		.dockerignore
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
dockerfile		dockerfile
extract.py		extract.py
requirments.txt		requirments.txt
ui.py		ui.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocChat — Local RAG PDF Chatbot

How it works

Features

Stack

Quickstart

Docker

File Overview

Notes

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DocChat — Local RAG PDF Chatbot

How it works

Features

Stack

Quickstart

Docker

File Overview

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages