This repository contains a fully local Retrieval-Augmented Generation (RAG) chatbot that indexes PDFs, plain text (.txt) files, retrieves relevant context, and generates grounded answers. Because it runs entirely on your machine—with no cloud services or external APIs—it’s well-suited for private data and air-gapped environments.
RAG-LangChain is a practical framework that combines Retrieval-Augmented Generation (RAG) with LangChain to build applications that are more accurate, context-aware, and reliable. Instead of relying solely on a model’s pre-trained knowledge, it retrieves relevant information from external sources—such as documents, databases, and APIs—at query time and then generates an answer. This retrieval-first workflow improves accuracy, reduces hallucinations, and makes responses easier to trace and explain.
- Documents are preprocessed, chunked, and embedded with an embedding model. The resulting vectors capture the semantic meaning of the text.
- When a user asks a question, the question is embedded into the same vector space and the most relevant chunks are retrieved by semantic similarity.
- The retrieved passages are appended to the prompt, augmenting the user’s query with additional context for the model.
- The large language model (LLM) then generates a response grounded in both the user’s question and the retrieved context.
(images from https://python.langchain.com/docs/tutorials/rag/)
- ~24 GB VRAM recommended for an ~3B-parameter LLM (24 GB+ for larger models)
- GPU acceleration
- Python 3.11+
git clone https://github.com/gyb357/langchain_chatbot
pip install -r requirements.txtBrowse Hugging Face to find and download models appropriate for your task.
🤗 Huggingface: https://huggingface.co/models
This RAG system runs locally, and both the embedding model and LLM are configured via config.yaml.
exam:
embed_model: "Embed model"
llm: "LLM name"
rag_prompt: "Your custom prompt"Place your source files (PDFs and plain text .txt files) into the documents/ folder.
The system will automatically index these files, split them into chunks, and build embeddings for efficient retrieval.
📝 Note:
This project is designed to run inside anipykernelenvironment.
To start the chatbot, you need to manually executemain.pyinstead of relying on an auto-run script.

