Skip to content

RAG system that allows to launch local models and use them for find information among chunks of file. Built using langchain and FAISS.

License

Notifications You must be signed in to change notification settings

DanisSharafiev/ChatBot-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChatBot

A chatbot with a RAG (Retrieval Augmented Generation) pipeline, FastAPI backend, and Streamlit frontend, all containerized with Docker.

Features

  • FastAPI Backend: Handles business logic, RAG pipeline, and communication with the Ollama LLM.
  • Streamlit Frontend: Provides a user interface for uploading documents, chatting with the RAG model, and searching the knowledge base.
  • RAG Pipeline: Uses FAISS for vector storage and all-MiniLM-L6-v2 for embeddings to retrieve relevant context for the LLM.
  • Ollama Integration: Leverages a local Ollama instance running an LLM (e.g., Llama 3).
  • Dockerized: All services (Ollama, backend, frontend) are containerized for easy setup and deployment using Docker Compose.
  • Document Upload: Supports uploading PDF, TXT, and MD files to populate the knowledge base.
  • FAISS Search: Allows direct searching of the FAISS vector store.

Project Structure

docker-compose.yml     # Docker Compose configuration
app/                   # FastAPI backend application
├── Dockerfile
├── main.py            # Main FastAPI application logic
├── requirements.txt
├── config/            # Configuration files (Ollama URL, model, etc.)
├── endpoints/         # API endpoint definitions (v1/ui_router.py)
├── langchain_pipe/    # RAG pipeline implementation (rag_pipeline.py)
└── faiss_index/       # (Mounted) Stores the FAISS vector index
frontend/              # Streamlit frontend application
├── Dockerfile
├── main.py            # Streamlit UI logic
└── requirements.txt
faiss_index/           # Local directory for FAISS index persistence (mounted into app container)
README.md              # This file
LICENSE                # Project license (if any)

Prerequisites

  • Docker Desktop installed and running.
  • Sufficient resources allocated to Docker Desktop (especially RAM for Ollama and the LLM).

Getting Started

  1. Clone the repository (if applicable) or ensure you have all project files.

  2. Build and run the services using Docker Compose:

    docker-compose up --build

    This command will:

    • Pull the Ollama image.
    • Start the Ollama service and automatically pull the llama3 and all-minilm models (this might take some time on the first run).
    • Build the Docker images for the app (backend) and frontend services.
    • Start all services.
  3. Access the services:

    • Frontend (Streamlit UI): Open your browser and go to http://localhost:8501
    • Backend (FastAPI): Accessible at http://localhost:8000 (e.g., for API testing with tools like Postman or directly from the frontend).
    • Ollama: Accessible at http://localhost:11434 (though direct interaction is usually not needed as the backend handles it).

Usage

  1. Open the Frontend: Navigate to http://localhost:8501 in your web browser.
  2. Upload Documents: Use the "Загрузка документа" section to upload PDF, TXT, or MD files. These will be processed, and their content will be added to the FAISS knowledge base.
  3. Add Text to FAISS: Use the "Добавление текста в базу знаний (FAISS)" section to manually input text and add it to the knowledge base.
  4. Search FAISS: Use the "Поиск по базе знаний (FAISS)" section to perform a semantic search directly against the FAISS index and view relevant text chunks.
  5. Chat with RAG: Use the "RAG Чат" section to ask questions. The system will retrieve relevant context from the knowledge base and use the LLM to generate an answer.

Configuration

  • Backend Configuration (app/config/config.py):
    • ollama_base_url: URL for the Ollama service (defaults to http://ollama:11434 for Docker Compose internal communication).
    • model: The LLM model to use with Ollama (e.g., llama3).
    • chunk_size, chunk_overlap: Parameters for text splitting.
    • n_retriever_docs: Number of documents to retrieve from FAISS for context.
  • Docker Compose (docker-compose.yml):
    • Service definitions, ports, volumes, and environment variables.
    • The ollama service entrypoint pulls llama3 and all-minilm models on startup.

Development

  • The application code (backend in app/, frontend in frontend/) is mounted as volumes in the Docker containers. This means changes to the code will be reflected automatically without needing to rebuild the image (though a restart of the specific service might be needed for some changes, e.g., installing new dependencies).
  • To install new Python dependencies:
    1. Add the dependency to the respective requirements.txt file (app/requirements.txt or frontend/requirements.txt).
    2. Rebuild the images: docker-compose up --build.

Troubleshooting

  • Connection refused for Ollama: Ensure the ollama_base_url in app/config/config.py is set to http://ollama:11434 when running via Docker Compose. If running the backend locally (outside Docker) while Ollama is in Docker, use http://localhost:11434.
  • Ollama model not found: Check the ollama service logs in Docker Compose (docker-compose logs ollama) to ensure the models (llama3, all-minilm) were pulled successfully. The entrypoint in docker-compose.yml for the ollama service is responsible for this.
  • Streamlit UI not loading or missing ScriptRunContext: Ensure the CMD in frontend/Dockerfile is streamlit run main.py --server.port=8501 --server.address=0.0.0.0.
  • FAISS index issues: The faiss_index directory is mounted into the app container. Ensure Docker has permissions to write to this directory if it's being created for the first time by the container.

About

RAG system that allows to launch local models and use them for find information among chunks of file. Built using langchain and FAISS.

Resources

License

Stars

Watchers

Forks