A Python-based Retrieval-Augmented Generation (RAG) application for querying PDF documents. This project demonstrates advanced RAG techniques, including local LLM integration, vector database updates, and response quality testing.
This guide provides step-by-step instructions to set up and use OllamaEmbeddings from LangChain to generate text embeddings locally on your PC. This is useful for Retrieval-Augmented Generation (RAG) applications, such as indexing and querying documents in a vector database.
- Visit ollama.com and download the installer for your operating system.
- Run the installer and follow the prompts to set up Ollama.
- Open a terminal or command prompt.
- Run the following command to start the Ollama server:
ollama serve
- This launches a local REST API server at
http://localhost:11434. Keep the terminal open while using Ollama.
- Choose a lightweight embedding model, such as
nomic-embed-text. - Download the model by running:
ollama pull nomic-embed-text
- Verify the model is available:
ollama list
- Install the required Python packages using pip:
pip install langchain ollama
- These packages provide the
OllamaEmbeddingsmodule and ensure compatibility with the Ollama server.
- Create a Python script (e.g.,
embeddings.py) and add the following code to initialize the embedding function:from langchain.embeddings import OllamaEmbeddings # Initialize the embedding function embedding_function = OllamaEmbeddings(model="nomic-embed-text")