This Streamlit-based application question-answering on texts extracted from PDF documents by integrating the langchain framework with various language models and utilizing the FAISS library for efficient similarity searches in large vector spaces.A key feature of this application is its support for local model execution using Ollama, enabling users to process data without relying on external API calls, thus ensuring privacy.
- PDF text extraction and processing
- Text chunking for efficient processing
- Choice of OpenAI models or local Mistral 7B for inference and LLaMA2 for embeddings via Ollama.
- Saved vector embeddings and search with FAISS
To run models locally using Ollama, installation is required. The most straightforward method to install and start Ollama is via its official Docker image. For comprehensive installation instructions, refer to:
- Ollama's official site for general installation guidelines.
- Ollama Docker Image Guide for specific steps to use the Docker image.
If you have an NVIDIA GPU on your machine, it's highly recommended to leverage it when running Ollama.
If you opt to use OpenAI models, you must obtain an
OPENAI_API_KEY. Please visit OpenAI API Keys to get your key. Once obtained, ensure to save it in your.envfile as follows:OPENAI_API_KEY=your_openai_api_key_here
Clone the repository to your local machine:
git clone https://github.com/rabee05/ask-a-pdf.git
cd ask-a-pdf.gitTo run the project, first, create a virtual environment. I recommend using Pipenv for its simplicity and effectiveness in managing project dependencies.
check if Pipenv is installed by running
pipenv --version. If not found, install Pipenv with:pip install pipenv --user
To ensure the virtual environment is created within the project folder, set the following environment variable:
export PIPENV_VENV_IN_PROJECT=1Now, create a virtual environment and install dependencies by running:
pipenv installIf you prefer to use venv for virtual environment management and a requirements.txt file for dependencies, follow these steps:
python3 -m venv .venvOn macOS and Linux:
source .venv/bin/activateInstall the required packages from requirements.txt:
pip install -r requirements.txtTo delete the virtual environment, either manually remove the environment directory or run :
pipenv --rmif you want to run OpenAI models, copy and rename .env.example to .env:
cp .env.example .env
Open the .env file and include the following information:
OPENAI_API_KEY=your_openai_api_key_here
After setting up your project, ensure to review and update the config/config.py file to suit your environment, particularly the OLLAMA_SERVER settings:
After installing Ollama, check it's running by navigating to http://localhost:11434/ or use the IP address with default port 11434. You should see "Ollama is running".
To run the application, execute the following command from the project root:
streamlit run app.pyNavigate to the URL provided by Streamlit in your browser to interact with the application.

