PDFs Chatbot using Langchain, GPT 3.5 and Llama 2

This is a Python gui application that demonstrates how to build a custom PDF chatbot using LangChain and GPT 3.5 / Llama 2.

How it works (GPT 3.5)

The application gui is built using streamlit
The application reads text from PDF files, splits it into chunks
Uses OpenAI Embedding API to generate embedding vectors used to find the most relevant content to a user's question
Build a conversational retrieval chain using Langchain
Use OpenAI GPT API to generate respond based on content in PDF

Requirements

Install the following Python packages:

pip install streamlit pypdf2 langchain python-dotenv faiss-cpu openai sentence_transformers

Create a .env file in the root directory of the project and add the following environment variables:

OPENAI_API_KEY= # Your OpenAI API key

Code Structure

The code is structured as follows:

app_db.py: The main application file that defines the Streamlit gui app and the user interface.
- get_pdf_text function: reads text from PDF files
- get_text_chunks function: splits text into chunks
- get_vectorstore function: creates a FAISS vectorstore from text chunks and their embeddings
- get_conversation_chain function: creates a retrieval chain from vectorstore
- handle_userinput function: generates response from OpenAI GPT API
- create_connection: connect to MySQL DB
- initialize_db: create tables if not exist
- create_new_session: create new session id for identifying purposes
- get_previous_sessions: load preivous session from DB
- load_chat_history_for_session: load chat history from DB to display on the streamlit app
- save_message_to_db: save chat history to DB
htmlTemplates.py: A module that defines HTML templates for the user interface.
create_chat_history_db.sql: sql script that creates chat history DB and tables to store/retrieve chat data.

How to run

streamlit run app_db.py

Update to use Llama 2 running locally

Install Python bindings for llama.cpp library

pip install llama-cpp-python

Download the llama 2 7B GGML model from https://huggingface.co/TheBloke/LLaMa-7B-GGML/blob/main/llama-7b.ggmlv3.q4_1.bin and place it in the models folder
Switch language model to use Llama 2 loaded by LlamaCpp
Switch embedding model to MiniLM-L6-v2 using HuggingFaceEmbeddings

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ads_data_html		ads_data_html
.gitignore		.gitignore
Ads cookbook .pdf		Ads cookbook .pdf
app.py		app.py
app_db.py		app_db.py
create_chat_history_db.sql		create_chat_history_db.sql
htmlTemplates.py		htmlTemplates.py
readme.md		readme.md
requirements.txt		requirements.txt
sample.pdf		sample.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDFs Chatbot using Langchain, GPT 3.5 and Llama 2

How it works (GPT 3.5)

Requirements

Code Structure

How to run

Update to use Llama 2 running locally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PDFs Chatbot using Langchain, GPT 3.5 and Llama 2

How it works (GPT 3.5)

Requirements

Code Structure

How to run

Update to use Llama 2 running locally

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages