Skip to content

VedantK1604/MultiPDF-Chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MultiPDF Chatbot

Overview

MultiPDF Chatbot is a Streamlit application that allows users to interact with multiple PDF documents through a conversational interface. The application leverages various libraries and tools to process PDFs, extract text, and enable users to ask questions about the content of their documents.

Features

  • PDF Upload: Users can upload multiple PDF documents.
  • Text Extraction: The application extracts text from the uploaded PDFs.
  • Conversational Interface: Users can ask questions about the content of their PDFs and receive answers in a chat-like interface.
  • Vector Store: The application uses a vector store to manage and retrieve text chunks efficiently.
  • Memory Management: The conversation history is managed to provide context-aware responses.

How It Works

  1. PDF Upload: Users upload their PDF documents through the Streamlit interface.
  2. Text Extraction: The text from the PDFs is extracted using the PyPDF2 library.
  3. Text Chunking: The extracted text is split into manageable chunks using the CharacterTextSplitter from the langchain library.
  4. Vector Store: The text chunks are embedded using GoogleGenerativeAIEmbeddings and stored in a vector store using FAISS.
  5. Conversational Chain: A conversational retrieval chain is created using the ConversationalRetrievalChain from the langchain library, which retrieves relevant text chunks based on the user's questions.
  6. User Interaction: Users can ask questions, and the application provides answers based on the retrieved text chunks.

Dependencies

The application relies on the following dependencies:

  • streamlit: For creating the web interface.
  • PyPDF2: For extracting text from PDFs.
  • langchain: For text splitting, embedding, and conversational retrieval.
  • faiss: For efficient similarity search and clustering.
  • dotenv: For loading environment variables.

Usage

  1. Clone the repository:
    git clone <repository_url>
  2. Install the required dependencies:
    pip install -r requirements.txt
  3. Run the application:
    streamlit run app.py
  4. Upload your PDF documents and start asking questions.

About

MultiPDF Chatbot is a Streamlit application that allows users to interact with multiple PDF documents through a conversational interface. The application leverages various libraries and tools to process PDFs, extract text, and enable users to ask questions about the content of their documents.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages