Skip to content

Upload PDFs, ask natural-language questions, and get context-aware answers powered by LangChain, ChromaDB, and NVIDIA/Gemini LLMs — all wrapped in a clean Gradio interface. Docker-ready and deployable on Hugging Face Spaces.

License

Notifications You must be signed in to change notification settings

KI-IAN/semantic-search

Repository files navigation

title emoji colorFrom colorTo sdk sdk_version app_file pinned
Semantic Search App
📄🔗🧠❓🔗🤖
indigo
pink
gradio
5.46.1
app.py
false

Semantic Search App (📄 → 🔗 → 🧠 → ❓ → 🔗 → 🤖)

Upload a PDF, ask questions, and get context-aware answers powered by LangChain, ChromaDB, and NVIDIA/Google LLMs — all wrapped in a clean Gradio interface.

🔗 Live Demo: Semantic Search App

📖 Read the Full Story

Want to learn more about the journey behind building this project? Check out the full story on Medium:


🚀 Features

  • 📄 Upload and process PDF documents
  • 🔍 Perform semantic search using vector embeddings
  • 🤖 Get answers from powerful LLMs (NVIDIA or Google Gemini)
  • 🧠 Uses LangChain + ChromaDB for retrieval
  • 📈 Integrated with Langfuse for tracing and observability.
  • 🧰 Docker-ready and Hugging Face Spaces–compatible

🛠️ Tech Stack

Component Purpose
LangChain Orchestration of embedding + LLM calls
ChromaDB Vector database for semantic retrieval
NVIDIA / Gemini Embedding + LLM APIs
Gradio Interactive UI
Langfuse Tracing and Observability
Docker Containerized deployment

📦 Installation

Option 1: Run Locally

git clone https://github.com/KI-IAN/semantic-search.git
cd semantic-search
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Create a .env file in the root directory:

GOOGLE_API_KEY=your_google_api_key
NVIDIA_API_KEY=your_nvidia_api_key
CHROMA_DIR=./chroma_db
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key
LANGFUSE_HOST=https://cloud.langfuse.com

Then run:

python app.py

Option 2: Run with Docker

# To Run in Live environment. It automatically uses the docker-compose.yml
docker-compose up --build 

# Or If you use the latest docker compose command, use the following

docker compose up --build

Access the app at http://localhost:12100


# To Run in local environment use docker-compose.dev.yml if you want to reflect your code changes without rebuilding docker container
docker-compose -f docker-compose.dev.yml up --build

# Or If you use the latest docker compose command, use the following
docker compose -f docker-compose.dev.yml up --build

Access the app at http://localhost:12100


Option 3: Deploy on Hugging Face Spaces

Create a new Space → choose Gradio as the SDK

Upload your project files (including app/, Dockerfile, requirements.txt, .env)

Set Secrets in the “Secrets” tab:

GOOGLE_API_KEY

NVIDIA_API_KEY

(Optional) CHROMA_DIR (defaults to ./chroma_db)

Hugging Face will auto-detect and launch the app via Gradio


🔑 Getting API Keys

To use this app, you'll need API keys for both Gemini and NVIDIA NIM. Here's how to obtain them:

🌐 Gemini API Key

Gemini is Google's family of generative AI models. To get an API key:

  1. Visit the Google AI Studio.
  2. Sign in with your Google account.
  3. Click "Create API Key" and copy the key shown.
  4. Use this key in your .env file or configuration as GEMINI_API_KEY.

Note: Gemini API access may be limited based on region or account eligibility. Check the Gemini API Rate Limits here

🚀 NVIDIA NIM API Key

NIM (NVIDIA Inference Microservices) provides hosted models via REST APIs. To get started:

  1. Go to the NVIDIA API Catalog.
  2. Choose a model (e.g., nim-gemma, nim-mistral, etc.) and click "Get API Key".
  3. Sign in or create an NVIDIA account if prompted.
  4. Copy your key and use it as NVIDIA_NIM_API_KEY in your environment.

Tip: You can test NIM endpoints directly in the browser before integrating.


Once you have both keys, store them securely and never commit them to version control.


🧪 How to Use

Upload a PDF — drag and drop your document

Click “📄 Process Document” — the app will split, embed, and store the content

Enter a query — ask a question like:

“What are the key findings?”

“Summarize the methodology.”

“What does the report say about climate change?”

Click “🔍 Ask a Question” — get semantic search results and an LLM-generated answer


⚙️ Configuration

All secrets are loaded from .env or Hugging Face Secrets tab:

Variable Description
GOOGLE_API_KEY Gemini LLM API key
NVIDIA_API_KEY NVIDIA LLM API key
CHROMA_DIR Path to store Chroma vector DB

🧩 Customization

Switch between NVIDIA and Gemini embeddings in process_pdf()

Change LLM model in search_query() (bytedance/seed-oss-36b-instruct, gemini-2.5-pro, etc.)

Tune chunk size and overlap in RecursiveCharacterTextSplitter

Add dropdowns to UI for model selection (optional)


📁 File Structure

semantic-search/
├── .env 
├── .github/
├── .gitignore
├── docker-compose.yml
├── docker-compose.dev.yml
├── Dockerfile
├── requirements.txt
├── app.py
├── config.py

.env file is not tracked in git. Use it only for local development and do not push it to git if you save secrets there.


📜 License

This project is open-source and distributed under the MIT License. Feel free to use, modify, and distribute it with attribution.


🤝 Acknowledgements

  • LangChain — Powerful framework for orchestrating LLMs, embeddings, and retrieval pipelines.
  • ChromaDB — Fast and flexible open-source vector database for semantic search.
  • NVIDIA AI Endpoints — Hosted LLM and embedding APIs including Seed OSS and NV-Embed.
  • Google Gemini — Robust multimodal LLM platform offering text embeddings and chat models.
  • Gradio — Simple and elegant Python library for building machine learning interfaces.
  • PyMuPDF — Lightweight PDF parser for fast and accurate text extraction.
  • Docker — Containerization platform for reproducible deployment across environments.
  • Hugging Face Spaces — Free hosting platform for ML demos with secret management and GPU support.
  • Langfuse for providing excellent observability tools.

About

Upload PDFs, ask natural-language questions, and get context-aware answers powered by LangChain, ChromaDB, and NVIDIA/Gemini LLMs — all wrapped in a clean Gradio interface. Docker-ready and deployable on Hugging Face Spaces.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published