| title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned |
|---|---|---|---|---|---|---|---|
Semantic Search App |
📄🔗🧠❓🔗🤖 |
indigo |
pink |
gradio |
5.46.1 |
app.py |
false |
Upload a PDF, ask questions, and get context-aware answers powered by LangChain, ChromaDB, and NVIDIA/Google LLMs — all wrapped in a clean Gradio interface.
🔗 Live Demo: Semantic Search App
Want to learn more about the journey behind building this project? Check out the full story on Medium:
- 📄 Upload and process PDF documents
- 🔍 Perform semantic search using vector embeddings
- 🤖 Get answers from powerful LLMs (NVIDIA or Google Gemini)
- 🧠 Uses LangChain + ChromaDB for retrieval
- 📈 Integrated with Langfuse for tracing and observability.
- 🧰 Docker-ready and Hugging Face Spaces–compatible
| Component | Purpose |
|---|---|
| LangChain | Orchestration of embedding + LLM calls |
| ChromaDB | Vector database for semantic retrieval |
| NVIDIA / Gemini | Embedding + LLM APIs |
| Gradio | Interactive UI |
| Langfuse | Tracing and Observability |
| Docker | Containerized deployment |
git clone https://github.com/KI-IAN/semantic-search.git
cd semantic-search
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txtCreate a .env file in the root directory:
GOOGLE_API_KEY=your_google_api_key
NVIDIA_API_KEY=your_nvidia_api_key
CHROMA_DIR=./chroma_db
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key
LANGFUSE_HOST=https://cloud.langfuse.comThen run:
python app.py# To Run in Live environment. It automatically uses the docker-compose.yml
docker-compose up --build
# Or If you use the latest docker compose command, use the following
docker compose up --buildAccess the app at http://localhost:12100
# To Run in local environment use docker-compose.dev.yml if you want to reflect your code changes without rebuilding docker container
docker-compose -f docker-compose.dev.yml up --build
# Or If you use the latest docker compose command, use the following
docker compose -f docker-compose.dev.yml up --build
Access the app at http://localhost:12100
Create a new Space → choose Gradio as the SDK
Upload your project files (including app/, Dockerfile, requirements.txt, .env)
Set Secrets in the “Secrets” tab:
GOOGLE_API_KEY
NVIDIA_API_KEY
(Optional) CHROMA_DIR (defaults to ./chroma_db)
Hugging Face will auto-detect and launch the app via Gradio
To use this app, you'll need API keys for both Gemini and NVIDIA NIM. Here's how to obtain them:
Gemini is Google's family of generative AI models. To get an API key:
- Visit the Google AI Studio.
- Sign in with your Google account.
- Click "Create API Key" and copy the key shown.
- Use this key in your
.envfile or configuration asGEMINI_API_KEY.
Note: Gemini API access may be limited based on region or account eligibility. Check the Gemini API Rate Limits here
NIM (NVIDIA Inference Microservices) provides hosted models via REST APIs. To get started:
- Go to the NVIDIA API Catalog.
- Choose a model (e.g.,
nim-gemma,nim-mistral, etc.) and click "Get API Key". - Sign in or create an NVIDIA account if prompted.
- Copy your key and use it as
NVIDIA_NIM_API_KEYin your environment.
Tip: You can test NIM endpoints directly in the browser before integrating.
Once you have both keys, store them securely and never commit them to version control.
Upload a PDF — drag and drop your document
Click “📄 Process Document” — the app will split, embed, and store the content
Enter a query — ask a question like:
“What are the key findings?”
“Summarize the methodology.”
“What does the report say about climate change?”
Click “🔍 Ask a Question” — get semantic search results and an LLM-generated answer
All secrets are loaded from .env or Hugging Face Secrets tab:
| Variable | Description |
|---|---|
| GOOGLE_API_KEY | Gemini LLM API key |
| NVIDIA_API_KEY | NVIDIA LLM API key |
| CHROMA_DIR | Path to store Chroma vector DB |
Switch between NVIDIA and Gemini embeddings in process_pdf()
Change LLM model in search_query() (bytedance/seed-oss-36b-instruct, gemini-2.5-pro, etc.)
Tune chunk size and overlap in RecursiveCharacterTextSplitter
Add dropdowns to UI for model selection (optional)
semantic-search/
├── .env
├── .github/
├── .gitignore
├── docker-compose.yml
├── docker-compose.dev.yml
├── Dockerfile
├── requirements.txt
├── app.py
├── config.py
.env file is not tracked in git. Use it only for local development and do not push it to git if you save secrets there.
This project is open-source and distributed under the MIT License. Feel free to use, modify, and distribute it with attribution.
- LangChain — Powerful framework for orchestrating LLMs, embeddings, and retrieval pipelines.
- ChromaDB — Fast and flexible open-source vector database for semantic search.
- NVIDIA AI Endpoints — Hosted LLM and embedding APIs including Seed OSS and NV-Embed.
- Google Gemini — Robust multimodal LLM platform offering text embeddings and chat models.
- Gradio — Simple and elegant Python library for building machine learning interfaces.
- PyMuPDF — Lightweight PDF parser for fast and accurate text extraction.
- Docker — Containerization platform for reproducible deployment across environments.
- Hugging Face Spaces — Free hosting platform for ML demos with secret management and GPU support.
- Langfuse for providing excellent observability tools.