LLM-powered search engine with hybrid search and re-ranking capabilities.
- 🔄 Hybrid Search: Combines FAISS and ElasticSearch for optimal results
- 🤖 LLM Re-ranking: Uses large language models to re-rank search results for better relevance
- ⚡ FastAPI Backend: High-performance REST API
- 📊 Evaluation Metrics: Built-in support for NDCG, MRR, and other IR metrics
- 🧪 Comprehensive Testing: Full test suite with pytest
SearchGPT/
├── src/
│ ├── api/ # FastAPI application
│ ├── hybrid_search/ # Hybrid search implementation
│ ├── llm_reranking/ # LLM re-ranking logic
│ ├── evaluation/ # Metrics and benchmarks
│ ├── core/ # Utilities (config, logging, cache)
│ └── deployment/ # Docker and deployment files
├── tests/ # Test suite
├── scripts/ # Utility scripts
├── data/ # Data directory (indices, embeddings)
└── resources/ # Research papers and documentation
- Python 3.9+
- UV (recommended) or pip
-
Clone the repository
git clone https://github.com/YourUsername/SearchGPT.git cd SearchGPT -
Install dependencies with UV
uv sync
Or with pip:
pip install -e . -
Set up environment variables
cp .env.example .env # Edit .env and add your API keys
uv run uvicorn src.api.main:app --reload
uvicorn src.api.main:app --reload --host 0.0.0.0 --port 8000Visit http://localhost:8000/docs for the interactive API documentation.
uv run pytest
pytest
pytest --cov=src --cov-report=htmlcurl -X POST "http://localhost:8000/api/v1/search" \
-H "Content-Type: application/json" \
-d '{
"query": "How does hybrid search work?",
"top_k": 10,
"use_reranking": true,
"hybrid_alpha": 0.5
}'{
"query": "How does hybrid search work?",
"results": [
{
"id": "doc1",
"title": "Introduction to Hybrid Search",
"content": "Hybrid search combines...",
"score": 0.95,
"metadata": {}
}
],
"total": 1,
"processing_time_ms": 123.45
}Configuration is managed through environment variables (see .env.example):
OPENAI_API_KEY: OpenAI API key for embeddings and re-rankingDEFAULT_LLM_MODEL: LLM model to use (default: gpt-4o-mini)EMBEDDING_MODEL: Embedding model (default: text-embedding-3-small)DEFAULT_TOP_K: Number of results to return (default: 10)DEFAULT_HYBRID_ALPHA: Balance between BM25 (0.0) and vector (1.0) search
uv run black src tests
uv run ruff check src testsuv add package-name
uv add --dev package-namescripts/setup_indices.py: Initialize search indicesscripts/run_benchmark.py: Run evaluation benchmarks
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Run the test suite
- Submit a pull request
MIT License - see LICENSE file for details
Research papers and documentation can be found in the resources/ directory.