Scalable vector database deployment for efficient similarity search across multiple nodes
This project demonstrates a scalable, distributed deployment of Qdrant, a high-performance vector database. It showcases how to set up multiple Qdrant nodes in a clustered configuration, enabling efficient vector search operations with high availability and performance.
Qdrant is designed for enterprise-grade vector similarity search, supporting a wide range of use cases:
- Semantic text search: Find documents with similar meaning, not just keywords
- Image similarity: Locate visually similar images
- Recommendation systems: Suggest products, content, or services
- Anomaly detection: Identify outliers in vector spaces
- Chatbot knowledge base: Power semantic retrieval for AI assistants
- π Scalable Multi-Node Architecture: Deploy 3+ Qdrant nodes that work as a unified cluster
- π Dynamic Sharding: Distribute vector data across nodes with customizable sharding strategies
- π High Availability: Built-in replication for fault tolerance and continuous operation
- π Monitoring Stack: Integrated Prometheus and Grafana for real-time metrics visualization
- π Python Client Integration: Comprehensive demo application showing cluster interaction
- π³ Docker-Based Deployment: Simple setup using Docker Compose
- π§ Detailed Configuration: Extensive options for tuning and optimizing performance
- Docker and Docker Compose
- Python 3.8+
- Git
# Clone the repository
git clone https://github.com/Mohitkr95/qdrant-multi-node-cluster.git
cd qdrant-multi-node-cluster
# Install the package and dependencies
pip install -e .
# Start the Qdrant cluster with Prometheus and Grafana
cd deployments/docker
docker-compose up -d
# Run the demonstration
python src/run_demo.py
# Or with custom parameters
python src/run_demo.py --host localhost --port 6333 --points 2000
- Qdrant API: http://localhost:6333
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (default login: admin/admin)
Comprehensive documentation is available in the docs directory:
- Getting Started Guide
- Architecture Overview
- Configuration Guide
- API Reference
- Performance Tuning
- Troubleshooting
qdrant-multi-node-cluster/
βββ config/ # Configuration files
β βββ grafana.json # Grafana dashboard configuration
β βββ prometheus.yml # Prometheus configuration
βββ deployments/ # Deployment files
β βββ docker/ # Docker-related files
β βββ docker-compose.yml # Docker Compose configuration
βββ docs/ # Documentation
β βββ api/ # API documentation
β βββ guides/ # User guides
β βββ images/ # Documentation images
βββ src/ # Source code
β βββ qdrant_demo/ # Main package
β β βββ config/ # Configuration settings
β β βββ core/ # Core functionality
β β βββ utils/ # Utility functions
β βββ run_demo.py # Main entry point
βββ tests/ # Test files
βββ LICENSE # MIT License
βββ Makefile # Development commands
βββ README.md # Project overview
βββ requirements.txt # Python dependencies
βββ setup.py # Package setup file
This project integrates Prometheus for metrics collection and Grafana for visualization, providing real-time insights into your Qdrant cluster's performance.
Customize sharding to distribute data efficiently:
# In settings.py
SHARD_NUMBER = 4 # Default shard count
Configure vector dimensions and distance metrics:
# In cluster_demo.py
client.create_collection(
collection_name=self.collection_name,
vectors_config=models.VectorParams(
size=self.vector_size, # 768 by default
distance=models.Distance.COSINE
),
# ...other parameters
)
Extend the cluster by adding more nodes in docker-compose.yml
:
qdrant_node4:
image: qdrant/qdrant:v1.6.1
volumes:
- ./data/node4:/qdrant/storage
depends_on:
- qdrant_node1
environment:
QDRANT__CLUSTER__ENABLED: "true"
command: "./qdrant --bootstrap http://qdrant_node1:6335 --uri http://qdrant_node4:6335"
Run the test suite:
# Run all tests
make test
# Or directly with Python
python -m unittest discover -s tests
Contributions are welcome! See our Contributing Guide for details on how to get started.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Qdrant for the excellent vector database
- Prometheus and Grafana for monitoring capabilities
- All contributors who help improve this project
Mohit Kumar - @Mohitkr95
Project Link: https://github.com/Mohitkr95/qdrant-multi-node-cluster