Skip to content

Scalable Qdrant vector database cluster with Docker Compose, monitoring, and comprehensive documentation for high-performance similarity search applications.

License

Notifications You must be signed in to change notification settings

Mohitkr95/qdrant-multi-node-cluster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

27 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Qdrant Multi-Node Cluster

Qdrant Logo

Build & Docs Status Documentation License GitHub Release GitHub Issues Python Version GitHub Stars

Scalable vector database deployment for efficient similarity search across multiple nodes


πŸ“– Overview

This project demonstrates a scalable, distributed deployment of Qdrant, a high-performance vector database. It showcases how to set up multiple Qdrant nodes in a clustered configuration, enabling efficient vector search operations with high availability and performance.

Qdrant is designed for enterprise-grade vector similarity search, supporting a wide range of use cases:

  • Semantic text search: Find documents with similar meaning, not just keywords
  • Image similarity: Locate visually similar images
  • Recommendation systems: Suggest products, content, or services
  • Anomaly detection: Identify outliers in vector spaces
  • Chatbot knowledge base: Power semantic retrieval for AI assistants

✨ Key Features

  • πŸ”„ Scalable Multi-Node Architecture: Deploy 3+ Qdrant nodes that work as a unified cluster
  • πŸ“Š Dynamic Sharding: Distribute vector data across nodes with customizable sharding strategies
  • 🏠 High Availability: Built-in replication for fault tolerance and continuous operation
  • πŸ“ˆ Monitoring Stack: Integrated Prometheus and Grafana for real-time metrics visualization
  • 🐍 Python Client Integration: Comprehensive demo application showing cluster interaction
  • 🐳 Docker-Based Deployment: Simple setup using Docker Compose
  • πŸ”§ Detailed Configuration: Extensive options for tuning and optimizing performance

πŸš€ Quick Start

Prerequisites

  • Docker and Docker Compose
  • Python 3.8+
  • Git

Installation

# Clone the repository
git clone https://github.com/Mohitkr95/qdrant-multi-node-cluster.git
cd qdrant-multi-node-cluster

# Install the package and dependencies
pip install -e .

Deploy the Cluster

# Start the Qdrant cluster with Prometheus and Grafana
cd deployments/docker
docker-compose up -d

Run the Demo

# Run the demonstration
python src/run_demo.py

# Or with custom parameters
python src/run_demo.py --host localhost --port 6333 --points 2000

Access Services

πŸ“‹ Documentation

Comprehensive documentation is available in the docs directory:

πŸ› οΈ Project Structure

qdrant-multi-node-cluster/
β”œβ”€β”€ config/                    # Configuration files
β”‚   β”œβ”€β”€ grafana.json           # Grafana dashboard configuration
β”‚   └── prometheus.yml         # Prometheus configuration
β”œβ”€β”€ deployments/               # Deployment files
β”‚   └── docker/                # Docker-related files
β”‚       └── docker-compose.yml # Docker Compose configuration
β”œβ”€β”€ docs/                      # Documentation
β”‚   β”œβ”€β”€ api/                   # API documentation
β”‚   β”œβ”€β”€ guides/                # User guides
β”‚   └── images/                # Documentation images
β”œβ”€β”€ src/                       # Source code
β”‚   β”œβ”€β”€ qdrant_demo/           # Main package
β”‚   β”‚   β”œβ”€β”€ config/            # Configuration settings
β”‚   β”‚   β”œβ”€β”€ core/              # Core functionality
β”‚   β”‚   └── utils/             # Utility functions
β”‚   └── run_demo.py            # Main entry point
β”œβ”€β”€ tests/                     # Test files
β”œβ”€β”€ LICENSE                    # MIT License
β”œβ”€β”€ Makefile                   # Development commands
β”œβ”€β”€ README.md                  # Project overview
β”œβ”€β”€ requirements.txt           # Python dependencies
└── setup.py                   # Package setup file

πŸ“Š Monitoring and Visualization

This project integrates Prometheus for metrics collection and Grafana for visualization, providing real-time insights into your Qdrant cluster's performance.

Grafana Dashboard

πŸ” Advanced Configuration

Sharding Configuration

Customize sharding to distribute data efficiently:

# In settings.py
SHARD_NUMBER = 4  # Default shard count

Vector Parameters

Configure vector dimensions and distance metrics:

# In cluster_demo.py
client.create_collection(
    collection_name=self.collection_name,
    vectors_config=models.VectorParams(
        size=self.vector_size,  # 768 by default 
        distance=models.Distance.COSINE
    ),
    # ...other parameters
)

Adding More Nodes

Extend the cluster by adding more nodes in docker-compose.yml:

qdrant_node4:
  image: qdrant/qdrant:v1.6.1
  volumes:
    - ./data/node4:/qdrant/storage
  depends_on:
    - qdrant_node1
  environment:
    QDRANT__CLUSTER__ENABLED: "true"
  command: "./qdrant --bootstrap http://qdrant_node1:6335 --uri http://qdrant_node4:6335"

πŸ§ͺ Testing

Run the test suite:

# Run all tests
make test

# Or directly with Python
python -m unittest discover -s tests

🀝 Contributing

Contributions are welcome! See our Contributing Guide for details on how to get started.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ‘ Acknowledgments

πŸ“§ Contact

Mohit Kumar - @Mohitkr95

Project Link: https://github.com/Mohitkr95/qdrant-multi-node-cluster