Fast-Plaid 1.4.1 Release Notes
Overview
Fast-Plaid 1.4.1 introduces incremental index updates with dynamic centroid expansion and a new low memory mode that significantly reduces GPU VRAM usage. This release focuses on making Fast-Plaid more efficient for production workloads with evolving document collections.
Key Features
Incremental Updates with Dynamic Centroid Expansion
The .update() method now supports intelligent centroid management:
- Buffered Updates: New documents are accumulated in a buffer. When the buffer reaches the threshold (default: 100 documents), the system triggers a centroid expansion check.
- Automatic Centroid Expansion: Embeddings far from existing centroids (outliers) are automatically identified and clustered to create new centroids, ensuring the index adapts to new data distributions over time.
- Efficient Small Updates: Small batches below the buffer size are processed immediately without centroid expansion for fast incremental updates.
This replaces the previous behavior where centroids remained fixed after initial index creation, which could lead to accuracy degradation as data distributions shifted.
Low Memory Mode
New low_memory parameter (default: True) reduces GPU VRAM usage:
fast_plaid = search.FastPlaid(index="index", device="cuda", low_memory=True)- Document tensors are kept on CPU and moved to GPU only when needed during search
- Significantly reduces VRAM footprint for large indexes
- Trade-off: Slightly slower search performance
- No effect when
device="cpu"
Memory-Optimized K-means
- Eliminates unnecessary numpy conversions during centroid computation
Embedding Reconstruction
New Rust function to reconstruct original embeddings from compressed index data, useful for debugging and analysis.
Thread-Safe Operations
- File locking mechanism ensures safe concurrent access to indexes
- Prevents corruption during simultaneous read/write operations
Configuration
New Parameters
| Parameter | Default | Description |
|---|---|---|
low_memory |
True |
Keep tensors on CPU, move to GPU only when needed |
buffer_size |
100 |
Documents to accumulate before centroid expansion |
start_from_scratch |
999 |
Rebuild index if fewer documents exist |
max_points_per_centroid |
256 |
Maximum points per centroid during expansion |
Breaking Changes
- Default
batch_sizefor.create()increased from 25,000 to 50,000 fastkmeansdependency pinned to version0.5.0
New Dependencies
filelock>=3.20.0- For thread-safe index operationsusearch>=2.21.0- For efficient similarity search during updates on CPU. Used to spot outliers.
Installation
pip install fast-plaid==1.4.1.290 # PyTorch 2.9.0
pip install fast-plaid==1.4.1.280 # PyTorch 2.8.0
pip install fast-plaid==1.4.1.271 # PyTorch 2.7.1
pip install fast-plaid==1.4.1.270 # PyTorch 2.7.0Upgrade Notes
Existing indexes created with v1.3.x are compatible with v1.4.1. The new centroid expansion features will activate automatically when using .update() on existing indexes.
For users who were experiencing accuracy degradation with frequent updates, this release should significantly improve long-term index quality without requiring full re-indexing.
Contributors @raphaelsty