Skip to content

1.4.1

Latest

Choose a tag to compare

@raphaelsty raphaelsty released this 05 Jan 23:34
· 1 commit to main since this release

Fast-Plaid 1.4.1 Release Notes

Overview

Fast-Plaid 1.4.1 introduces incremental index updates with dynamic centroid expansion and a new low memory mode that significantly reduces GPU VRAM usage. This release focuses on making Fast-Plaid more efficient for production workloads with evolving document collections.

Key Features

Incremental Updates with Dynamic Centroid Expansion

The .update() method now supports intelligent centroid management:

  • Buffered Updates: New documents are accumulated in a buffer. When the buffer reaches the threshold (default: 100 documents), the system triggers a centroid expansion check.
  • Automatic Centroid Expansion: Embeddings far from existing centroids (outliers) are automatically identified and clustered to create new centroids, ensuring the index adapts to new data distributions over time.
  • Efficient Small Updates: Small batches below the buffer size are processed immediately without centroid expansion for fast incremental updates.

This replaces the previous behavior where centroids remained fixed after initial index creation, which could lead to accuracy degradation as data distributions shifted.

Low Memory Mode

New low_memory parameter (default: True) reduces GPU VRAM usage:

fast_plaid = search.FastPlaid(index="index", device="cuda", low_memory=True)
  • Document tensors are kept on CPU and moved to GPU only when needed during search
  • Significantly reduces VRAM footprint for large indexes
  • Trade-off: Slightly slower search performance
  • No effect when device="cpu"

Memory-Optimized K-means

  • Eliminates unnecessary numpy conversions during centroid computation

Embedding Reconstruction

New Rust function to reconstruct original embeddings from compressed index data, useful for debugging and analysis.

Thread-Safe Operations

  • File locking mechanism ensures safe concurrent access to indexes
  • Prevents corruption during simultaneous read/write operations

Configuration

New Parameters

Parameter Default Description
low_memory True Keep tensors on CPU, move to GPU only when needed
buffer_size 100 Documents to accumulate before centroid expansion
start_from_scratch 999 Rebuild index if fewer documents exist
max_points_per_centroid 256 Maximum points per centroid during expansion

Breaking Changes

  • Default batch_size for .create() increased from 25,000 to 50,000
  • fastkmeans dependency pinned to version 0.5.0

New Dependencies

  • filelock>=3.20.0 - For thread-safe index operations
  • usearch>=2.21.0 - For efficient similarity search during updates on CPU. Used to spot outliers.

Installation

pip install fast-plaid==1.4.1.290  # PyTorch 2.9.0
pip install fast-plaid==1.4.1.280  # PyTorch 2.8.0
pip install fast-plaid==1.4.1.271  # PyTorch 2.7.1
pip install fast-plaid==1.4.1.270  # PyTorch 2.7.0

Upgrade Notes

Existing indexes created with v1.3.x are compatible with v1.4.1. The new centroid expansion features will activate automatically when using .update() on existing indexes.

For users who were experiencing accuracy degradation with frequent updates, this release should significantly improve long-term index quality without requiring full re-indexing.

Contributors @raphaelsty