05 Jan 23:34

c8a1412

1.4.1 Latest

Latest

Fast-Plaid 1.4.1 Release Notes

Overview

Fast-Plaid 1.4.1 introduces incremental index updates with dynamic centroid expansion and a new low memory mode that significantly reduces GPU VRAM usage. This release focuses on making Fast-Plaid more efficient for production workloads with evolving document collections.

Key Features

Incremental Updates with Dynamic Centroid Expansion

The .update() method now supports intelligent centroid management:

Buffered Updates: New documents are accumulated in a buffer. When the buffer reaches the threshold (default: 100 documents), the system triggers a centroid expansion check.
Automatic Centroid Expansion: Embeddings far from existing centroids (outliers) are automatically identified and clustered to create new centroids, ensuring the index adapts to new data distributions over time.
Efficient Small Updates: Small batches below the buffer size are processed immediately without centroid expansion for fast incremental updates.

This replaces the previous behavior where centroids remained fixed after initial index creation, which could lead to accuracy degradation as data distributions shifted.

Low Memory Mode

New low_memory parameter (default: True) reduces GPU VRAM usage:

fast_plaid = search.FastPlaid(index="index", device="cuda", low_memory=True)

Document tensors are kept on CPU and moved to GPU only when needed during search
Significantly reduces VRAM footprint for large indexes
Trade-off: Slightly slower search performance
No effect when device="cpu"

Memory-Optimized K-means

Eliminates unnecessary numpy conversions during centroid computation

Embedding Reconstruction

New Rust function to reconstruct original embeddings from compressed index data, useful for debugging and analysis.

Thread-Safe Operations

File locking mechanism ensures safe concurrent access to indexes
Prevents corruption during simultaneous read/write operations

Configuration

New Parameters

Parameter	Default	Description
`low_memory`	`True`	Keep tensors on CPU, move to GPU only when needed
`buffer_size`	`100`	Documents to accumulate before centroid expansion
`start_from_scratch`	`999`	Rebuild index if fewer documents exist
`max_points_per_centroid`	`256`	Maximum points per centroid during expansion

Breaking Changes

Default batch_size for .create() increased from 25,000 to 50,000
fastkmeans dependency pinned to version 0.5.0

New Dependencies

filelock>=3.20.0 - For thread-safe index operations
usearch>=2.21.0 - For efficient similarity search during updates on CPU. Used to spot outliers.

Installation

pip install fast-plaid==1.4.1.290  # PyTorch 2.9.0
pip install fast-plaid==1.4.1.280  # PyTorch 2.8.0
pip install fast-plaid==1.4.1.271  # PyTorch 2.7.1
pip install fast-plaid==1.4.1.270  # PyTorch 2.7.0

Upgrade Notes

Existing indexes created with v1.3.x are compatible with v1.4.1. The new centroid expansion features will activate automatically when using .update() on existing indexes.

For users who were experiencing accuracy degradation with frequent updates, this release should significantly improve long-term index quality without requiring full re-indexing.

Contributors @raphaelsty

Contributors

raphaelsty

Assets 2

17 Dec 22:50

raphaelsty

1.3.1

d8da345

1.3.1

Small release which reduce memory usage of Fast-Plaid index creation. Getting better one step at a time 😊

Assets 2

04 Dec 15:53

raphaelsty

1.3.0

6ac767f

1.3.0

v1.3.0: Memory Optimizations & Architecture Improvements

This release introduces significant reductions in memory usage and improves index management.

🚀 Performance & Memory

Memory-Mapped Loading: Implemented a new loading system with incremental updates and zero-copy validation to prevent loading entire indices into RAM with update method.
Optimized Tensors: Shifted to smaller integer types (Uint8, Int32) where appropriate and replaced torch.quantile with a custom implementation to bypass Torch limitations.
Object-Based Management: Replaced the global index cache with direct object passing, allowing Python to fully manage the index lifecycle.

⚙️ API & Behavior Changes

Automatic Parallelism: Simplified the API by abstracting multi-device logic. If no device is provided, the search now automatically distributes across available GPUs. If CPU is provided, index should spawn faster.
Default Settings: Changed the default search batch_size from 25,000 to 2,000.

Assets 2

29 Oct 21:42

raphaelsty

1.2.5

fdba8a8

1.2.5

FastPlaid 1.2.5: Leaner & Faster

We're excited to release FastPlaid 1.2.5! This version focuses on significant optimizations for indexing, giving you faster search speeds and much more efficient GPU VRAM management.

✨ Highlights

Drastically Reduced GPU VRAM Usage: We've refactored the indexing process to process document embeddings in batches. This massively reduces GPU VRAM consumption during index creation, all without compromising on speed. No impact on overall CPU RAM usage or indexing speed.
Blazing-Fast Search for APIs: Centroids are now by default pre-loaded into memory by default during indexing / when creating Fast-Plaid object. This results in an acceleration of search performance for large-scale indexes deployed in API environments. It can be disabled by setting preload_index=False. Disabling this option might be useful in environment with lots of replicates of Fast-Plaid indexes, otherwise keep it on. ✅
Improved batch_size parameter: better control over memory usage during indexing with the new batch_size parameter.
Indexing Progress Bar: Track the status of your index creation. ⚙️

housekeeping

Code Clarity: Several variables have been renamed to improve the overall clarity and readability of the code.

PyTorch 2.9.0 Support: This release is fully compatible with PyTorch 2.9.0.

Dependency Note: Support for PyTorch 2.6.0 has been temporarily dropped due to compatibility issues.

Contributors: @raphaelsty @fschlatt

Contributors

fschlatt and raphaelsty

Assets 2

23 Sep 15:10

raphaelsty

1.2.4

4348cdd

1.2.4

The version 1.2.4 of fast-plaid now support Python 3.13 version and upload dedicated wheels to PyPi. 🚀

Assets 2

22 Sep 19:06

raphaelsty

1.2.3

38e3fbf

1.2.3

The 1.2.3 version of Fast-Plaid enhance the mutability of the index by adding deletion of specific embeddings.
It also includes a built-in sqlite filtering pipeline.

Assets 2

10 Sep 13:31

raphaelsty

1.2.1

d8bfa55

1.2.1

This new release allows to feed Fast-Plaid with un-padded queries. It also normalize decompressed embeddings to further enhance the results. It also solve an issue on small dataset where the fast-kmeans would be initialized with a larger number of clusters than training data points. This version will be integrated to PyLate as the backend for search.

Assets 2

08 Sep 11:55

raphaelsty

1.2.0

6147e6a

1.2.0

This new release introduces filtering for Fast-Plaid, allowing any system to interoperate with it by providing subset IDs to score. 🚀

import torch
from fast_plaid import search


fast_plaid = search.FastPlaid(index="index") # Load an existing index

# Apply a single filter to all queries
# Search for the top 5 results only within documents [2, 5, 10, 15, 18]
scores = fast_plaid.search(
    queries_embeddings=torch.randn(2, 50, 128), # 2 queries
    top_k=5,
    subset=[2, 5, 10, 15, 18]
)

print(scores)

# Apply a different filter for each query
# Query 1: search within documents [0, 1, 2, 3, 4]
# Query 2: search within documents [10, 11, 12, 13, 14]
scores = fast_plaid.search(
    queries_embeddings=torch.randn(2, 50, 128), # 2 queries
    top_k=5,
    subset=[
        [0, 1, 2, 3, 4],
        [10, 11, 12, 13, 14]
    ]
)

print(scores)

Assets 2

13 Aug 15:48

raphaelsty

1.1.0

47e00b9

1.1.0

Introducing mutable indexes with update method. 🚀

Adding new parameter n_samples_kmeans which allow to modulate the number of samples used to compute the centroids and reduce memory usage on demand.

Assets 2

06 Jun 15:13

raphaelsty

1.0.3

53f130b

1.0.3

Ease the Torch dependancy.

Assets 2

Releases: lightonai/fast-plaid

1.4.1

Fast-Plaid 1.4.1 Release Notes

Overview

Key Features

Incremental Updates with Dynamic Centroid Expansion

Low Memory Mode

Memory-Optimized K-means

Embedding Reconstruction

Thread-Safe Operations

Configuration

New Parameters

Breaking Changes

New Dependencies

Installation

Upgrade Notes

Contributors

Uh oh!

1.3.1

Uh oh!

1.3.0

v1.3.0: Memory Optimizations & Architecture Improvements

🚀 Performance & Memory

⚙️ API & Behavior Changes

Uh oh!

1.2.5

Contributors

Uh oh!

1.2.4

Uh oh!

1.2.3

Uh oh!

1.2.1

Uh oh!

1.2.0

Uh oh!

1.1.0

Uh oh!

1.0.3

Uh oh!