C++ implementation of TurboQuant, a near-optimal online vector quantization algorithm.
Based on TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
Compresses high-dimensional floating-point vectors to low-bitwidth integers (1-4 bits per coordinate) while preserving inner products and distances. No preprocessing, no training, no codebook learning. Vectors are quantized independently on arrival.
Two quantizers are provided:
- QuantizerMSE minimizes reconstruction error (mean-squared error).
- QuantizerProd provides unbiased inner product estimates with low variance.
Both achieve distortion within a constant factor (~2.7x) of the information-theoretic lower bound.
Depends on Eigen 3.4 (fetched automatically by CMake, no manual install needed)
mkdir build && cd build
cmake ..
cmake --build .
Run the example:
./turboquant_example
#include <turboquant/turboquant.h>
#include <random>
using namespace turboquant;
std::mt19937 rng(42);
int dim = 1536;
int bitwidth = 3;
// MSE quantizer
QuantizerMSE qmse(dim, bitwidth, rng);
auto compressed = qmse.quantize(x);
Vec reconstructed = qmse.dequantize(compressed);
// Inner product quantizer (unbiased)
QuantizerProd qprod(dim, bitwidth, rng);
auto compressed = qprod.quantize(x);
float ip = qprod.estimate_inner_product(y, compressed);For unit-norm vectors at bitwidth b:
| Metric | Upper bound | Lower bound |
|---|---|---|
| MSE | sqrt(3pi)/2 * 4^(-b) | 4^(-b) |
| Inner product error | sqrt(3pi^2) * norm(y)^2 / (d * 4^b) | norm(y)^2 / (d * 4^b) |
- Bitwidths 1-4 only (precomputed codebooks). Extending to higher bitwidths requires solving the Lloyd-Max optimization for the Beta distribution at the desired precision.
- Stores full d x d rotation matrix in memory. For large dimensions, a randomized Hadamard transform would reduce this to O(d log d).
- No SIMD optimization yet. The quantization loop is straightforward to vectorize with AVX2/AVX-512.