Alusus language bindings for the FAISS library - A library for efficient similarity search and clustering of dense vectors.
This library provides Alusus bindings to FAISS, enabling high-performance vector similarity search and clustering operations in the Alusus programming language.
import "Apm";
Apm.importFile("Alusus/Faiss");
use Faiss;
import "Srl/Console";
import "Srl/Array";
import "Apm";
Apm.importFile("Alusus/Faiss");
use Srl;
use Faiss;
// Create a flat index with 4-dimensional vectors
def index: ref[Index];
Index.new(index, 4, "Flat", MetricType.METRIC_INNER_PRODUCT);
// Add vectors to the index
def xb: Array[Float]({1.0, 2.0, 3.0, 4.0, 2.0, 3.0, 4.0, 5.0});
index.add(2, xb.buf); // 2 vectors
// Search for nearest neighbors
def xq: Array[Float]({1.5, 2.5, 3.5, 4.5});
def labels: array[Int[64], 3];
def distances: array[Float, 3];
index.search(1, xq.buf, 3, distances, labels); // Find 3 nearest neighbors
// Clean up
Index.free(index);
See complete examples in the Examples/ directory.
This library wraps the FAISS C API. For detailed documentation of concepts, algorithms, and best practices, please refer to the official FAISS documentation:
- Main Documentation: https://github.com/facebookresearch/faiss/wiki
- C API Reference: https://github.com/facebookresearch/faiss/blob/main/c_api/
- Getting Started Tutorial: https://github.com/facebookresearch/faiss/wiki/Getting-started
- Index Selection Guide: https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index
Main index class for similarity search. C API docs
Factory method:
Index.new(obj: ref[ref[Index]], d: Int, description: CharsPtr, metric: Int): Int- Create index using factory string
Key methods:
train(n: Int[64], x: ref[array[Float]]): Int- Train the index on dataadd(n: Int[64], x: ref[array[Float]]): Int- Add vectors to indexsearch(n: Int[64], x: ref[array[Float]], k: Int[64], distances: ref[array[Float]], labels: ref[array[Int[64]]]): Int- Search for k nearest neighborsrangeSearch(n: Int[64], x: ref[array[Float]], radius: Float, result: ref[RangeSearchResult]): Int- Range searchreset(): Int- Remove all vectors from indexremoveIds(sel: ref[IdSelector], nRemoved: ref[ArchWord]): Int- Remove specific vectors
Properties:
d: Int[64]- Vector dimensionnTotal: Int[64]- Total number of indexed vectorsisTrained: Int- Whether index is trained (0 or 1)metricType: MetricType- Distance metric being usedverbose: Int- Verbosity level
Cleanup:
Index.free(obj: ref[Index])- Free index memory
Brute-force index performing exact search. Guide
Creation:
IndexFlat.new(obj: ref[ref[IndexFlat]]): IntIndexFlat.new(obj: ref[ref[IndexFlat]], d: Int[64], metric: MetricType): Int
Additional methods:
getXb(outXb: ref[ref[array[Float]]], outSize: ref[ArchWord])- Get stored vectorscomputeDistanceSubset(n: Int[64], x: ref[array[Float]], k: Int[64], outDistances: ref[array[Float]], labels: ref[array[Int[64]]]): Int- Compute distances to subset
Inherits all Index methods.
Flat index specialized for inner product metric. Docs
Creation:
IndexFlatIp.new(obj: ref[ref[IndexFlatIp]]): IntIndexFlatIp.new(obj: ref[ref[IndexFlatIp]], d: Int[64]): Int
Flat index specialized for L2 (Euclidean) distance. Docs
Creation:
IndexFlatL2.new(obj: ref[ref[IndexFlatL2]]): IntIndexFlatL2.new(obj: ref[ref[IndexFlatL2]], d: Int[64]): Int
Inverted file index for faster approximate search. Guide
Additional properties:
nList: ArchWord- Number of inverted lists (clusters)nProbe: ArchWord- Number of clusters to visit during search (tunable)quantizer: ref[Index]- Quantizer indexownFields: Int- Whether index owns its fields
Additional methods:
mergeFrom(other: ref[IndexIvf], addId: Int[64]): Int- Merge another IVF indexcopySubsetTo(other: ref[IndexIvf], subsetType: Int, a1: Int[64], a2: Int[64]): Int- Copy subset of vectorsgetListSize(listNo: ArchWord): ArchWord- Get size of inverted listmakeDirectMap(newMaintainDirectMap: Int): Int- Create direct map for reconstructionimbalanceFactor: Float[64]- Get cluster imbalance factorprintStats()- Print index statistics
Index for binary (hamming) vectors. Guide
Similar to Index but operates on binary vectors (Word[8] arrays instead of Float arrays).
Manages index parameters for grid search and tuning. C API
Methods:
new(parameterSpace: ref[ref[ParameterSpace]]): IntsetIndexParameter(index: ref[Index], paramName: CharsPtr, val: Float[64]): Int- Set single parametersetIndexParameters(index: ref[Index], params: CharsPtr): Int- Set multiple parametersaddRange(name: CharsPtr, outRange: ref[ref[ParameterRange]]): Int- Add parameter range
Runtime search parameters. C API
Methods:
new(obj: ref[ref[SearchParameters]], sel: ref[IdSelector]): IntnProbe: Int- Number of clusters to probe (for IVF indexes)
Extended search parameters for IVF indexes.
Methods:
new(obj: ref[ref[SearchParametersIvf]]): Intnew(obj: ref[ref[SearchParametersIvf]], sel: ref[IdSelector], nprobe: ArchWord, maxCodes: ArchWord): Int
Properties:
sel: ref[IdSelector]- ID selectornProbe: ArchWord- Number of clusters to probemaxCodes: ArchWord- Maximum codes to scan
K-means clustering implementation. C API
Creation:
new(out: ref[ref[Clustering]], d: Int, k: Int): Int- Create with dimension and k clustersnew(out: ref[ref[Clustering]], d: Int, k: Int, params: ptr[ClusteringParameters]): Int- Create with parameters
Methods:
train(n: Int[64], x: ref[Float], index: ref[Index]): Int- Run k-meansgetCentroids(centroids: ref[ref[array[Float]]], size: ref[ArchWord])- Get cluster centroidsgetIterationStats(stats_out: ref[ref[ClusteringIterationStats]], size: ref[ArchWord])- Get iteration statistics
Properties:
niter: Int- Number of iterationsnredo: Int- Number of k-means restartsk: ArchWord- Number of clustersd: ArchWord- Vector dimension
Select subsets of vectors by ID. C API
Variants:
IdSelectorBatch- Select specific IDs from a listIdSelectorRange- Select IDs in a rangeIdSelectorBitmap- Select using a bitmapIdSelectorNot- Invert a selectorIdSelectorAnd- Combine selectors with ANDIdSelectorOr- Combine selectors with ORIdSelectorXor- Combine selectors with XOR
Results from range search queries. C API
Methods:
new(obj: ref[ref[RangeSearchResult]], nq: Int[64]): IntdoAllocation(): Int- Allocate result buffersbufferSize(): ArchWord- Get buffer sizegetLims(outLims: ref[ref[array[ArchWord]]])- Get result limits arraygetLabels(outLabels: ref[ref[array[Int[64]]]], outDistances: ref[ref[ref[Float]]])- Get labels and distances
Compute distances to vectors. C API
Methods:
setQuery(x: ref[array[Float]]): Int- Set query vectorvectorToQueryDis(i: Int[64], qd: ref[array[Float]]): Int- Distance to querysymmetricDis(i: Int[64], j: Int[64], vd: ref[array[Float]]): Int- Symmetric distance
Distance metrics. Docs
METRIC_INNER_PRODUCT: 0- Inner product (maximum similarity)METRIC_L2: 1- Euclidean distance (L2 norm)METRIC_L1: 2- Manhattan distance (L1 norm)METRIC_LINF: 3- Infinity norm (Chebyshev distance)METRIC_LP: 4- Lp normMETRIC_CANBERRA: 20- Canberra distanceMETRIC_BRAY_CURTIS: 21- Bray-Curtis dissimilarityMETRIC_JENSEN_SHANNON: 22- Jensen-Shannon divergence
Return codes from C API functions.
OK: 0- SuccessUNKNOWN_EXCEPT: -1- Unknown exceptionFAISS_EXCEPT: -2- FAISS exceptionSTD_EXCEPT: -4- Standard library exception
getLastError(): CharsPtr- Get last error messagekmeansClustering(d: ArchWord, n: ArchWord, k: ArchWord, x: ref[array[Float]], centroids: ref[array[Float]], q_error: ref[Float]) Int- Standalone k-means
To enable GPU acceleration, set the environment variable before running:
export FAISS_USE_GPU=1The library will automatically load GPU-enabled binaries when available. See FAISS GPU documentation for details.
The Index.new factory method accepts strings to create different index types:
"Flat"- Exact search (brute force)"IVFn,Flat"- IVF with n centroids, flat encoding"IVFn,PQm"- IVF with n centroids, PQ with m subquantizers"HNSW32"- Hierarchical navigable small world with 32 neighbors"IVFn,HNSW32"- Combined IVF and HNSW
See the index factory documentation for all available options and combinations.
Complete working examples are in the Examples/ directory:
- example.alusus - Basic flat index with inner product search
- example2.alusus - IVF index with parameter tuning
-
Index Selection:
- Use
IndexFlatfor exact search on datasets <1M vectors - Use
IndexIVFfor approximate search on larger datasets - See the index selection guide
- Use
-
Training: IVF and other approximate indexes require training before adding vectors
-
nprobe Parameter: For IVF indexes, higher nprobe = better accuracy but slower search
-
GPU Acceleration: Enable GPU for operations on >10M vectors
-
Memory: Flat indexes store all vectors in memory; use compression for large datasets
See FAISS performance guidelines for detailed recommendations.
- FAISS GitHub: https://github.com/facebookresearch/faiss
- FAISS Wiki: https://github.com/facebookresearch/faiss/wiki
- Research Paper: Billion-scale similarity search with GPUs
- Alusus Language: https://alusus.org
This binding follows the FAISS license (MIT). See the license file for details.