Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/articles/Vector-Indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Running AI applications depends on vectors, often called [embeddings](https://su

![What is a vector index](../assets/use_cases/vector_indexes/vector_index1.png)

Vector indexing, by creating groups of matching elements, speeds up similarity search - which calculate vector closeness using metrics like Euclidean or Jacobian distance. (In small datasets where accuracy is more important than efficiency, you can use K-Nearest Neighbors to pinpoint your query's closest near neighbors. As datasets get bigger and efficiency becomes an issue, an [Approximate Nearest Neighbor](https://superlinked.com/vectorhub/building-blocks/vector-search/nearest-neighbor-algorithms) (ANN) approach will *very quickly* return accurate-enough results.)
Vector indexing, by creating groups of matching elements, speeds up similarity search - which calculate vector closeness using metrics like Euclidean or Jaccard distance. (In small datasets where accuracy is more important than efficiency, you can use K-Nearest Neighbors to pinpoint your query's closest near neighbors. As datasets get bigger and efficiency becomes an issue, an [Approximate Nearest Neighbor](https://superlinked.com/vectorhub/building-blocks/vector-search/nearest-neighbor-algorithms) (ANN) approach will *very quickly* return accurate-enough results.)

Vector indexes are crucial to efficient, relevant, and accurate search in various common applications, including Retrieval Augmented Generation ([RAG](https://superlinked.com/vectorhub/articles/advanced-retrieval-augmented-generation)), [semantic search in image databases](https://superlinked.com/vectorhub/articles/retrieval-from-image-text-modalities) (e.g., in smartphones), large text documents, advanced e-commerce websites, and so on.

Expand Down Expand Up @@ -77,9 +77,9 @@ IVF_SQ makes sense when dealing with medium to large datasets where memory effic

### DiskANN

Most ANN algorithms - including those above - are designed for in-memory computation. But when you're dealing with *big data*, in-memory computation can be a bottleneck. Disk-based ANN ([DiskANN](https://suhasjs.github.io/files/diskann_neurips19.pdf)) is built to leverage Solid-State Drives' (SSDs') large memory and high-speed capabilities. DiskANN indexes vectors using the Vamana algorithm, a graph-based indexing structure that minimizes the number of sequential disk reads required during, by creating a graph with a smaller search "diameter" - the max distance between any two nodes (representing vectors), measured as the least number of hops (edges) to get from one to the other. This makes the search process more efficient, especially for the kind of large-scale datasets that are stored on SSDs.
Most ANN algorithms - including those above - are designed for in-memory computation. But when you're dealing with *big data*, in-memory computation can be a bottleneck. Disk-based ANN ([DiskANN](https://suhasjs.github.io/files/diskann_neurips19.pdf)) is built to leverage Solid-State Drives' (SSDs') large memory and high-speed capabilities. DiskANN indexes vectors using the Vamana algorithm, a graph-based indexing structure that minimizes the number of sequential disk reads required, by creating a graph with a smaller search "diameter" - the max distance between any two nodes (representing vectors), measured as the least number of hops (edges) to get from one to the other. This makes the search process more efficient, especially for the kind of large-scale datasets that are stored on SSDs.

By using a SSD to store and search its graph index, DiskANN can be cost-effective, scalable, and efficient.
By using an SSD to store and search its graph index, DiskANN can be cost-effective, scalable, and efficient.

### SPTAG-based Approximate Nearest Neighbor Search (SPANN)

Expand Down
6 changes: 3 additions & 3 deletions docs/articles/advanced_retrieval_augmented_generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ embed_model = HuggingFaceEmbedding(model_name="mixedbread-ai/mxbai-embed-large-v
Settings.embed_model = embed_model
```

Specifically, we selected "mixedbread-ai/mxbai-embed-large-v1", a model that strikes a balance between retrieval accuracy and computational efficiency, according to recent performance evaluations in the Huggingface [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
Specifically, we selected "mixedbread-ai/mxbai-embed-large-v1", a model that strikes a balance between retrieval accuracy and computational efficiency, according to recent performance evaluations in the Hugging Face [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard).

### Indexing

Expand Down Expand Up @@ -164,7 +164,7 @@ Another way to enhance retrieval accuracy is through [hybrid search](https://sup

This hybrid approach captures both the semantic richness of embeddings and the direct match precision of keyword search, leading to improved relevance in retrieved documents.

So far we've seen how careful preretrieval (data preparation, chunking, embedding, indexing) and retrieval (hybrid search) can help improve RAG retrieval results. What about _after_ we've done our retrieval?
So far we've seen how careful pre-retrieval (data preparation, chunking, embedding, indexing) and retrieval (hybrid search) can help improve RAG retrieval results. What about _after_ we've done our retrieval?

## Post-retrieval

Expand Down Expand Up @@ -259,7 +259,7 @@ display(Markdown(f"<b>{response}</b>"))

"Based on the context provided, the dangers of hallucinations in the context of machine learning and natural language processing are that they can lead to inaccurate or incorrect results, particularly in customer support and content creation. These hallucinations, which are false pieces of information generated by a generative model, can have disastrous consequences in use cases where there's more at stake than simple internet searches. In short, machine hallucinations can be dangerous because they can lead to false information being presented as fact, which can have serious consequences in real-world applications."

Our advanced RAG pipeline result appears to be relatively precise, avoid hallucinations, and effectively integrate retrieved context into generated output. Note: generation is not a fully deterministic process, so if you run this code yourself, you may receive slightly different output.
Our advanced RAG pipeline result appears to be relatively precise, avoids hallucinations, and effectively integrates retrieved context into generated output. Note: generation is not a fully deterministic process, so if you run this code yourself, you may receive slightly different output.

## Conclusion

Expand Down
12 changes: 6 additions & 6 deletions docs/articles/airbnb-search-benchmarking.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Introduction & Motivation

Imagine you are searching for the ideal Airbnb for a weekend getaway. You open the website and adjust sliders and checkboxes but still encounter lists of options that nearly match your need but never are never truly what you are looking for. Although it is straightforward to specify a filter such as: "price less than two hundred dollars", rigid tags and thresholds for more complex search queries, make it a much more difficult task to figure out what the user is looking for.
Imagine you are searching for the ideal Airbnb for a weekend getaway. You open the website and adjust sliders and checkboxes but still encounter lists of options that nearly match your need but are never truly what you are looking for. Although it is straightforward to specify a filter such as: "price less than two hundred dollars", rigid tags and thresholds for more complex search queries, make it a much more difficult task to figure out what the user is looking for.

Converting a mental image of a luxury apartment near the city's finest cafés or an affordable business-ready suite with good reviews into numerical filters often proves frustrating. Natural language is inherently unstructured and must be transformed into numerical representations to uncover user intent. At the same time, the rich structured data associated with each listing must also be encoded numerically to reveal relationships between location, comfort, price, and reviews.

Expand Down Expand Up @@ -49,7 +49,7 @@ def create_text_description(row):
"""Create a unified text description from listing attributes."""
text = f"{row['listing_name']} is a {row['accommodation_type']} "
text += f"For {row['max_guests']} guests. "
text += f"It costs ${row['price']} per night with a rating of {row['rating']} with {row['review_count']} nymber of reviews. "
text += f"It costs ${row['price']} per night with a rating of {row['rating']} with {row['review_count']} number of reviews. "
text += f"Description: {row['description']} "
text += f"Amenities include: {', '.join(row['amenities_list'])}"
return text
Expand Down Expand Up @@ -259,7 +259,7 @@ If neither of the two approaches produces satisfactory results on structured dat
<figcaption>Figure 10: Hybrid search results for "luxury places with good reviews"</figcaption>
</figure>

The results indicate that hybrid search effectively balances semantic understanding with keyword precision. By combining vector search's ability to grasp concepts like "luxury" with BM25's strength in finding exact term matches, the hybrid approach delivers more comprehensive results. However, the fundamental limitations remain: the system still cannot reliably interpret numerical constraints (Figure 11) or make sophisticated judgments about what constitutes "good reviews" in terms of both rating quality and quantity. Additionaly, finding the optimal alpha value for the weighted combination requires careful tuning and may need adjustment based on specific use cases or datasets. Implementing hybrid search also requires maintaining two separate index structures and ensuring proper score normalization and fusion. This suggests that while hybrid search improves upon its component approaches, we need a more advanced solution to truly understand structured data attributes and their relationships.
The results indicate that hybrid search effectively balances semantic understanding with keyword precision. By combining vector search's ability to grasp concepts like "luxury" with BM25's strength in finding exact term matches, the hybrid approach delivers more comprehensive results. However, the fundamental limitations remain: the system still cannot reliably interpret numerical constraints (Figure 11) or make sophisticated judgments about what constitutes "good reviews" in terms of both rating quality and quantity. Additionally, finding the optimal alpha value for the weighted combination requires careful tuning and may need adjustment based on specific use cases or datasets. Implementing hybrid search also requires maintaining two separate index structures and ensuring proper score normalization and fusion. This suggests that while hybrid search improves upon its component approaches, we need a more advanced solution to truly understand structured data attributes and their relationships.

<figure style="text-align: center; margin: 20px 0;">
<img src="../assets/use_cases/airbnb_search/hybrid_filter.png" alt="Hybrid Search Results" style="width: 100%;">
Expand Down Expand Up @@ -324,7 +324,7 @@ The cross-encoder reranking results demonstrate a notable improvement in result
Most impressively, for the numerical constraints query, the cross-encoder makes progress in understanding specific requirements. Despite the first result exceeding the price constraint (2632 > 2000), the reranking correctly identifies more listings matching the "5 guests" requirement and prioritizes them appropriately. This shows the effectiveness of using cross-encoders, since they re-calculate the similarity between the query and the documents after the initial retrieval based on vector search. In other words, the model can make finer distinctions when examining query-document pairs together rather than separately. However, the cross-encoder still does not perfectly understand all numerical constraints. Additionally, despite the improvements, cross-encoder reranking has significant computational drawbacks. It requires evaluating each query-document pair individually through a transformer-based model, which increases latency and resource requirements. Especially as the candidate pool grows, making the search challenging to scale for large datasets or real-time applications with strict performance requirements. These takeaways suggest that while this approach represents a significant improvement, a more structured approach to handling multi-attribute data could yield better results.

<figure style="text-align: center; margin: 20px 0;">
<img <img src="../assets/use_cases/airbnb_search/cross_filter.png" alt="Cross-Encoder Results for Numerical Query" style="width: 100%;">
<img src="../assets/use_cases/airbnb_search/cross_filter.png" alt="Cross-Encoder Results for Numerical Query" style="width: 100%;">
<figcaption>Figure 15: Cross-encoder results for numerical constraints query</figcaption>
</figure>

Expand All @@ -349,7 +349,7 @@ During offline indexing, each listing is passed through a BERT model to produce
<figcaption>Figure 17: Colbert Multi-Vector Retrieval</figcaption>
</figure>

Here is how we implment the multi-vecotr search by ColBERT:
Here is how we implement the multi-vector search by ColBERT:

```python
class ColBERTSearch:
Expand Down Expand Up @@ -462,7 +462,7 @@ At query time, Superlinked uses a large language model to interpret the user’s

To ensure that non-negotiable constraints are respected, Superlinked first applies hard filters to eliminate listings that do not meet specific criteria, such as guest capacity or maximum price. Only the listings that pass these filters are considered in the final ranking stage. The system then performs a weighted nearest neighbors search, comparing the multi-attribute embeddings of these candidates against the weighted query representation to rank them by overall relevance. This combination of modality-aware encoding, constraint filtering, and weighted ranking allows Superlinked to produce accurate, context-aware results that reflect both the structure of the underlying data and the nuanced preferences of the user.

Here is how we implment the Superlinked for our Airbnb search:
Here is how we implement the Superlinked for our Airbnb search:

We first need to define a schema that captures the structure of our dataset. The schema outlines both the fields we'll use for embedding and those we'll use for filtering:

Expand Down
Loading