Skip to content

Conversation

@RoRoJ
Copy link
Contributor

@RoRoJ RoRoJ commented Nov 26, 2025

Add doc for querying rerank models

@RoRoJ RoRoJ added type: new content New pages or categories do not merge PR that shouldn't be merged before a specific date (eg release) priority: low Maintenance PRs that are not critical. labels Nov 26, 2025
Co-authored-by: Benedikt Rollik <brollik@scaleway.com>

For example: a query to a fast (but imprecise) model may return a list of 100 documents. A specialized reranking model can then evaluate these documents more deeply, score each on how well it matches the query, and return only the 10 most relevant documents to the first model to be used in answering the query.

This approach takes advantage of the strengths of each model: one that is fast but not specialized, which can generate candidates quickly, and another than is slow but specialized, to refine these candidates. It can result in reduced context windows with therefore improved relevance, and faster overall query processing time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This approach takes advantage of the strengths of each model: one that is fast but not specialized, which can generate candidates quickly, and another than is slow but specialized, to refine these candidates. It can result in reduced context windows with therefore improved relevance, and faster overall query processing time.
This approach takes advantage of the strengths of each model: one that is fast but not specialized, which can generate candidates quickly, and another that is slow but specialized, to refine these candidates. It can result in reduced context windows with therefore improved relevance, and faster overall query processing time.

Comment on lines +33 to +34
- Query vector: `qv = embedding(query`)
- Document vector: `dv = embedding(document content)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Query vector: `qv = embedding(query`)
- Document vector: `dv = embedding(document content)`
- Query vector: `qv = embedding` (query)
- Document vector: `dv = embedding` (document content)

- Document vector: `dv = embedding(document content)`
- Relevance score: `score = (qv, dv)` (dot product)

Therefore, if you're performing repeated relevance scoring, you can streamline your workflow as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Therefore, if you're performing repeated relevance scoring, you can streamline your workflow as follows:
Therefore, if you are performing repeated relevance scoring, you can streamline your workflow as follows:

@fpagny
Copy link
Contributor

fpagny commented Nov 28, 2025

Ok for me 👍
The API is not ready to be deployed yet in production, we still need a quick technical fix.
I'll update when it's ready to be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do not merge PR that shouldn't be merged before a specific date (eg release) priority: low Maintenance PRs that are not critical. type: new content New pages or categories

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants