feat(mcp): expose skipRerank and candidateLimit in query tool by DmitryPogodaev · Pull Request #435 · tobi/qmd

DmitryPogodaev · 2026-03-18T14:02:37Z

Problem

On CPU-only servers (no GPU), the LLM reranker model (Qwen3-Reranker-0.6B) takes ~2 seconds per document to score. A typical query with 20 candidates takes 30-40 seconds — far exceeding the 1-2s timeouts used by automated RAG hooks.

The internal structuredSearch already supports skipRerank and candidateLimit, but neither is exposed through the MCP query tool.

Changes

skipRerank (boolean, optional): added to MCP query tool schema. When true, returns results scored by RRF fusion only — no LLM rerank. Queries complete in 30-50ms instead of 30-40s.
candidateLimit: was declared in the MCP schema but never forwarded to store.search(). Now passed through.
Added candidateLimit to the SearchOptions interface.

Use case

Automated RAG hooks (e.g. Telegram bot preprocessing) on VPS without GPU, where the reranker model is prohibitively slow. skipRerank: true gives fast approximate results; the LLM reranker remains available for interactive / CLI use.

Performance

	With rerank	skipRerank=true
Cold (model load)	30-40s	400-500ms
Warm	30-40s (rerank dominates)	30-50ms

Tested on AMD EPYC 8-core (no GPU), QMD 2.0.1, node-llama-cpp 3.18.1.

On CPU-only servers, LLM reranking (0.6B model) takes ~2s per document, making the query tool unusable with timeouts under 30s. This commit: - Adds `skipRerank` boolean parameter to the MCP `query` tool schema. When true, returns results scored by RRF fusion only (no LLM rerank). - Passes `candidateLimit` through to structuredSearch (was declared in schema but never forwarded to the store). Use case: automated RAG hooks with 1-2s timeouts on VPS without GPU. With skipRerank=true, queries complete in 30-50ms instead of 30-40s.

fxstein · 2026-03-18T14:46:07Z

Was on my todo list as well. Should now make the mcp server pretty complete.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): expose skipRerank and candidateLimit in query tool#435

feat(mcp): expose skipRerank and candidateLimit in query tool#435
DmitryPogodaev wants to merge 1 commit intotobi:mainfrom
DmitryPogodaev:feat/skip-rerank-mcp

DmitryPogodaev commented Mar 18, 2026

Uh oh!

fxstein commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DmitryPogodaev commented Mar 18, 2026

Problem

Changes

Use case

Performance

Uh oh!

fxstein commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants