-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: optional temporal relevance boost in search scoring #367
Description
Use Case
QMD is increasingly used for conversational memory — session transcripts, journals, meeting notes — where document age is a meaningful relevance signal. Currently, search scoring is purely text-based: a document from yesterday and one from three months ago rank identically if their BM25/vector/reranker scores match.
For knowledge bases and static docs, this is correct. But for temporal corpora (journals, meeting notes, chat transcripts), recency is often the tiebreaker a user expects. "What did we discuss about the API?" should naturally favor last week's meeting over January's.
Proposal
Add an optional recency parameter to the search API that applies a time-decay multiplier to final scores:
// SDK
const results = await store.search({
query: "API design",
recency: { halfLife: 30, weight: 0.15 }
})
// MCP query tool
// recency_days: 30 (shorthand — applies default weight)Scoring: final_score = text_score * (1 - weight + weight * decay(age, halfLife))
Where decay is exponential: 2^(-age_days / halfLife). This means:
- A document from today scores
text_score * 1.0 - At half-life (e.g., 30 days), it scores
text_score * (1 - weight/2)— a gentle 7.5% reduction with default weight - Very old documents approach
text_score * (1 - weight)— an asymptotic 15% reduction, never zeroed out
Key design points:
- Opt-in. No recency parameter = current behavior, zero change to existing users
- Boost, not filter. Old documents still surface — they just need stronger text relevance to beat recent ones
- Configurable per query. Different corpora need different half-lives (journals: 14d, docs: 180d, meetings: 30d)
- Applied after RRF + reranker blend. Doesn't interfere with the existing pipeline — it's a final multiplier
- Uses existing
modified_attimestamp from the documents table, no schema changes
Why Not Consumer-Side?
The caller could re-sort results by date, but that throws away the text relevance signal entirely. The value is in blending temporal and textual relevance — "this slightly less relevant document from yesterday beats that slightly more relevant one from two months ago." That blending belongs in the scoring pipeline.
Implementation
~100-150 lines touching store.ts (score multiplier after final blend), types.ts (RecencyOptions interface), and the MCP/CLI layers to expose the parameter. Happy to implement and PR if the approach looks right.