Skip to content

feat: optional temporal relevance boost in search scoring #367

@sudo25o1

Description

@sudo25o1

Use Case

QMD is increasingly used for conversational memory — session transcripts, journals, meeting notes — where document age is a meaningful relevance signal. Currently, search scoring is purely text-based: a document from yesterday and one from three months ago rank identically if their BM25/vector/reranker scores match.

For knowledge bases and static docs, this is correct. But for temporal corpora (journals, meeting notes, chat transcripts), recency is often the tiebreaker a user expects. "What did we discuss about the API?" should naturally favor last week's meeting over January's.

Proposal

Add an optional recency parameter to the search API that applies a time-decay multiplier to final scores:

// SDK
const results = await store.search({
  query: "API design",
  recency: { halfLife: 30, weight: 0.15 }
})

// MCP query tool
// recency_days: 30 (shorthand — applies default weight)

Scoring: final_score = text_score * (1 - weight + weight * decay(age, halfLife))

Where decay is exponential: 2^(-age_days / halfLife). This means:

  • A document from today scores text_score * 1.0
  • At half-life (e.g., 30 days), it scores text_score * (1 - weight/2) — a gentle 7.5% reduction with default weight
  • Very old documents approach text_score * (1 - weight) — an asymptotic 15% reduction, never zeroed out

Key design points:

  • Opt-in. No recency parameter = current behavior, zero change to existing users
  • Boost, not filter. Old documents still surface — they just need stronger text relevance to beat recent ones
  • Configurable per query. Different corpora need different half-lives (journals: 14d, docs: 180d, meetings: 30d)
  • Applied after RRF + reranker blend. Doesn't interfere with the existing pipeline — it's a final multiplier
  • Uses existing modified_at timestamp from the documents table, no schema changes

Why Not Consumer-Side?

The caller could re-sort results by date, but that throws away the text relevance signal entirely. The value is in blending temporal and textual relevance — "this slightly less relevant document from yesterday beats that slightly more relevant one from two months ago." That blending belongs in the scoring pipeline.

Implementation

~100-150 lines touching store.ts (score multiplier after final blend), types.ts (RecencyOptions interface), and the MCP/CLI layers to expose the parameter. Happy to implement and PR if the approach looks right.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions