feat: epistemic_type ranking + cold-start boost (#70, #71) by zonk1024 · Pull Request #580 · ourochronos/valence

zonk1024 · 2026-02-28T18:09:15Z

Summary

Implements two related improvements to multi_signal_rank in ranking.py.

#70 — Epistemic Type Awareness

Added detect_query_intent(query: str) -> str that classifies queries as "procedural", "episodic", or "general" based on prefix/keyword matching and date patterns
Added query_intent: str | None = None parameter to multi_signal_rank
Applies 1.3× multiplier when article epistemic_type matches query intent
Applies 0.85× penalty when there is a procedural↔episodic conflict
"general" intent or "semantic" epistemic_type → no adjustment
Wired detect_query_intent through retrieval.py retrieve path

#71 — Cold-Start Mitigation

Added cold_start_boost: bool = True parameter to multi_signal_rank
Fresh articles (created within 48h) with confidence ≥ 0.7 receive a score floor:
- 0–24h: floor = 0.30
- 24–48h: floor = 0.15
- >48h: no floor
Prevents new high-quality articles from being buried by established articles with usage_score > 0

Tests

29 new tests in tests/core/test_ranking_improvements.py covering all cases including combined scenarios.

Checklist

Backward-compatible (new params have defaults)
All 1666 existing tests pass
ruff format + lint clean

Copilot

Pull request overview

This PR implements two ranking improvements to multi_signal_rank in ranking.py: (1) epistemic type awareness that boosts/penalises articles based on whether their epistemic_type matches the detected query intent, and (2) a cold-start floor that prevents fresh high-confidence articles from being buried behind established ones. A new detect_query_intent function classifies queries as procedural, episodic, or general.

Changes:

ranking.py: Added detect_query_intent, _cold_start_floor, and updated multi_signal_rank with query_intent and cold_start_boost parameters
retrieval.py: Wired detect_query_intent into the main _retrieve_sync retrieval path
tests/core/test_ranking_improvements.py: 29 new tests covering all new behaviours

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File	Description
`src/valence/core/ranking.py`	Adds epistemic type detection, cold-start floor helper, and extends `multi_signal_rank` with two new optional parameters
`src/valence/core/retrieval.py`	Integrates `detect_query_intent` into the main retrieval path and passes `query_intent` to `multi_signal_rank`
`tests/core/test_ranking_improvements.py`	New test file with 29 tests covering `detect_query_intent`, epistemic ranking, cold-start boost, and combined scenarios

Comments suppressed due to low confidence (1)

src/valence/core/ranking.py:337

The score_breakdown in explain mode stores the final_score (which includes the epistemic multiplier and/or cold-start floor) as "final", but each signal's "contribution" still reflects only the pre-adjustment weighted score. When a query intent match multiplies final_score by 1.3 (or 0.85), or the cold-start floor raises it, the existing invariant sum(contributions) == final is violated. The existing test in tests/core/test_ranking.py (line 196–197) and tests/cli/test_cli.py (line 303–304) happen to pass only because the test articles are created with days_ago=0 and confidence_overall=0.7 (so the cold-start floor doesn't change anything) and have no epistemic_type. Any call to multi_signal_rank(explain=True, query_intent="procedural") on a procedural article will produce a score_breakdown where sum(contributions) != final. Consider adding epistemic_multiplier and cold_start_floor fields to the breakdown dict so callers can inspect and audit the full scoring.

        if explain:
            r["score_breakdown"] = {
                "semantic": {
                    "value": semantic,
                    "weight": semantic_weight,
                    "contribution": semantic_weight * semantic,
                },
                "confidence": {
                    "value": confidence,
                    "weight": confidence_weight,
                    "contribution": confidence_weight * confidence,
                },
                "recency": {
                    "value": recency,
                    "weight": recency_weight,
                    "contribution": recency_weight * recency,
                },
                "final": final_score,
            }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-28T18:14:47Z

src/valence/core/retrieval.py

    for r in all_results:
        freshness_days = r.get("freshness", 90.0)
        r["_original_created_at"] = r.get("created_at")
        r["created_at"] = (now - timedelta(days=freshness_days)).isoformat()

+    query_intent = detect_query_intent(query)
    ranked = multi_signal_rank(
        all_results,
        semantic_weight=0.50,
        confidence_weight=0.35,
        recency_weight=0.15,
+        query_intent=query_intent,
    )


The cold-start boost is silently broken in _retrieve_sync. Before calling multi_signal_rank, the code replaces each article's created_at with a fake date derived from freshness_days (line 551). Since _cold_start_floor reads created_at to determine article age, it will evaluate the synthetic date rather than the actual creation timestamp. A newly created article with freshness = 90.0 (the default for unknown age) would have its created_at shifted to 90 days ago, making it ineligible for the cold-start floor entirely.

The fix is to store the real created_at separately before the substitution (e.g. as _original_created_at, which already happens) and pass the real creation time into _cold_start_floor, or alternatively apply the cold-start evaluation after restoring created_at rather than during multi_signal_rank. As written, the cold-start boost works only when multi_signal_rank is called directly (e.g. in tests), but not via the main retrieve() code path.

Copilot · 2026-02-28T18:14:47Z

tests/core/test_ranking_improvements.py

+    def test_general_default(self):
+        assert detect_query_intent("python async patterns") == "general"
+
+    def test_general_empty(self):


The test name test_general_empty is misleading — the query string "database indexing strategies" is not empty. The test is actually checking that a generic, non-procedural and non-episodic query returns "general". A more accurate name would be test_general_generic_query or test_general_non_matching_query.

Suggested change

def test_general_empty(self):

def test_general_generic_query(self):

Copilot · 2026-02-28T18:14:47Z

src/valence/core/ranking.py

    decay_rate: float = 0.01,
    min_confidence: float | None = None,
    explain: bool = False,
+    query_intent: str | None = None,
+    cold_start_boost: bool = True,
 ) -> list[dict]:
    """Apply multi-signal ranking to query results.



The multi_signal_rank docstring on line 248 still documents the formula as only the three-signal weighted sum, but the function now also conditionally applies an epistemic type multiplier (×1.3 or ×0.85) and a cold-start score floor. The docstring should be updated to describe the extended formula including these adjustments.

Copilot · 2026-02-28T18:14:47Z

tests/core/test_ranking_improvements.py

+import pytest
+


The pytest module is imported on line 10 but never used in the test file — no pytest.mark, pytest.raises, pytest.fixture, or similar constructs are present. This unused import should be removed.

Suggested change

import pytest

Copilot · 2026-02-28T18:14:48Z

src/valence/core/ranking.py

+    "what happened",
+    "when did",
+)
+_EPISODIC_CONTAINS = ("session", "last time")


The term "session" in _EPISODIC_CONTAINS is a broad substring match that will produce false positives for legitimate procedural/semantic queries such as "configure session timeout", "manage user sessions", or "HTTP session handling". These have nothing to do with episodic (time-anchored) recollections. Consider using a more specific phrase like "last session" or "previous session" instead of the bare word "session".

Suggested change

_EPISODIC_CONTAINS = ("session", "last time")

_EPISODIC_CONTAINS = ("last session", "previous session", "last time")

Copilot · 2026-02-28T18:14:48Z

src/valence/core/ranking.py

+# ---------------------------------------------------------------------------
+# Query intent detection (#70)
+# ---------------------------------------------------------------------------


The PR title and description reference issues #70 and #71, but the actual issues #70 and #71 are about "Implement propagate() API with restriction composition" and "Add strip_on_forward field redaction" respectively — both related to privacy primitives. They have no connection to epistemic type ranking or cold-start boost features being implemented here. The issue references are therefore incorrect.

Copilot · 2026-02-28T18:14:48Z

src/valence/core/ranking.py

+_COLD_START_WINDOW_HOURS = 48
+_COLD_START_FLOOR_FULL = 0.3  # floor at creation -> 24h
+_COLD_START_FLOOR_HALF = 0.15  # floor at 24h -> 48h
+_COLD_START_MIN_CONFIDENCE = 0.7
+
+
+def _cold_start_floor(article: dict, confidence: float) -> float | None:
+    """Return the cold-start score floor for a qualifying article, or None."""
+    if confidence < _COLD_START_MIN_CONFIDENCE:
+        return None
+
+    created_at = article.get("created_at")
+    if created_at is None:
+        return None
+
+    if isinstance(created_at, str):
+        try:
+            created_at = datetime.fromisoformat(created_at.replace("Z", "+00:00"))
+        except (ValueError, TypeError):
+            return None
+
+    if isinstance(created_at, datetime):
+        if created_at.tzinfo is None:
+            created_at = created_at.replace(tzinfo=UTC)
+        now = datetime.now(UTC)
+        age_hours = (now - created_at).total_seconds() / 3600.0
+        if age_hours <= 24:
+            return _COLD_START_FLOOR_FULL
+        elif age_hours <= 48:
+            return _COLD_START_FLOOR_HALF


The constant _COLD_START_WINDOW_HOURS = 48 is defined but never referenced in _cold_start_floor. The 24 and 48 hour thresholds are hardcoded as magic numbers directly in the function body. _COLD_START_WINDOW_HOURS should be used in the elif age_hours <= _COLD_START_WINDOW_HOURS comparison, and a companion _COLD_START_MID_WINDOW_HOURS = 24 constant (or similar) should guard the first branch — consistent with how _COLD_START_FLOOR_FULL, _COLD_START_FLOOR_HALF, and _COLD_START_MIN_CONFIDENCE are used.

- Add detect_query_intent() to classify queries as procedural/episodic/general - Apply 1.3x boost when article epistemic_type matches query intent - Apply 0.85x penalty when procedural↔episodic conflict - Add cold_start_boost: fresh articles (≤48h, confidence≥0.7) get score floor - 0–24h: floor = 0.30; 24–48h: floor = 0.15; >48h: no floor - Wire detect_query_intent into retrieval.py retrieve path - 29 tests covering all cases

Copilot AI review requested due to automatic review settings February 28, 2026 18:09

Copilot started reviewing on behalf of zonk1024 February 28, 2026 18:09 View session

Copilot AI reviewed Feb 28, 2026

View reviewed changes

zonk1024 force-pushed the ranking-improvements branch from 8b11a7a to 9b88c14 Compare February 28, 2026 18:41

zonk1024 added 2 commits February 28, 2026 10:42

style: fix unused import

6281fdf

zonk1024 force-pushed the ranking-improvements branch from 9b88c14 to 6281fdf Compare February 28, 2026 18:42

zonk1024 merged commit d03546c into main Feb 28, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: epistemic_type ranking + cold-start boost (#70, #71)#580

feat: epistemic_type ranking + cold-start boost (#70, #71)#580
zonk1024 merged 2 commits intomainfrom
ranking-improvements

zonk1024 commented Feb 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Copilot AI Feb 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	def test_general_empty(self):
	def test_general_generic_query(self):

	_EPISODIC_CONTAINS = ("session", "last time")
	_EPISODIC_CONTAINS = ("last session", "previous session", "last time")

Conversation

zonk1024 commented Feb 28, 2026

Summary

#70 — Epistemic Type Awareness

#71 — Cold-Start Mitigation

Tests

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants