Skip to content

feat(recall): add proof_count boost to combined scoring#821

Open
Abdulkadirklc wants to merge 6 commits intovectorize-io:mainfrom
Abdulkadirklc:feature/proof-count-boost
Open

feat(recall): add proof_count boost to combined scoring#821
Abdulkadirklc wants to merge 6 commits intovectorize-io:mainfrom
Abdulkadirklc:feature/proof-count-boost

Conversation

@Abdulkadirklc
Copy link
Copy Markdown

Observations with more supporting evidence now rank slightly higher in recall results. proof_count is threaded through the retrieval pipeline and applied as a multiplicative boost in reranking:

  • types.py: add proof_count field to RetrievalResult
  • retrieval.py: include proof_count in SELECT columns
  • reranking.py: add log1p-normalized proof_count boost (alpha=0.1)

The boost uses the same multiplicative pattern as recency and temporal signals. proof_count=1 is neutral, proof_count=50 gives ~+5% boost. Non-observation fact types are unaffected (neutral 0.5).

Observations with more supporting evidence now rank slightly higher
in recall results. proof_count is threaded through the retrieval
pipeline and applied as a multiplicative boost in reranking:

- types.py: add proof_count field to RetrievalResult
- retrieval.py: include proof_count in SELECT columns
- reranking.py: add log1p-normalized proof_count boost (alpha=0.1)

The boost uses the same multiplicative pattern as recency and temporal
signals. proof_count=1 is neutral, proof_count=50 gives ~+5% boost.
Non-observation fact types are unaffected (neutral 0.5).
@nicoloboschi
Copy link
Copy Markdown
Collaborator

Nice approach! A couple of things to address:

  1. Missing tests — this needs unit tests for the new proof_count boost in apply_combined_scoring. Cover at least: neutral when proof_count is None, neutral when proof_count=1, increasing boost with higher counts, and clamping at 100+.

  2. Other retrieval paths — the proof_count column is only added to the semantic/BM25 query in retrieval.py. Graph retrieval (graph_retrieval.py) and temporal retrieval also build RetrievalResult objects — those paths should populate proof_count too, otherwise graph/temporal-retrieved observations will always get a neutral boost even when they have proof evidence.

@nicoloboschi
Copy link
Copy Markdown
Collaborator

The log1p(100) normalizer is hardcoded — any observation with proof_count > 100 gets clamped to the same max boost. This feels like a magic number that will silently break for banks with heavily-reinforced observations. I'd remove the hardcoded cap and let the normalization scale naturally, or at least make it configurable.

Copy link
Copy Markdown
Collaborator

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things to fix:

  • Remove the BFSGraphRetriever class from graph_retrieval.py (looks like it slipped in from a merge conflict)
  • Clamp proof_norm to [0, 1] — currently unbounded, so extreme proof counts can exceed the documented ±5% range
  • Fix misleading comment in test_proof_count_neutral_at_one — says log1p(1) but code uses math.log

@Abdulkadirklc Abdulkadirklc force-pushed the feature/proof-count-boost branch from e94c225 to ad425c2 Compare April 2, 2026 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants