Skip to content

Add optional Ollama-backed semantic sidecar reranking (lexical-first, additive)#29

Merged
HiveForensicsAI merged 3 commits intomainfrom
codex/implement-optional-local-semantic-reranking
Mar 18, 2026
Merged

Add optional Ollama-backed semantic sidecar reranking (lexical-first, additive)#29
HiveForensicsAI merged 3 commits intomainfrom
codex/implement-optional-local-semantic-reranking

Conversation

@HiveForensicsAI
Copy link
Copy Markdown
Contributor

Motivation

  • Provide an optional local semantic reranking layer backed by Ollama embeddings while preserving Knolo's deterministic lexical-first retrieval model.
  • Keep semantic logic additive, explainable, and local-first without introducing a vector DB or changing the .knolo pack format.
  • Use a sidecar index to enable fast, portable reranking over only lexical top-N candidates and allow gating by lexical confidence.

Description

  • Added a semantic core under packages/core/src/semantic/ with types.ts, cosine.ts, sidecar.ts, provider.ts, and rerank.ts implementing provider interfaces, normalization/cosine utilities, sidecar serialization/validation, and lexical-topN reranking/blending.
  • Extended query() API and validation to accept semantic options (sidecar/provider/topN/minLexConfidence/blend/etc.) and attached per-hit evidence fields (retrieval, lexicalScore, semanticScore, blendedScore, modelId) while preserving the default lexical-first behavior.
  • Implemented an Ollama adapter in a new package @knolo/semantic-ollama that implements EmbeddingProvider with batching, timeouts, and clear errors, and added CLI sidecar commands in the knolo binary: semantic:index, semantic:inspect, and semantic:validate.
  • Added tests and README docs: sidecar rerank + validation tests, cosine helper tests, CLI semantic validation tests, and README usage/troubleshooting for the sidecar workflow.

Testing

  • Built core and adapter packages with npm run build --workspace @knolo/core and npm run build --workspace @knolo/semantic-ollama and both builds succeeded.
  • Ran core unit tests with npm run test --workspace @knolo/core and all tests passed (including testSemanticSidecarRerankAndValidation and testCosineHelpers).
  • Ran CLI tests with npm run test --workspace @knolo/cli and all CLI tests passed (including semantic:validate success/failure cases).
  • Verified knolo semantic:index / semantic:inspect / semantic:validate flows via unit tests that exercise sidecar creation/inspection/validation.

Codex Task

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8ac727f6d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

function shouldRerankWithSemantic(pack: Pack, opts: ResolvedSemanticOpts, confidence: number): boolean {
if (!opts.enabled || opts.mode !== "rerank") return false;
if (!pack.semantic) return false;
if (!pack.semantic && !opts.sidecar) return false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wire semantic.sidecarPath into rerank eligibility

query() accepts semantic.sidecarPath, but semantic rerank is gated by if (!pack.semantic && !opts.sidecar) return false and the function never resolves sidecarPath into opts.sidecar. In practice, callers that pass only sidecarPath (with semantic.enabled, queryEmbedding, and even force) silently fall back to lexical-only retrieval, so the new option does not actually enable sidecar reranking.

Useful? React with 👍 / 👎.

namespace: pack.namespaces?.[r.blockId] ?? undefined,
evidence: {
retrieval: retrievalMode,
lexicalScore: r.score,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve pre-rerank score in evidence.lexicalScore

When semantic reranking runs, prelim is replaced with blended/semantic scores, but evidence.lexicalScore is still assigned from r.score here. That means hybrid results report reranked scores as "lexical" scores, which corrupts explainability/telemetry and any downstream logic that uses lexical confidence when semantic rerank is enabled.

Useful? React with 👍 / 👎.

…sues-in-semantic-rerank-feature

Fix semantic sidecarPath rerank eligibility and evidence lexical score correctness
@HiveForensicsAI HiveForensicsAI merged commit 2ab4149 into main Mar 18, 2026
0 of 2 checks passed
@HiveForensicsAI HiveForensicsAI deleted the codex/implement-optional-local-semantic-reranking branch March 18, 2026 16:16
@HiveForensicsAI HiveForensicsAI added enhancement New feature or request and removed codex labels Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants