Full spec: docs/hardening-roadmap-2026-04-16.md#h-9
Description
grounding-validator.ts:69 reduces confidence proportionally to invalid-ID fraction but leaves score untouched. An LLM that hallucinates 1/10 IDs keeps full score with 10% lower confidence. Policy decision required.
Current State
- Only confidence is scaled; score is invariant to hallucination rate.
Suggested Fix (needs policy choice first)
Options:
- Strict: hallucinationRate > 20% → dimension score = 0 (forces re-extraction).
- Soft: scale score by
sqrt(ratio) — gentler than linear, nonzero.
- Bifurcated: keep score + confidence, add
groundingQuality: 'clean' | 'partial' | 'poor' field for tier assignment to consider.
Implementation once chosen:
Verification
Automation Hints
scope: packages/scoring/src/grounding-validator.ts, packages/core/src/scoring.ts
do-not-touch: adapters
approach: refactor-types
risk: medium
max-files-changed: 6
blocked-by: none
bail-if: grounding tests fail
Priority
Medium — blocks H-12 (scoring test coverage)
Full spec:
docs/hardening-roadmap-2026-04-16.md#h-9Description
grounding-validator.ts:69reducesconfidenceproportionally to invalid-ID fraction but leavesscoreuntouched. An LLM that hallucinates 1/10 IDs keeps full score with 10% lower confidence. Policy decision required.Current State
Suggested Fix (needs policy choice first)
Options:
sqrt(ratio)— gentler than linear, nonzero.groundingQuality: 'clean' | 'partial' | 'poor'field for tier assignment to consider.Implementation once chosen:
SearchConfig.groundingStrictness: 'strict' | 'soft'(default'soft').ExtractedSignalsschema + all readers.Verification
pnpm buildpassespnpm testpassespnpm typecheckcleanAutomation Hints
scope: packages/scoring/src/grounding-validator.ts, packages/core/src/scoring.ts
do-not-touch: adapters
approach: refactor-types
risk: medium
max-files-changed: 6
blocked-by: none
bail-if: grounding tests fail
Priority
Medium — blocks H-12 (scoring test coverage)