This project implements a corpus-grounded question-answering system that generates natural language responses strictly limited to an ingested corpus.
It is deliberately not deterministic, and deliberately not a general-purpose assistant.
The system is designed to answer only what the corpus explicitly supports, cite its sources, and refuse to speculate when evidence is missing.
Project Status
Stable research implementation.
Not intended as a production knowledge system.
- Generate natural language answers without hallucination
- Enforce strict evidence grounding
- Fail closed instead of guessing
- Produce outputs that are human-readable and auditable
- Demonstrate the practical limits of LLM inference under hard constraints
Evidence precedes generation.
The language model may summarize, paraphrase, or explain, but it may only do so using retrieved corpus evidence. If the corpus does not support the question, the system must return NOT_FOUND.
This project is not:
- A deterministic query engine
- A theology engine
- A chatbot
- A knowledge completion system
- A model benchmark or leaderboard exercise
Apparent intelligence derives solely from generation over retrieved corpus text, not inference beyond it.
Evidence Gating
- The model only sees retrieved corpus passages.
- Background knowledge and training data recall are explicitly disallowed.
Citation Enforcement
- Every sentence in an answer must be supported by at least one citation.
- Citations refer to concrete corpus locations.
- When evidence is insufficient or absent, the system outputs NOT_FOUND.
- No attempt is made to “fill gaps” using inference or common knowledge.
All responses follow this structure:
Verdict: ANSWERED | NOT_FOUND
Answer: Natural language text with inline citations [E1], [E2], etc.
Evidence used:
E1 — doc_id — locator — "verbatim quote" E2 — ...
This format is human-first, auditable, and intentionally resistant to silent failure.
JSON is not a safety mechanism. Evidence gating is.
This project prioritizes clarity and inspection over machine convenience. Internal tooling may use structure, but external output is designed for direct reading and review.
- Research and study
- Corpus-based religious or textual analysis
- Demonstrating failure modes of LLMs
- Whitepapers, audits, and reproducible experiments
- World knowledge completion
- Cross-tradition harmonization
- Implicit timelines or inferred facts
- “Helpful” extrapolation beyond text
- Conversational UX optimization
Copyright © 2025 Right Business Pte Ltd All rights reserved.
See LICENSE for details.