Corpus-Bound Answer Generator

This project implements a corpus-grounded question-answering system that generates natural language responses strictly limited to an ingested corpus.

It is deliberately not deterministic, and deliberately not a general-purpose assistant.

The system is designed to answer only what the corpus explicitly supports, cite its sources, and refuse to speculate when evidence is missing.

Project Status

Stable research implementation.
Not intended as a production knowledge system.

Design Goals

Generate natural language answers without hallucination
Enforce strict evidence grounding
Fail closed instead of guessing
Produce outputs that are human-readable and auditable
Demonstrate the practical limits of LLM inference under hard constraints

Core Principle

Evidence precedes generation.

The language model may summarize, paraphrase, or explain, but it may only do so using retrieved corpus evidence. If the corpus does not support the question, the system must return NOT_FOUND.

What This Is Not

This project is not:

A deterministic query engine
A theology engine
A chatbot
A knowledge completion system
A model benchmark or leaderboard exercise

Apparent intelligence derives solely from generation over retrieved corpus text, not inference beyond it.

System Behavior

Evidence Gating

The model only sees retrieved corpus passages.
Background knowledge and training data recall are explicitly disallowed.

Citation Enforcement

Every sentence in an answer must be supported by at least one citation.
Citations refer to concrete corpus locations.

Fail-Closed Output

When evidence is insufficient or absent, the system outputs NOT_FOUND.
No attempt is made to “fill gaps” using inference or common knowledge.

Output Contract (No JSON)

All responses follow this structure:

Verdict: ANSWERED | NOT_FOUND

Answer: Natural language text with inline citations [E1], [E2], etc.

Evidence used:

E1 — doc_id — locator — "verbatim quote" E2 — ...

This format is human-first, auditable, and intentionally resistant to silent failure.

Why No JSON

JSON is not a safety mechanism. Evidence gating is.

This project prioritizes clarity and inspection over machine convenience. Internal tooling may use structure, but external output is designed for direct reading and review.

Intended Use

Research and study
Corpus-based religious or textual analysis
Demonstrating failure modes of LLMs
Whitepapers, audits, and reproducible experiments

Out of Scope by Design

World knowledge completion
Cross-tradition harmonization
Implicit timelines or inferred facts
“Helpful” extrapolation beyond text
Conversational UX optimization

License

See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.vscode		.vscode
corpus		corpus
docs		docs
scope		scope
scripts/ver0.1		scripts/ver0.1
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Corpus-Bound Answer Generator

Design Goals

Core Principle

What This Is Not

System Behavior

Fail-Closed Output

Output Contract (No JSON)

Why No JSON

Intended Use

Out of Scope by Design

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Corpus-Bound Answer Generator

Design Goals

Core Principle

What This Is Not

System Behavior

Fail-Closed Output

Output Contract (No JSON)

Why No JSON

Intended Use

Out of Scope by Design

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages