100% Accuracy | 0% Hallucinations | Industry-Leading Performance
A state-of-the-art document retrieval system that preserves hierarchical structure for superior RAG performance. Combines PageIndex, Recursive Language Models (RLM), Knowledge Graphs, and Tree of Thoughts navigation.
RNSR is the only document context retrieval system to achieve 100% accuracy on FinanceBench - the industry-standard benchmark for financial document Q&A. This represents a breakthrough in grounded document retrieval.
Head-to-head comparison on financial document Q&A (Workers' Compensation Act):
| Method | Relevance | Correctness | Hallucination | Avg Time |
|---|---|---|---|---|
| RNSR | 100% | 100% | 0% | 10.73s |
| Long Context LLM | 88% | 75% | 0% | 2.12s |
| Naive RAG | 75% | 50% | 50% | 3.24s |
RNSR correctness is 2x better than Naive RAG and reduces hallucination by 100%.
| Metric | RNSR | GPT-4 RAG | Claude RAG | Industry Avg |
|---|---|---|---|---|
| Accuracy | 100% | ~60% | ~65% | ~55% |
| Hallucination Rate | 0% | ~15% | ~12% | ~20% |
| Grounded Responses | 100% | ~80% | ~85% | ~75% |
Evaluates RNSR's ability to extract chronological events from legal and project documents:
| Document | Events Found | Recall | Order Accuracy | Date Parse |
|---|---|---|---|---|
| Meridian Project History | 15/15 | 100% | 100% | 100% |
| Baxter v Thornton (legal) | 11/11 | 100% | 100% | 100% |
| Average | 26/26 | 100% | 100% | 100% |
Timeline extraction uses regex-based date pre-scanning and post-extraction grounding to prevent hallucinated dates (see Determinism & Grounding below).
Evaluates RNSR's ability to detect conflicting claims within and across documents:
| Scenario | Known Contradictions | Detected | Recall | Precision | F1 |
|---|---|---|---|---|---|
| Single-doc (Greenfield Annual Report) | 5 | 5/5 | 100% | 22% | 36% |
| Cross-doc (Expert Reports + Incident) | 6 | 6/6 | 100% | 15% | 26% |
100% recall means RNSR never misses a real contradiction. Lower precision reflects the system's conservative approach - it flags potential contradictions for human review rather than risking a miss. All 5 single-doc contradictions (revenue, profit, headcount, offices, product sales) and all 6 cross-doc contradictions (diagnosis, speed, admission, GAF score, treatment, fitness) were correctly identified.
RNSR ships with loaders and evaluation harnesses for established academic benchmarks:
| Benchmark | Domain | Task | RNSR Accuracy | Key Metric |
|---|---|---|---|---|
| FinanceBench | Finance | 10-K/10-Q Q&A | 100% | Correctness |
| TAT-QA | Finance | Table + text reasoning | 67%* | EM, F1 |
| QASPER | Scientific papers | Long-document QA | 67%* | F1 |
| DocVQA | Visual documents | QA over images | 67%* | ANLS |
| MultiHiertt | Finance | Multi-step hierarchical tables | -- | EM, F1 |
*Evaluated on 3-sample subset. Failures are attributable to multi-span formatting (TAT-QA), abstractive summarization style (QASPER), and OCR quality (DocVQA) rather than retrieval accuracy. Per-type breakdown: span-type questions score 100% on TAT-QA, extractive questions score 100% on QASPER.
from rnsr.benchmarks import MultiHierttLoader, TATQALoader, QASPERLoader, DocVQALoader
# Load any benchmark dataset
samples = MultiHierttLoader(max_samples=50).load()
for s in samples:
print(f"Q: {s.question} A: {s.expected_answer}")# Comparison benchmark: RNSR vs Naive RAG vs Long Context
make benchmark-compare
# Timeline extraction benchmark
make benchmark-timeline
# Contradiction detection benchmark
make benchmark-contradiction
# All feature benchmarks (timeline + contradiction)
make benchmark-features
# Full academic benchmark suite
python run_all_benchmarks.py
# Specific benchmarks
python run_all_benchmarks.py --benchmarks financebench tatqa qasper docvqa
# Quick smoke test (3 samples per benchmark)
python run_all_benchmarks.py --max-samples 3RNSR employs a multi-layered strategy to minimize LLM non-determinism and prevent hallucinations:
| Layer | Technique | Description |
|---|---|---|
| 1. Sampling Controls | temperature=0.0 + seed=42 |
All LLM calls use zero temperature. OpenAI and Gemini also receive a deterministic seed (RNSR_LLM_SEED env var). |
| 2. Response Caching | CachedLLM wrapper |
When RNSR_LLM_CACHE=1 is set, LLM responses are cached to disk keyed by prompt hash. Identical prompts always return identical results. |
| 3. Structured Output | Provider-native JSON mode | OpenAI uses response_format=json_object, Gemini uses response_mime_type=application/json. All extractors call complete_json() for reliable parsing. |
| 4. Source Grounding | Regex pre-scan + post-validation | Timeline extraction pre-scans text for dates via regex, injects them into the prompt, and post-validates every extracted date against the source. Ungrounded dates are discarded. Entity extraction uses _text_is_grounded() to verify entities exist in source text. |
These layers work together so that repeated benchmark runs produce consistent results.
FinanceBench is a challenging benchmark that tests:
- Complex financial document understanding
- Multi-step reasoning over 10-K/10-Q filings
- Numerical extraction and calculation
- Cross-reference resolution
RNSR's 100% score on this benchmark demonstrates that accurate, hallucination-free document Q&A is achievable with the right architecture.
Unlike traditional RAG systems that chunk documents and lose context, RNSR:
- Preserves Document Structure - Maintains hierarchical relationships between sections
- Knowledge Graph Grounding - Extracts entities (companies, amounts, dates) and verifies relationships
- RLM Navigation - LLM writes code to navigate the document tree, finding relevant sections deterministically
- Cross-Doc KG Disambiguation - When multiple documents give conflicting answers, entity relationships and document context from the Knowledge Graph resolve which answer is authoritative
- Unified Atomic Storage - All document data lives in a single WAL-mode SQLite database per workspace, eliminating the file-locking and corruption issues that plague multi-file stores
- Provenance Tracking - Every answer includes exact citations to source text
- Source Grounding - Regex pre-scanning and post-validation ensure extracted facts exist in the source text
- No Guessing - If information isn't found, RNSR says so rather than hallucinating
RNSR combines neural and symbolic approaches to achieve accurate document understanding:
- Font Histogram Algorithm - Automatically detects document hierarchy from font sizes (no training required)
- Skeleton Index Pattern - Lightweight summaries with KV store for efficient retrieval
- Tree-of-Thoughts Navigation - LLM reasons about document structure to find answers
- RLM Unified Extraction - LLM writes extraction code, grounded in actual text
- Knowledge Graph - Entity and relationship storage for cross-document linking
- Self-Reflection Loop - Iterative answer improvement through self-critique
- Adaptive Learning - System learns from your document workload over time
| Feature | Description |
|---|---|
| 🏆 100% FinanceBench | Only retrieval system to achieve perfect accuracy on the industry benchmark |
| Zero Hallucinations | Grounded answers with provenance - if not found, says so |
| Multi-Format Ingestion | Ingest PDF, DOCX, XLSX, CSV, MSG, and image files — not just PDFs |
| VLM OCR | Scanned/image-only PDFs are transcribed by Gemini/Anthropic/OpenAI vision models instead of tesseract, with automatic provider fallback |
| Unified Store (StoreDB) | Single SQLite database per workspace with WAL mode, atomic transactions, and automatic migration from legacy multi-file stores |
| Hierarchical Extraction | Preserves document structure (sections, subsections, paragraphs) |
| Knowledge Graph | LLM-driven entity & relationship extraction with adaptive type learning and parallel processing |
| Persistent KG | File-backed knowledge graphs that survive across sessions and documents |
| Multi-Document Workspace | Upload multiple documents, build a workspace-wide KG, and query across all of them |
| Cross-Doc KG Disambiguation | When documents disagree, entity relationships and document titles are fed into synthesis prompts so the LLM resolves conflicts using KG context rather than frequency |
| Cross-Document Entity Linking | Automatically discovers that "G. Sorenssen" in Doc A is "GeoV William Sorenssen" in Doc B |
| Timeline Extraction | Automatically builds chronological timelines of events from the knowledge graph |
| Contradiction Detection | Six-strategy detection: KG relationships, subject-gated heuristics, LLM semantic analysis, structure-parallel section matching, entity-centric comparison, and relationship divergence |
| Bring Your Own Data (BYOD) | Pass in pre-built skeleton indexes, KV stores, and knowledge graphs |
| RLM Navigation | LLM writes code to navigate documents - deterministic and reproducible |
| SQL-like Table Queries | SELECT, WHERE, ORDER BY, SUM, AVG over detected tables |
| Provenance System | Every answer traces back to exact document citations |
| LLM Response Cache | Semantic-aware caching for 10x cost/speed improvement |
| Self-Reflection | Iterative self-correction improves answer quality |
| Multi-Document Detection | Automatically splits bundled PDFs |
# Clone the repository
git clone https://github.com/theeufj/RNSR.git
cd RNSR
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install with all LLM providers
pip install -e ".[all]"
# Or install with specific provider
pip install -e ".[openai]" # OpenAI only
pip install -e ".[anthropic]" # Anthropic only
pip install -e ".[gemini]" # Google Gemini only
# With vision features (LayoutLM, torch, torchvision)
pip install -e ".[vision]"Create a .env file:
cp .env.example .env
# Edit .env with your API keys# Choose your preferred LLM provider
OPENAI_API_KEY=sk-...
# or
ANTHROPIC_API_KEY=sk-ant-...
# or
GOOGLE_API_KEY=AI...
# Optional: Override default models
LLM_PROVIDER=anthropic
SUMMARY_MODEL=claude-sonnet-4-5
# Optional: Use a fast, cheap model for entity extraction
RNSR_EXTRACTION_MODEL=gemini-2.5-flash
# RNSR_EXTRACTION_PROVIDER=gemini # if different from your primary providerfrom rnsr import RNSRClient
# Option A: auto-detect provider from env vars / .env file
client = RNSRClient()
# Option B: pass API key directly (recommended for PyPI installs)
client = RNSRClient(api_key="your-key", llm_provider="gemini")
# Option C: explicit provider + model, key from env
client = RNSRClient(llm_provider="anthropic", llm_model="claude-sonnet-4-5")
# Simple one-line Q&A
answer = client.ask("contract.pdf", "What are the payment terms?")
print(answer)
# Advanced navigation with Knowledge Graph (recommended for best accuracy)
result = client.ask_advanced(
"complex_report.pdf",
"Compare liability clauses in sections 5 and 8",
use_knowledge_graph=True, # Entity extraction for better accuracy
enable_verification=False, # Set True for strict mode
)
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['confidence']}")make demo
# Open http://localhost:7860 in your browserThe demo includes tabs for Chat, Document Structure, Tables, Knowledge Graph, Timeline, Contradictions, and Multi-Document workspace.
The RNSR benchmark (make benchmark-compare) achieves zero hallucinations and high accuracy. Here's how to replicate this performance in your own application:
The benchmark uses three key components that work together:
-
Knowledge Graph with LLM-Driven Entity Extraction - Uses the
RLMUnifiedExtractorto discover entities and relationships directly from the text. The extractor is adaptive -- it learns new entity types from your documents and persists them to~/.rnsr/learned_entity_types.json. No hardcoded patterns; the LLM writes extraction code grounded in the actual document content. -
Parallel Extraction - Entity extraction runs across skeleton nodes in parallel using a thread pool (default 8 workers), reducing wall-clock time by up to 8x for large documents.
-
Cached LLM Instance - Reuses a single LLM instance across queries for consistency and reduced latency
-
RLMNavigator with Entity Awareness - The navigator can query the knowledge graph to understand relationships between entities in the document
Use ask_advanced() with knowledge graph enabled (the default):
from rnsr import RNSRClient
# Create client with caching (recommended for production)
client = RNSRClient(cache_dir="./rnsr_cache")
# Ask questions with knowledge graph (matches benchmark performance)
result = client.ask_advanced(
"document.pdf",
"What are the total compensation amounts?",
use_knowledge_graph=True, # Enables entity extraction
enable_verification=False, # Set True for strict mode
)
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['confidence']}")
# Multiple queries on the same document reuse cached index + knowledge graph
result2 = client.ask_advanced(
"document.pdf",
"Who are the parties mentioned?",
)For maximum control (as used in benchmarks), access the navigator directly:
from rnsr.agent.rlm_navigator import RLMNavigator, RLMConfig
from rnsr.indexing import load_index
from rnsr.indexing.knowledge_graph import KnowledgeGraph
# Load pre-built index
skeleton, kv_store = load_index("./cache/my_document")
# Build knowledge graph with entities
kg = KnowledgeGraph(":memory:")
# ... add entities from your extraction logic ...
# Create navigator with all components
config = RLMConfig(
max_recursion_depth=3,
enable_pre_filtering=True,
enable_verification=False,
)
navigator = RLMNavigator(
skeleton=skeleton,
kv_store=kv_store,
knowledge_graph=kg,
config=config,
)
# Run queries
result = navigator.navigate("What is the contract value?")| Parameter | Default | Description |
|---|---|---|
use_rlm |
True |
Use RLM Navigator (vs. simpler navigator) |
use_knowledge_graph |
True |
Extract entities/relationships in parallel and build knowledge graph |
enable_pre_filtering |
True |
Filter nodes by keywords before LLM calls |
enable_verification |
False |
Enable strict critic loop (can reject valid answers) |
max_recursion_depth |
3 |
Maximum depth for recursive sub-LLM calls |
- Always use
cache_dir- Avoids re-indexing documents on every query - Keep
use_knowledge_graph=True- This is key to benchmark-level accuracy - Set
enable_verification=Falsefor most cases - The critic can be too aggressive - Reuse the same client instance - The navigator and knowledge graph are cached
- Parallel extraction is automatic - Knowledge graph building runs up to 8 extraction threads in parallel. Tune
max_workerson the_get_or_create_knowledge_graphcall if you hit API rate limits
Every answer includes traceable citations:
from rnsr.agent import ProvenanceTracker, format_citations_for_display
tracker = ProvenanceTracker(kv_store=kv_store, skeleton=skeleton)
record = tracker.create_provenance_record(
answer="The payment terms are net 30.",
question="What are the payment terms?",
variables=navigation_variables,
)
print(f"Confidence: {record.aggregate_confidence:.0%}")
print(format_citations_for_display(record.citations))
# Output:
# **Sources:**
# 1. [contract.pdf] Section: Payment Terms, Page 5: "Payment shall be due within 30 days..."Automatic caching reduces costs and latency:
from rnsr.agent import wrap_llm_with_cache, get_global_cache
# Wrap any LLM function with caching
cached_llm = wrap_llm_with_cache(llm.complete, ttl_seconds=3600)
# Use cached LLM - repeated prompts hit cache
response = cached_llm("What is 2+2?") # Calls LLM
response = cached_llm("What is 2+2?") # Returns cached (instant)
# Check cache stats
print(get_global_cache().get_stats())
# {'entries': 150, 'hits': 89, 'hit_rate': 0.59}Answers are automatically critiqued and improved:
from rnsr.agent import SelfReflectionEngine, reflect_on_answer
# Quick one-liner
result = reflect_on_answer(
answer="The contract expires in 2024.",
question="When does the contract expire?",
evidence="Contract dated 2023, 2-year term...",
)
print(f"Improved: {result.improved}")
print(f"Final answer: {result.final_answer}")
print(f"Iterations: {result.total_iterations}")The system learns from successful queries:
from rnsr.agent import get_reasoning_memory, find_similar_chains
# Find similar past queries
matches = find_similar_chains("What is the liability cap?")
for match in matches:
print(f"Similar query: {match.chain.query}")
print(f"Similarity: {match.similarity:.0%}")
print(f"Past answer: {match.chain.answer}")RNSR automatically detects tables during document ingestion and provides SQL-like query capabilities:
from rnsr import RNSRClient
client = RNSRClient()
# List all tables in a document
tables = client.list_tables("financial_report.pdf")
for t in tables:
print(f"{t['id']}: {t['title']} ({t['num_rows']} rows)")
# SQL-like queries with filtering and sorting
results = client.query_table(
"financial_report.pdf",
table_id="table_001",
columns=["Description", "Amount"],
where={"Amount": {"op": ">=", "value": 10000}},
order_by="-Amount", # Descending
limit=10,
)
# Aggregations
total = client.aggregate_table(
"financial_report.pdf",
table_id="table_001",
column="Revenue",
operation="sum", # sum, avg, count, min, max
)
print(f"Total Revenue: ${total:,.2f}")The RLM Navigator can also query tables during navigation using list_tables(), query_table(), and aggregate_table() functions in the REPL environment.
Handle ambiguous queries gracefully:
from rnsr.agent import QueryClarifier, needs_clarification
# Check if query needs clarification
is_ambiguous, analysis = needs_clarification(
"What does it say about the clause?"
)
if is_ambiguous:
print(f"Ambiguity: {analysis.ambiguity_type}")
print(f"Clarifying question: {analysis.suggested_clarification}")
# "What does 'it' refer to in your question?"Manage multiple documents, build a workspace-wide knowledge graph, and ask questions that span across them:
from rnsr import DocumentStore
# Create or open a document store (backed by a single StoreDB SQLite file)
store = DocumentStore("./my_documents/")
# Add documents — PDF, DOCX, XLSX, CSV, MSG, and images are all supported
store.add_document("contract_a.pdf")
store.add_document("contract_b.docx", metadata={"year": 2024})
# Build workspace knowledge graph & link entities across documents
kg = store.build_workspace_kg()
links = store.link_entities_across_documents()
print(f"Found {len(links)} cross-document entity links")
# Query across all documents
result = store.query_cross_document("What are the payment terms in each contract?")
print(result["answer"])
print(f"Documents used: {result['documents_used']}")How cross-document disambiguation works: When documents give conflicting answers, the CrossDocNavigator enriches its synthesis prompt with:
- Document titles — human-readable names instead of opaque hashes, so the LLM can reason about document types (e.g. "Costs Agreement" vs "Invoice Cover Letter").
- Knowledge Graph context — entity relationships, entity-document mappings, and cross-document links are injected directly into the prompt. The synthesis rules instruct the LLM to pick the most contextually relevant answer rather than the most frequent one.
All workspace data — skeletons, KV content, knowledge graphs, and the catalog — is persisted in a single WAL-mode SQLite database (store.db) per workspace via StoreDB, providing atomic transactions and eliminating the file-locking issues of legacy multi-file stores.
The demo UI includes a Multi-Document tab where you can upload multiple documents, build the workspace KG, and run cross-document queries interactively.
Ingest an entire folder of documents (or a list of files) into a DocumentStore in one call:
from rnsr import DocumentStore
store = DocumentStore("./my_store/")
# Ingest all PDFs in a folder
result = store.batch_ingest("./contracts/")
# Recurse into subdirectories
result = store.batch_ingest("./contracts/", recursive=True)
# Ingest a specific list of files
result = store.batch_ingest([
"report_q1.pdf",
"report_q2.pdf",
"report_q3.pdf",
])
# Parallel ingestion with KG build
result = store.batch_ingest(
"./contracts/",
recursive=True,
max_workers=4,
build_kg=True, # build workspace KG + entity linking after ingestion
skip_existing=True, # skip files already in the catalog
)
print(f"{result.succeeded}/{result.total} ingested in {result.elapsed_seconds:.1f}s")
print(f"Skipped: {result.skipped}, Failed: {result.failed}")The same functionality is available from the command line:
# Flat folder
python -m rnsr batch-ingest ./docs/
# Recursive with parallel workers
python -m rnsr batch-ingest ./docs/ --recursive --workers 4
# Explicit file list
python -m rnsr batch-ingest file1.pdf file2.pdf file3.pdf
# Custom store path, glob pattern, and KG build
python -m rnsr batch-ingest ./docs/ -s ./my_store/ -g "*.pdf" --build-kgFor maximum flexibility, you can build indexes externally and pass them into RNSR:
from rnsr import RNSRClient
client = RNSRClient()
# Build indexes once
skeleton, kv_store = client.build_index("document.pdf")
kg = client.build_knowledge_graph(skeleton, kv_store, doc_id="my_doc")
# Query with pre-built data (no re-indexing)
result = client.query(
"What are the key findings?",
skeleton=skeleton,
kv_store=kv_store,
knowledge_graph=kg,
)
print(result["answer"])
# Or pass pre-built data into ask() / ask_advanced()
answer = client.ask(
"document.pdf",
"Who is the primary applicant?",
skeleton=skeleton,
kv_store=kv_store,
knowledge_graph=kg,
)You can also import the building blocks directly:
from rnsr import SkeletonNode, KnowledgeGraph, SQLiteKVStore, InMemoryKVStoreAutomatically build chronological timelines from the knowledge graph:
from rnsr.extraction.timeline_extractor import extract_timeline, format_timeline
# Extract timeline from any knowledge graph (single doc or workspace)
events = extract_timeline(kg)
# Pretty-print
print(format_timeline(events))
# 1. [15 Mar 2019] Contract signed — Entities: Acme Corp, John Smith
# 2. [01 Jun 2023] Amendment filed — Entities: Acme Corp
# 3. [10 Dec 2024] Renewal deadline — Entities: Acme Corp
# Access structured data
for event in events:
print(f"{event.date_str} — {event.description}")
print(f" Parsed: {event.date_parsed}")
print(f" Entities: {event.entities_involved}")
print(f" Source doc: {event.doc_id}")Flag conflicting claims within a single document or across multiple documents:
from rnsr.analysis import detect_document_contradictions, detect_cross_document_contradictions
# Single-document contradictions
contradictions = detect_document_contradictions(
kg=knowledge_graph,
skeleton=skeleton,
kv_store=kv_store,
)
for c in contradictions:
print(f"[{c.type}] {c.confidence:.0%} confidence")
print(f" Claim 1 ({c.source_1}): {c.claim_1}")
print(f" Claim 2 ({c.source_2}): {c.claim_2}")
print(f" {c.explanation}")
# Cross-document contradictions (compares claims from different docs)
# Pass an llm_fn for highest-quality results (strategies 3-5 use it)
from rnsr.llm import get_llm
llm = get_llm()
llm_fn = lambda prompt: str(llm.complete(prompt))
store = DocumentStore("./docs")
kg = store.get_workspace_kg()
doc_tuples = [
(doc_id, *store.get_document(doc_id))
for doc_id in store
]
cross_contradictions = detect_cross_document_contradictions(
kg, doc_tuples, llm_fn=llm_fn
)Cross-document detection uses six complementary strategies:
| # | Strategy | How it works | Signal quality |
|---|---|---|---|
| 1 | KG CONTRADICTS | Looks for explicit CONTRADICTS relationships already in the knowledge graph |
High (pre-extracted) |
| 2 | Subject-Gated Heuristic | Negation detection ("was granted" vs "was denied") and numeric conflicts, but only between claims that share meaningful content words. Dates, reference codes, and section numbers are stripped before comparison | Medium |
| 3 | LLM Semantic | Broad LLM scan of top claims across documents | High (requires llm_fn) |
| 4 | Structure-Parallel | Matches sections with similar headers across documents (e.g. "Diagnosis" in two expert reports) using SequenceMatcher, then compares their content via LLM or heuristic fallback |
High |
| 5 | Entity-Centric | Uses the KG + EntityLinker to find entities spanning multiple documents, gathers all passages mentioning each entity, groups by document, and asks the LLM to find conflicts about the same entity |
Highest |
| 6 | Relationship Divergence | Walks the KG relationship graph for linked entities across documents, detecting contradictory patterns (e.g. SUPPORTS in one doc but CONTRADICTS in another, or same relationship type with conflicting evidence) |
High |
Strategies 4 and 5 exploit the document tree structure (parallel section headers) and cross-document entity mapping (KG entity linking) to compare only what should be compared, eliminating the false positives that plague naive pairwise approaches.
RNSR learns from your document workload. All learned data persists in ~/.rnsr/:
~/.rnsr/
├── learned_entity_types.json # New entity types discovered
├── learned_relationship_types.json # New relationship types
├── learned_normalization.json # Title/suffix patterns
├── learned_stop_words.json # Domain-specific stop words
├── learned_header_thresholds.json # Document-type font thresholds
├── learned_query_patterns.json # Successful query patterns
├── reasoning_chains.json # Successful reasoning chains
└── llm_cache.db # LLM response cache
The more you use RNSR, the better it gets at understanding your domain.
graph LR
PDF["📄 PDF Document"]
ING["🔍 Ingestion"]
TREE["🌳 Hierarchical Tree"]
SKEL["📋 Skeleton Index"]
KG["🧠 Knowledge Graph"]
NAV["🧭 RLM Navigator"]
ANS["✅ Grounded Answer"]
PDF --> ING
ING --> TREE
TREE --> SKEL
TREE --> KG
SKEL --> NAV
KG --> NAV
NAV --> ANS
style PDF fill:#e1f5fe
style KG fill:#f3e5f5
style ANS fill:#e8f5e9
RNSR ingests PDFs, DOCX, XLSX, CSV, MSG, and image files through a unified pipeline:
flowchart TD
INPUT["📄 Document Input"] --> FMT{"File Format?"}
FMT -->|PDF| EXTRACT{"Has extractable text?"}
FMT -->|DOCX/XLSX/CSV/MSG| TEXT["Extract Text"]
FMT -->|"Image (PNG/JPG/...)"| VLM_IMG["VLM Transcription"]
EXTRACT -->|Yes| T1["Tier 1: Font Histogram"]
EXTRACT -->|No| T3
T1 -->|Success| TREE["Build Hierarchical Tree"]
T1 -->|Fail| T2["Tier 2: Semantic Splitter"]
T2 -->|Success| TREE
T2 -->|Fail| T3["Tier 3: VLM OCR"]
T3 --> TREE
TEXT --> TREE
VLM_IMG --> TREE
TREE --> SKEL["Skeleton Index"]
TREE --> KV["Unified Store (StoreDB)"]
TREE --> TBL["Table Detection"]
Tier 3 (VLM OCR) renders each PDF page to a 300 DPI image with PyMuPDF, then transcribes via Gemini/Anthropic/OpenAI vision with automatic provider fallback. Tesseract is kept as a legacy fallback only.
flowchart LR
Q["❓ Question"] --> CL["Clarify<br>ambiguity?"]
CL --> PF["Pre-Filter<br>(keyword scan)"]
PF --> NAV["RLM Tree<br>Navigation"]
NAV --> SYN["Synthesise<br>Answer"]
SYN --> SR["Self-Reflect<br>& Critique"]
SR --> VER["Verify<br>(optional)"]
VER --> A["✅ Answer +<br>Provenance"]
NAV -->|"complex query"| SUB["Sub-LLM<br>Recursion"]
SUB --> NAV
style Q fill:#e1f5fe
style A fill:#e8f5e9
style NAV fill:#fff3e0
The extractor receives ancestor context from the skeleton tree so it always knows whose data it is extracting (e.g. the primary applicant's passport).
flowchart TD
DOC["🌳 Document Tree"] --> SPLIT["Split into<br>Skeleton Nodes"]
SPLIT --> CTX["Build Ancestor Context<br>per Node"]
CTX --> POOL["ThreadPool<br>(8 workers)"]
subgraph PER_NODE ["Per-Node Extraction"]
direction TB
ANC["📍 Ancestor Breadcrumb<br>+ Subject Hint"] --> LLM["LLM Writes<br>Extraction Code"]
LLM --> EXEC["Execute on<br>DOC_VAR"]
EXEC --> TOT["ToT Validation<br>(probability scores)"]
TOT --> ENT["Entities &<br>Relationships"]
end
POOL --> PER_NODE
PER_NODE --> MERGE["Merge Results"]
MERGE --> KG["🧠 Knowledge Graph"]
MERGE --> LEARN["📚 Learn New Types<br>(~/.rnsr/)"]
style DOC fill:#e1f5fe
style KG fill:#f3e5f5
style LEARN fill:#fce4ec
style ANC fill:#fff9c4
Ancestor context example — when extracting Identity Documents (a child of PRIMARY APPLICANT DETAILS), the prompt receives:
Document path: Form 80 > PRIMARY APPLICANT DETAILS > Identity Documents
Subject context: Title: Mr | Family Name: Sorenssen | Given Names: GeoV William | ...
This lets the LLM produce Passport PA1234567 → BELONGS_TO → GeoV William Sorenssen
instead of the meaningless Passport → MENTIONS → PA1234567.
Relationship types that the LLM discovers but don't match a canonical type are
persisted to ~/.rnsr/learned_relationship_types.json. On future documents the
learned types are injected back into the extraction prompt, creating a feedback
loop that improves with use.
flowchart LR
EXT["Extraction<br>Result"] --> CHK{"Type matches<br>canonical?"}
CHK -->|Yes| KG["Knowledge Graph"]
CHK -->|No → OTHER| REC["Record in<br>Registry"]
REC --> AUTO["Auto-Suggest<br>Canonical Mapping"]
AUTO --> JSON["💾 ~/.rnsr/<br>learned_*.json"]
JSON -->|"Next extraction"| PROMPT["Inject into<br>LLM Prompt"]
PROMPT --> EXT
style JSON fill:#fce4ec
style KG fill:#e8f5e9
style PROMPT fill:#fff9c4
RNSR uses a unique combination of Tree of Thoughts (ToT) reasoning and a REPL (Read-Eval-Print Loop) environment for document navigation. This is what sets RNSR apart from naive RAG approaches.
The Problem with Naive RAG: Traditional RAG splits documents into chunks, embeds them, and retrieves based on similarity. This loses hierarchical structure and often retrieves irrelevant chunks for complex queries.
RNSR's RLM Navigation Solution:
flowchart TD
Q["❓ Query"] --> REPL["NavigationREPL<br>(document as environment)"]
subgraph LOOP ["Iterative Code-Generation Loop"]
direction TB
REPL --> GEN["LLM Generates<br>Python Code"]
GEN --> RUN["Execute Code<br>(search_tree, navigate_to, …)"]
RUN --> FIND["Store Findings"]
FIND -->|"Need more info"| REPL
FIND -->|"ready_to_synthesize()"| VAL["ToT Validation<br>(probability scores)"]
end
VAL --> ANS["✅ Grounded Answer<br>+ Citations"]
style Q fill:#e1f5fe
style ANS fill:#e8f5e9
style GEN fill:#fff3e0
How it works:
-
Document as Environment: The document tree is exposed as a programmable environment through
NavigationREPL. The LLM can write Python code to search, navigate, and extract information. -
Code Generation Navigation: Instead of keyword matching, the LLM writes code like:
# LLM-generated code to find CEO salary results = search_tree(r"CEO|chief executive|compensation|salary") for match in results[:3]: navigate_to(match.node_id) content = get_node_content(match.node_id) if "salary" in content.lower(): store_finding("ceo_salary", content, match.node_id) ready_to_synthesize()
-
Iterative Search: The LLM can execute multiple rounds of code, drilling deeper into promising sections, just like a human would browse a document.
-
ToT Validation: Findings are validated using Tree of Thoughts - each potential answer gets a probability score based on how well it matches the query and document evidence.
-
Grounded Answers: All answers are tied to specific document sections. If the LLM can't find reliable information, it honestly reports "Unable to find reliable information" rather than hallucinating.
Available NavigationREPL Functions:
| Function | Description |
|---|---|
search_content(pattern) |
Regex search within current node |
search_children(pattern) |
Search direct children |
search_tree(pattern) |
Search entire subtree with relevance scoring |
navigate_to(node_id) |
Move to a specific section |
go_back() |
Return to previous section |
go_to_root() |
Return to document root |
get_node_content(node_id) |
Get full text of a section |
store_finding(key, content, node_id) |
Save relevant information |
ready_to_synthesize() |
Signal that enough info has been gathered |
Why This Outperforms Naive RAG:
- Hierarchical Understanding: RNSR understands that "Section 42" might contain the CEO salary even if the query doesn't mention "Section 42"
- Multi-hop Reasoning: Can navigate from a table of contents to a specific subsection to find buried information
- Document Length Agnostic: Works equally well on 10-page and 1000-page documents - the LLM navigates to relevant sections rather than trying to fit everything in context
- No Hallucination: If information isn't found through code execution, the system admits it rather than making up answers
graph TD
CLIENT["client.py<br>High-Level API"]
DS["document_store.py<br>Multi-Doc Workspace"]
subgraph INGESTION ["ingestion/"]
P["pipeline.py<br>Multi-Format Orchestrator"]
FH["font_histogram.py"]
HC["header_classifier.py"]
TB["tree_builder.py"]
TP["table_parser.py"]
CP["chart_parser.py"]
OCR["ocr_fallback.py<br>VLM OCR"]
end
subgraph INDEXING ["indexing/"]
SDB["store_db.py<br>Unified SQLite Store"]
SI["skeleton_index.py"]
KV["kv_store.py"]
KGR["knowledge_graph.py"]
SS["semantic_search.py"]
CS["collection_skeleton.py"]
ES["expandable_skeleton.py"]
end
subgraph EXTRACTION ["extraction/"]
RUE["rlm_unified_extractor.py"]
LT["learned_types.py"]
EL["entity_linker.py"]
TL["timeline_extractor.py"]
MOD["models.py"]
end
subgraph ANALYSIS ["analysis/"]
CD["contradiction_detector.py"]
end
subgraph AGENT ["agent/"]
RN["rlm_navigator.py"]
CDN["cross_doc_navigator.py"]
NR["nav_repl.py"]
PROV["provenance.py"]
LC["llm_cache.py"]
SR["self_reflection.py"]
RM["reasoning_memory.py"]
QC["query_clarifier.py"]
end
LLM["llm.py<br>Multi-Provider Abstraction"]
CLIENT --> INGESTION
CLIENT --> INDEXING
CLIENT --> EXTRACTION
CLIENT --> AGENT
DS --> CLIENT
DS --> INDEXING
DS --> EXTRACTION
ANALYSIS --> EXTRACTION
AGENT --> LLM
EXTRACTION --> LLM
INGESTION --> INDEXING
style CLIENT fill:#e1f5fe
style DS fill:#e1f5fe
style LLM fill:#fff3e0
style ANALYSIS fill:#fce4ec
File tree (plain text)
rnsr/
├── agent/ # Query processing
│ ├── rlm_navigator.py # Main navigation agent (RLM + ToT)
│ ├── cross_doc_navigator.py # Cross-document query orchestrator
│ ├── nav_repl.py # NavigationREPL for code-based navigation
│ ├── repl_env.py # Base REPL environment
│ ├── provenance.py # Citation tracking
│ ├── llm_cache.py # Response caching
│ ├── self_reflection.py # Answer improvement
│ ├── reasoning_memory.py # Chain memory
│ ├── query_clarifier.py # Ambiguity handling
│ ├── graph.py # LangGraph workflow
│ └── variable_store.py # Context management
├── analysis/ # Higher-level analysis tools
│ └── contradiction_detector.py # Within- and cross-document contradiction detection
├── extraction/ # Entity/relationship extraction
│ ├── rlm_unified_extractor.py # Unified extractor (RLM + ToT)
│ ├── learned_types.py # Adaptive type learning
│ ├── entity_linker.py # Cross-document entity linking
│ ├── timeline_extractor.py # Chronological timeline extraction
│ └── models.py # Entity/Relationship models
├── indexing/ # Index construction
│ ├── store_db.py # Unified WAL-mode SQLite store per workspace
│ ├── skeleton_index.py # Summary generation
│ ├── collection_skeleton.py # Collection-level skeleton builder
│ ├── expandable_skeleton.py # Lazy skeleton expansion
│ ├── knowledge_graph.py # Entity/relationship storage (SQLite-backed)
│ ├── kv_store.py # SQLite/in-memory storage
│ └── semantic_search.py # Optional vector search
├── ingestion/ # Document processing
│ ├── pipeline.py # Multi-format ingestion orchestrator (PDF, DOCX, XLSX, CSV, MSG, images)
│ ├── font_histogram.py # Font-based structure detection
│ ├── header_classifier.py # H1/H2/H3 classification
│ ├── ocr_fallback.py # VLM OCR via Gemini/Anthropic/OpenAI vision (tesseract as legacy fallback)
│ ├── table_parser.py # Table extraction
│ ├── chart_parser.py # Chart interpretation
│ └── tree_builder.py # Hierarchical tree construction
├── document_store.py # Multi-document workspace management
├── llm.py # Multi-provider LLM abstraction
├── client.py # High-level API (incl. BYOD + cross-doc)
└── models.py # Data structures
from rnsr import RNSRClient
# Auto-detect provider from environment variables or .env file
client = RNSRClient()
# Explicit provider + API key (recommended for PyPI installs)
client = RNSRClient(
api_key="your-key",
llm_provider="gemini", # "openai", "anthropic", or "gemini"
llm_model="gemini-2.5-flash", # optional model override
cache_dir="./rnsr_cache", # optional index cache
)
# Simple query
answer = client.ask("document.pdf", "What is the main topic?")
# Vision mode (for scanned docs)
answer = client.ask_vision("scanned.pdf", "What does the chart show?")| Parameter | Type | Default | Description |
|---|---|---|---|
cache_dir |
str | Path | None |
None |
Directory for caching indexes. Persists and reuses indexes when set. |
llm_provider |
str | None |
None |
LLM provider ("openai", "anthropic", "gemini"). Auto-detected from available API keys when omitted. |
llm_model |
str | None |
None |
Model name override. Uses the provider's default when omitted. |
api_key |
str | None |
None |
API key for the LLM provider. When llm_provider is also set, the key is injected only for that provider; otherwise it is set for all three. |
from rnsr import (
ingest_document,
build_skeleton_index,
run_rlm_navigator,
SQLiteKVStore
)
from rnsr.extraction import RLMUnifiedExtractor
from rnsr.agent import ProvenanceTracker, SelfReflectionEngine
# Step 1: Ingest document
result = ingest_document("document.pdf")
print(f"Extracted {result.tree.total_nodes} nodes")
# Step 2: Build index
kv_store = SQLiteKVStore("./data/index.db")
skeleton = build_skeleton_index(result.tree, kv_store)
# Step 3: Extract entities (grounded, no hallucination)
extractor = RLMUnifiedExtractor()
extraction = extractor.extract(
node_id="section_1",
doc_id="document",
header="Introduction",
content="..."
)
# Step 4: Query with provenance
answer = run_rlm_navigator(
question="What are the key findings?",
skeleton=skeleton,
kv_store=kv_store
)
# Step 5: Get citations
tracker = ProvenanceTracker(kv_store=kv_store)
record = tracker.create_provenance_record(answer, question, variables)RNSR supports three configuration methods (highest priority first):
- Programmatic — pass
api_key,llm_provider, andllm_modeldirectly toRNSRClient()orDocumentStore(). .envfile — place a.envfile in your working directory (or the project root for dev checkouts). RNSR loads it automatically viapython-dotenv.- System environment variables — export the variables in your shell.
| Variable | Description | Default |
|---|---|---|
LLM_PROVIDER |
Primary LLM provider | auto (detect from keys) |
SUMMARY_MODEL |
Model for summarization | Provider default |
AGENT_MODEL |
Model for navigation | Provider default |
EMBEDDING_MODEL |
Embedding model | text-embedding-3-small |
KV_STORE_PATH |
SQLite database path | ./data/kv_store.db |
LOG_LEVEL |
Logging verbosity | INFO |
RNSR_EXTRACTION_MODEL |
Model for entity extraction (e.g. gemini-2.5-flash) |
Same as primary LLM |
RNSR_EXTRACTION_PROVIDER |
Provider for entity extraction (openai, anthropic, gemini) |
Same as primary provider |
RNSR_LLM_CACHE_PATH |
Custom cache location | ~/.rnsr/llm_cache.db |
RNSR_LLM_SEED |
Deterministic seed for OpenAI/Gemini | 42 |
RNSR_LLM_CACHE |
Enable disk-based LLM response caching (1 to enable) |
Off |
RNSR_REQUIRE_GROUNDING |
Discard entities not found in source text (1 to enable) |
Off |
RNSR_REASONING_MEMORY_PATH |
Custom memory location | ~/.rnsr/reasoning_chains.json |
| Provider | Models |
|---|---|
| OpenAI | gpt-5.2, gpt-5-mini, gpt-5-nano, gpt-4.1, gpt-4o-mini |
| Anthropic | claude-opus-4-5, claude-sonnet-4-5, claude-haiku-4-5 |
| Gemini | gemini-3-pro-preview, gemini-3-flash-preview, gemini-2.5-pro, gemini-2.5-flash |
RNSR is designed for complex document understanding tasks:
- Multi-document PDFs - Automatically detects and separates bundled documents
- Hierarchical queries - "Compare section 3.2 with section 5.1"
- Cross-reference questions - "What does the appendix say about the claim in section 2?"
- Entity extraction - Grounded extraction with ToT validation (no hallucination)
- Table queries - "What is the total for Q4 2024?"
RNSR includes sample documents for testing and demonstration:
| File | Type | Features Demonstrated |
|---|---|---|
sample_contract.md |
Legal Contract | Entities (people, orgs), relationships, payment tables, legal terms |
sample_financial_report.md |
Financial Report | Financial tables, metrics, executive names, quarterly data |
sample_research_paper.md |
Academic Paper | Citations, hierarchical sections, technical content, tables |
Legal documents from the Djokovic visa case (public court records) for testing with actual PDFs:
- Affidavits and court applications
- Legal submissions and orders
- Interview transcripts
from pathlib import Path
from rnsr.ingestion import TableParser
from rnsr.extraction import CandidateExtractor
# Parse a sample document
sample = Path("samples/sample_contract.md").read_text()
# Extract tables
parser = TableParser()
tables = parser.parse_from_text(sample)
print(f"Found {len(tables)} tables")
# Extract entities
extractor = CandidateExtractor()
candidates = extractor.extract_candidates(sample)
print(f"Found {len(candidates)} entity candidates")RNSR has comprehensive test coverage with 281+ tests:
# Run all tests
pytest tests/ -v
# Run specific feature tests
pytest tests/test_provenance.py tests/test_llm_cache.py -v
# Run end-to-end workflow tests
pytest tests/test_e2e_workflow.py -v
# Run with coverage
pytest tests/ --cov=rnsr --cov-report=html| Test File | Tests | Coverage |
|---|---|---|
test_e2e_workflow.py |
18 | Full pipeline: ingestion → extraction → KG → query → provenance |
test_provenance.py |
17 | Citations, contradictions, provenance records |
test_llm_cache.py |
17 | Cache get/set, TTL, persistence |
test_self_reflection.py |
13 | Critique, refinement, iteration limits |
test_reasoning_memory.py |
15 | Chain storage, similarity matching |
test_query_clarifier.py |
19 | Ambiguity detection, clarification |
test_table_parser.py |
26 | Markdown/ASCII tables, SQL-like queries |
test_chart_parser.py |
16 | Chart detection, trend analysis |
test_rlm_unified.py |
13 | REPL execution, code cleaning |
test_learned_types.py |
13 | Adaptive learning registries |
The test_e2e_workflow.py demonstrates the complete pipeline:
# Tests cover:
# 1. Document Ingestion - Parse structure and tables
# 2. Entity Extraction - Pattern-based grounded extraction
# 3. Knowledge Graph - Store entities and relationships
# 4. Query Processing - Ambiguity detection, table queries
# 5. Provenance - Citations and evidence tracking
# 6. Self-Reflection - Answer improvement loop
# 7. Reasoning Memory - Learn from successful queries
# 8. LLM Cache - Response caching
# 9. Adaptive Learning - Type discovery
# 10. Full Workflow - Contract and financial analysis# Install dev dependencies
pip install -e ".[dev]"
# Run linting
ruff check .
# Type checking
mypy rnsr/
# Switch between feature branches (interactive picker)
make switchFor testers trying out new features, make switch provides an interactive numbered menu of up to 10 branches sorted by most recent commit:
$ make switch
🔀 Available branches:
1) feature/byod-multi-doc
2) main (current)
Enter branch number (1-10): 1
Switching to: feature/byod-multi-doc
✅ Now on branch: feature/byod-multi-doc
- Python 3.9+
- At least one LLM API key (OpenAI, Anthropic, or Gemini)
MIT License - see LICENSE for details.
See CONTRIBUTING.md for guidelines.
RNSR is inspired by:
- Hybrid Document Retrieval System Design - Core architecture and design principles
- PageIndex (VectifyAI) - Vectorless reasoning-based tree search
- Recursive Language Models - REPL environment with recursive sub-LLM calls
- Tree of Thoughts - LLM-based decision making with probabilities
- Self-Refine / Reflexion - Iterative self-correction patterns