CaseStack

Turn any document dump into a searchable evidence database.

Built by the team behind epstein-data.com — where we turned the 218GB DOJ Epstein file release into a fully searchable, entity-linked, citation-backed research database.

License

This project is licensed under PolyForm Noncommercial License 1.0.0.

Noncommercial use is allowed under the terms in LICENSE.
Commercial use requires a separate commercial license from the project owner.
Required attribution notice is in NOTICE.
Third-party dependency notices are in THIRD_PARTY_NOTICES.md.

Install

pip install -e ".[pymupdf,nlp]"
python -m spacy download en_core_web_sm

Quickstart

# Point at a folder of PDFs, get a searchable database
casestack ingest ./my-documents --name "City Council FOIA"

# Serve it locally
casestack serve

# Check status
casestack status

Configuration

Copy case.yaml.example to case.yaml and customize. See the example for all options.

How It Works

OCR — Extract text from PDFs (Docling or PyMuPDF)
Entity Extraction — Find people, orgs, dates, money, phone numbers (spaCy NER)
Deduplication — Identify duplicate documents (content hash + fuzzy matching)
Export — SQLite database with FTS5 full-text search
Serve — Datasette web interface with search, filtering, and AI Q&A

Case Presets

Pre-configured case files for known document sets:

presets/epstein.yaml — DOJ Jeffrey Epstein File Release (218GB, 1.38M PDFs)

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
docs/plans		docs/plans
frontend		frontend
presets		presets
src/casestack		src/casestack
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
THIRD_PARTY_NOTICES.md		THIRD_PARTY_NOTICES.md
case.yaml.example		case.yaml.example
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CaseStack

License

Install

Quickstart

Configuration

How It Works

Case Presets

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CaseStack

License

Install

Quickstart

Configuration

How It Works

Case Presets

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages