Disclaimer: I am not an investigative journalist, I am a data scientist. All content within this repo must be taken with a grain of salt: it has not been independently verified by a human. It is primarily the result of Claude Opus 4.6 reviewing the searchable database I have established at: https://github.com/rhowardstone/Epstein-research-data
Remember, there is a reason why journalists and investigators are so heavily trained and trusted. There are extensive protocols they take in order to ensure the information they put out is vetted. The average person does not have these skills. Do not reach any conclusions based solely on the content in this repo. The underlying data is available, and all source documents should be linked directly. If you see any inaccuracies, broken links, or anything else fishy - please don't hesitate to reach out by opening an issue.
Similarly, please remember to take care of yourself if you choose review these things. Be very careful when looking through this stuff. There is extensive training that people go through who do this for a living to make sure they properly self-care, decompress, and take breaks. It is very easy to get carried away, very easy to have emotional reactions. For example, data scientists who read CSAM-adjacent material are subject to mandatory maximum shift lengths, mandatory group therapy and debrief sessions with others who are going through the same data. Be good to yourself, be with friends, and do not just pour through it without taking solid breaks.
The content is intended only for people 18+, and only at your own peril. I am hoping a structured, indexed, organized set of documents makes the "click-bait jump scares" less traumatizing than those we are seeing on social media. Second-hand PTSD is a real phenomenon, and you do not want to live with some of the descriptions and images within this dataset in your head for the rest of time.
This repository contains 150+ forensic analysis reports derived from the U.S. Department of Justice's release of Jeffrey Epstein investigation files -- 194.5 GB across 12 datasets, comprising 1,380,937 PDFs (2,731,785 pages, 3.18 billion characters of text), plus 3,226 non-PDF native files (video, audio, spreadsheets). The analysis involved full-text extraction and FTS5 indexing of every page, 2,587,102 redaction records, 1,530 audio/video transcripts, a 1,536-person entity registry, and cross-referenced entity relationships (524 entities, 2,096 connections).
Every factual claim in these reports traces back to specific EFTA document numbers (Epstein Files Task Force Archive). Click any linked EFTA number to attempt to view the original PDF on justice.gov.
This is not opinion or speculation. These reports synthesize what the documents themselves say, with sourcing. Where conclusions are drawn, the supporting evidence chain is cited. Where evidence is ambiguous, that is noted. If you notice any problems, please raise an ISSUE on this repository and it will be attended to promptly.
| Document | Description |
|---|---|
| CONGRESSIONAL_READING_GUIDE.md | Prioritized document list for the DOJ reading room (original 90 documents) |
| CONGRESS_RAW_EFTA_LIST.md | Complete EFTA document numbers with descriptions |
| CONGRESSIONAL_FOLLOWUP_NEW_FINDINGS.md | NEW: 60 additional critical documents from the full-corpus revisit (DS9-12) — includes FBI FD-1023, current government officials, co-conspirator list, Wexner deposition evidence |
| CONGRESSIONAL_ADDENDUM.md | Supplement: key corrections from the 225-issue factual accuracy audit, DS9 MCC death investigation documents, and March-July 2025 FBI evidence review conclusions |
| congressional_priority_list.md | DOJ reading room priority list: documents ranked by "predator name reveal" score to maximize identification of redacted perpetrators during limited in-person review time |
Every EFTA######## number is a unique DOJ document identifier. Throughout these reports, EFTA numbers are hyperlinked to the DOJ's original PDF hosting location:
https://www.justice.gov/epstein/files/DataSet%20{N}/EFTA{########}.pdf
See the EFTA Dataset Mapping table at the bottom of this file to determine which dataset contains a given EFTA number.
Note: These reports were written in January–February 2026 during the first weeks of the investigation. They remain accurate but do not reflect the full scope of findings from the 150+ reports that followed. For the complete catalog, browse the sections below or visit the full report index.
| Report | Description |
|---|---|
| FINAL_INVESTIGATION_REPORT | Early investigation synthesis (Feb 7, 2026). 400+ EFTA citations, $755M traced, 30+ named individuals. Covers first two weeks of findings only — see later reports for deeper analysis. |
| INSTITUTIONAL_FAILURE_NARRATIVE | "The Architecture of Impunity" -- 7-chapter prosecutorial failure narrative, 1996-2024, 80+ EFTA citations. |
| MASTER_REPORT | Early findings (Feb 5, 2026) from initial redaction and text layer analysis of the 627MB text corpus. |
| PHASE1_GAP_DETECTION | Gap detection and counterfactual analysis -- identifying what's missing from the record. |
| PHASE2_LEVER_TRACEBACK | Who had the power to shield whom. Agency failures, financial concealment, academic legitimization. |
| PHASE3_HIDDEN_DOMAINS | Hidden domain connections. 90+ queries across 1.8M+ redaction records, DS10, knowledge graph, and OCR text recordss. |
| PHASE4_BRIEFING_KIT | Congressional briefing kit. Prepared for staff use, based on 3.4M redaction records. |
| SESSION9_MASTER_FINDINGS | Supplemental findings: art forensics, trafficking routes, device forensics, prosecution failures, CBP corruption, 4chan/online evidence. |
| ANALYSIS_SUMMARY | First-look findings (Jan 2026) from the initial DOJ release, anchored to SDNY prosecution memo. |
| UNEXPLORED_DOCUMENT_MINING | Deep-search of under-examined areas: camera-in-clock, T-160 VHS tapes, MCC DVR, crypto network, 48 diamonds, CSAM found 2023. |
| Report | Description |
|---|---|
| FORENSIC_ACCT_1_HAZE_DRAWDOWN | Tracing the Haze Trust $41.7M drawdown from $49.5M to ~$7.7M (June 2018 - February 2019). |
| FORENSIC_ACCT_2_MONEY_SOURCES | Tracing the sources of Epstein's wealth. No legitimate source for $500M+ identified in the record. |
| FORENSIC_ACCT_3_INTER_ENTITY_FLOWS | Inter-entity fund flows across the Epstein shell company network. |
| FORENSIC_ACCT_4_JABWCPA_INSTITUTION1 | Identification of JABWCPA (Jeanne Anne Brennan Wiebracht, CPA — de-redacted via DS9) and Institution-1 (Deutsche Bank). Richard Kahn confirmed as rkahn email. |
| FORENSIC_ACCT_5_CALENDAR_CORRELATION | Cross-referencing meeting/calendar data with financial transactions. |
| FORENSIC_ACCT_6_POST_DEATH_ASSETS | Post-death disposition of $600M+ estate: 14 entities, Indyke/Kahn as co-executors. |
| SHELL_ENTITY_MAP | Complete map of 95+ Epstein shell entities across 10 categories under RM CODE 82289. |
| SHELL_ENTITY_DARK_MONEY_INVESTIGATION | 57 additional entities beyond the 95+ baseline map: JEEPERS INC, ELLMAX LLC, Rothschild pipeline. |
| TRANSACTION_CHAIN_AUCTION_TO_DESTINATION | Complete forensic trace: $30.5M in Sotheby's/Christie's auction proceeds through Haze Trust to Valar, Honeycomb, Boothbay, Plan D. |
| TRANSACTION_CHAIN_BLACK_ART_MACHINE | Prosecutorial narrative: 15 chains tracing $168M Black-to-Epstein, art machine / trafficking machine structural unity. |
| TRANSACTION_CHAIN_THIRD_PARTY_ART | Third-party art-related money flows: Prytanee LLC (corrected: Etienne Pierre Jean Binant, not Jack Lang), Rothschild $25M, Tudor $13.5M, Gratitude America, David Mitchell $526K. |
| INVESTIGATION_2_DB_KYC_BREACH | Deutsche Bank KYC breach timeline for Southern Financial LLC / Epstein. |
| INVESTIGATION_3_HAZE_TRUST_AML | Haze Trust AML inquiry -- Deutsche Bank's anti-money laundering process for Epstein's largest trust vehicle. |
| INVESTIGATION_4_2018_WIRE_RECIPIENTS | November/December 2018 wire recipients -- post-Miami Herald payments. |
| INVESTIGATION_7_BARRETT_REPORTS | Paul Barrett's weekly reports as Deutsche Bank relationship manager on the Epstein account. |
| DILORIO_APOLLO_WHISTLEBLOWER | Christopher J. DiLorio SEC whistleblower complaint -- Apollo/Epstein/Kushner connections, ESWW shell company. |
| WECHSLER_BLACK_TRUST_INVESTIGATION | Brad Wechsler (Elysium Management), J BLACK Trust identified as Leon Black discretionary gift trust (created April 2014 at Epstein's direction), $30.5M BV70 circular loan structure. DS9 yielded complete trust agreement chain. |
| LUXURY_PURCHASES_ANALYSIS | Luxury purchases, lifestyle spending, and high-value acquisitions analysis. |
| WOW_GOLD_IGE_BANNON_SEARCH | NEGATIVE: Zero evidence of WoW gold / IGE / virtual currency money laundering across 3.5M+ records. |
| Report | Description |
|---|---|
| LEON_BLACK_PROSECUTION_FAILURE | Complete prosecution failure timeline: SDNY + Manhattan DA failed to charge despite 4+ victims, FBI 302s, $62.5M USVI settlement. |
| LUTNICK_DUBIN_INVESTIGATION | Howard Lutnick (single NTOC tip, financial only) and Glen Dubin (54 documents, "lent out" testimony, Eva described as present, 34+ flights). |
| WILLIAM_BARR_INVESTIGATION | 55+ documents: NTOC tip, father hired Epstein at Dalton, Kirkland & Ellis conflict, split recusal, death investigation oversight. Corrected: OIG did publish its report in June 2023 (125+ pages located in DS9). Evan Barr (Fried Frank) distinguished from AG William Barr — separate individual with direct Epstein attorney-client relationship. |
| RUEMMLER_DEEP_DIVE | Former Obama White House Counsel: 29 documents, "Clinton Obama unnecessary implication" warning, career broker relationship through May 2019. |
| SENATOR_MITCHELL_INVESTIGATION | Former Senate Majority Leader: 4 evidentiary pillars, 2 independent victims, Groff/State Dept call, Mitchell's own admission. |
| MITCHELL_CASCADE_INVESTIGATION | David J. Mitchell (estate co-executor): $580.5K wires, fragmentation pattern, "Cascade" code name, Mandelson connection. Separate from Senator Mitchell. |
| ROTHSCHILD_INVESTIGATION | Ariane de Rothschild's untraceable aderfam.ch channel. $25M in 2 wires bracketing EdR $45M DOJ penalty. Both $25M principals now dead. |
| JUNKERMANN_MC2_INVESTIGATION | Nicole Junkermann: 4,182 docs (expanded from 10+ in DS1-8), 10+ year relationship, Leon Black intro brokered, Jan 2019 island trip. MC2 stranding Russian girls in Milan, recruiting ages 13-20. |
| MARCINKOVA_INVESTIGATION | Nadia Marcinkova: near-zero results for full name in redaction databases (1 hit in DS10; may reflect effective redactions, first-name usage, or tool limitations). $100K Aviloop wire 2 days after Miami Herald. 124 flights. NPA protected. |
| INVESTIGATION_1_BARR_NTOC | William Barr NTOC filing deep dive -- forensic analysis of the tip and associated evidence. |
| INVESTIGATION_5_MAXWELL_SSN | Maxwell NYPD firearms permit anomalies: CT-prefix SSN, military/criminal record flags. |
| INVESTIGATION_6_LEON_BLACK | Leon Black: 47 EFTA docs, NTOC filing, HT Subject Referral, "DANY do not doubt her allegations." |
| INVESTIGATION_8_UNEXPLORED_NAMES | 18 previously under-examined names: comprehensive forensic analysis. |
| ALEXANDER_WANDTKE_NSALEM_INVESTIGATION | Alexander brothers (currently on trial SDNY Jan 2026 for sex trafficking 60+ women), Max Wandtke (ghost), North Salem wedding evidence. |
| RUSSIAN_WOMAN_SCOTT_IDENTIFICATION | Identification attempt: woman likely Uzbek (WIUT), possibly "Nina K." (25+ docs). "Scott" unidentified. |
| UNNAMED_PERSONS_INVESTIGATION | Foreign president (Ehud Barak), AOL cluster, 34 journal names mapped, Wigdor corroboration. |
| DUBAI_SULAYEM_INVESTIGATION | Sultan bin Sulayem directed DP World SVP to contact Epstein's USVI attorney re "port of St Croix." Victim names "Sultan from Dubai." 40+ docs. |
| KHANNA_SIX_NAMES_INVESTIGATION | The six men whose names were redacted by the FBI and read onto the House floor by Rep. Ro Khanna on Feb 10, 2026: Nuara, Mikeladze, Leonov, Caputo, Sultan bin Sulayem, and Wexner. 20-person co-conspirator list analysis. |
| JACQUI_SAFRA_INVESTIGATION | Jacob "Jacqui" Safra: Swiss billionaire (~$5B), Safra banking dynasty. 100+ EFTA docs via Brockman/Edge Foundation pipeline. Jerusalem "Sherover House" property negotiation, financial opportunity discussions. No criminal nexus. |
| Report | Description |
|---|---|
| FRENCH_CONNECTION_INVESTIGATION | Epstein's operations in France: Jean-Luc Brunel/MC2/Karin Models recruitment pipeline, 22 Avenue Foch (SCI JEP), Jack Lang & Prytanee LLC $5.2M art vehicle, UN diplomat Fabrice Aidan brokering Gulf royalty meetings, conductor Frederic Chaslin, Deutsche Bank wires through Societe Generale and BNP Paribas. Written as a guide for French investigators. |
| Report | Description |
|---|---|
| ALLRED_VICTIM_INTERVIEW | Complete 30-page FBI evidence package: victim met Epstein at 17, 4 assaults before 18, 2 rapes, harem ideology, Brunel companion. |
| VICTIM_CENSUS | Minimum 60-80 individually identified victims, likely 200+, USVI civil suit says "hundreds." |
| VICTIM_LEADS_VERIFICATION | Re-verification of Leads 7-12 including major correction on flight log modification claim. |
| TRAFFICKING_ROUTES_INVESTIGATION | Aircraft fleet, weekly cycling routes, MC2 recruitment ages 13-20, corrected: pilots did not retroactively add names to logs (explaining incomplete records), victim availability tracking, CBP bypass. |
| Report | Description |
|---|---|
| DEVICE_FORENSICS_COMPLETE | 70+ devices, 2005 computer forensic image never examined by federal authorities (PBCSO may have examined the original), DVR failure 12 days pre-death, 6 machines unexported Oct 2020. |
| PLIST_FORENSIC_SEARCH | 460+ Apple Mail PLIST metadata documents, 2 email accounts, 9-year date range (2009-2018). |
| PLIST_REDACTED_EMAILS_DEEP_DIVE | 12 failed redaction overlays exposing PLIST XML: Russian/Uzbek woman, neuroscience dinner, Groff calling State Dept for Mitchell. |
| PLIST_TIMESTAMP_TRANSACTION_CORRELATION | 420 timestamps vs financial dates: Tudor $13.5M strongest correlation. Note: The "99-day blackout" originally reported here was corrected — DS9 shows continuous email activity. |
| [EFTA00004800_DEEP_DIVE](evidence/EFTA00004800_DEEP_DIVE.md) | FBI "Book 17" evidence binder: 98 pages of CDs/DVDs, "grapes" files blacked out alongside CSAM, ~50+ unscanned media items. |
| BLACKOUT_PERIOD_INVESTIGATION | Investigation of the Nov 2018 - Feb 2019 period. Originally reported as 99-day email silence, corrected: DS9 shows continuous email activity throughout. The gap was a PLIST extraction artifact. Epstein flew 8+ flights, paid $100K/$250K during this period. |
| MAXWELL_FIREARMS_LICENSE_INVESTIGATION | Maxwell NYPD firearms license application investigation. |
| EVIDENCE_COMPILATION | Master evidence table: named individuals with documented victim interactions and legal status. |
| CORRUPTED_PDF_FORENSICS | NEW: Byte-level recovery of 5 "corrupted" DS9 PDFs. Apple Address Book with 8 contacts (Epstein attorney, known associates, Senegalese political figure, PI firm) + iPhone 5s photo from Little Saint James, Aug 2014. No prior public reporting. Recovered files |
| GABRIELLA_RICO_JIMENEZ_INVESTIGATION | NO connection found to Epstein. Jimenez incident Aug 2009 in Monterrey. ZERO hits across all document collections. |
| FOUR_CHAN_PARAMEDIC_INVESTIGATION | 4chan death leak: hard drives removed from SHU at 10:15 PM, guard DPAs then charges dismissed, FBI captured 8+ screenshots. |
| ONLINE_EVIDENCE_INVESTIGATION | r/maxwellhill screenshot in FBI case serial, social media led to Maxwell via Borgerson-Angara-Tidewood shells. |
| Report | Description |
|---|---|
| SCIENCE_NETWORK_COMPREHENSIVE_AUDIT | NEW: Full-corpus audit of 35+ scientists with Epstein connections. 10,000+ documents across uninvestigated individuals. FBI FinCEN investigation of Robert Trivers, 8 Nobel laureates, Biden's Science Advisor (Eric Lander), Global Viral/Metabiota founder (Nathan Wolfe), and mega-event guest lists. |
| BIOTECH_SCIENCE_NETWORK_INVESTIGATION | Original biotech/science report: Brockman/Edge pipeline, Nowak ($6.5M), Chomsky (trust management), Hillis (Zorro Ranch), Krauss, Church, Lloyd, Ito, Boyden, Marsh. |
| DAVID_SHAW_INVESTIGATION | D.E. Shaw & Co.: Limited exposure, proposed as dinner guest only. Science dinner network architecture mapped. |
Key document: EFTA00090314 — FBI Confidential Human Source report (FD-1023 format), filed under intelligence product case numbers ("INTELPRODS"). The CHS states that Epstein "belonged to both U.S. and allied intelligence services," "trained as a spy under" former Israeli PM Ehud Barak, and that "Mossad would then call Dershowitz to debrief" after Epstein-related calls. This is an unverified CHS report — not an FBI conclusion — but it is an official DOJ document preserved in the FBI case file. See ISRAEL_DEEP_DIVE_V2 Section B for full analysis.
| Report | Description |
|---|---|
| ISRAEL_DEEP_DIVE_V2 | Definitive Israel report: Barak 3,756 docs, Carbyne 50 docs, Reporty 324 docs, 301 E 66th nexus, Kohn letters. FBI CHS FD-1023 (EFTA00090314) states Epstein "belonged to both U.S. and allied intelligence services" and "trained as a spy under" Barak. Epstein-Barak exchange (Dec 2018): Epstein writes "you should make clear that i dont work for mossad :)" / "unfortunately, not" — Barak: "You or I?" |
| ISRAELI_INTELLIGENCE_DEEP_DIVE | Initial Israeli intelligence connections investigation across all document collections. Revised: FD-1023 intelligence claims now documented; original "zero hits for Mossad" corrected. |
| POWER_OVERLAP_SEALED_FILINGS_INVESTIGATION | Power overlap, sealed filings, and evidence suppression patterns. 100+ searches across 4 document collections. Section 9.3 corrected: intelligence connection documented via CHS FD-1023 but unverified at FBI conclusion level. |
| Report | Description |
|---|---|
| PROSECUTION_FAILURES_ANALYSIS | Comprehensive documentation of failed prosecutions: NPA architecture, blanket co-conspirator immunity expansion, Acosta deposition, CVRA violations, 15+ named individuals. |
| CBP_CORRUPTION_INVESTIGATION | CBP officer Timothy Routch (Badge #CAS03223, de-redacted via DS9) self-incriminated, 7+ years clearing Epstein's aircraft at St. Thomas. FBI proffer sessions Oct-Nov 2020. |
| CBP_RUEMMLER_REMAINING_LEADS | CBP officer expanded investigation, Ruemmler full 15-email trail, remaining unidentified leads. |
| MIDNIGHT_911_CALL_INVESTIGATION | CAD record anomalies at 358 El Brillo Way: Aug 21, 2002 midnight 911 call logged as "SICK PERSON/FIRST AID" — no narrative, no EMS, no disposition code. Five copies across four datasets all consistent in what they lack. |
| Report | Description |
|---|---|
| ART_INVESTIGATION_COMPLETE | Unified art investigation: $30.5M auction proceeds, Leon Black $2.7B collection, 54 named art world figures, 100+ EFTA citations. 80KB, 72 sections. |
| ART_INVESTIGATION_OCR_IMAGES | Sub-report: 53 searches across OCR text and image records for art-related evidence. |
| ART_INVESTIGATION_REDACTIONS | Sub-report: 165 queries across 3.4M redaction records for art-related content. |
| ART_INVESTIGATION_WEB_RESEARCH | Sub-report: 40+ web sources, 16 sections of open-source intelligence on Epstein art connections. |
| Report | Description |
|---|---|
| CORPUS_INVENTORY | Complete evidence chain: 1,380,937 PDFs, 2,731,785 pages, 194.5 GB across 12 datasets. Per-dataset accounting, derived databases, media inventory, processing pipeline, verification instructions. Start here to understand the source material. |
| MISSING_EFTA_ANALYSIS | Page-based gap detection across all 12 datasets. Exploits the EFTA numbering system (each page = one EFTA number) to account for every document. 31 gaps recovered from DOJ server or forensic carving; 4 confirmed as pages within multi-page PDFs via concordance files; 1 recovered from Wayback Machine after DOJ deleted it Dec 23, 2025. All 2,731,783 EFTA page-numbers accounted for — 100% complete. |
| DATA_QUALITY_AUDIT | Audit of "bad_overlay" redaction records -- confirmed ~98% are OCR noise from degraded scans. |
| EVIDENCE_RELIABILITY_AUDIT | Impact assessment: how "bad_overlay" OCR noise affects investigation report reliability. |
| REDACTION_ASYMMETRY_ANALYSIS | 179,139 redactions analyzed: victim names properly redacted, powerful associates frequently recoverable under bad overlays. |
| DEFECTIVE_REDACTION_FINDINGS | Defective redactions in 2022 Virgin Islands civil case Exhibit 1 — text visible beneath incompetent overlay. |
| REDACTION_TEXT_LAYER_ANALYSIS | Definitive proof: Dec 19 PDFs use invisible OCR text layer (Tr=3). Black boxes are baked JPEG pixels, not PDF overlays. |
| DEC2025_REDACTION_COMPARISON | Comparison of December 19, 2025 DOJ release redactions between original and re-released versions. |
| HIDDEN_TEXT_COMPLETE_REVIEW | Complete review of text extracted from document text layers across the entire EFTA corpus. |
| LEAD_VERIFICATION_REPORT | Deep forensic review via direct PDF reading, cross-referenced with document text records. |
| LEAD_VERIFICATION_PART1 | Leads 1-6 verified. Key corrections: CBP proffer was cooperating witness, not the officer; Austrian passport agent never called back. |
| LEAD_VERIFICATION_PART2 | Leads 7-12 verified. Major correction: flight log modification claim was INVERTED in prior notes. |
| FACTUAL_ACCURACY_AUDIT | 225-issue factual accuracy audit across ~50 report files in 43 phases. Key corrections: Barrack acquittal (was listed as convicted), 99-day blackout disproved (DS9 shows continuous email), flight log modification claim inverted, legal conclusion language corrected throughout. |
| Report | Description |
|---|---|
| CONSPIRACY_THEORY_SEARCH_MISC | Exhaustive search for miscellaneous internet theories across 218GB, 519,438 PDFs. |
| CONSPIRACY_THEORY_SEARCH_OCCULT | NEGATIVE: Zero evidence of satanic rituals, adrenochrome, or blood drinking across 3.5M+ records. |
| CONSPIRACY_THEORY_SEARCH_PIZZAGATE | NEGATIVE: Zero evidence supporting Pizzagate or related theories across 519,438 PDFs. |
Full-corpus search of all 537 current members of Congress (119th Congress), 77 executive branch officials, and 503 Article III federal judges (sourced from the Federal Judicial Center's biographical directory). Every name was searched as an exact quoted phrase across all 1,380,937 documents. Officials with significant hit counts received deeper context analysis to distinguish genuine connections from news mentions.
| Report | Description |
|---|---|
| DEMOCRAT_HOUSE | 216 Democratic House members: 2 DIRECT connections (Plaskett, DeGette), 1 INVESTIGATION, 2 MIXED, 1 FALSE POSITIVE. |
| DEMOCRAT_SENATE | 45 Democratic Senators: 2 MIXED (Schumer donation/return, Warner media project list). |
| REPUBLICAN_HOUSE | 221 Republican House members: 1 MIXED (Loudermilk — different person), 2 FALSE POSITIVES (John James, Scott Fitzgerald). |
| REPUBLICAN_SENATE | 53 Republican Senators: 2 MIXED (McConnell declined donation, Rick Scott routine govt doc), 1 FALSE POSITIVE (Jim Justice). |
| INDEPENDENT_SENATE | 2 Independent Senators (Sanders, King): news coverage only. |
| EXECUTIVE_BRANCH | 77 officials: 8 DIRECT connections (Musk, Bannon, Lutnick, Burns, Trump, Kushner, Rice, Monaco). Key finding: William Burns (future CIA Director) had direct email exchanges with Epstein in 2014 (EFTA00869068, EFTA01748726, EFTA01001666); Epstein brokered a Burns-Thiel introduction (EFTA02370150) and mentioned "bill gates in on the 5th" in a message to Burns (EFTA01001666). |
| JUDICIAL_BRANCH | 503 federal judges (all SCOTUS + 301 circuit + 191 key district): No SCOTUS justice has direct Epstein connection. Elena Kagan (58 docs) linked via Harvard poetry project Epstein funded. Stephanie Thacker (now 4th Circuit) was former CEOS deputy who formally criticized DOJ's Epstein case handling. |
| Report | Description |
|---|---|
| DS10_COMPLETE_FINDINGS | Dataset 10 complete scan: 503,154 PDFs, 1,629,776 redaction rows. FBI "Prominent Names" briefing recovered here. |
| DS10_COMPREHENSIVE_NAME_SEARCH | Comprehensive name search across the DS10 document text records. |
| DS10_ENTITY_EXTRACTION_REPORT | 107,422 entities extracted from 529,061 text records: 49K names, 36K dates, 8K addresses. |
| DS10_FORENSIC_ANALYSIS | Deep forensic analysis of the full 1,808,915-row document text corpus. |
| DS10_INTERIM_FINDINGS | Interim findings from Dataset 10 scan (~15% complete at time of writing). |
| DS10_KEY_DOCUMENTS_DEEP_DIVE | Document-by-document analysis of recovered hidden text from key DS10 documents. |
| DS10_RECONSTRUCTED_PAGES | 39,588 reconstructed pages from spatially-ordered redaction fragments; 4M+ characters recovered. |
| DS8_CONTENT_SURVEY | Dataset 8 comprehensive content survey: 10,594 PDFs across 11 subdirectories. |
| DS8_MEDIA_CATALOG | Dataset 8 media catalog: 11,034 files, 419 surveillance MP4s (412.5 hours), 438 native files. |
| DS8_NEW_LEADS | New leads and investigative threads extracted from Dataset 8. |
| DS8_VERIFICATION | Source document cross-referencing: verifying redaction analysis claims against original PDFs. |
Structured analysis of 257 grand jury subpoenas, decomposed into 2,018 individual demand clauses, matched against production records, and scored for fulfillment. Built from prosecutorial_query_graph.db. See the index for methodology and summary statistics.
| Dossier | Description |
|---|---|
| 00_INDEX | Master index: 257 subpoenas, 2,018 demand clauses, 779 investigative gaps. 48.2% of subpoenas have no identifiable return in the corpus. |
| 01_TEMPORAL_BLACKOUT | The 524-day subpoena gap: why did the grand jury stop issuing subpoenas for 17 months (July 2017 - December 2018)? |
| 02_REDACTED_TARGETS | The 27 fully-redacted subpoena targets: who are the entities behind the blacked-out names? |
| 03_TECH_COMPANY_GAPS | 21 subpoenas to technology companies; only 5 matched to identifiable returns. |
| 04_TRAVEL_RECORDS_GAP | Travel records production gaps: airlines, FBOs, and charter companies. |
| 05_DEUTSCHE_BANK_COMPLIANCE | Deutsche Bank production analysis: what was demanded vs. what was produced. |
| 06_FINANCIAL_NO_RETURNS | Financial institutions subpoenaed with no identifiable returns in the corpus. |
| 07_INDIVIDUAL_SUBPOENAS | Individuals under subpoena: named persons served with grand jury demands. |
| 08_CRYPTO_DEAD_END | The cryptocurrency gap: subpoenas to crypto exchanges with no matched returns. |
| 09_CORRECTIONAL_DEATH_INVESTIGATION | Correctional records gaps and the death investigation: BOP, MCC, and medical examiner subpoenas. |
| 10_SCOPE_EVOLUTION | Prosecutorial scope evolution: how the investigation's focus shifted over time. |
Use this table to determine which DOJ dataset contains a given EFTA number, or to construct the DOJ URL manually.
| Dataset | EFTA Range Start | EFTA Range End | URL Pattern |
|---|---|---|---|
| 1 | EFTA00000001 | EFTA00003158 | DataSet%201/EFTA{########}.pdf |
| 2 | EFTA00003159 | EFTA00003857 | DataSet%202/EFTA{########}.pdf |
| 3 | EFTA00003858 | EFTA00005586 | DataSet%203/EFTA{########}.pdf |
| 4 | EFTA00005705 | EFTA00008320 | DataSet%204/EFTA{########}.pdf |
| 5 | EFTA00008409 | EFTA00008528 | DataSet%205/EFTA{########}.pdf |
| 6 | EFTA00008529 | EFTA00008998 | DataSet%206/EFTA{########}.pdf |
| 7 | EFTA00009016 | EFTA00009664 | DataSet%207/EFTA{########}.pdf |
| 8 | EFTA00009676 | EFTA00039023 | DataSet%208/EFTA{########}.pdf |
| 9 | EFTA00039025 | EFTA01262781 | DataSet%209/EFTA{########}.pdf |
| 10 | EFTA01262782 | EFTA02205654 | DataSet%2010/EFTA{########}.pdf |
| 11 | EFTA02205655 | EFTA02730264 | DataSet%2011/EFTA{########}.pdf |
| 12 | EFTA02730265 | EFTA02731783 | DataSet%2012/EFTA{########}.pdf |
Base URL: https://www.justice.gov/epstein/files/
Note: There are small gaps between some datasets (e.g., Dataset 4 ends at 8320, Dataset 5 starts at 8409). EFTAs falling in gaps are mapped to the nearest lower dataset.
All analysis was performed locally against databases derived from the raw PDF corpus. No documents were uploaded to cloud services or third-party APIs.
| Database | Size | Records | Contents |
|---|---|---|---|
| full_text_corpus.db | 6.08 GB | 1,380,937 docs / 2,731,785 pages | Full text of every page of every document (PyMuPDF extraction + invisible OCR text layers) |
| redaction_analysis_v2.db | 0.95 GB | 2,587,102 redactions / 638,416 docs | Spatial redaction analysis with text at each redaction's coordinates |
| transcripts.db | 2.5 MB | 1,530 entries (375 with speech) | GPU-transcribed audio/video (faster-whisper large-v3 on A100) |
| persons_registry.json | — | 1,536 persons | Unified entity registry from 3 sources |
| knowledge_graph.db | — | 524 entities / 2,096 connections | Cross-referenced entity relationships |
NATIVE_FILES_CATALOG.csv — Complete inventory of all 3,226 non-PDF native files (video, audio, spreadsheets, images) across the DOJ release. Each row includes: EFTA number, dataset, file extension, size, DOJ PDF link, transcription status, word count, duration, and description. Open in Excel or any spreadsheet application. Every non-PDF file has a corresponding PDF placeholder on DOJ; native files include 419 MCC surveillance videos (412+ hours), grand jury audio, prison phone calls, FBI interview recordings, and financial spreadsheets.
For the complete evidence chain -- what data exists, where it came from, and how to verify any finding -- see CORPUS_INVENTORY. For methodology details, data quality assessment, and reliability audits, see the Methodology & Data Quality section above.
Processed data collection: https://github.com/rhowardstone/Epstein-research-data
See COMMUNITY_PLATFORMS.md for a comprehensive directory of 78+ platforms, tools, and resources for searching and analyzing the Epstein files. Includes government sources, search platforms, network visualization tools, AI/RAG tools, datasets, and community hubs.
All processing scripts (36+ Python tools) used to build the databases live in the data repo: Epstein-research-data/tools/
These reports constitute independent forensic analysis of publicly released government documents. They are not legal advice, not government publications, and not affiliated with any law enforcement agency. All findings are derived from documents released by the U.S. Department of Justice and are cited to specific EFTA document numbers that can be independently verified.
Where the evidence is ambiguous or inconclusive, that is stated explicitly. Where claims from prior reporting were found to be incorrect upon verification, corrections are documented (see Lead Verification reports). Negative findings (searches that returned zero results) are reported with equal rigor to positive findings.
This repository does not contain any original source documents, victim-identifying information, or classified material. It contains only analysis and citations.