Explore your exported message data like a personal time machine.
Inbox Archeology transforms your Google Takeout .mbox into a local, visual archive so you can explore relationship patterns over time.
Analyze your Gmail history and uncover patterns in your communication.
- Parses Gmail Takeout mailbox exports
- Builds relationship-level tables through a multi-step pipeline
- Renders a Streamlit dashboard with:
- relationship graph
- timeline views
- reciprocity analysis
- lifecycle metrics
- CORE relationship density by year
Everything runs on your machine.
- Python 3.10+
- A Gmail Takeout
.mboxfile (for real analysis)
Install dependencies:
python -m venv .venvActivate:
- macOS / Linux:
source .venv/bin/activate - Windows (cmd):
.venv\Scripts\activate.bat - Windows (PowerShell):
.venv\Scripts\Activate.ps1
Then:
pip install -r requirements.txtstreamlit run app.pyThe app creates/uses local folders in this repo:
input/for source.mboxfilesworkspaces/for per-run outputs
Recommended path (for real exports):
- Download your Gmail Takeout export
- Locate the mailbox file (often
All Mail.mbox) - Copy the
.mboxintoinput/ - Click Refresh list in the app
Optional path:
- Upload a small test
.mboxdirectly in the app - Intended for small files only (about 200 MB browser upload limit)
- Pick the detected
.mbox - Set a run name (used for
workspaces/<run-name>/) - Reopen previous completed runs from Open Existing Workspace
- Click Run Inbox Archeology
- Progress is shown live as pipeline steps complete
- Open dashboard automatically (toggle in sidebar) or manually after run
For each run:
- Workspace:
workspaces/<run-name>/ - Output files:
workspaces/<run-name>/output/
Typical generated artifacts:
inbox_metadata.csvrelationships_raw.csvrelationships_filtered.csvrelationships_clean.csvcore_timeline.csvcore_timeline.png
Create a .env file in the project root to customize relationship extraction.
SELF_EMAILS=you@gmail.com,alias@gmail.com,you@company.comThis is important for accurate sent/received attribution.
AUTOMATED_DOMAINS=facebookmail.com,google.com,linkedin.com,substack.com
AUTOMATED_PREFIXES=no-reply@,noreply@,notifications@,donotreply@AUTOMATED_DOMAINS: filters addresses ending in@domainAUTOMATED_PREFIXES: filters addresses starting with those prefixes
After changing .env, rerun analysis for affected workspaces.
Inbox Archeology is local-first:
- no cloud processing in this app flow
- no external API requirement for analysis
- no mailbox upload to third-party services by default
Your archive stays on your machine unless you move it.
app.py- Streamlit UI and run workflowpipeline.py- orchestrates step scriptssteps/- modular processing stepsdashboard.py- dashboard renderinginput/- source mailbox filesworkspaces/- saved run outputs
See CONTRIBUTING.md.
MIT
