Operational automation for achieving inbox zero across Gmail and Office 365 mailboxes, then extracting high-value email content into an Obsidian-based second brain.
Two standalone CLI toolchains — one for Gmail, one for O365 — that share identical output schemas. Each uses AI-powered sender classification, batch human review via Google Sheets, and safe resumable execution to clean up even very large mailboxes (200k+ messages).
Designed for agent-assisted workflows. These tools work best when run collaboratively with a terminal-based AI agent (Claude Code, Codex, etc.) that guides you through each phase, explains classifications, and helps you make decisions.
Gmail O365
│ │
▼ ▼
pull ──────────────── pull
│ │
enrich ────────────── enrich
│ │
review (Google Sheets) ─── review (Google Sheets)
│ │
execute ───────────── execute
│ │
▼ ▼
sender-state.v1.json ─── sender-state.v1.json
decision-log.json ────── decision-log.json
contacts.json batch-*.json
│ │
└────────┬───────────────┘
│
identity graph
│
Obsidian vault
(People notes,
knowledge notes)
For each mailbox, the pipeline:
- Pulls all message metadata locally (resumable, checkpointed)
- Classifies every sender as human / company / newsletter / automated
- Presents senders in Google Sheets for human review (keep / filter / unsubscribe)
- Executes decisions: archives noise, creates filters/rules, preserves signal
- Produces structured JSON that downstream tools (Obsidian agents, identity graphs) can consume
# Gmail pipeline
cd inbox-zero
npm install
cp .env.example .env # configure credentials (see README.md inside)
npm run cli -- pull --dry-run
# O365 pipeline
cd inbox-zero-o365
npm install
cp .env.example .env # configure credentials (see README.md inside)
npm run cli -- smoke-testEach pipeline has its own detailed README with step-by-step auth setup:
- Gmail:
inbox-zero/README.md— GCP service account, domain-wide delegation, Gmail API - O365:
inbox-zero-o365/README.md— Azure Entra ID app registration, Microsoft Graph API
| Credential | Used By | How to Get |
|---|---|---|
| GCP service account | Gmail pipeline + Google Sheets | GCP Console → Service Accounts → Create key |
| Domain-wide delegation | Gmail pipeline (Workspace only) | Admin Console → API controls → Domain-wide delegation |
| Azure Entra ID app | O365 pipeline | Azure Portal → Entra ID → App registrations |
| Anthropic API key | LLM classification (optional) | console.anthropic.com → API Keys |
See each pipeline's README for detailed setup instructions.
These tools are built to be used with an AI coding agent in the terminal. The recommended workflow:
-
Point the agent at the docs. Tell it to read the pipeline's
CLAUDE.md(agent instructions) andDATA-HANDOVER.template.md(data schema reference). -
Run the pipeline together. The agent monitors progress, handles errors, and explains what each step does. The pull phase can take time on large mailboxes — the agent will track checkpoint progress.
-
Review collaboratively. The agent can explain sender classifications, suggest which senders to keep vs. filter, and help you build review batches. Decisions happen in Google Sheets where you have full visibility.
-
Execute safely. The agent runs
--dry-runfirst, shows you what will happen, then executes only after your approval. Every action is logged to the decision log. -
Extract to Obsidian. After cleanup, the agent uses sender-state and contacts data to generate People notes and knowledge notes for your Obsidian vault.
| File | Purpose |
|---|---|
CLAUDE.md |
Agent instructions — CLI commands, data schemas, patterns |
DATA-HANDOVER.template.md |
Schema reference — what data exists, how to query it |
data/sender-state.v1.json |
Canonical sender database |
data/decision-log.json |
All review decisions |
data/contacts.json |
Google Contacts export (Gmail only) |
The pipeline output is designed to seed an Obsidian vault with structured knowledge from your email. This is a separate initiative — the inbox-zero tools produce the clean data, and a downstream process converts it to Obsidian notes.
- People notes — generated from contacts + sender-state + relationship context
- Knowledge notes — extracted from high-value email threads (flagged as
extractionCandidateduring enrichment) - Meeting index — cross-referenced with meeting transcripts already in your vault
- Identity graph — links people across email, phone, WhatsApp, meetings
vault/
00-Inbox/ # Raw imports land here for triage
01-Notes/ # Processed notes (flat, found by metadata not folders)
02-Templates/ # Note templates (People, Knowledge, Meeting, etc.)
03-Assets/ # Attachments, images
04-Archive/ # Completed/retired notes
Notes use YAML frontmatter for queryable metadata:
---
type: person
name: Jane Doe
emails: [jane@example.com, jane@company.com]
phones: ["+447912345678"]
organization: Acme Corp
relationship: client
source: inbox-zero
last_email: 2026-03-10
email_count: 150
---- Match contacts.json emails against sender-state senders
- Enrich with relationship notes captured during review
- Generate People notes with frontmatter linking to business entities
- Bridge to messaging (phone numbers → WhatsApp identity)
- Cross-reference with meeting transcripts for interaction history
The pipeline's DATA-HANDOVER.template.md has detailed instructions for each step.
inbox-zero/ # Gmail pipeline — TypeScript CLI
inbox-zero-o365/ # O365 pipeline — standalone sibling
docs/
specs/ # Design documents (what and why)
plans/ # Implementation plans (how and when)
research/ # Research findings
decisions/ # Decision log (why X over Y)
HANDOVER.md # Latest execution context (gitignored for public)
- Safety over cleverness. Every destructive action requires a manifest on disk and supports
--dry-run. Atomic writes prevent data corruption. - Resumable everything. Long-running operations checkpoint progress and continue from where they left off.
- Human in the loop. No bulk actions without a Google Sheets review step. The pipeline presents, you decide.
- Durable state. Canonical data lives in local JSON files, not in the cloud. Google Sheets are projections, not sources of truth.
- Agent-friendly. CLAUDE.md files, structured schemas, and deterministic CLI commands make it easy for AI agents to reason about and operate the pipeline.
Both pipelines have comprehensive test suites:
cd inbox-zero && npm test # 785+ tests
cd inbox-zero-o365 && npm testTypeScript strict mode, Zod schema validation, Biome linting.
ISC