Skip to content

rania-run/Work-Summarizer

Repository files navigation

Work Summarizer

A local, terminal-based CLI that pulls your activity from GitHub, Linear, Notion, and Slack, then generates a concise AI-powered work summary for any timeframe — today, yesterday, this week, or this month.

No SaaS. No cloud sync. Everything runs on your machine.


Architecture

fetch_*.py  ──────────────────────────────────────────────┐  parallel
  fetch_github.py   gh CLI → commits, PRs opened, reviews │
  fetch_linear.py   GraphQL → issues assigned/created      │
  fetch_notion.py   REST Search → pages created or edited  │
  fetch_slack.py    search.messages → messages sent        │
                                                           ▼
                                                     raw JSON files

enrich_*.py  ─────────────────────────────────────────────┐  parallel, optional
  analyze_diff.py   commit diffs → semantic summary        │  each makes targeted
  extract_thread.py Slack threads → decision/noise         │  claude -p sub-calls
  link_tickets.py   ticket+PR pairs → confirmed links      │  on isolated chunks
  classify_review.py PR reviews → depth + focus            │
                                                           ▼
                                                   enrichment JSON files

aggregate.py  ─────────────────────────────────────────────
  merge & deduplicate all sources
  apply PR categorization (deterministic)
  apply enrichment results (if present)
  detect weekday gaps + work-stretching hints
                                                           ▼
                                                       context.txt

claude -p  ─────────────────────────────────────────────────
  one summarization call, reads context.txt
                                                           ▼
                                              printed summary + .last_context.txt

Two LLM tiers:

  1. Enrichments (optional) — small, targeted sub-calls during the pipeline. Each operates on one isolated chunk (one diff, one thread, one ticket-PR pair). All default to false and are non-blocking.
  2. Summarization — exactly one claude -p call at the very end.

File Structure

work-summarizer/
├── summarize.sh                              # CLI entry point & orchestrator
├── fetch_github.py                           # GitHub: commits, PRs, reviews (gh CLI)
├── fetch_linear.py                           # Linear: issues, comments (GraphQL)
├── fetch_notion.py                           # Notion: pages created or edited (Search API)
├── fetch_slack.py                            # Slack: messages sent (search.messages)
├── analyze_diff.py                           # Enrichment: commit diff → semantic summary
├── extract_thread.py                         # Enrichment: Slack thread → decision/blocker/noise
├── link_tickets.py                           # Enrichment: ticket+PR pair → confirmed link
├── classify_review.py                        # Enrichment: PR review → depth + focus
├── aggregate.py                              # Merge, categorize, apply enrichments, detect gaps
├── config.example.json                       # Template — copy to internal/config.json
├── requirements.txt                          # Python deps: requests only
├── work-summarizer.postman_collection.json   # All API endpoints with docs
└── work-summarizer.postman_environment.json  # Postman env (fill in secrets)

Setup

1. Prerequisites

Tool Purpose Install
gh GitHub CLI brew install gh && gh auth login
jq JSON parsing in shell brew install jq
python3 Fetchers, enrichments & aggregator system or brew install python3
claude AI summarization Claude Code
pip install -r requirements.txt

2. Create your config

cp config.example.json internal/config.json

Open internal/config.json and fill in your values. The _instructions field explains where to find each credential.

3. Get API credentials

GitHub — handled by gh auth login. No extra token needed.

Linear

  1. Settings → API → Personal API keys → Create key
  2. To find your user ID: use the Get Viewer request in the Postman collection

Notion

  1. notion.com/my-integrations → New integration → copy the Internal Integration Token (ntn_...)
  2. Share each workspace page with the integration: page → Connections → your integration
  3. To find your user ID: use the Get Current User request in the Postman collection

Slack

  1. api.slack.com/apps → Create App → From scratch
  2. OAuth & Permissions → User Token Scopes → add search:read
  3. For thread extraction, also add: channels:history, groups:history
  4. Install App to workspace → copy the User OAuth Token (xoxp-...)
  5. Your user ID: Slack → click your avatar → Profile → → Copy member ID

4. Toggle services

"enabled_services": {
  "github": true,
  "linear": false,
  "notion": true,
  "slack": true
}

Usage

./summarize.sh today
./summarize.sh yesterday
./summarize.sh week
./summarize.sh month

./summarize.sh --timeframe week   # explicit flag
./summarize.sh                    # interactive prompt
./summarize.sh --help

Enrichments

Enrichments are optional AI sub-calls that run between the fetch and aggregate steps. They add semantic depth that deterministic scripts cannot produce — reading actual code diffs, classifying thread signal, confirming ticket-PR links.

All enrichments are disabled by default. Enable them one at a time in internal/config.json and check the token cost before enabling by default.

"enabled_enrichments": {
  "analyze_diff":    false,
  "extract_thread":  false,
  "link_tickets":    false,
  "classify_review": false
}

What each enrichment does

analyze_diff — requires github service enabled

Fetches the actual code diff for each (date, repo) commit group via gh api. Replaces meaningless commit messages like "wip" or "fix stuff" with a 1–2 sentence semantic description of what the code actually does.

Before:  GitHub   | Commits       | [owner/repo] 4 commit(s). e.g. "wip"
After:   GitHub   | Commits [refactor] | [owner/repo] 4 commit(s) — Extracted auth token validation into a reusable middleware; removed direct session table dependency. [refactor]

extract_thread — requires slack service enabled + channels:history scope

Fetches full Slack threads via conversations.replies. Classifies each as decision, blocker, discussion, or noise. Noise threads are dropped silently; only signal threads reach the final context.

Before:  Slack    | Messages Sent  | [#engineering] 12 msg(s). e.g. "sounds good"
After:   Slack    | Thread [decision] | [#engineering] Decision: Team aligned on Postgres for job queue over Redis. → Action: Spike Postgres job queue implementation

link_tickets — requires both linear and github services enabled

For each (Linear ticket, GitHub PR) pair within 7 days of each other, asks Claude whether the PR actually implements the ticket. Upgrades gap detection hints from proximity guesses to confirmed links.

Before (gap hint):  2024-01-08 — Likely local development for ticket "Add JWT auth" (created 2024-01-06...)
After  (gap hint):  2024-01-08 — Local development for "Add JWT auth" — confirmed (high confidence) to implement https://github.com/.../pull/42

classify_review — requires github service enabled

Fetches all review comments for each PR review via gh api. Classifies depth (superficial, moderate, or deep) and focus area. Replaces the blank "Code Review" label with a one-sentence description of actual contribution.

Before:  GitHub   | Code Review        | [owner/repo] Reviewed @alice's PR: Refactor auth...
After:   GitHub   | Code Review [deep] | [owner/repo] Reviewed @alice's PR: Refactor auth... | Flagged missing error handling in 3 async functions and proposed extracting auth into shared middleware

Non-blocking by design

If an enrichment fails (API error, Claude timeout, malformed output), the pipeline continues without it. The final summary is produced from whatever data is available.


PR Categorization

Applied deterministically in aggregate.py — not left to the LLM.

Signal Label
PR where author == you Created PR
PR where reviewed-by == you AND author != you Code Review
Commits pushed to your own open PR Updated / Refined PR

The classify_review enrichment annotates the depth and focus on top of this — it does not change the category.


Work Stretching (Gap Detection)

When aggregate.py detects a weekday with no recorded activity between a Linear ticket's creation date and its next state change, it annotates the gap in context.txt. If link_tickets is enabled and confirms the corresponding PR, the hint is upgraded from "likely" to "confirmed".

The final summarization prompt bridges gaps with explicit deduction language — "likely", "inferred from context" — rather than skipping the day or fabricating claims.


Token Efficiency

  • Fetchers output raw JSON only — no prose, no summaries.
  • Each enrichment operates on one isolated chunk at a time and outputs 1–2 sentences.
  • aggregate.py compresses everything into a single flat text file before the final LLM call.
  • The summarization LLM is called exactly once per run.
  • All fetchers run in parallel; all enabled enrichments run in parallel.

Contributing

PRs welcome for:

  • Additional data sources (Google Calendar, Jira, GitHub Discussions)
  • Output formats (JSON, Markdown file export)
  • Multi-user support

About

Local AI-powered work summarizer — pulls GitHub, Linear, Notion & Slack activity into a concise daily/weekly summary via Claude

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors