[recipes] Thought enrichment pipeline by alanshurafa · Pull Request #9 · alanshurafa/OB1

alanshurafa · 2026-04-06T15:09:19Z

Summary

LLM-powered enrichment that classifies existing thoughts with type, importance, quality, sensitivity, topics, tags, people, action items
Supports OpenRouter (default) and Anthropic API providers
Includes companion scripts: backfill-type.mjs and backfill-sensitivity.mjs
Batching, retry, checkpoint/resume
Part 2 of 12 in OB1 Alpha Milestone

Test plan

Run enrich-thoughts.mjs --dry-run --batch-size 5
Run enrich-thoughts.mjs --apply --provider openrouter --batch-size 10
Verify enriched thoughts have populated columns
Confirm OB1 PR Gate passes

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c1afad96a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-06T15:13:03Z

+      totalUpdated += updates.length;
+    }
+
+    offset += rows.length;


Paginate type backfill by stable cursor

Advancing offset after mutating rows causes this backfill to skip records in --apply mode. fetchBatch() reads only type=eq.reference, but each successful update changes type away from reference, shrinking the result set before the next page; then offset += rows.length jumps past still-unprocessed rows. This means some eligible thoughts never get visited whenever a batch performs updates.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-06T15:13:03Z

+    }
+  }
+
+  offset += data.length;


Page sensitivity backfill with id cursor

This loop has the same shrinking-set pagination bug: it queries only rows with null/standard/empty sensitivity_tier, updates many of them to personal/restricted, and then increments offset. Because updated rows drop out of the filtered set, subsequent pages skip remaining candidates, so the script can finish with a large subset never scanned in --apply mode.

Useful? React with 👍 / 👎.

* [recipes] Add repo learning coach recipe * [recipes] Harden repo learning coach sync and reads

fix broken link

…es-Projects#146) * [dashboards] Add Workflow kanban board with drag-and-drop, mobile support, and MCP progress_task tool Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [dashboards] Mobile UX fixes: modal centering, landscape layout, touch drag-and-drop - Fix modal positioning with createPortal to escape DnD transform context - Add phone landscape CSS to hide sidebar and show mobile topbar - Switch to MouseSensor + TouchSensor for proper mobile drag delay - Add touchAction pan-y for scroll + drag coexistence - Add allowedDevOrigins for mobile dev testing - Add suppressHydrationWarning for browser extension compatibility Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [dashboards] Allow pinch-to-zoom on kanban cards Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [schemas] Add workflow status tracking columns for kanban board Adds status and status_updated_at columns to the thoughts table, enabling kanban-style workflow management for task and idea types. Includes migration SQL, backfill for existing thoughts, and partial index. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [dashboards] Add Workflow kanban board with drag-and-drop and mobile support Adds a full kanban board interface for managing task and idea thoughts: - Drag-and-drop between status columns (New/Planning/Active/Review/Done) - Touch-friendly with 200ms hold delay, pinch-to-zoom enabled - Collapsible columns with localStorage persistence - Inline edit modal for status, priority, type, and content - Dashboard summary widget showing active workflow items - Mobile-first responsive layout with full-screen edit on small screens - @dnd-kit for accessible drag-and-drop (mouse + touch sensors) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [dashboards] Add delete button to kanban card edit modal Adds a Delete button in the kanban card modal footer with a confirmation banner before permanently deleting the thought. Wires up a new /api/kanban/delete route and optimistic removal from the board. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [dashboards] Make delete confirmation a separate popup dialog Replace the inline banner with a standalone centered dialog that overlays on top of the edit modal, with clear title, description, and Cancel/Delete buttons. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [dashboards] Fix deleteThought parsing empty response body The REST API returns an empty body on DELETE, but apiFetch always called res.json() causing a parse error. Inline the fetch so it skips JSON parsing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Ivan <ivan@openbrain.dev> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…ng (NateBJones-Projects#141) Syncs Claude Code's local memory saves to Open Brain via mcp__open-brain__capture_thought so memories are accessible from ChatGPT, Claude Desktop, Codex, and any MCP-connected client. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…fix skill divergence (NateBJones-Projects#135) * [recipes] Update life-engine schema: user_id TEXT, add weekly_review/cron_state types - Changed user_id from UUID to TEXT across all 5 tables (supports Telegram chat_id as identifier without UUID padding hacks) - Added weekly_review and cron_state to briefing_type check constraint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [recipes] Clean up Life Engine: add state table, simplify loop timing, fix skill divergence - Add life_engine_state key-value table for runtime state (cron job ID, sleep schedule) instead of overloading briefing log with cron_state type - Remove cron_state from briefing_type CHECK constraint - Simplify Dynamic Loop Timing from 6 tiers to 4 (15m/30m/60m/one-shot) - Replace duplicate embedded skill in README with pointer to life-engine-skill.md - Add user_responded update logic to Rule 7 for self-improvement engagement tracking - Add timezone note to skill time windows - Fix platform references to include Discord alongside Telegram - Add RLS comment explaining why no row policies are needed - Update metadata.json date Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [recipes] Harden Life Engine permissions: lead with settings.json allowlist, scope MCP tools - Restructure Step 6 to recommend settings.json allowlist as default (Option A) - Replace broad mcp__open-brain__* and mcp__supabase__* wildcards with specific tool names (search_thoughts, list_thoughts, execute_sql, etc.) - Include CronCreate and CronDelete in the default allowlist - Demote --dangerously-skip-permissions to Option D (testing only) - Update Quick Setup and Step 7 launch commands to use settings.json approach - Addresses HIGH finding from security audit Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [recipes] Add rain forecast to Life Engine morning briefing via Open-Meteo - Add Weather section to skill with Open-Meteo API call (free, no API key) - Include rain windows with time ranges and probability in morning briefing - Default coordinates: Portland, OR (45.52, -122.68), configurable via life_engine_state - Only show rain line when precipitation_probability >= 30% - Update schema comment to document latitude/longitude state keys Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [recipes] Add Daily Capture, portable customizations, and manual sync rule to Life Engine Backport portable customizations from installed SKILL.md into the recipe: date anchor, database note, user identity, valid briefing types, proactive chat_id, rules 9-14. Add Daily Capture prompt in evening window with capture_thought integration. Add Rule 14 requiring manual sync between recipe and installed skill files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [recipes] Fix hallucinated column name: briefings table uses 'content' not 'summary' Add explicit column reference note to prevent the LLM from hallucinating a 'summary' column on life_engine_briefings — the correct column is 'content'. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [recipes] Address PR review: Discord support, migration steps, permission docs Fixes all issues from PR NateBJones-Projects#135 review: - P1: Add Bash(date/curl) and capture_thought to README allowlist examples - P1: Make channel event handling platform-agnostic (Telegram + Discord) in skill Rules 7, 10, 11 and Channel Tools section - P1: Add upgrade migration steps to schema.sql for user_id UUID→TEXT - P2: Add CHECK constraint on delivered_via ('telegram', 'discord') - P2: Add single-user assumption comment on life_engine_state table - Bump version to 1.1.0, update date to 2026-04-01 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [recipes] Broaden Bash permission to Bash(*) — scoped patterns are fragile Scoped Bash patterns like Bash(date *) and Bash(curl -s *api.open-meteo.com*) break when the LLM varies its exact command syntax between runs, causing silent permission blocks during unattended operation. Replace with Bash(*) since Life Engine only uses benign read-only commands (date, curl) and Rule 11 prevents dangerous execution from external triggers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…teBJones-Projects#125) Replaces the empty stub with a working zero-infrastructure approach using Claude Code scheduled tasks + Open Brain MCP + Gmail MCP. Preserves the Edge Function approach as a planned future option. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…es-Projects#37) * [recipes] Vercel + Neon + Telegram alternative architecture Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [fix] Replace local MCP pattern with custom connectors (PR review feedback) Replace claude_desktop_config.json + mcp-remote bridge instructions with Claude Desktop custom connectors UI approach in both Step 8 and the Troubleshooting section, aligning with CONTRIBUTING.md Rule #14. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…BJones-Projects#171) * [recipes] ChatGPT import v2: multi-thought knowledge extraction Replace 1-3 sentence summarization with structured knowledge extraction that produces 2-5 typed thoughts per conversation (decision, preference, learning, context, brainstorm, reference) with enriched metadata. Key changes to import-chatgpt.py: - Branch resolution via current_node parent-pointer walk - Content type dispatch for 14 export message formats (voice, reasoning, web search, code) - Signal-based filtering replaces regex title matching - Session boundary detection for multi-day conversations - Semantic deduplication via match_thoughts RPC - Re-import handling with update_time/content_hash detection - Embed thought content, not [ChatGPT: title] prefix - --store-conversations for optional conversation history with pyramid summaries - --focus flag with presets (tech, strategy, personal, creative) and custom text - --openrouter-model flag for model selection - --max-words flag to skip oversized conversations (default: 50000) - Robust JSON parsing for non-OpenAI models (Anthropic, Ollama) - Accurate progress display with percentage and skip counts New files: - chatgpt_parser.py: parsing, content dispatch, filtering, session detection - schema.sql: chatgpt_conversations table with pyramid summaries and indexes All existing CLI flags preserved (--dry-run, --model ollama, --after/--before, --limit, --report, --verbose, --raw, --ingest-endpoint). * [recipes] Fix ChatGPT import filtering defaults --------- Co-authored-by: Jonathan Edwards <justfinethanku@gmail.com>

NateBJones-Projects#160) * [recipes] Local Ollama embeddings — zero-cost alternative to OpenRouter Generate embeddings locally via Ollama and insert into Supabase. Keeps the existing OB1 architecture, only swaps the embedding provider. Five models tested including gte-qwen2-1.5b (1536-dim) which is drop-in compatible with the default Open Brain schema. Includes quality benchmarks comparing discrimination power across all five models. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix markdown lint errors in README Add blank lines around fenced code blocks (MD031) and merge consecutive blockquotes (MD028). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [recipes] Fix local Ollama env loading docs --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Jonathan Edwards <justfinethanku@gmail.com>

…s-Projects#150) * [docs] Fix MD028 blank line between blockquotes in getting-started guide Removes blank line between WARNING and IMPORTANT blockquotes that was failing markdownlint across all PRs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix claudeception recipe: convert multi-line YAML descriptions to single-line Multi-line descriptions (description: |) break agent routing silently. Nate's March 2026 Skills Standard requires single-line YAML descriptions for reliable semantic matching. Fixed 3 instances: the recipe's own description and 2 template examples. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [recipes] Clean up Claudeception docs formatting --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Jonathan Edwards <justfinethanku@gmail.com>

…NateBJones-Projects#148) * fix(professional-crm): remove Accept header patch causing SSE reconnect loop The Accept: text/event-stream header patch forced StreamableHTTPTransport into SSE mode on every request. Since Supabase edge functions are stateless, the SSE stream terminates immediately after each response — causing the MCP client to reconnect every ~2 seconds (~43k invocations/day). StreamableHTTPTransport is request/response by design. Removing the patch lets it respond with plain JSON, eliminating the reconnect loop entirely. * fix(professional-crm): force JSON-only Accept header to prevent SSE reconnect loop Removing text/event-stream from the Accept header before it reaches StreamableHTTPTransport prevents it from opening SSE streams. MCP clients send Accept: application/json, text/event-stream per spec -- this is what triggers SSE mode even without the original workaround. JSON-only responses close cleanly, eliminating the boot/shutdown cycle.

…ateBJones-Projects#139) * recipes: add adaptive capture classification with confidence gating * recipes: address review — fix author, OB1 types, add TypeScript implementation * recipes: incorporate GitHub edits to README, classifier prompt, and metadata * [recipes] Tighten adaptive capture setup and threshold updates --------- Co-authored-by: Jonathan Edwards <justfinethanku@gmail.com>

…ateBJones-Projects#133) * Add update_professional_contact tool to CRM extension Adds the ability to update existing contact fields (name, company, title, email, phone, tags, notes, follow_up_date, etc.) which was proposed in NateBJones-Projects#93 but never implemented. Only provided fields are updated, and the existing updated_at trigger handles timestamping. * Allow clearing follow_up_date by passing null or empty string Fixes the case where a follow-up date, once set, could never be cleared — leaving contacts permanently stuck in get_follow_ups_due. * [extensions] Document contact update tool --------- Co-authored-by: Matt Hallett <matthallett@gmail.com> Co-authored-by: Jonathan Edwards <justfinethanku@gmail.com>

…es-Projects#161) * Fix pre-existing markdownlint errors across 15 files Add blank lines around headings (MD022), fenced code blocks (MD031), and between adjacent blockquotes (MD028). Fix broken link fragment (MD051) and remove extra blank line (MD012). No content changes. These errors were blocking CI on all open PRs since the lint check runs repo-wide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [docs] Preserve README links during markdown cleanup --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Jonathan Edwards <justfinethanku@gmail.com>

@jaredirish

…raphics (NateBJones-Projects#85) * [recipes] Infographic Generator: turn research docs into visual infographics Second recipe from @jaredirish. Part of the Open Brain Flywheel (capture-process-visualize loop, see Issue NateBJones-Projects#84). Takes any markdown doc or Open Brain thought cluster and generates professional infographic images via Gemini's free-tier API. Auto-chunks content, writes verbose prompts (300+ words each), generates PNGs with specific colors/layout/typography. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [recipes] Fix broken relative links in infographic-generator README ../brain-dump-processor/ → ../panning-for-gold/ ../auto-capture-protocol/ → ../auto-capture/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [recipes] Address review feedback on infographic generator - Sync generate.py with working local version (cleaner error handling, fix --redo display counter bug) - Fix auto-capture link: directory doesn't exist until PR NateBJones-Projects#42 merges, so link to the PR instead of a non-existent directory Note: part.as_image() and gemini-2.5-flash-image are both valid per the official google-genai SDK docs. Reviewer concerns on those were based on outdated information. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [recipes] Fix infographic redo progress output --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Jonathan Edwards <justfinethanku@gmail.com>

* [recipes] Add OB-Graph knowledge graph layer Adds graph database functionality for Open Brain using PostgreSQL nodes + edges with recursive CTE traversal. Includes schema, MCP server with 10 tools, and setup documentation. https://claude.ai/code/session_015Z8wCeokTMTdrVMthqzGKJ * [recipes] Clarify OB-Graph deployment setup --------- Co-authored-by: Claude <noreply@anthropic.com>

* [docs] Fix Cursor MCP connection — use native url field, not mcp-remote mcp-remote@latest now attempts OAuth client registration before sending custom headers, which breaks against Open Brain's simple key-based auth. Cursor supports remote MCP servers natively via the url field, so mcp-remote is unnecessary. Changes: - Add dedicated Cursor section to getting-started guide (7.5) and remote-mcp primitive with native url config - Update mcp-remote examples to pass key via ?key= query parameter instead of --header to avoid OAuth discovery issues - Clarify x-brain-key (core) vs x-access-key (extensions) in troubleshooting guides Made-with: Cursor * [primitives] Bring remote MCP docs in line with repo format --------- Co-authored-by: Jonathan Edwards <justfinethanku@gmail.com>

* [skills] Add weekly signal diff skill pack * [skills] Fix markdownlint numbering in weekly signal diff

…Projects#179)

…ojects#180)

…rojects#181) * [recipes] Add Bring Your Own Context recipe * [recipes] Fix markdownlint regression in activation README

* [repo] Sweep fix-now backlog issues * [docs] Fix setup-guide markdownlint regression

Add LLM-powered enrichment recipe that retroactively classifies existing thoughts with type, importance, quality score, sensitivity tier, and metadata (topics, tags, people, action items). Supports OpenRouter and Anthropic providers with batching, retry, and checkpoint/resume. Part 2 of 12 in the OB1 alpha milestone. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Why: `.*` between variable-length `\d{8,17}` and a literal keyword alternation (account|routing|iban) creates catastrophic backtracking on digit-heavy inputs with no keyword match. A single long thought without the keyword could freeze backfill-sensitivity.mjs for seconds. Replace with `[^\n]{0,80}` bounded proximity window across all three patterns that used the `.*` shape (bank_account x2, health_measurement, financial_detail). 30kb pathological input now scans in 0ms.

Why: Node 18's undici global fetch has no body/read timeout — a stuck upstream (OpenRouter proxying a stalled backend, or a flaky proxy in front of Supabase) will hang a worker indefinitely with no signal to the user. Concurrency 20 plus one stuck call = zero forward progress but a live event loop, which also blocks clean shutdown. Mitigation: - Added `fetchWithTimeout` + default constants + `resolveTimeoutMs` helper to `lib/memory-core.mjs` so all three scripts share one implementation. - Wrapped every fetch() call in enrich-thoughts.mjs, backfill-sensitivity.mjs, and backfill-type.mjs. LLM calls use 60s, Supabase calls use 30s. FETCH_TIMEOUT_MS in .env.local (or env) overrides both. - Extended enrich-thoughts.mjs `withRetry` to treat AbortError / "Timeout after" / "aborted" as transient so timeouts retry once before failing. - backfill-type.mjs now catches fetch() throws (AbortError etc.) inside fetchBatch/updateRow and folds them into existing exponential backoff instead of bubbling as fatal.

…puts Why: Thought content is pasted verbatim into the classifier with no delimiter, no framing, and no length cap on free-form output fields. A hostile import (shared ChatGPT export, scraped feed, captured phishing message) can inject instructions that land as attacker- controlled strings in `metadata.summary`, `people`, `tags`, or `action_items` — which OB1 then treats as trustworthy memory. Mitigation: - Wrap content in `<thought_content>...</thought_content>` tags and escape any literal tag occurrences in the content so an attacker cannot close the block. - Updated system prompt tells the model the tagged block is untrusted data and to classify it, never follow instructions inside. - Added `response_format: { type: "json_object" }` to the OpenRouter call so supporting models refuse to emit prose-around-JSON at all. - New `sanitizeString` / `sanitizeStringArray` helpers apply length caps (summary 500, each array item 80-300 chars depending on field, 20 items max per array) and strip control characters before any free-form text reaches the Supabase metadata payload. - README gains a "Security notes" section documenting both the injection threat model and the Bearer-token-on-wire concern.

Why: Normal-mode loop only terminates on `--limit` or an empty page. A shell typo (dropped `--limit`, wrong `--model`, or a misbehaving provider that still bills for its own failures) against a 50k-row table can burn $50-$200 with no circuit breaker. The previous "approximately $1-2 per 1k" guidance in the README was a forecast, not a guard. Mitigation: add a `--max-calls <n>` CLI flag (default 10000, pass 0 to disable). Track `budget.calls` across both the normal-mode and retry-failed loops, incremented inside classifyAndUpdate immediately before each LLM provider call (empty-content rows are not counted since they skip the LLM). Check the budget at the top of every loop iteration and every concurrency chunk; on exceed, set `budgetExceeded`, break cleanly, and flip the final banner to "ABORTED (--max-calls reached)". Final summary always prints "LLM calls made: N / cap" so users can tune the cap on their next run. README documents the flag, the default, and the zero-disables semantics. Also accepts ENRICH_MAX_CALLS from .env.local as an implicit default.

Why: Offset pagination while mutating filtered rows is broken. Every successful PATCH removes a row from the WHERE clause, so the next `offset += BATCH_SIZE` step skips that many UNPROCESSED rows. For backfill-sensitivity this could silently lose up to half of sensitive thoughts on dense batches; for backfill-type every PATCH drops the row out of the `type=eq.reference` filter, compounding the skip. The user sees "scanned 12,000" and assumes the run is complete when thousands of rows were never seen. Fix: switch both scripts to cursor (keyset) pagination on id: `id=gt.${afterId} ORDER BY id ASC LIMIT N`, advance afterId to the last id in each page. Stable under mutation because PATCHing does not change the id of the remaining rows. While here: - backfill-sensitivity progress reporter now uses a counter-based threshold (LOW-6) so partial batches still emit progress. - backfill-type now sends `Prefer: count=exact` only on the first page (LOW-7), avoiding a full COUNT(*) scan on every page. Both scripts' completion summaries were updated to use the cursor semantics (`processedRows` instead of `offset`).

Why: `state.lastProcessedId` was written to enrichment-state.json after every chunk but never read back on startup. Resume only worked as a side-effect of the `enriched=eq.false` filter — fine for the normal case, but a run with `--skip N` that crashed mid-way would start over from offset N of a different filtered set, silently re-processing rows. README advertised "checkpoint/resume" as a first-class feature; it wasn't. Fix: - On startup, if `state.lastProcessedId != null` and neither `--skip` nor `--reset-state` was passed, seed `fetchCursor.afterId` from `state.lastProcessedId`. Print an explicit "Resuming from id > N" banner so the user sees it. - Add `--reset-state` flag to explicitly wipe the checkpoint when the user wants a clean run. - Document real resume semantics in the README (cursor + DB filter as defense-in-depth). Respects explicit user intent: passing `--skip` overrides the resume because the user clearly wants to restart from a specific point.

Why: Previous patchThought retried exactly once on any non-ok response, sleeping 2s in between. Against a 4xx (400 column-doesn't- exist, 401 bad key, 403 RLS denial, 404 row gone, 422 constraint), this wasted 2s + a round trip per row without ever succeeding and masked the real error behind the retry-error path. Worst case: `--apply --limit 10000` with the enhanced-thoughts schema not applied burns 10,000 LLM calls (real money already spent) + 2s per PATCH before the user figures out what's wrong. Fix: rewrite patchThought as an exponential-backoff loop (same shape as backfill-type.mjs's updateRow) that retries only on 429 / 500 / 502 / 503 / 504, plus AbortError/network errors as transient up to `retries` times. On any 4xx, throw immediately with the response body (300 chars) so the caller sees `FAIL #{id}: column "enriched" does not exist` on row 1.

Why: `state.failedIds` had no upper bound. A catastrophic run against a flaky provider could accumulate tens of thousands of IDs, all persisted to disk, then reloaded on the next `--retry-failed`. That path builds a single `id=in.(...)` URL that blows past PostgREST's 8KB cap and any proxy-level URL length limits, producing silent 414 URI Too Long or truncated result sets. Mitigation: - Cap `state.failedIds` at 1000 entries with FIFO eviction in `addFailedId`. Warn exactly once per run when the cap is first hit so the user knows older failures will no longer surface. - `fetchByIds` now splits input IDs into chunks bounded by both count (50 IDs per request) and URL character length (6000 chars of comma-joined IDs). Each chunk is a separate fetch and the results are concatenated. 50 small IDs fit in one request; very large IDs split sooner on URL length; empty input yields no requests.

Why: One-line fixes picked up while addressing blockers, per the review brief's "trivial fixes are fine" exception. - LOW-1: record `enriched_provider: config.provider` alongside `enriched_model` so a model string (e.g. `claude-3-5-haiku-*`) can be disambiguated between OpenRouter and direct Anthropic for historical audit. - LOW-5: validate `--limit` as a positive integer. Previously `--limit 0` and `--limit foo` both coerced to 0, which the main loop interprets as "unlimited" — the same compound risk BLOCKER-1 exists to guard against. Now exits 1 with a clear message on non-positive or non-integer input. - MEDIUM-8: metadata.json `services` now lists "Anthropic API" alongside OpenRouter, and `tags` includes "anthropic" so catalog search surfaces this recipe for Anthropic users.

…tion Why: The CI `lint` job fails because markdownlint-cli2 flags MD029/ol-prefix in recipes/thought-enrichment/README.md — the three script sub-sections share continuous 3..10 numbering instead of each restarting at 1, and the "Recommended execution order" list continues from 11..14. MD029 expects each list to start at 1. The review flagged the same issue as MEDIUM-6 (readability), now also blocking the lint CI gate. Fix: restart numbering at 1 for each sub-section (enrich-thoughts, backfill-type, backfill-sensitivity) and at 1 for the "Recommended execution order" list. Total step count unchanged (still 14 numbered items per the gate rule 9 check), but the reader no longer needs to track a global counter across sections.

…roduction sessions Updates from 6 days of daily production use across 4 projects: - Phase 0.5: Speaker consolidation (voice transcription creates 3-5x more labels than actual speakers, must clean before extraction) - Phase 3.5: Auto-capture to Open Brain with per-thread granularity - DOT flowchart diagram replacing ASCII flow - 6 lessons log entries from real failures - Common mistakes #9-10: speaker label trust, extraction stinginess - "Don't be stingy" rule: 80+ threads for 1-hour conversation is normal Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions Bot added the recipe label Apr 6, 2026

chatgpt-codex-connector Bot reviewed Apr 6, 2026

View reviewed changes

github-actions Bot added documentation Improvements or additions to documentation integration labels Apr 6, 2026

justfinethanku and others added 23 commits April 12, 2026 21:57

[recipes] Add Repo Learning Coach recipe (NateBJones-Projects#175)

cd349a5

* [recipes] Add repo learning coach recipe * [recipes] Harden repo learning coach sync and reads

fix link to recipes/content-fingerprint-dedup (NateBJones-Projects#167)

3285d04

fix broken link

[skills] Add weekly signal diff skill pack (NateBJones-Projects#178)

c651588

* [skills] Add weekly signal diff skill pack * [skills] Fix markdownlint numbering in weekly signal diff

[skills] Add install prompt to weekly signal diff README (NateBJones-…

2c933e4

…Projects#179)

[recipes] Add work operating model activation workflow (NateBJones-Pr…

0a970ef

…ojects#180)

[recipes] Add Bring Your Own Context composition recipe (NateBJones-P…

6c673f0

…rojects#181) * [recipes] Add Bring Your Own Context recipe * [recipes] Fix markdownlint regression in activation README

[repo] Sweep fix-now backlog issues (NateBJones-Projects#185)

500076e

* [repo] Sweep fix-now backlog issues * [docs] Fix setup-guide markdownlint regression

alanshurafa force-pushed the contrib/alanshurafa/thought-enrichment branch from d7c1195 to 6a98273 Compare April 18, 2026 00:48

github-actions Bot added dashboard extension labels Apr 18, 2026

github-actions Bot added the primitive label Apr 18, 2026

alanshurafa added 10 commits April 17, 2026 21:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[recipes] Thought enrichment pipeline#9

[recipes] Thought enrichment pipeline#9
alanshurafa wants to merge 33 commits intomainfrom
contrib/alanshurafa/thought-enrichment

alanshurafa commented Apr 6, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Conversation

alanshurafa commented Apr 6, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants