[recipes] Add wiki-synthesis — autobiography + email-thread wikis by alanshurafa · Pull Request #17 · alanshurafa/OB1

alanshurafa · 2026-04-21T20:28:52Z

Summary

New recipe under recipes/wiki-synthesis/ that ports ExoCortex's wiki-synthesis work into OB1:

scripts/synthesize-wiki.mjs — topic-scoped synthesizer with an autobiography mode that groups thoughts by year and asks an OpenAI-compatible Chat Completions endpoint for second-person biographical prose per year. Extensible catalogue (drop in more topics: career, travel, relationships, etc.).
scripts/backfill-gmail-wikis.mjs — resume-safe per-thread wiki generator for Gmail-imported thoughts. Groups by metadata.gmail.thread_id, filters by word-count + message/atom thresholds, writes gmail_wiki thoughts with derived_from edges back to source atoms. Prefers an upsert_thought RPC when present; plain insert fallback.
dashboard-snippets/ — optional Next.js Server-Action components for a /wiki index + /wiki/[slug] detail view. Users copy into their own dashboard and wire in auth.

How it differs from `entity-wiki`

entity-wiki (separate PR, branch contrib/alanshurafa/entity-wiki) synthesizes one page per entity and needs the entity-extraction schema.
wiki-synthesis (this PR) synthesizes one page per corpus slice (year, topic, email thread) and only requires the core thoughts table, plus optional thought_edges for email-thread provenance.

Both recipes are documented to cross-reference each other.

What it requires

Open Brain setup, Node.js 18+, any Chat-Completions-compatible LLM endpoint.
Autobiography mode: only the core thoughts table.
Email-thread mode: thoughts imported via recipes/email-history-import/ (or compatible importer that populates metadata.gmail.thread_id + metadata.gmail.gmail_id), plus a public.thought_edges table from the Knowledge Graph schema (upstream PR [schemas] Knowledge graph tables and extraction trigger #5).

README calls these prerequisites out explicitly so users know which OB1 layers they need before running the pipeline.

Generalization notes (from the ExoCortex origin)

Replaced MCP edge-function client (x-brain-key / open-brain-rest) with direct PostgREST + service role — matches the entity-wiki recipe's env-var contract (OPEN_BRAIN_URL, OPEN_BRAIN_SERVICE_KEY, LLM_BASE_URL, LLM_API_KEY, LLM_MODEL).
Swapped Anthropic-direct HTTP calls for the OpenAI-compatible Chat Completions pattern so users can point at OpenRouter/OpenAI/local LLMs.
Dropped the claude-cli provider branch from backfill-gmail-wikis.mjs (that was a local-compute workaround, not appropriate for a shared recipe).
New env knobs: SUBJECT_NAME (narrator voice), SOURCE_TYPE_FILTER (narrow autobiography corpus), WIKI_OUTPUT_DIR.
Stripped hardcoded personal data; the only remaining personal reference is the author metadata, as expected.

Test plan

Dry-run the autobiography synthesizer on my own Open Brain (--dry-run --scope year=2024) and confirm the year bucketing works against core thoughts.
Generate a single-year autobiography end-to-end and inspect output/wiki/autobiography-2024.md.
Dry-run backfill-gmail-wikis.mjs against my imported Gmail corpus and confirm eligibility counts match expectation.
Full backfill on a small subset (--limit=5), verify wiki thoughts land with source_type='gmail_wiki' and that thought_edges has derived_from rows back to source atoms.
Copy dashboard snippets into my dashboard fork, confirm the /wiki index renders and the Server Action round-trips to the script.

Pre-review fork PR — this is the intake PR for cross-AI review before any upstream PR to NateBJones-Projects/OB1.

…from thoughts Adds a new recipe that ships two Node scripts for wiki-style synthesis over the core `thoughts` table, plus optional Next.js dashboard snippets. - `scripts/synthesize-wiki.mjs` — topic-scoped synthesizer with a built-in `autobiography` mode that groups thoughts by year and asks an OpenAI-compatible Chat Completions endpoint to produce second-person biographical prose per year. Extend the catalogue to add more topics. - `scripts/backfill-gmail-wikis.mjs` — resume-safe per-thread wiki generator for Gmail-imported thoughts. Groups by `metadata.gmail.thread_id`, filters by word-count + message/atom thresholds, writes wiki thoughts with `derived_from` edges to their source atoms. Prefers an `upsert_thought` RPC when present, falls back to plain inserts. - `dashboard-snippets/` — optional Next.js components (Server Action + `/wiki` index + `/wiki/[slug]` detail) to copy into a dashboard. Generalized from ExoCortex origin: replaced the MCP edge-function client with direct PostgREST + service role, swapped the Anthropic-direct call for OpenAI-compatible Chat Completions, removed personal data references, introduced SUBJECT_NAME and SOURCE_TYPE_FILTER env knobs. Complements the in-flight entity-wiki recipe — this one does corpus-slice synthesis and only requires the core `thoughts` table. README documents the thought_edges / upsert_thought prerequisites so users know which OB1 layers they need before running the email-thread pipeline.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 181f48e289

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-21T20:33:56Z

+    const thoughtId = active.id as number;
+    const newStatus = over.id as string;
+


Resolve status from drop container, not hovered card

handleDragEnd treats over.id as the new workflow status, but with SortableContext each card is also a droppable target, so over.id is often a thought ID rather than a column key. In those common drops, this sends values like "123" to /api/kanban/update, which fails VALID_STATUSES validation and reverts the move, making drag-and-drop unreliable unless the user drops in empty column space.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-21T20:33:56Z

+UPDATE thoughts
+SET status = 'new', status_updated_at = now()
+WHERE metadata->>'type' IN ('task', 'idea') AND status IS NULL;


Backfill workflow status from type column

The migration backfill filters on metadata->>'type', but workflow task classification is stored in the top-level type field (and even this schema's README uses WHERE type IN ('task', 'idea')). On existing databases this leaves pre-existing task/idea rows with status IS NULL, so they are excluded from status-filtered workflow queries until users run a manual corrective update.

Useful? React with 👍 / 👎.

…tion Wrap raw thought content in <entries>...</entries> delimiters and tell the system+user prompt explicitly to ignore instructions inside it. Captured thoughts are user data and can contain adversarial text that overrides the biographer task.

Wrap raw email-thread content in <thread>...</thread> delimiters and tell the system prompt to treat it as untrusted data. Email bodies arrive from external senders and may contain adversarial instructions that override the summarization task.

…idate year Server Action was passing the host's entire process.env to the synthesize-wiki child process, leaking every unrelated secret. Pass only the OB1/LLM-related vars plus the minimal system env the child needs. Also server-side-validate scope_year against /^(19|20)\d{2}$/ (P3). Auth guard stays as a placeholder so the snippet remains optional reference code — the README already warns to add one.

…rrow RPC fallback P1-2: Wrap the wiki-delete PostgREST filter id in encodeURIComponent so non-numeric (e.g., uuid) ids with reserved chars don't break the URL. P1-3: Coerce thrown non-Error values in the retry loop — 'throw 'msg'' used to crash the catch and skip the state-log append. P2: Narrow the upsert_thought RPC fallback trigger to the specific rpc/upsert_thought 404 signal, not any string with 'not found', so auth/permission errors surface instead of silently falling through to a direct insert. Also cap the LLM error body at 500 chars to avoid leaking long provider diagnostics to logs.

…rereqs Escape SUBJECT_NAME in autobiography frontmatter so names containing ':', quotes, or YAML-reserved chars don't break the frontmatter parse the dashboard snippet does. Expand the README's thought_edges prerequisite with column types and the UNIQUE index needed for the ignore-duplicates edge-upsert header. Also spell out what users observe when the schema is missing.

alanshurafa · 2026-04-22T16:54:22Z

Refreshing checks after markdownlint cleanup merged into fork main.

alanshurafa · 2026-04-22T16:57:30Z

Refreshing checks after fork markdownlint workflow fix.

github-actions Bot added dashboard documentation Improvements or additions to documentation extension integration primitive recipe labels Apr 21, 2026

chatgpt-codex-connector Bot reviewed Apr 21, 2026

View reviewed changes

alanshurafa added 5 commits April 21, 2026 16:50

alanshurafa mentioned this pull request Apr 21, 2026

[recipes] Wiki synthesis + autobiography pipeline NateBJones-Projects/OB1#222

Merged

alanshurafa closed this Apr 22, 2026

alanshurafa reopened this Apr 22, 2026

alanshurafa closed this Apr 22, 2026

alanshurafa reopened this Apr 22, 2026

[docs] Fix pre-existing markdownlint errors across 8 files

da0cf24

github-actions Bot added the schema label Apr 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[recipes] Add wiki-synthesis — autobiography + email-thread wikis#17

[recipes] Add wiki-synthesis — autobiography + email-thread wikis#17
alanshurafa wants to merge 7 commits intomainfrom
contrib/alanshurafa/wiki-synthesis

alanshurafa commented Apr 21, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Uh oh!

alanshurafa commented Apr 22, 2026

Uh oh!

alanshurafa commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		const thoughtId = active.id as number;
		const newStatus = over.id as string;

Conversation

alanshurafa commented Apr 21, 2026

Summary

How it differs from entity-wiki

What it requires

Generalization notes (from the ExoCortex origin)

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

alanshurafa commented Apr 22, 2026

Uh oh!

alanshurafa commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

How it differs from `entity-wiki`