garrytan · garrytan · Apr 18, 2026 · Apr 18, 2026 · Apr 18, 2026 · Apr 18, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,81 @@
 
 All notable changes to GBrain will be documented in this file.
 
+## [0.12.2] - 2026-04-19
+
+## **Postgres frontmatter queries actually work now.**
+## **Wiki articles stop disappearing when you import them.**
+
+This is a data-correctness hotfix for the `v0.12.0`-and-earlier Postgres-backed brains. If you run gbrain on Postgres or Supabase, you've been losing data without knowing it. PGLite users were unaffected. Upgrade auto-repairs your existing rows. Lands on top of v0.12.1 (extract N+1 fix + migration timeout fix) — pull `gbrain upgrade` and you get both.
+
+### What was broken
+
+**Frontmatter columns were silently stored as quoted strings, not JSON.** Every `put_page` wrote `frontmatter` to Postgres via `${JSON.stringify(value)}::jsonb` — postgres.js v3 stringified again on the wire, so the column ended up holding `"\"{\\\"author\\\":\\\"garry\\\"}\""` instead of `{"author":"garry"}`. Every `frontmatter->>'key'` query returned NULL. GIN indexes on JSONB were inert. Same bug on `raw_data.data`, `ingest_log.pages_updated`, `files.metadata`, and `page_versions.frontmatter`. PGLite hid this entirely (different driver path) — which is exactly why it slipped past the existing test suite.
+
+**Wiki articles got truncated by 83% on import.** `splitBody` treated *any* standalone `---` line in body content as a timeline separator. Discovered by @knee5 migrating a 1,991-article wiki where a 23,887-byte article landed in the DB as 593 bytes (4,856 of 6,680 wikilinks lost).
+
+**`/wiki/` subdirectories silently typed as `concept`.** Articles under `/wiki/analysis/`, `/wiki/guides/`, `/wiki/hardware/`, `/wiki/architecture/`, and `/writing/` defaulted to `type='concept'` — type-filtered queries lost everything in those buckets.
+
+**pgvector embeddings sometimes returned as strings → NaN search scores.** Discovered by @leonardsellem on Supabase, where `getEmbeddingsByChunkIds` returned `"[0.1,0.2,…]"` instead of `Float32Array`, producing `[NaN]` query scores.
+
+### What you can do now that you couldn't before
+
+- **`frontmatter->>'author'` returns `garry`, not NULL.** GIN indexes work. Postgres queries by frontmatter key actually retrieve pages.
+- **Wiki articles round-trip intact.** Markdown horizontal rules in body text are horizontal rules, not timeline separators.
+- **Recover already-truncated pages with `gbrain sync --full`.** Re-import from your source-of-truth markdown rebuilds `compiled_truth` correctly.
+- **Search scores stop going `NaN` on Supabase.** Cosine rescoring sees real `Float32Array` embeddings.
+- **Type-filtered queries find your wiki articles.** `/wiki/analysis/` becomes type `analysis`, `/writing/` becomes `writing`, etc.
+
+### How to upgrade
+
+```bash
+gbrain upgrade
+```
+
+The `v0.12.2` orchestrator runs automatically: applies any schema changes, then `gbrain repair-jsonb` rewrites every double-encoded row in place using `jsonb_typeof = 'string'` as the guard. Idempotent — re-running is a no-op. PGLite engines short-circuit cleanly. Batches well on large brains.
+
+If you want to recover pages that were truncated by the splitBody bug:
+
+```bash
+gbrain sync --full
+```
+
+That re-imports every page from disk, so the new `splitBody` rebuilds the full `compiled_truth` correctly.
+
+### What's new under the hood
+
+- **`gbrain repair-jsonb`** — standalone command for the JSONB fix. Run it manually if needed; the migration runs it automatically. `--dry-run` shows what would be repaired without touching data. `--json` for scripting.
+- **CI grep guard** at `scripts/check-jsonb-pattern.sh` — fails the build if anyone reintroduces the `${JSON.stringify(x)}::jsonb` interpolation pattern. Wired into `bun test` so it runs on every CI invocation.
+- **New E2E regression test** at `test/e2e/postgres-jsonb.test.ts` — round-trips all four JSONB write sites against real Postgres and asserts `jsonb_typeof = 'object'` plus `->>` returns the expected scalar. The test that should have caught the original bug.
+- **Wikilink extraction** — `[[page]]` and `[[page|Display Text]]` syntaxes now extracted alongside standard `[text](page.md)` markdown links. Includes ancestor-search resolution for wiki KBs where authors omit one or more leading `../`.
+
+### Migration scope
+
+The repair touches five JSONB columns:
+- `pages.frontmatter`
+- `raw_data.data`
+- `ingest_log.pages_updated`
+- `files.metadata`
+- `page_versions.frontmatter` (downstream of `pages.frontmatter` via INSERT...SELECT)
+
+Other JSONB columns in the schema (`minion_jobs.{data,result,progress,stacktrace}`, `minion_inbox.payload`) were always written via the parameterized `$N::jsonb` form so they were never affected.
+
+### Behavior changes (read this if you upgrade)
+
+`splitBody` now requires an explicit sentinel for timeline content. Recognized markers (in priority order):
+1. `<!-- timeline -->` (preferred — what `serializeMarkdown` emits)
+2. `--- timeline ---` (decorated separator)
+3. `---` directly before `## Timeline` or `## History` heading (backward-compat fallback)
+
+If you intentionally used a plain `---` to mark your timeline section in source markdown, add `<!-- timeline -->` above it manually. The fallback covers the common case (`---` followed by `## Timeline`).
+
+### Attribution
+
+Built from community PRs #187 (@knee5) and #175 (@leonardsellem). The original PRs reported the bugs and proposed the fixes; this release re-implements them on top of the v0.12.0 knowledge graph release with expanded migration scope, schema audit (all 5 affected columns vs the 3 originally reported), engine-aware behavior, CI grep guard, and an E2E regression test that should have caught this in the first place. Codex outside-voice review during planning surfaced the missed `page_versions.frontmatter` propagation path and the noisy-truncated-diagnostic anti-pattern that was dropped from this scope. Thanks for finding the bugs and providing the recovery path — both PRs left work to do but the foundation was right.
+
+Co-Authored-By: @knee5 (PR #187 — splitBody, inferType wiki, JSONB triple-fix)
+Co-Authored-By: @leonardsellem (PR #175 — parseEmbedding, getEmbeddingsByChunkIds fix)
+
 ## [0.12.1] - 2026-04-19
 
 ## **Extract no longer hangs on large brains.**

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -61,7 +61,10 @@ strict behavior when unset.
 - `src/mcp/server.ts` — MCP stdio server (generated from operations)
 - `src/commands/auth.ts` — Standalone token management (create/list/revoke/test)
 - `src/commands/upgrade.ts` — Self-update CLI. `runPostUpgrade()` enumerates migrations from the TS registry (src/commands/migrations/index.ts) and tail-calls `runApplyMigrations(['--yes', '--non-interactive'])` so the mechanical side of every outstanding migration runs unconditionally.
-- `src/commands/migrations/` — TS migration registry (compiled into the binary; no filesystem walk of `skills/migrations/*.md` needed at runtime). `index.ts` lists migrations in semver order. `v0_11_0.ts` = Minions adoption orchestrator (8 phases). `v0_12_0.ts` = Knowledge Graph auto-wire orchestrator (5 phases: schema → config check → backfill links → backfill timeline → verify). `phaseASchema` has a 600s timeout (bumped from 60s in v0.12.1 for duplicate-heavy brains). All orchestrators are idempotent and resumable from `partial` status.
+- `src/commands/migrations/` — TS migration registry (compiled into the binary; no filesystem walk of `skills/migrations/*.md` needed at runtime). `index.ts` lists migrations in semver order. `v0_11_0.ts` = Minions adoption orchestrator (8 phases). `v0_12_0.ts` = Knowledge Graph auto-wire orchestrator (5 phases: schema → config check → backfill links → backfill timeline → verify). `phaseASchema` has a 600s timeout (bumped from 60s in v0.12.1 for duplicate-heavy brains). `v0_12_2.ts` = JSONB double-encode repair orchestrator (4 phases: schema → repair-jsonb → verify → record). All orchestrators are idempotent and resumable from `partial` status.
+- `src/commands/repair-jsonb.ts` — `gbrain repair-jsonb [--dry-run] [--json]`: rewrites `jsonb_typeof='string'` rows in place across 5 affected columns (pages.frontmatter, raw_data.data, ingest_log.pages_updated, files.metadata, page_versions.frontmatter). Fixes v0.12.0 double-encode bug on Postgres; PGLite no-ops. Idempotent.
+- `src/core/markdown.ts` — Frontmatter parsing + body splitter. `splitBody` requires an explicit timeline sentinel (`<!-- timeline -->`, `--- timeline ---`, or `---` immediately before `## Timeline`/`## History`). Plain `---` in body text is a markdown horizontal rule, not a separator. `inferType` auto-types `/wiki/analysis/` → analysis, `/wiki/guides/` → guide, `/wiki/hardware/` → hardware, `/wiki/architecture/` → architecture, `/writing/` → writing (plus the existing people/companies/deals/etc heuristics).
+- `scripts/check-jsonb-pattern.sh` — CI grep guard. Fails the build if anyone reintroduces the `${JSON.stringify(x)}::jsonb` interpolation pattern (which postgres.js v3 double-encodes). Wired into `bun test`.
 - `docs/UPGRADING_DOWNSTREAM_AGENTS.md` — Patches for downstream agent skill forks (Wintermute etc.) to apply when upgrading. Each release appends a new section. v0.10.3 includes diffs for brain-ops, meeting-ingestion, signal-detector, enrich.
 - `src/core/schema-embedded.ts` — AUTO-GENERATED from schema.sql (run `bun run build:schema`)
 - `src/schema.sql` — Full Postgres + pgvector DDL (source of truth, generates schema-embedded.ts)
@@ -129,6 +132,9 @@ Key commands added for Minions (job queue):
 - `gbrain jobs stats` — job health dashboard
 - `gbrain jobs work [--queue Q] [--concurrency N]` — start worker daemon (Postgres only)
 
+Key commands added in v0.12.2:
+- `gbrain repair-jsonb [--dry-run] [--json]` — repair double-encoded JSONB rows left over from v0.12.0-and-earlier Postgres writes. Idempotent; PGLite no-ops. The `v0_12_2` migration runs this automatically on `gbrain upgrade`.
+
 ## Testing
 
 `bun test` runs all tests. After the v0.12.1 release: ~75 unit test files + 8 E2E test files (1412 unit pass, 119 E2E when `DATABASE_URL` is set — skip gracefully otherwise). Unit tests run
@@ -172,12 +178,16 @@ parity), `test/cli.test.ts` (CLI structure), `test/config.test.ts` (config redac
 `test/features.test.ts` (feature scanning, brain_score calculation, CLI routing, persistence),
 `test/file-upload-security.test.ts` (symlink traversal, cwd confinement, slug + filename allowlists, remote vs local trust),
 `test/query-sanitization.test.ts` (prompt-injection stripping, output sanitization, structural boundary),
-`test/search-limit.test.ts` (clampSearchLimit default/cap behavior across list_pages and get_ingest_log).
+`test/search-limit.test.ts` (clampSearchLimit default/cap behavior across list_pages and get_ingest_log),
+`test/repair-jsonb.test.ts` (v0.12.2 JSONB repair: TARGETS list, idempotency, engine-awareness),
+`test/migrations-v0_12_2.test.ts` (v0.12.2 orchestrator phases: schema → repair → verify → record),
+`test/markdown.test.ts` (splitBody sentinel precedence, horizontal-rule preservation, inferType wiki subtypes).
 
 E2E tests (`test/e2e/`): Run against real Postgres+pgvector. Require `DATABASE_URL`.
 - `bun run test:e2e` runs Tier 1 (mechanical, all operations, no API keys). Includes 9 dedicated cases for the postgres-engine `addLinksBatch` / `addTimelineEntriesBatch` bind path — postgres-js's `unnest()` binding is structurally different from PGLite's and gets its own coverage.
 - `test/e2e/search-quality.test.ts` runs search quality E2E against PGLite (no API keys, in-memory)
 - `test/e2e/graph-quality.test.ts` runs the v0.10.3 knowledge graph pipeline (auto-link via put_page, reconciliation, traversePaths) against PGLite in-memory
+- `test/e2e/postgres-jsonb.test.ts` — v0.12.2 regression test. Round-trips all 5 JSONB write sites (pages.frontmatter, raw_data.data, ingest_log.pages_updated, files.metadata, page_versions.frontmatter) against real Postgres and asserts `jsonb_typeof='object'` plus `->>'key'` returns the expected scalar. The test that should have caught the original double-encode bug.
 - `test/e2e/upgrade.test.ts` runs check-update E2E against real GitHub API (network required)
 - Tier 2 (`skills.test.ts`) requires OpenClaw + API keys, runs nightly in CI
 - If `.env.testing` doesn't exist in this directory, check sibling worktrees for one:

diff --git a/INSTALL_FOR_AGENTS.md b/INSTALL_FOR_AGENTS.md
@@ -127,7 +127,7 @@ Verify: `gbrain integrations doctor` (after at least one is configured)
 
 ## Step 9: Verify
 
-Read `docs/GBRAIN_VERIFY.md` and run all 6 verification checks. Check #4 (live sync
+Read `docs/GBRAIN_VERIFY.md` and run all 7 verification checks. Check #4 (live sync
 actually works) is the most important.
 
 ## Upgrade
@@ -145,3 +145,10 @@ this is how features ship in the binary but stay dormant in the user's brain.
 For v0.12.0+ specifically: if your brain was created before v0.12.0, run
 `gbrain extract links --source db && gbrain extract timeline --source db` to
 backfill the new graph layer (see Step 4.5 above).
+
+For v0.12.2+ specifically: if your brain is Postgres- or Supabase-backed and
+predates v0.12.2, the `v0_12_2` migration runs `gbrain repair-jsonb`
+automatically during `gbrain post-upgrade` to fix the double-encoded JSONB
+columns. PGLite brains no-op. If wiki-style imports were truncated by the old
+`splitBody` bug, run `gbrain sync --full` after upgrading to rebuild
+`compiled_truth` from source markdown.
diff --git a/README.md b/README.md
@@ -536,6 +536,7 @@ ADMIN
   gbrain integrations                   Integration recipe dashboard
   gbrain check-backlinks check|fix      Back-link enforcement
   gbrain lint [--fix]                   LLM artifact detection
+  gbrain repair-jsonb [--dry-run]       Repair v0.12.0 double-encoded JSONB (Postgres)
   gbrain transcribe <audio>             Transcribe audio (Groq Whisper)
   gbrain research init <name>           Scaffold a data-research recipe
   gbrain research list                  Show available recipes

diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-0.12.1
+0.12.2
diff --git a/docs/GBRAIN_VERIFY.md b/docs/GBRAIN_VERIFY.md
@@ -224,6 +224,43 @@ heuristics won't find them — file an issue with a sample page.
 
 ---
 
+## 8. JSONB Frontmatter Integrity (v0.12.2)
+
+Postgres-backed brains created before v0.12.2 had double-encoded JSONB columns
+(`frontmatter->>'key'` returned NULL, GIN indexes were inert). `gbrain upgrade`
+runs `gbrain repair-jsonb` automatically via the `v0_12_2` orchestrator.
+Verify the repair succeeded.
+
+**Command:**
+
+```bash
+gbrain repair-jsonb --dry-run --json
+```
+
+**Expected:** `totalRepaired: 0` across all 5 columns (`pages.frontmatter`,
+`raw_data.data`, `ingest_log.pages_updated`, `files.metadata`,
+`page_versions.frontmatter`). A zero count means every row is properly-typed
+JSON objects, not string-encoded JSON.
+
+**If the count is > 0:** The repair didn't run or was interrupted. Re-run
+without `--dry-run`:
+
+```bash
+gbrain repair-jsonb
+```
+
+Idempotent. PGLite brains always report 0 (unaffected by the original bug).
+
+**Bonus check** — frontmatter-keyed queries actually resolve:
+
+```bash
+gbrain call list_pages '{"frontmatterKey": "type", "frontmatterValue": "person"}'
+```
+
+If this returns rows on a brain with person pages, the JSONB path is healthy.
+
+---
+
 ## Quick Verification (all checks in one pass)
 
 ```bash
@@ -247,7 +284,10 @@ gbrain check-update --json
 
 # 7. Knowledge graph populated (links + timeline > 0)
 gbrain stats | grep -E 'links|timeline'
+
+# 8. JSONB integrity (v0.12.2 — Postgres only, PGLite always 0)
+gbrain repair-jsonb --dry-run --json
 ```
 
-If all seven return successfully, the installation is healthy. For the full
+If all eight return successfully, the installation is healthy. For the full
 end-to-end sync test (4c), push a real change and verify it appears in search.
diff --git a/docs/UPGRADING_DOWNSTREAM_AGENTS.md b/docs/UPGRADING_DOWNSTREAM_AGENTS.md
@@ -177,6 +177,74 @@ Timeline entries still need explicit `gbrain timeline-add` calls.
    ```
    Should return an indented tree of typed edges.
 
+---
+
+## v0.12.2 hotfix (data-correctness, no skill edits)
+
+v0.12.2 is a Postgres data-correctness hotfix. No forked skill files need to
+change — the skill contracts are unchanged. But you DO need to run the migration,
+and you should know about one behavior change in markdown parsing.
+
+### 1. Run the migration (Postgres-backed brains)
+
+```bash
+gbrain upgrade
+```
+
+The `v0_12_2` orchestrator runs `gbrain repair-jsonb` automatically. It rewrites
+rows where `jsonb_typeof = 'string'` across `pages.frontmatter`, `raw_data.data`,
+`ingest_log.pages_updated`, `files.metadata`, and `page_versions.frontmatter`.
+Idempotent, safe to re-run. PGLite brains no-op cleanly.
+
+Verify after upgrade:
+
+```bash
+gbrain repair-jsonb --dry-run --json    # expect totalRepaired: 0
+```
+
+### 2. Recover any truncated wiki articles
+
+If your brain imported wiki-style markdown before v0.12.2, some pages were
+silently truncated (any standalone `---` in body content was treated as a
+timeline separator). Re-import from source:
+
+```bash
+gbrain sync --full
+```
+
+The new `splitBody` rebuilds `compiled_truth` correctly.
+
+### 3. Know the splitBody contract going forward
+
+`splitBody` now requires an explicit timeline sentinel. Recognized markers
+(priority order):
+
+1. `<!-- timeline -->` (preferred — what `serializeMarkdown` emits)
+2. `--- timeline ---` (decorated separator)
+3. `---` directly before `## Timeline` or `## History` heading (backward-compat)
+
+A bare `---` in body text is now a markdown horizontal rule, not a timeline
+separator. If your agent writes pages with a bare `---` delimiter, migrate to
+`<!-- timeline -->` — the `serializeMarkdown` helper already does this.
+
+### 4. Wiki subtypes now auto-typed
+
+`inferType` now auto-detects five additional directory patterns as their own
+page types (previously they all defaulted to `concept`):
+
+| Path pattern           | New type       |
+|------------------------|----------------|
+| `/wiki/analysis/`      | `analysis`     |
+| `/wiki/guides/`        | `guide`        |
+| `/wiki/hardware/`      | `hardware`     |
+| `/wiki/architecture/`  | `architecture` |
+| `/writing/`            | `writing`      |
+
+If your skills or queries filter by `type=concept` and expect wiki content in
+that bucket, update them to include the new types.
+
+---
+
 ## Future versions
 
 When gbrain ships a new version, this doc will be updated with the diffs for that

diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "gbrain",
-  "version": "0.12.0",
+  "version": "0.12.2",
   "description": "Postgres-native personal knowledge brain with hybrid RAG search",
   "type": "module",
   "main": "src/core/index.ts",
@@ -20,8 +20,9 @@
     "build": "bun build --compile --outfile bin/gbrain src/cli.ts",
     "build:all": "bun build --compile --target=bun-darwin-arm64 --outfile bin/gbrain-darwin-arm64 src/cli.ts && bun build --compile --target=bun-linux-x64 --outfile bin/gbrain-linux-x64 src/cli.ts",
     "build:schema": "bash scripts/build-schema.sh",
-    "test": "bun test",
+    "test": "scripts/check-jsonb-pattern.sh && bun test",
     "test:e2e": "bun test test/e2e/",
+    "check:jsonb": "scripts/check-jsonb-pattern.sh",
     "postinstall": "gbrain --version >/dev/null 2>&1 && gbrain apply-migrations --yes --non-interactive 2>/dev/null || true",
     "prepublish:clawhub": "bun run build:all",
     "publish:clawhub": "clawhub package publish . --family bundle-plugin"