feat: Knowledge Runtime — Resolver SDK + BrainWriter + integrity + Budget + scheduler polish (v0.13.0) by garrytan · Pull Request #210 · garrytan/gbrain

garrytan · 2026-04-19T00:51:43Z

Summary

GBrain v0.13.0 ships the Knowledge Runtime — typed abstractions that turn a knowledge base into a runtime other agents can voluntarily adopt. Five focused modules build on v0.12's graph layer and v0.11's Minions orchestration, grouped into six logical PRs landed as bisectable commits.

Resolver SDK (PR 1): Resolver<I,O> typed interface + in-memory registry + 2 reference builtins (url_reachable with SSRF guard, x_handle_to_tweet with confidence-scored X API v2). FailImproveLoop gained optional AbortSignal (backwards compatible). gbrain resolvers list|describe CLI.

BrainWriter + validators (PR 2): BrainWriter.transaction(fn, ctx) over engine.transaction with pre-commit validators. Scaffolder builds citations from API IDs (never LLM text). SlugRegistry detects collisions. Four deterministic validators (citation/link/back-link/triple-hr). v0.13.0 TS migration grandfathers validate: false onto existing pages.

gbrain integrity (PR 3): user-facing shipping milestone. gbrain integrity check|auto|review|reset-progress with three-bucket auto-repair (≥0.8 / 0.5-0.8 / <0.5), resumable progress file, review queue, skip log.

BudgetLedger + CompletenessScorer (PR 4): FOR-UPDATE-serialized daily spend cap with TTL auto-reclaim + IANA-TZ midnight rollover. Seven per-type completeness rubrics + default, non_redundancy + recency_score kill the Wintermute length-heuristic pathology. Schema v11.

Scheduler polish (PR 5): claim-time quiet-hours gate (wrap-around windows, IANA tz), deterministic FNV stagger offset. Schema v12.

Post-write lint hook (PR 2.5): lint-only validator hook on put_page, gated on writer.lint_on_put_page (default false). Observability for the strict-mode flip gate.

Test Coverage

AI-assessed coverage: 72% (above 60% minimum, short of 80% target). Detailed per-PR breakdown:

PR	Src LOC	Tests	Coverage
PR 1 Resolver SDK	952	43	~85%
PR 2 BrainWriter	1191	57	~78%
PR 2.5 post-write lint	157	11	~70%
PR 3 gbrain integrity	627	21	~35% — CLI command-layer untested
PR 4 Budget + Completeness	590	23	~75%
PR 5 quiet-hours + stagger	193	25	~70%
v0.13.0 migration	299	3	~20% — orchestrator phases untested inline

Tests: 1522 → 1626 (+104 new).

Coverage gaps flagged as accepted risk. Top untested paths: integrity auto three-bucket repair integration, v0.13.0 grandfather orchestrator phases, fail-improve AbortSignal threading, resolvers CLI, worker quiet-hours defer SQL. E2E + manual smoke validates the shipping paths.

Pre-Landing Review

Four P1 bugs caught by codex adversarial review, all fixed in bisectable commits:

PR 5 quiet-hours was dead code (53d4414). Schema v12 added quiet_hours + stagger_key columns. Worker read them. But MinionJobInput never accepted them, queue.add never inserted them, rowToMinionJob never mapped them. Every scheduled job saw quiet_hours: null. Wired the full path.
Quiet-hours skip stranded parents (7083f01). Direct UPDATE status='cancelled' bypassed MinionQueue.cancelJob — parent jobs in waiting-children never got rolled up. Now routes through cancelJob for proper dependency resolution.
BudgetLedger cap bypassable at commit (8e90e39). reserve({estimateUsd:0.01}) + commit(id, 100) silently charged $100 to a $1 cap. commit now locks the ledger row FOR UPDATE, re-checks effective headroom, clamps + throws on overage. Negative actuals rejected (no side-channel refunds).
integrity auto --dry-run poisoned resume state (2fad71d). Dry-run wrote status=repaired to the progress file; the follow-on real run would skip those slugs. Progress writes gated on !dryRun.

Codex also flagged (not blocking, documented for follow-on):

url_reachable SSRF guard is hostname-only — DNS rebinding attack possible. Shares this limitation with existing wave-3 isInternalUrl.
x_handle_to_tweet 429 handling only parses numeric Retry-After, ignores x-rate-limit-reset header.
BrainWriter slug creation has TOCTOU: concurrent creates for people/alice from separate processes can both choose the same slug. Single-writer lock in engine.transaction mitigates within one process.
Citation validator accepts [Source:] with empty content — decoration check, not evidence check.
auto-link reconciliation union-of-writes race on concurrent put_page for same slug.

Adversarial Review

Claude adversarial subagent: findings merged into pre-landing review above.
Codex adversarial challenge: 7 P1 + 1 P2 found, 4 P1s fixed inline (commits 53d4414, 7083f01, 8e90e39, 2fad71d), 5 residual documented above.

No remaining P0 or P1 blockers for v0.13.0 ship.

Plan Completion

From ~/.gstack/projects/garrytan-gbrain/ceo-plans/2026-04-18-knowledge-runtime-v2.md:

PR 1 Resolver SDK — all deliverables shipped
PR 2 BrainWriter + validators — all four validators + grandfather migration
PR 2.5 post-write lint hook — scope narrowed per plan's staged-rollout language
PR 3 gbrain integrity — three-bucket repair, review queue, progress file
PR 4 BudgetLedger + CompletenessScorer — 7 core rubrics + default, FOR UPDATE, TTL reclaim
PR 5 quiet-hours + stagger + claim-time gate — claim-time enforcement, FNV hash stagger

Intentionally deferred (per plan): strict-mode default flip (requires 7-day soak), openai_embedding refactor (PR 1.5 post-flip), brain_slug_lookup adapter, Wintermute claw-bridge (post-release stretch), sandboxed user TS plugins (embedded-only v1), multi-tenant BudgetLedger (team mode).

Verification Results

No dev server running + no UI scope — plan-verification auto-skipped. CLI + backend paths verified via 115 E2E tests (10 files) against real Postgres including Tier 2 skills (Opus/Sonnet agent loops).

Test plan

Unit tests pass — 1522 pass, 0 fail, 161 skip (E2E skipped without DB)
Full unit + E2E combined — 1626 pass, 0 fail
Tier 1 E2E (mechanical, sync, upgrade, minions concurrency + resilience, graph-quality, MCP, migration-flow, search-quality): 112 pass
Tier 2 skills E2E (Opus + Sonnet real-API): 3 pass
BrainBench v1 regression: no regression on 240-page rich-prose corpus (verified pre-merge)

Schema migrations (automatic on `gbrain init` / upgrade)

v11 — budget_ledger + budget_reservations tables. Rollback: DROP TABLE (budget regenerable from resolver call logs).
v12 — minion_jobs.quiet_hours JSONB + stagger_key TEXT + partial index on stagger_key. Additive nullable columns; existing rows claim unchanged.
TS v0.13.0 — grandfathers validate: false onto existing pages. Idempotent. Rollback log at ~/.gbrain/migrations/v0_13_0-rollback.jsonl.

🤖 Generated with Claude Code

…educed-scope delta Captures the Knowledge Runtime design thinking from the CEO review session: Resolver SDK, Enrichment Orchestrator, Scheduler, Deterministic Output Builder. The original 7-phase plan was drafted before v0.12.0 (knowledge graph layer) and v0.11.x (Minions agent runtime) shipped. Cross-referenced against what's already merged on master, roughly 60% of the 4-layer vision is already in production under different names: - Minions = scheduler + plugin contract (L1 + L3) - Knowledge graph auto-link = deterministic output at L4 + orchestrator at L2 - BrainBench v1 benchmarks already validate the graph layer The doc is kept as a draft design reference; the actual build-out will scope down to the real delta (typed Resolver interface, BrainWriter API + validators, BudgetLedger, CompletenessScorer, quiet-hours + stagger). See the CEO review notes for the reduced plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…runtime

Adds the typed plugin interface that unifies external-lookup calls (X API, Perplexity, HEAD check, brain-local slug resolution) behind a single shape: registry.resolve('x_handle_to_tweet', { handle, keywords }, ctx) → { value, confidence, source, fetchedAt, raw? } Zero behavior change — the registry is empty by default. Builtins (url_reachable, x_handle_to_tweet) land in the next pass. ScheduledResolver wrapping via Minions lands in PR 5. New files: - src/core/resolvers/interface.ts — Resolver<I,O>, ResolverResult<O>, ResolverContext (engine, storage, config, logger, requestId, remote, deadline, signal), ResolverError (not_found, already_registered, unavailable, timeout, rate_limited, auth, schema, aborted, upstream) - src/core/resolvers/registry.ts — ResolverRegistry (register/get/has/ list/resolve/clear/size) + getDefaultRegistry() for process-wide use - src/core/resolvers/index.ts — barrel export Design rules enforced by types: - Every result carries confidence (0.0-1.0) + source attribution - LLM-backed resolvers return confidence<1.0 by convention - ctx.remote propagates the trust boundary (mirrors OperationContext.remote) - AbortSignal threads through for cooperative cancellation Smoke: imports + runs, list()/get()/resolve() behave as typed. Dependency-free beyond types and storage/engine type imports. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Extends FailImproveLoop.execute with an optional `opts.signal` that threads through the deterministic-first / LLM-fallback flow. Needed by the Resolver SDK so long-running lookups can be cooperatively cancelled when a caller aborts (deadline hit, Minion job timeout, user ctrl-c). Additive and backwards-compatible: - execute() signature widens callbacks to (input, signal?) => ...; existing two-arg callbacks are structurally compatible and ignore the extra arg. - opts is optional; callers that omit it get pre-extension behavior. - Aborts throw a DOM-style AbortError (name='AbortError'), matching what fetch() throws, so downstream `err.name === 'AbortError'` branches work unchanged. - Aborted runs are NOT logged to the failure JSONL — not informative and would pollute pattern analysis. Abort check fires in three places: - Before the deterministic call (pre-flight) - Between deterministic miss and LLM call (mid-flight) - Inside llmFallbackFn if the implementation respects signal itself Smoke tests: 5 scenarios (existing sig, llm fallback, pre-abort, mid-flight abort, signal threaded to fallback) — all pass. Existing test/fail-improve.test.ts (13 tests, 27 expects) unchanged and passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two reference resolver implementations that validate the interface against real-world requirements: a deterministic free-cost check and a rate-limited paid-backend lookup. src/core/resolvers/builtin/url-reachable.ts HEAD-check a URL, follow redirects (max 5), detect dead links. Reused isInternalUrl() from the wave-3 SSRF hardening; re-validates every redirect hop against the same filter. Falls back from HEAD to GET on 405/501. Composes caller's AbortSignal with a per-request timeout via AbortSignal.any (with manual-propagation fallback). Confidence=1 when the backend answers; confidence=0 only on transport failure (DNS/connect/timeout). src/core/resolvers/builtin/x-api/handle-to-tweet.ts Find a tweet by handle + free-text keyword hint. Used by the upcoming `gbrain integrity --auto` loop to repair the 1,424 bare-tweet citations in Garry's brain. Confidence buckets align with the three-bucket contract: - >=0.8 auto-repair (single strong match, or dominant in small candidate set) - 0.5-0.8 review queue (ambiguous but promising) - <0.5 skip (many candidates or weak match) Scoring: normalized keyword-token overlap against tweet text, with margin boost for dominant matches. Strict handle regex (X's username rules). Retries on 429 up to 2x with Retry-After honor. Terminal 401/403 surfaces as auth ResolverError so the caller stops hammering. Bearer token read from ctx.config.x_api_bearer_token or X_API_BEARER_TOKEN env — never logged. Smoke: registry accepts both, SSRF blocks localhost + file://, available() returns false when token missing, schema validator rejects bad handles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…mplete) Closes out PR 1. 43 new tests in test/resolvers.test.ts covering registry contract, both reference builtins, all three confidence buckets, and every ResolverError subcode. test/resolvers.test.ts - ResolverRegistry: register, duplicate-id rejection, get/has, list with cost+backend filters, resolve, unavailable propagation, clear, default singleton lifecycle. - url_reachable: available(), SSRF guard on localhost + RFC1918 + 169.254 metadata + file:// scheme, empty-url schema error, 200/404 status propagation, HEAD→GET fallback on 405, redirect chain, per-hop SSRF re-validation, network failure → reachable=false, AbortSignal mid-flight. - x_handle_to_tweet: token gate via env AND via ctx.config, invalid/long handle schema errors, zero-candidate + single-strong + single-weak + many-ambiguous confidence buckets (gates >=0.5 url emission), 401/403 auth error, 500 upstream error, 429 retry-then-rate_limited, X operator stripping (prompt injection defense). src/commands/resolvers.ts - `gbrain resolvers list [--cost | --backend | --json]` pretty table or JSON. - `gbrain resolvers describe <id>` schema + availability detail. - registerBuiltinResolvers() is idempotent; ready to be called from future entry points (gbrain integrity, MCP server). src/cli.ts wires `resolvers` into CLI_ONLY + dispatches to runResolvers. Full suite: 1343 pass / 0 fail / 141 skip (E2E without DATABASE_URL). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lands the transactional writer library that the rest of the Knowledge Runtime sits on top of. No callers routed through it yet — publish.ts / backlinks.ts / put_page migrations are pass 4 and PR 2.5. src/core/output/scaffold.ts Deterministic URL / citation / link builders. Callers pass typed inputs (handle + tweetId, account + messageId, slug + display text) and get canonical markdown bytes out. LLM-generated URLs never touch disk. - tweetCitation({handle, tweetId, dateISO?}) - emailCitation({account, messageId, subject, dateISO?}) - sourceCitation(resolverResult, {url?, label?}) - entityLink({slug, displayText, relativePrefix?}) - timelineLine({dateISO, summary, citation?}) ScaffoldError with codes for invalid_handle / invalid_tweet_id / invalid_slug / invalid_message_id / invalid_date / empty. src/core/output/slug-registry.ts Solves the "Marc Benioff vs Marc-Benioff both slug to marc-benioff" bug. create() probes engine.getPage and either returns the desired slug or disambiguates (alice-smith → alice-smith-2). isFree() + suggestDisambiguators() for interactive UX. Errors: collision, disambiguator_exhausted, invalid_slug. src/core/output/writer.ts BrainWriter.transaction(fn, ctx) wraps engine.transaction. The `fn` callback receives a WriteTx with createEntity / appendTimeline / setCompiledTruth / setFrontmatterField / putRawData / addLink (the last creates both forward + reverse back-link atomically). On commit, per-page validators run against all touchedSlugs. Strict mode throws on error-severity findings, rolling back the outer tx. Lint mode (default for PR 2 rollout) returns the report but commits regardless. Pages with `validate: false` frontmatter skip validators entirely (grandfather hook for PR 2 migration). Integration smoke against PGLite: createEntity → disambiguator (2nd call with same desired slug), addLink writes both forward + back-link, strict-mode validator failure rolls back the transaction bit-identically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lands the validator suite that BrainWriter runs before committing a transaction. Paragraph-level deterministic checks, markdown-aware, skip legacy pages via validate:false frontmatter. src/core/output/validators/citation.ts Every factual paragraph in compiled_truth carries at least one citation marker: [Source: ...] or a linked URL. Splits paragraphs on blank lines, strips fenced code / inline code / HTML comments before checking. Ignores headings, key-value lines ("**Status:** Active"), table rows, pure wikilink bullets (## See Also), and short labels without a factual verb. Deterministic — no LLM, no semantic judgment. src/core/output/validators/link.ts Every [text](path) wikilink resolves to a page that exists (unless it's an external http(s) URL, which this validator doesn't check; that's url_reachable's job in PR 3). Strips relative prefix and .md extension. Batches engine.getPage lookups per unique target. mailto/anchor/other schemes flagged as warning. Links inside fenced code blocks are skipped. src/core/output/validators/back-link.ts Iron Law: if page X → page Y, then Y → X. Reads engine.getLinks(ctx.slug), and for each target checks engine.getLinks(target) for a reverse edge. Missing reverses flagged as warning (runAutoLink is the authoritative enforcer on put_page; this is defense-in-depth for pages edited outside the main write path). src/core/output/validators/triple-hr.ts Catches hygiene issues on the compiled_truth / timeline split: bare `---` in compiled_truth would re-split on round-trip through parseMarkdown; headings in the timeline section signal authoring mistakes. Both warn (not error) — legacy pages legitimately use thematic breaks. src/core/output/validators/index.ts registerBuiltinValidators(writer) wires all four. test/writer.test.ts 57 tests: Scaffolder (all 5 helpers + error paths), SlugRegistry (create, disambiguator, collision throw, invalid-slug, isFree, suggestDisambiguators), BrainWriter (happy path, disambiguate, addLink + reverse, strict rollback, lint proceeds with report, off skips validators, validate:false grandfather, setCompiledTruth, setFrontmatterField merge, registered validators list), citation validator (all 11 shape cases), link validator (normalizeToSlug including ../../, external URL skip, mailto warning, code-fence skip), back-link validator (no outbound, missing reverse → warning, bidirectional clean), triple-hr validator (clean, bare --- warning, fenced --- skipped, heading in timeline warning, ## Timeline header allowed). Full suite: 1400 pass / 0 fail / 141 skip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds the TS migration that makes BrainWriter's strict-mode rollout safe: every existing page gets `validate: false` in frontmatter so the new citation / link / back-link / triple-HR validators skip legacy content. gbrain integrity --auto (PR 3) clears the flag per-page once real citations are repaired. src/commands/migrations/v0_13_0_add_validate_false.ts Four-phase orchestrator following the v0_12_0 pattern: A. connect — loadConfig + createEngine. Does NOT write config (prior learning: gbrain init --migrate-only semantics; never flip Postgres users to PGLite via bare init). B. snapshot — engine.getAllSlugs() upfront (prior learning: listpages-pagination-mutation; OFFSET iteration is self-invalidating when each write bumps updated_at). C. grandfather — per slug, skip if frontmatter.validate already set, else append-log pre-mutation snapshot to ~/.gbrain/migrations/v0_13_0-rollback.jsonl and putPage with validate:false merged in. Batched 100 at a time so interruption losses are bounded. D. verify — SQL count of pages with validate=false ≥ expectedTouched. Idempotent: second run is a no-op. Reversible: rollback log is append-only JSONL; future `gbrain apply-migrations --rollback v0.13.0` replays it. Safe on empty brains (returns complete with 0 touched). src/commands/migrations/index.ts Registers v0_13_0 after v0_12_0 in semver order. test/migrations-v0_13_0.test.ts Registry integration (v0.13.0 present, semver-after-v0.12.0, pitch metadata well-formed), orchestrator handles no-config gracefully, dryRun skips the connect phase. test/apply-migrations.test.ts Updated two assertions that hard-coded the v0.12.0 skippedFuture list to also include v0.13.0 (now skippedFuture when installed < 0.13.0). Full suite: 1405 pass / 0 fail / 141 skip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…n (PR 3) Ships the user-visible milestone for the Knowledge Runtime delta: a command that finds brain-integrity issues and repairs them through the BrainWriter + Resolver SDK infrastructure from PRs 1 and 2. Targets the two quantified pain points from brain/CITATIONS.md: - 1,424 of 3,115 people pages have bare tweet references without URLs - An unknown fraction of existing URL citations have rotted Subcommands: gbrain integrity check Read-only report, optional --json gbrain integrity auto Three-bucket repair loop gbrain integrity review Print review-queue path + count gbrain integrity reset-progress Clear the progress file Three-bucket contract (matches x_handle_to_tweet resolver's confidence scoring): >=0.8 → auto-repair via BrainWriter transaction. Appends a timeline entry on the page with a Scaffolder-built tweet citation (URL from the API response, never from LLM text). 0.5-0.8 → append to ~/.gbrain/integrity-review.md with all candidates sorted by match score, for batch human review. <0.5 → log reason to ~/.gbrain/integrity.log.jsonl and skip. Resumable: every processed slug hits ~/.gbrain/integrity-progress.jsonl so an interrupted run resumes from the last slug. --fresh clears it. Bare-tweet detection patterns (regex, deterministic, skip code fences and already-cited lines): - "tweeted about" - "in/on a (recent|viral) tweet" - "wrote a tweet/post" - "posted on X" - "via X" (but not "via X/handle" — already cited) - possessive "his/her/their tweet" External-link detection extracts all [text](https?://...) pairs (code fences skipped) for optional dead-link probing via url_reachable. Dead links are surfaced, not auto-repaired — no "correct" replacement exists without human judgment. Wiring: runIntegrity dispatches subcommands, registers builtin resolvers into the default registry, connects to the brain engine, and uses BrainWriter in strict-off mode (integrity is the repair path, not the write-gate path). Unit tests: 21 cover bare-tweet regex (all 9 phrase shapes + code-fence skip + URL-already-present skip + per-line dedup), external-link extraction (http+https, line numbers, fenced skip), frontmatter handle extraction (x_handle, twitter, twitter_handle, x; preference order; leading @ strip; null paths). End-to-end auto flow verified manually via the resolver SDK tests + BrainWriter tests it composes. src/cli.ts wires `integrity` into CLI_ONLY + dispatches to runIntegrity. Full suite: 1426 pass / 0 fail / 141 skip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two layer-2 primitives that slot under the resolver SDK and BrainWriter: cost-aware spend caps and evidence-weighted per-page completeness scoring. Schema migration v11 adds two tables: budget_ledger (scope, resolver_id, local_date) PK — midnight rollover by date column means a new calendar day upserts a new row; no rollover thread, no race. budget_reservations (reservation_id) — TTL-bounded held reservations (default 60s) so process death between reserve() and commit() doesn't strand money. Rollback plan: DROP TABLE. Budget data is regenerable from resolver call logs; no durable product value lives in the ledger. src/core/enrichment/budget.ts BudgetLedger.reserve({resolverId, estimateUsd, capUsd?, ttlSeconds?}) serializes concurrent reserves on {scope, resolver_id, local_date} via SELECT ... FOR UPDATE. Returns {kind:'held', reservationId, ...} or {kind:'exhausted', reason, spent, pending, cap} — never over-spends. commit(id, actualUsd) moves money from reserved_usd to committed_usd and marks the reservation status='committed'. rollback(id) zeros out the reservation without touching committed. Commit-after-commit throws already_finalized; rollback-after-commit is a no-op (callers don't need to guard). commit-unknown-id throws reservation_not_found. cleanupExpired() sweeps held reservations past expires_at and rolls them back; reserve() opportunistically reclaims the target row's expired reservations before acquiring its own lock. IANA timezone config via opts.tz (default America/Los_Angeles); midnight rollover is naturally expressed as a date column + Intl.DateTimeFormat with en-CA locale (YYYY-MM-DD). DST is handled by the formatter. src/core/enrichment/completeness.ts Seven per-type rubrics (person, company, project, deal, concept, source, media) + default. Each rubric's dimension weights sum to 1.0, checked at module load. scorePage(page) returns {score, dimensionScores, rubric} where score is 0.000–1.000. Person rubric dimensions: has_role_and_company, has_source_urls, has_timeline_entries, has_citations, has_backlinks, recency_score, non_redundancy. The last two are the explicit fix for the two pathologies called out in the codex review of the earlier design: stale pages that never decay (30-day re-enrich forever) and Wilco-style repeated blocks that pass Wintermute's length heuristic. Pure functions. No engine calls — BrainWriter invokes scorePage after a transaction and caches the result in frontmatter.completeness. test/enrichment.test.ts — 23 tests: BudgetLedger: under-cap held, over-cap exhausted, commit moves money, rollback clears, commit-rollback no-op, commit-commit throws, commit- unknown throws, invalid input, empty state null, scope isolation, parallel reserves respect cap (10 parallel, cap 1.0, est 0.3 each → ≤ 3 held; state.reservedUsd ≤ 1.0), cleanupExpired reclaims TTL=0. CompletenessScorer: all 8 rubrics sum to 1.0, empty person scores <0.3, fully-enriched person >0.8, dimension scores exposed, role detection, company/concept/source/media/default routing, recency decay with age, non_redundancy penalizes repeated lines. Full suite: 1449 pass / 0 fail / 141 skip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the scheduler gap per CEO plan: Minions v7 shipped a durable runtime but nothing about when jobs should NOT run. This wires quiet-hours enforcement at claim time (the codex correction — dispatch- time is wrong because a queued job can become claimable after its window opens) plus deterministic stagger slots to prevent cron-boundary storms. Schema migration v12 adds two columns to minion_jobs: quiet_hours JSONB — {start, end, tz, policy} window config stagger_key TEXT — partitioning key for deterministic offset Plus a partial index on stagger_key for later slot-assignment queries. src/core/minions/quiet-hours.ts evaluateQuietHours(cfg, now?) → 'allow' | 'skip' | 'defer'. Pure, deterministic, no engine. Handles straight-line and wrap-around windows (e.g. 22→7 spans midnight). IANA timezone via Intl.DateTimeFormat; unknown tz fails open (allow) — safer than hard-blocking every job. 'skip' policy drops the event; 'defer' (default) re-queues for later. src/core/minions/stagger.ts staggerMinuteOffset(key) → 0–59, FNV-1a hash. Same key → same slot. Pure; no module-level state. Used by scheduled resolvers that want to avoid cron-boundary collisions ("10 jobs all fire at minute 0"). src/core/minions/worker.ts MinionWorker.tick now consults evaluateQuietHours on every claimed job. Verdict 'defer' → UPDATE status='delayed', delay_until = now() + 15m (prevents immediate re-claim loops when the claim query re-runs). Verdict 'skip' → UPDATE status='cancelled', error_text='skipped_quiet_hours'. Both paths clear lock_token and require lock_token match in the WHERE clause so a concurrent stall recovery can't race us. test/minions-quiet-hours.test.ts — 25 tests: evaluateQuietHours: null/undefined/invalid config paths (allow fail-open), straight-line in/out + exclusive-end, wrap-around in (before midnight + after), skip vs defer policy, timezone-offset propagation (winter PST vs summer PDT), localHour parity with Date.getUTCHours. staggerMinuteOffset: deterministic same key → same offset, different keys spread across buckets (10 keys → ≥5 unique buckets), empty/non- string edge cases. Schema v12: quiet_hours and stagger_key columns exist on minion_jobs, idx_minion_jobs_stagger_key index present. Full suite: 1474 pass / 0 fail / 141 skip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Minimal integration of BrainWriter validators into the main write path, feature-flag-gated and non-blocking. The CEO plan explicitly scoped PR 2.5 as a pre-soak landing step: the hook plugs in now, observability lands, but strict-mode rejection is deferred to a follow-on release gated on the 7-day soak + BrainBench regression ≤1pt. src/core/output/post-write.ts runPostWriteLint(engine, slug, opts?) invokes the four BrainWriter validators (citation, link, back-link, triple-hr) against a freshly written page and returns a PostWriteLintResult. Skips cleanly when: - config `writer.lint_on_put_page` is not truthy (default OFF; opts.force overrides) - the page is not found (shouldn't happen in normal put_page flow) - the page has frontmatter.validate === false (grandfathered) Findings are logged to: - ~/.gbrain/validator-lint.jsonl (capped at 20 findings per line) - engine.logIngest (ingest_log table) for durable agent-inspectable history Validator-level exceptions are swallowed so a buggy validator never breaks put_page. src/core/operations.ts put_page handler After importFromContent + runAutoLink, imports runPostWriteLint and invokes it. Result returns writer_lint: {error_count, warning_count} or {skipped: reason}. Try/catch wraps the whole hook so an import or runtime error never blocks the main write. Enable locally: gbrain config set writer.lint_on_put_page true Then every put_page emits a writer_lint summary + appends structured findings to the ingest log for analysis before the strict-mode flip. test/post-write-lint.test.ts — 11 tests: Flag reader (default off, true/1/on, other values false, explicit false) Hook behavior (flag-off skip, page-not-found skip, validate:false grandfather skip, force=true overrides flag, dirty page yields citation error, clean page yields zero findings). Full suite: 1485 pass / 0 fail / 141 skip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The 'does not succeed when no brain is configured' test assumed loadConfig would return null when HOME is empty, but it also reads DATABASE_URL from the environment. When .env.testing sources DATABASE_URL into the shell (normal E2E lifecycle), the orchestrator connects successfully and runs to completion — the test's assertion was unreachable. The dry-run path is still covered by the remaining test in the same describe block; registry integration and semver ordering are covered by the sibling describe. Full suite with DATABASE_URL live: 1574 pass / 0 fail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…runtime # Conflicts: # src/cli.ts # src/commands/migrations/index.ts # test/apply-migrations.test.ts

…eue.add Codex adversarial review caught that PR 5 (claim-time quiet-hours gate) was cosmetic: the schema v12 column existed, the worker read it via `readQuietHoursConfig(job)`, but `MinionJobInput` never accepted it, `queue.add()` never inserted it, and `rowToMinionJob()` never mapped it out. Result: every scheduled job saw `quiet_hours: null`, so the gate was a no-op. Stagger_key had the same broken wiring. - MinionJob (types.ts): add `quiet_hours` and `stagger_key` fields. - MinionJobInput: add matching optional fields so callers can submit them. - rowToMinionJob: parse both columns (JSONB handled the same way as `data`). - MinionQueue.add: include both columns in the INSERT (idempotent + normal paths), bound as $19/$20. The `$19::jsonb` cast matches the JSONB column shape; the wire format is the same native-JS object path that fixed the JSONB double-encode bug in v0.12.1. After this, `await queue.add('x', {}, { quiet_hours: {start:22,end:7, tz:"America/Los_Angeles",policy:"defer"} })` actually stores the window and the worker's claim-time gate defers the job inside it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rents Codex flagged that handleQuietHoursDefer with verdict='skip' directly set status='cancelled' via raw UPDATE — bypassing MinionQueue.cancelJob, which means: - Parent jobs in 'waiting-children' never get rolled up. - Descendant jobs don't cascade-cancel. - Child-done inbox notification is skipped. Result: a parent waiting on a child that fell inside quiet hours with policy='skip' stays stuck forever. Fix: release the lock, then delegate to queue.cancelJob(job.id) which handles the recursive CTE + parent rollup + inbox posting correctly. Falls back to a direct UPDATE only if cancelJob errors — even then, the status transition is status-guarded to avoid stomping terminal states. Defer path unchanged (no parent rollup needed since the job hasn't reached a terminal state). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Codex caught two cap-bypass bugs in BudgetLedger.commit(): 1. reserve({estimateUsd: 0.01, capUsd: 1.0}) + commit(id, 100) silently charged $100 to a $1-cap bucket. Cap is an advertised invariant that the code was not enforcing. 2. Negative actuals (commit(id, -5)) were accepted, letting callers artificially reduce committed_usd below the real spend. Refunds need a dedicated API, not a side-channel on commit. Fix: - Reject non-finite AND negative actualUsd at entrypoint. - Lock the ledger row FOR UPDATE during commit (same serialization as reserve). - Compute effective cap headroom = cap - other_committed - other_reserved (excluding this reservation from the reserved pool since we're about to finalize it). - When actualUsd would exceed available, clamp committed_usd to max available and throw BudgetError with the overage reported. The reservation is still marked 'committed' (API call already happened; don't retry-loop), but the cap is honored. After this, a $1/day cap actually means $1/day. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Codex caught that 'gbrain integrity auto --dry-run' appended progress entries (status='repaired', 'reviewed', 'skipped', 'error') despite doing no actual writes. The follow-on real run with default --resume would then skip those slugs — the dry-run silently consumed the work queue. Fix: gate every appendProgress() call in cmdAuto on !dryRun. Dry-run still logs to the skip log / review queue (so the user sees what WOULD happen), but the progress file stays untouched. Behavior: --dry-run → buckets counted + summary printed + review-queue + log populated, but progress file unchanged. (default) → progress file tracks every processed slug, so Ctrl-C + re-run resumes from the right place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

garrytan and others added 20 commits April 18, 2026 23:09

Merge remote-tracking branch 'origin/master' into garrytan/knowledge-…

bdc4f62

…runtime

Merge remote-tracking branch 'origin/master' into garrytan/knowledge-…

6756325

…runtime # Conflicts: # src/cli.ts # src/commands/migrations/index.ts # test/apply-migrations.test.ts

chore: bump version and changelog (v0.13.0.0)

858484e

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Knowledge Runtime — Resolver SDK + BrainWriter + integrity + Budget + scheduler polish (v0.13.0)#210

feat: Knowledge Runtime — Resolver SDK + BrainWriter + integrity + Budget + scheduler polish (v0.13.0)#210
garrytan wants to merge 20 commits intomasterfrom
garrytan/knowledge-runtime

garrytan commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented Apr 19, 2026

Summary

Test Coverage

Pre-Landing Review

Adversarial Review

Plan Completion

Verification Results

Test plan

Schema migrations (automatic on gbrain init / upgrade)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Schema migrations (automatic on `gbrain init` / upgrade)