feat: Knowledge Runtime — Resolver SDK + BrainWriter + integrity + Budget + scheduler polish (v0.13.0)#210
Open
feat: Knowledge Runtime — Resolver SDK + BrainWriter + integrity + Budget + scheduler polish (v0.13.0)#210
Conversation
…educed-scope delta Captures the Knowledge Runtime design thinking from the CEO review session: Resolver SDK, Enrichment Orchestrator, Scheduler, Deterministic Output Builder. The original 7-phase plan was drafted before v0.12.0 (knowledge graph layer) and v0.11.x (Minions agent runtime) shipped. Cross-referenced against what's already merged on master, roughly 60% of the 4-layer vision is already in production under different names: - Minions = scheduler + plugin contract (L1 + L3) - Knowledge graph auto-link = deterministic output at L4 + orchestrator at L2 - BrainBench v1 benchmarks already validate the graph layer The doc is kept as a draft design reference; the actual build-out will scope down to the real delta (typed Resolver interface, BrainWriter API + validators, BudgetLedger, CompletenessScorer, quiet-hours + stagger). See the CEO review notes for the reduced plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the typed plugin interface that unifies external-lookup calls (X API,
Perplexity, HEAD check, brain-local slug resolution) behind a single shape:
registry.resolve('x_handle_to_tweet', { handle, keywords }, ctx)
→ { value, confidence, source, fetchedAt, raw? }
Zero behavior change — the registry is empty by default. Builtins
(url_reachable, x_handle_to_tweet) land in the next pass. ScheduledResolver
wrapping via Minions lands in PR 5.
New files:
- src/core/resolvers/interface.ts — Resolver<I,O>, ResolverResult<O>,
ResolverContext (engine, storage, config, logger, requestId, remote,
deadline, signal), ResolverError (not_found, already_registered,
unavailable, timeout, rate_limited, auth, schema, aborted, upstream)
- src/core/resolvers/registry.ts — ResolverRegistry (register/get/has/
list/resolve/clear/size) + getDefaultRegistry() for process-wide use
- src/core/resolvers/index.ts — barrel export
Design rules enforced by types:
- Every result carries confidence (0.0-1.0) + source attribution
- LLM-backed resolvers return confidence<1.0 by convention
- ctx.remote propagates the trust boundary (mirrors OperationContext.remote)
- AbortSignal threads through for cooperative cancellation
Smoke: imports + runs, list()/get()/resolve() behave as typed.
Dependency-free beyond types and storage/engine type imports.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends FailImproveLoop.execute with an optional `opts.signal` that threads through the deterministic-first / LLM-fallback flow. Needed by the Resolver SDK so long-running lookups can be cooperatively cancelled when a caller aborts (deadline hit, Minion job timeout, user ctrl-c). Additive and backwards-compatible: - execute() signature widens callbacks to (input, signal?) => ...; existing two-arg callbacks are structurally compatible and ignore the extra arg. - opts is optional; callers that omit it get pre-extension behavior. - Aborts throw a DOM-style AbortError (name='AbortError'), matching what fetch() throws, so downstream `err.name === 'AbortError'` branches work unchanged. - Aborted runs are NOT logged to the failure JSONL — not informative and would pollute pattern analysis. Abort check fires in three places: - Before the deterministic call (pre-flight) - Between deterministic miss and LLM call (mid-flight) - Inside llmFallbackFn if the implementation respects signal itself Smoke tests: 5 scenarios (existing sig, llm fallback, pre-abort, mid-flight abort, signal threaded to fallback) — all pass. Existing test/fail-improve.test.ts (13 tests, 27 expects) unchanged and passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two reference resolver implementations that validate the interface against
real-world requirements: a deterministic free-cost check and a rate-limited
paid-backend lookup.
src/core/resolvers/builtin/url-reachable.ts
HEAD-check a URL, follow redirects (max 5), detect dead links. Reused
isInternalUrl() from the wave-3 SSRF hardening; re-validates every redirect
hop against the same filter. Falls back from HEAD to GET on 405/501.
Composes caller's AbortSignal with a per-request timeout via
AbortSignal.any (with manual-propagation fallback). Confidence=1 when the
backend answers; confidence=0 only on transport failure (DNS/connect/timeout).
src/core/resolvers/builtin/x-api/handle-to-tweet.ts
Find a tweet by handle + free-text keyword hint. Used by the upcoming
`gbrain integrity --auto` loop to repair the 1,424 bare-tweet citations
in Garry's brain. Confidence buckets align with the three-bucket contract:
- >=0.8 auto-repair (single strong match, or dominant in small candidate set)
- 0.5-0.8 review queue (ambiguous but promising)
- <0.5 skip (many candidates or weak match)
Scoring: normalized keyword-token overlap against tweet text, with margin
boost for dominant matches. Strict handle regex (X's username rules).
Retries on 429 up to 2x with Retry-After honor. Terminal 401/403 surfaces
as auth ResolverError so the caller stops hammering. Bearer token read
from ctx.config.x_api_bearer_token or X_API_BEARER_TOKEN env — never logged.
Smoke: registry accepts both, SSRF blocks localhost + file://, available()
returns false when token missing, schema validator rejects bad handles.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mplete)
Closes out PR 1. 43 new tests in test/resolvers.test.ts covering registry
contract, both reference builtins, all three confidence buckets, and every
ResolverError subcode.
test/resolvers.test.ts
- ResolverRegistry: register, duplicate-id rejection, get/has, list with
cost+backend filters, resolve, unavailable propagation, clear, default
singleton lifecycle.
- url_reachable: available(), SSRF guard on localhost + RFC1918 + 169.254
metadata + file:// scheme, empty-url schema error, 200/404 status
propagation, HEAD→GET fallback on 405, redirect chain, per-hop SSRF
re-validation, network failure → reachable=false, AbortSignal mid-flight.
- x_handle_to_tweet: token gate via env AND via ctx.config, invalid/long
handle schema errors, zero-candidate + single-strong + single-weak +
many-ambiguous confidence buckets (gates >=0.5 url emission), 401/403
auth error, 500 upstream error, 429 retry-then-rate_limited, X operator
stripping (prompt injection defense).
src/commands/resolvers.ts
- `gbrain resolvers list [--cost | --backend | --json]` pretty table
or JSON.
- `gbrain resolvers describe <id>` schema + availability detail.
- registerBuiltinResolvers() is idempotent; ready to be called from
future entry points (gbrain integrity, MCP server).
src/cli.ts wires `resolvers` into CLI_ONLY + dispatches to runResolvers.
Full suite: 1343 pass / 0 fail / 141 skip (E2E without DATABASE_URL).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the transactional writer library that the rest of the Knowledge
Runtime sits on top of. No callers routed through it yet — publish.ts /
backlinks.ts / put_page migrations are pass 4 and PR 2.5.
src/core/output/scaffold.ts
Deterministic URL / citation / link builders. Callers pass typed inputs
(handle + tweetId, account + messageId, slug + display text) and get
canonical markdown bytes out. LLM-generated URLs never touch disk.
- tweetCitation({handle, tweetId, dateISO?})
- emailCitation({account, messageId, subject, dateISO?})
- sourceCitation(resolverResult, {url?, label?})
- entityLink({slug, displayText, relativePrefix?})
- timelineLine({dateISO, summary, citation?})
ScaffoldError with codes for invalid_handle / invalid_tweet_id /
invalid_slug / invalid_message_id / invalid_date / empty.
src/core/output/slug-registry.ts
Solves the "Marc Benioff vs Marc-Benioff both slug to marc-benioff" bug.
create() probes engine.getPage and either returns the desired slug or
disambiguates (alice-smith → alice-smith-2). isFree() + suggestDisambiguators()
for interactive UX. Errors: collision, disambiguator_exhausted, invalid_slug.
src/core/output/writer.ts
BrainWriter.transaction(fn, ctx) wraps engine.transaction. The `fn`
callback receives a WriteTx with createEntity / appendTimeline /
setCompiledTruth / setFrontmatterField / putRawData / addLink (the last
creates both forward + reverse back-link atomically). On commit, per-page
validators run against all touchedSlugs. Strict mode throws on
error-severity findings, rolling back the outer tx. Lint mode (default for
PR 2 rollout) returns the report but commits regardless. Pages with
`validate: false` frontmatter skip validators entirely (grandfather hook
for PR 2 migration).
Integration smoke against PGLite: createEntity → disambiguator (2nd call
with same desired slug), addLink writes both forward + back-link,
strict-mode validator failure rolls back the transaction bit-identically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the validator suite that BrainWriter runs before committing a
transaction. Paragraph-level deterministic checks, markdown-aware, skip
legacy pages via validate:false frontmatter.
src/core/output/validators/citation.ts
Every factual paragraph in compiled_truth carries at least one citation
marker: [Source: ...] or a linked URL. Splits paragraphs on blank lines,
strips fenced code / inline code / HTML comments before checking.
Ignores headings, key-value lines ("**Status:** Active"), table rows,
pure wikilink bullets (## See Also), and short labels without a factual
verb. Deterministic — no LLM, no semantic judgment.
src/core/output/validators/link.ts
Every [text](path) wikilink resolves to a page that exists (unless it's
an external http(s) URL, which this validator doesn't check; that's
url_reachable's job in PR 3). Strips relative prefix and .md extension.
Batches engine.getPage lookups per unique target. mailto/anchor/other
schemes flagged as warning. Links inside fenced code blocks are skipped.
src/core/output/validators/back-link.ts
Iron Law: if page X → page Y, then Y → X. Reads engine.getLinks(ctx.slug),
and for each target checks engine.getLinks(target) for a reverse edge.
Missing reverses flagged as warning (runAutoLink is the authoritative
enforcer on put_page; this is defense-in-depth for pages edited outside
the main write path).
src/core/output/validators/triple-hr.ts
Catches hygiene issues on the compiled_truth / timeline split: bare `---`
in compiled_truth would re-split on round-trip through parseMarkdown;
headings in the timeline section signal authoring mistakes. Both warn
(not error) — legacy pages legitimately use thematic breaks.
src/core/output/validators/index.ts
registerBuiltinValidators(writer) wires all four.
test/writer.test.ts
57 tests: Scaffolder (all 5 helpers + error paths), SlugRegistry (create,
disambiguator, collision throw, invalid-slug, isFree, suggestDisambiguators),
BrainWriter (happy path, disambiguate, addLink + reverse, strict rollback,
lint proceeds with report, off skips validators, validate:false grandfather,
setCompiledTruth, setFrontmatterField merge, registered validators list),
citation validator (all 11 shape cases), link validator (normalizeToSlug
including ../../, external URL skip, mailto warning, code-fence skip),
back-link validator (no outbound, missing reverse → warning, bidirectional
clean), triple-hr validator (clean, bare --- warning, fenced --- skipped,
heading in timeline warning, ## Timeline header allowed).
Full suite: 1400 pass / 0 fail / 141 skip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the TS migration that makes BrainWriter's strict-mode rollout safe:
every existing page gets `validate: false` in frontmatter so the new
citation / link / back-link / triple-HR validators skip legacy content.
gbrain integrity --auto (PR 3) clears the flag per-page once real citations
are repaired.
src/commands/migrations/v0_13_0_add_validate_false.ts
Four-phase orchestrator following the v0_12_0 pattern:
A. connect — loadConfig + createEngine. Does NOT write config (prior
learning: gbrain init --migrate-only semantics; never
flip Postgres users to PGLite via bare init).
B. snapshot — engine.getAllSlugs() upfront (prior learning:
listpages-pagination-mutation; OFFSET iteration is
self-invalidating when each write bumps updated_at).
C. grandfather — per slug, skip if frontmatter.validate already set,
else append-log pre-mutation snapshot to
~/.gbrain/migrations/v0_13_0-rollback.jsonl and
putPage with validate:false merged in. Batched 100
at a time so interruption losses are bounded.
D. verify — SQL count of pages with validate=false ≥ expectedTouched.
Idempotent: second run is a no-op. Reversible: rollback log is
append-only JSONL; future `gbrain apply-migrations --rollback v0.13.0`
replays it. Safe on empty brains (returns complete with 0 touched).
src/commands/migrations/index.ts
Registers v0_13_0 after v0_12_0 in semver order.
test/migrations-v0_13_0.test.ts
Registry integration (v0.13.0 present, semver-after-v0.12.0, pitch
metadata well-formed), orchestrator handles no-config gracefully,
dryRun skips the connect phase.
test/apply-migrations.test.ts
Updated two assertions that hard-coded the v0.12.0 skippedFuture list
to also include v0.13.0 (now skippedFuture when installed < 0.13.0).
Full suite: 1405 pass / 0 fail / 141 skip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n (PR 3)
Ships the user-visible milestone for the Knowledge Runtime delta: a
command that finds brain-integrity issues and repairs them through the
BrainWriter + Resolver SDK infrastructure from PRs 1 and 2.
Targets the two quantified pain points from brain/CITATIONS.md:
- 1,424 of 3,115 people pages have bare tweet references without URLs
- An unknown fraction of existing URL citations have rotted
Subcommands:
gbrain integrity check Read-only report, optional --json
gbrain integrity auto Three-bucket repair loop
gbrain integrity review Print review-queue path + count
gbrain integrity reset-progress Clear the progress file
Three-bucket contract (matches x_handle_to_tweet resolver's confidence
scoring):
>=0.8 → auto-repair via BrainWriter transaction. Appends a timeline
entry on the page with a Scaffolder-built tweet citation (URL
from the API response, never from LLM text).
0.5-0.8 → append to ~/.gbrain/integrity-review.md with all candidates
sorted by match score, for batch human review.
<0.5 → log reason to ~/.gbrain/integrity.log.jsonl and skip.
Resumable: every processed slug hits ~/.gbrain/integrity-progress.jsonl
so an interrupted run resumes from the last slug. --fresh clears it.
Bare-tweet detection patterns (regex, deterministic, skip code fences
and already-cited lines):
- "tweeted about"
- "in/on a (recent|viral) tweet"
- "wrote a tweet/post"
- "posted on X"
- "via X" (but not "via X/handle" — already cited)
- possessive "his/her/their tweet"
External-link detection extracts all [text](https?://...) pairs (code
fences skipped) for optional dead-link probing via url_reachable.
Dead links are surfaced, not auto-repaired — no "correct" replacement
exists without human judgment.
Wiring: runIntegrity dispatches subcommands, registers builtin resolvers
into the default registry, connects to the brain engine, and uses
BrainWriter in strict-off mode (integrity is the repair path, not the
write-gate path).
Unit tests: 21 cover bare-tweet regex (all 9 phrase shapes + code-fence
skip + URL-already-present skip + per-line dedup), external-link
extraction (http+https, line numbers, fenced skip), frontmatter handle
extraction (x_handle, twitter, twitter_handle, x; preference order;
leading @ strip; null paths). End-to-end auto flow verified manually
via the resolver SDK tests + BrainWriter tests it composes.
src/cli.ts wires `integrity` into CLI_ONLY + dispatches to runIntegrity.
Full suite: 1426 pass / 0 fail / 141 skip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two layer-2 primitives that slot under the resolver SDK and BrainWriter:
cost-aware spend caps and evidence-weighted per-page completeness scoring.
Schema migration v11 adds two tables:
budget_ledger (scope, resolver_id, local_date) PK — midnight rollover by
date column means a new calendar day upserts a new row; no rollover
thread, no race.
budget_reservations (reservation_id) — TTL-bounded held reservations
(default 60s) so process death between reserve() and commit() doesn't
strand money.
Rollback plan: DROP TABLE. Budget data is regenerable from resolver call
logs; no durable product value lives in the ledger.
src/core/enrichment/budget.ts
BudgetLedger.reserve({resolverId, estimateUsd, capUsd?, ttlSeconds?})
serializes concurrent reserves on {scope, resolver_id, local_date} via
SELECT ... FOR UPDATE. Returns {kind:'held', reservationId, ...} or
{kind:'exhausted', reason, spent, pending, cap} — never over-spends.
commit(id, actualUsd) moves money from reserved_usd to committed_usd and
marks the reservation status='committed'. rollback(id) zeros out the
reservation without touching committed. Commit-after-commit throws
already_finalized; rollback-after-commit is a no-op (callers don't need
to guard). commit-unknown-id throws reservation_not_found.
cleanupExpired() sweeps held reservations past expires_at and rolls them
back; reserve() opportunistically reclaims the target row's expired
reservations before acquiring its own lock.
IANA timezone config via opts.tz (default America/Los_Angeles); midnight
rollover is naturally expressed as a date column + Intl.DateTimeFormat
with en-CA locale (YYYY-MM-DD). DST is handled by the formatter.
src/core/enrichment/completeness.ts
Seven per-type rubrics (person, company, project, deal, concept, source,
media) + default. Each rubric's dimension weights sum to 1.0, checked at
module load. scorePage(page) returns {score, dimensionScores, rubric}
where score is 0.000–1.000.
Person rubric dimensions: has_role_and_company, has_source_urls,
has_timeline_entries, has_citations, has_backlinks, recency_score,
non_redundancy. The last two are the explicit fix for the two pathologies
called out in the codex review of the earlier design: stale pages that
never decay (30-day re-enrich forever) and Wilco-style repeated blocks
that pass Wintermute's length heuristic.
Pure functions. No engine calls — BrainWriter invokes scorePage after a
transaction and caches the result in frontmatter.completeness.
test/enrichment.test.ts — 23 tests:
BudgetLedger: under-cap held, over-cap exhausted, commit moves money,
rollback clears, commit-rollback no-op, commit-commit throws, commit-
unknown throws, invalid input, empty state null, scope isolation,
parallel reserves respect cap (10 parallel, cap 1.0, est 0.3 each →
≤ 3 held; state.reservedUsd ≤ 1.0), cleanupExpired reclaims TTL=0.
CompletenessScorer: all 8 rubrics sum to 1.0, empty person scores <0.3,
fully-enriched person >0.8, dimension scores exposed, role detection,
company/concept/source/media/default routing, recency decay with age,
non_redundancy penalizes repeated lines.
Full suite: 1449 pass / 0 fail / 141 skip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the scheduler gap per CEO plan: Minions v7 shipped a durable
runtime but nothing about when jobs should NOT run. This wires
quiet-hours enforcement at claim time (the codex correction — dispatch-
time is wrong because a queued job can become claimable after its window
opens) plus deterministic stagger slots to prevent cron-boundary storms.
Schema migration v12 adds two columns to minion_jobs:
quiet_hours JSONB — {start, end, tz, policy} window config
stagger_key TEXT — partitioning key for deterministic offset
Plus a partial index on stagger_key for later slot-assignment queries.
src/core/minions/quiet-hours.ts
evaluateQuietHours(cfg, now?) → 'allow' | 'skip' | 'defer'. Pure,
deterministic, no engine. Handles straight-line and wrap-around windows
(e.g. 22→7 spans midnight). IANA timezone via Intl.DateTimeFormat;
unknown tz fails open (allow) — safer than hard-blocking every job.
'skip' policy drops the event; 'defer' (default) re-queues for later.
src/core/minions/stagger.ts
staggerMinuteOffset(key) → 0–59, FNV-1a hash. Same key → same slot.
Pure; no module-level state. Used by scheduled resolvers that want to
avoid cron-boundary collisions ("10 jobs all fire at minute 0").
src/core/minions/worker.ts
MinionWorker.tick now consults evaluateQuietHours on every claimed job.
Verdict 'defer' → UPDATE status='delayed', delay_until = now() + 15m
(prevents immediate re-claim loops when the claim query re-runs).
Verdict 'skip' → UPDATE status='cancelled', error_text='skipped_quiet_hours'.
Both paths clear lock_token and require lock_token match in the WHERE
clause so a concurrent stall recovery can't race us.
test/minions-quiet-hours.test.ts — 25 tests:
evaluateQuietHours: null/undefined/invalid config paths (allow fail-open),
straight-line in/out + exclusive-end, wrap-around in (before midnight +
after), skip vs defer policy, timezone-offset propagation (winter PST
vs summer PDT), localHour parity with Date.getUTCHours.
staggerMinuteOffset: deterministic same key → same offset, different
keys spread across buckets (10 keys → ≥5 unique buckets), empty/non-
string edge cases.
Schema v12: quiet_hours and stagger_key columns exist on minion_jobs,
idx_minion_jobs_stagger_key index present.
Full suite: 1474 pass / 0 fail / 141 skip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Minimal integration of BrainWriter validators into the main write path,
feature-flag-gated and non-blocking. The CEO plan explicitly scoped PR 2.5
as a pre-soak landing step: the hook plugs in now, observability lands,
but strict-mode rejection is deferred to a follow-on release gated on the
7-day soak + BrainBench regression ≤1pt.
src/core/output/post-write.ts
runPostWriteLint(engine, slug, opts?) invokes the four BrainWriter
validators (citation, link, back-link, triple-hr) against a freshly
written page and returns a PostWriteLintResult. Skips cleanly when:
- config `writer.lint_on_put_page` is not truthy (default OFF; opts.force overrides)
- the page is not found (shouldn't happen in normal put_page flow)
- the page has frontmatter.validate === false (grandfathered)
Findings are logged to:
- ~/.gbrain/validator-lint.jsonl (capped at 20 findings per line)
- engine.logIngest (ingest_log table) for durable agent-inspectable history
Validator-level exceptions are swallowed so a buggy validator never
breaks put_page.
src/core/operations.ts put_page handler
After importFromContent + runAutoLink, imports runPostWriteLint and
invokes it. Result returns writer_lint: {error_count, warning_count} or
{skipped: reason}. Try/catch wraps the whole hook so an import or
runtime error never blocks the main write.
Enable locally:
gbrain config set writer.lint_on_put_page true
Then every put_page emits a writer_lint summary + appends structured
findings to the ingest log for analysis before the strict-mode flip.
test/post-write-lint.test.ts — 11 tests:
Flag reader (default off, true/1/on, other values false, explicit false)
Hook behavior (flag-off skip, page-not-found skip, validate:false
grandfather skip, force=true overrides flag, dirty page yields citation
error, clean page yields zero findings).
Full suite: 1485 pass / 0 fail / 141 skip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 'does not succeed when no brain is configured' test assumed loadConfig would return null when HOME is empty, but it also reads DATABASE_URL from the environment. When .env.testing sources DATABASE_URL into the shell (normal E2E lifecycle), the orchestrator connects successfully and runs to completion — the test's assertion was unreachable. The dry-run path is still covered by the remaining test in the same describe block; registry integration and semver ordering are covered by the sibling describe. Full suite with DATABASE_URL live: 1574 pass / 0 fail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…runtime # Conflicts: # src/cli.ts # src/commands/migrations/index.ts # test/apply-migrations.test.ts
…eue.add
Codex adversarial review caught that PR 5 (claim-time quiet-hours gate) was
cosmetic: the schema v12 column existed, the worker read it via
`readQuietHoursConfig(job)`, but `MinionJobInput` never accepted it,
`queue.add()` never inserted it, and `rowToMinionJob()` never mapped it out.
Result: every scheduled job saw `quiet_hours: null`, so the gate was a
no-op. Stagger_key had the same broken wiring.
- MinionJob (types.ts): add `quiet_hours` and `stagger_key` fields.
- MinionJobInput: add matching optional fields so callers can submit them.
- rowToMinionJob: parse both columns (JSONB handled the same way as `data`).
- MinionQueue.add: include both columns in the INSERT (idempotent + normal
paths), bound as $19/$20. The `$19::jsonb` cast matches the JSONB column
shape; the wire format is the same native-JS object path that fixed the
JSONB double-encode bug in v0.12.1.
After this, `await queue.add('x', {}, { quiet_hours: {start:22,end:7,
tz:"America/Los_Angeles",policy:"defer"} })` actually stores the window
and the worker's claim-time gate defers the job inside it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rents Codex flagged that handleQuietHoursDefer with verdict='skip' directly set status='cancelled' via raw UPDATE — bypassing MinionQueue.cancelJob, which means: - Parent jobs in 'waiting-children' never get rolled up. - Descendant jobs don't cascade-cancel. - Child-done inbox notification is skipped. Result: a parent waiting on a child that fell inside quiet hours with policy='skip' stays stuck forever. Fix: release the lock, then delegate to queue.cancelJob(job.id) which handles the recursive CTE + parent rollup + inbox posting correctly. Falls back to a direct UPDATE only if cancelJob errors — even then, the status transition is status-guarded to avoid stomping terminal states. Defer path unchanged (no parent rollup needed since the job hasn't reached a terminal state). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex caught two cap-bypass bugs in BudgetLedger.commit():
1. reserve({estimateUsd: 0.01, capUsd: 1.0}) + commit(id, 100) silently
charged $100 to a $1-cap bucket. Cap is an advertised invariant that
the code was not enforcing.
2. Negative actuals (commit(id, -5)) were accepted, letting callers
artificially reduce committed_usd below the real spend. Refunds need
a dedicated API, not a side-channel on commit.
Fix:
- Reject non-finite AND negative actualUsd at entrypoint.
- Lock the ledger row FOR UPDATE during commit (same serialization as
reserve).
- Compute effective cap headroom = cap - other_committed - other_reserved
(excluding this reservation from the reserved pool since we're about to
finalize it).
- When actualUsd would exceed available, clamp committed_usd to max
available and throw BudgetError with the overage reported. The
reservation is still marked 'committed' (API call already happened;
don't retry-loop), but the cap is honored.
After this, a $1/day cap actually means $1/day.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex caught that 'gbrain integrity auto --dry-run' appended progress
entries (status='repaired', 'reviewed', 'skipped', 'error') despite doing
no actual writes. The follow-on real run with default --resume would then
skip those slugs — the dry-run silently consumed the work queue.
Fix: gate every appendProgress() call in cmdAuto on !dryRun. Dry-run
still logs to the skip log / review queue (so the user sees what WOULD
happen), but the progress file stays untouched.
Behavior:
--dry-run → buckets counted + summary printed + review-queue
+ log populated, but progress file unchanged.
(default) → progress file tracks every processed slug, so
Ctrl-C + re-run resumes from the right place.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
GBrain v0.13.0 ships the Knowledge Runtime — typed abstractions that turn a knowledge base into a runtime other agents can voluntarily adopt. Five focused modules build on v0.12's graph layer and v0.11's Minions orchestration, grouped into six logical PRs landed as bisectable commits.
Resolver SDK (PR 1):
Resolver<I,O>typed interface + in-memory registry + 2 reference builtins (url_reachablewith SSRF guard,x_handle_to_tweetwith confidence-scored X API v2).FailImproveLoopgained optionalAbortSignal(backwards compatible).gbrain resolvers list|describeCLI.BrainWriter + validators (PR 2):
BrainWriter.transaction(fn, ctx)over engine.transaction with pre-commit validators.Scaffolderbuilds citations from API IDs (never LLM text).SlugRegistrydetects collisions. Four deterministic validators (citation/link/back-link/triple-hr). v0.13.0 TS migration grandfathersvalidate: falseonto existing pages.gbrain integrity(PR 3): user-facing shipping milestone.gbrain integrity check|auto|review|reset-progresswith three-bucket auto-repair (≥0.8 / 0.5-0.8 / <0.5), resumable progress file, review queue, skip log.BudgetLedger + CompletenessScorer (PR 4): FOR-UPDATE-serialized daily spend cap with TTL auto-reclaim + IANA-TZ midnight rollover. Seven per-type completeness rubrics + default,
non_redundancy+recency_scorekill the Wintermute length-heuristic pathology. Schema v11.Scheduler polish (PR 5): claim-time quiet-hours gate (wrap-around windows, IANA tz), deterministic FNV stagger offset. Schema v12.
Post-write lint hook (PR 2.5): lint-only validator hook on
put_page, gated onwriter.lint_on_put_page(default false). Observability for the strict-mode flip gate.Test Coverage
AI-assessed coverage: 72% (above 60% minimum, short of 80% target). Detailed per-PR breakdown:
Tests: 1522 → 1626 (+104 new).
Coverage gaps flagged as accepted risk. Top untested paths:
integrity autothree-bucket repair integration, v0.13.0 grandfather orchestrator phases,fail-improveAbortSignal threading,resolversCLI, worker quiet-hours defer SQL. E2E + manual smoke validates the shipping paths.Pre-Landing Review
Four P1 bugs caught by codex adversarial review, all fixed in bisectable commits:
53d4414). Schema v12 addedquiet_hours+stagger_keycolumns. Worker read them. ButMinionJobInputnever accepted them,queue.addnever inserted them,rowToMinionJobnever mapped them. Every scheduled job sawquiet_hours: null. Wired the full path.skipstranded parents (7083f01). DirectUPDATE status='cancelled'bypassedMinionQueue.cancelJob— parent jobs inwaiting-childrennever got rolled up. Now routes throughcancelJobfor proper dependency resolution.8e90e39).reserve({estimateUsd:0.01})+commit(id, 100)silently charged $100 to a $1 cap.commitnow locks the ledger row FOR UPDATE, re-checks effective headroom, clamps + throws on overage. Negative actuals rejected (no side-channel refunds).integrity auto --dry-runpoisoned resume state (2fad71d). Dry-run wrotestatus=repairedto the progress file; the follow-on real run would skip those slugs. Progress writes gated on!dryRun.Codex also flagged (not blocking, documented for follow-on):
url_reachableSSRF guard is hostname-only — DNS rebinding attack possible. Shares this limitation with existing wave-3isInternalUrl.x_handle_to_tweet429 handling only parses numericRetry-After, ignoresx-rate-limit-resetheader.people/alicefrom separate processes can both choose the same slug. Single-writer lock inengine.transactionmitigates within one process.[Source:]with empty content — decoration check, not evidence check.auto-linkreconciliation union-of-writes race on concurrentput_pagefor same slug.Adversarial Review
No remaining P0 or P1 blockers for v0.13.0 ship.
Plan Completion
From
~/.gstack/projects/garrytan-gbrain/ceo-plans/2026-04-18-knowledge-runtime-v2.md:Intentionally deferred (per plan): strict-mode default flip (requires 7-day soak),
openai_embeddingrefactor (PR 1.5 post-flip),brain_slug_lookupadapter, Wintermuteclaw-bridge(post-release stretch), sandboxed user TS plugins (embedded-only v1), multi-tenant BudgetLedger (team mode).Verification Results
No dev server running + no UI scope — plan-verification auto-skipped. CLI + backend paths verified via 115 E2E tests (10 files) against real Postgres including Tier 2 skills (Opus/Sonnet agent loops).
Test plan
Schema migrations (automatic on
gbrain init/ upgrade)budget_ledger+budget_reservationstables. Rollback: DROP TABLE (budget regenerable from resolver call logs).minion_jobs.quiet_hours JSONB+stagger_key TEXT+ partial index on stagger_key. Additive nullable columns; existing rows claim unchanged.validate: falseonto existing pages. Idempotent. Rollback log at~/.gbrain/migrations/v0_13_0-rollback.jsonl.🤖 Generated with Claude Code