Skip to content

fix(migrations): bump v0.12.0 backfill timeout from 10min to 30min#201

Open
orendi84 wants to merge 1 commit intogarrytan:masterfrom
orendi84:fix/v0-12-0-backfill-timeout
Open

fix(migrations): bump v0.12.0 backfill timeout from 10min to 30min#201
orendi84 wants to merge 1 commit intogarrytan:masterfrom
orendi84:fix/v0-12-0-backfill-timeout

Conversation

@orendi84
Copy link
Copy Markdown
Contributor

Problem

phaseCBackfillLinks and phaseDBackfillTimeline in src/commands/migrations/v0_12_0.ts each wrap gbrain extract {links,timeline} --source db in an execSync call with timeout: 600_000 (10 minutes).

On a brain of non-trivial size that is too tight. On my 13,002-page brain, each extraction takes ~11 minutes - right past the cutoff. Both phases return failed, the orchestrator marks the migration partial, and the daily post-upgrade hook keeps retrying the same thing and failing the same way every day.

Fix

Bump both timeouts from 600_000 (10 min) to 1_800_000 (30 min).

At the observed rate of ~50 sec/1k pages this covers brains up to ~35k pages; larger brains will need a proportional extension, but 30 min is a far safer default than 10 min for any non-trivial corpus.

Verification

On a 13,002-page brain (127,093 chunks, all embedded):

Before the fix: migration consistently finished as partial, both backfill phases timed out at ~10 min with no output written.

After the fix: migration completes cleanly in ~22 min total. Both backfill phases run to completion (~11 min each), produce their summary lines, and verify phase runs after. Migration status: complete.

Scope

One-line change to two lines in one file. No new imports, no test changes needed (the existing test/migrate.test.ts doesn't exercise real execSync against prod-sized data so the timeout path isn't covered by tests today - consider that a follow-up).

Phase C (backfill_links) and D (backfill_timeline) both shelled out with a
10-minute timeout. On a 13k-page brain, each extraction takes ~11 minutes,
so both phases returned 'failed' via execSync timeout and the orchestrator
marked the migration 'partial'. Schema + verify phases were unaffected.

Bump both timeouts to 30 minutes (1_800_000 ms). At the observed rate of
~50 sec/1k pages this covers brains up to ~35k pages; larger brains will
need a proportional extension.

Verified by re-running on a 13002-page brain: both phases complete cleanly
in ~11 min each, migration marks 'complete'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant