fix(astro): add batch-size override and ECONNRESET retry to incremental search indexer#133
Merged
chris-c-thomas merged 1 commit intomainfrom Apr 25, 2026
Merged
Conversation
…al search indexer Meilisearch can silently restart mid-task under memory pressure (observed ~60s crash cycles during FR bulk upserts on the 7.6 GiB VPS), causing ECONNRESET on either the addDocuments POST or the waitForTask polling that follows. The previous indexer died outright on the first failure even though the submitted task was already persisted in LMDB and would typically resume on server recovery. Changes (apps/astro/scripts/index-search-incremental.ts): - New flushWithRetry() in BatchIndexer waits for /health to return "available" (up to 180s) and retries the wait on the original taskUid rather than resubmitting the batch. Up to 5 attempts per flush. - New --batch-size <n> CLI flag and MEILI_BATCH_SIZE env var override the default of 500 docs/batch. Smaller batches reduce per-flush Meilisearch memory and let crash recovery happen between batches instead of inside one. - New --verbose-batches flag prints the first/last doc ID of every flushed batch, with stdout force-flushed so the last logged ID is durable through a crash. Combined with --batch-size 1 this isolates poison documents. apps/astro/CLAUDE.md already documents these flags; this commit brings the code in line with the documentation. The full-reindex sibling (index-search.ts) has the same OOM-vulnerable pattern and should get the same treatment in a follow-up — scoped out of this PR because only the incremental script was field-validated on the VPS.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
--batch-size <n>CLI flag andMEILI_BATCH_SIZEenv var to override the default 500 docs/batch inindex-search-incremental.ts--verbose-batchesflag that logs first/last doc ID of every flushed batch (with stdout force-flush so the last logged ID survives a crash) — pair with--batch-size 1to bisect poison docsflushWithRetry()toBatchIndexerthat handles ECONNRESET during Meilisearch OOM crashes by waiting for/healthand retrying on the originaltaskUidrather than resubmittingWhy
Meilisearch can silently restart mid-task under memory pressure. On the 7.6 GiB VPS we observed ~60s crash cycles during FR bulk upserts that would die outright on the first ECONNRESET — even though the submitted task is persisted in LMDB and would typically resume on server recovery. The new retry waits for
/healthto returnavailable(up to 180s) and resumes the original task instead of giving up.This work was field-tested on the production VPS but never made it back to
main—apps/astro/CLAUDE.md:161already documents these flags as if they exist. This PR brings the code in line with the docs.Scope
Only
apps/astro/scripts/index-search-incremental.tsis modified. The full-reindex siblingapps/astro/scripts/index-search.tshas the sameBatchIndexershape and the same OOM-vulnerableaddDocuments+waitForTaskpattern — it should get the same treatment as a follow-up. Scoped out here because only the incremental script was field-validated.Test plan
npx tsx scripts/index-search-incremental.ts --batch-size abcrejects non-integerMEILI_BATCH_SIZE=xyz npx tsx scripts/index-search-incremental.tsrejects non-integer--batch-size 1 --verbose-batches --set-checkpoint --source frparses flags and writes checkpointgray-matter cacheerrors are unrelated and documented)--source fr --batch-size 100 --verbose-batchesafter merge to confirm retry path on next deploy