Skip to content

fix: Defer index creation on empty/insufficient tables#71

Merged
tryweb merged 1 commit intomainfrom
fix/vector-index-empty-table
Apr 8, 2026
Merged

fix: Defer index creation on empty/insufficient tables#71
tryweb merged 1 commit intomainfrom
fix/vector-index-empty-table

Conversation

@tryweb
Copy link
Copy Markdown
Owner

@tryweb tryweb commented Apr 7, 2026

Summary

Fixes #70

Add row count guard before attempting vector/FTS index creation to prevent noisy error messages when LanceDB tables have insufficient training data (< 256 rows for IVF/PQ indices).

Problem

When starting with an empty LanceDB table, the system produces noisy error messages:

[store] Vector index creation failed (attempt 1/3): lance error: Not supported: Creating empty vector indices with train=False is not yet implemented. Retrying in 512ms...
[store] Vector index creation failed (attempt 2/3): lance error: Not supported: Creating empty vector indices with train=False is not yet implemented. Retrying in 1045ms...
[store] Vector index creation failed (attempt 3/3): lance error: Not supported: Creating empty vector indices with train=False is not yet implemented
[store] Vector index creation failed after 3 attempts: lance error: Not supported: Creating empty vector indices with train=False is not yet implemented. Falling back to in-memory search.

Solution

  • Add MIN_ROWS_FOR_INDEX = 256 constant (LanceDB IVF/PQ training requirement)
  • Check table.countRows() before attempting index creation
  • Silently defer with info log: [store] Deferring vector index creation: 0 rows found (need ≥ 256)
  • No 3x retry errors for empty tables

Changes

File Change
src/store.ts +17 lines: row count guard, countRows type, MIN_ROWS_FOR_INDEX constant
test/unit/index-race-condition.test.ts +84 lines: 6 new test cases for empty/insufficient row scenarios

Test Results

✓ createVectorIndexWithRetry: empty table defers index creation silently
✓ createVectorIndexWithRetry: insufficient rows defers index creation
✓ createVectorIndexWithRetry: sufficient rows attempts index creation
✓ createFtsIndexWithRetry: empty table defers index creation with error message
✓ createFtsIndexWithRetry: insufficient rows defers index creation
✓ All 11 existing tests still pass

Impact

Before After
3x retry errors on empty table Single info log, silent deferral
Scary error messages Clear, informative message
Functional (fallback works) Functional (same behavior, cleaner UX)

Closes #70

Add row count guard before attempting vector/FTS index creation
to prevent noisy error messages when LanceDB tables have insufficient
training data (< 256 rows for IVF/PQ indices).

Changes:
- Add MIN_ROWS_FOR_INDEX constant (256) for LanceDB training data requirement
- Add countRows() check before createIndex() in both createVectorIndexWithRetry and createFtsIndexWithRetry
- Silently defer index creation with info log instead of 3 retry errors
- Add countRows to LanceTable type definition
- Add 6 test cases for empty/insufficient row scenarios

Fixes #70
@tryweb tryweb merged commit 0a0f223 into main Apr 8, 2026
9 checks passed
@tryweb tryweb deleted the fix/vector-index-empty-table branch April 8, 2026 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Noisy error on startup: Vector index creation fails on empty table

1 participant