Skip to content

perf(zql): O(N+M) batch merge path for view change application#5604

Draft
Karavil wants to merge 1 commit intorocicorp:mainfrom
Karavil:goblins/view-apply-batching
Draft

perf(zql): O(N+M) batch merge path for view change application#5604
Karavil wants to merge 1 commit intorocicorp:mainfrom
Karavil:goblins/view-apply-batching

Conversation

@Karavil
Copy link
Contributor

@Karavil Karavil commented Feb 25, 2026

Problem

applyChanges loops sequentially, calling applyChange per row. Each call does a binary search + array copy (immutable) or splice (mutable), making bulk updates O(N*M) where N is view size and M is change count. At 50K changes this takes over 1 second.

Additionally, ArrayView#hydrate() applies each initial row one at a time, creating N intermediate array copies during initial load.

Solution

O(N+M) merge-sort batch path in applyChanges:

  1. Classify all changes as add/remove/edit row changes, bail out to sequential for unsupported types
  2. Sort row changes by comparator (stable: preserves original order within same row key)
  3. Walk existing view array and sorted changes simultaneously (merge-sort), producing a new merged array in a single pass
  4. Group consecutive changes for the same row key and apply them atomically (handles refCount for duplicate adds, add-then-remove, etc.)

Falls back to sequential applyChange loop for:

  • Child changes (require recursive descent into nested relationships)
  • Sort-key-changing edits (require positional remove + reinsert)
  • Singular format (single entry, not an array)
  • Hidden schema (junction table collapsing)

ArrayView hydration batching: collects all initial adds into a single applyChanges call instead of per-row applyChange.

Works with the immutable applyChange API: the batch path always builds a fresh array since we construct it from scratch via merge, so immutability is naturally handled.

No threshold gate

Benchmarks show the batch path is faster at every batch size, even 2 changes. There is no threshold: we always attempt the batch path first and fall back only if the change types are unsupported.

Stack overflow prevention

The sequential path with immutable mode creates intermediate arrays via toSpliced() for each change. The batch path builds a flat array with push(), avoiding this entirely. This matters at >125K elements where array.push(...spread) would blow the stack.

Benchmarks

Changes Sequential Batch Speedup
5 5.1 us 3.9 us 1.3x
10 8.4 us 4.8 us 1.7x
50 29 us 14 us 2.2x
100 12 us 4.9 us 2.5x
500 85 us 24 us 3.5x
1,000 219 us 49 us 4.5x
5,000 3.1 ms 241 us 13x
10,000 10.1 ms 482 us 21x
50,000 1,307 ms 3.0 ms 433x

Test plan

  • All 30 existing applyChange tests pass (both zql and zqlite-zql-test)
  • All 19 existing ArrayView tests pass
  • 15 new applyChanges batch tests covering:
    • Batch adds (correct sorted order)
    • Batch removes
    • Batch in-place edits (sort key unchanged)
    • Mixed add/remove/edit batches (identical to sequential)
    • Batch with existing view entries (interleaved adds)
    • Duplicate adds (refCount handling)
    • Add-then-remove same row (net zero)
    • Empty changes (identity)
    • Single change
    • Immutability (new parent entry returned)
    • withIDs support
    • Fallback for child changes
    • Fallback for singular format
    • Fallback for sort-key-changing edits

@vercel
Copy link

vercel bot commented Feb 25, 2026

Someone is attempting to deploy a commit to the Rocicorp Team on Vercel.

A member of the Team first needs to authorize it.

@Karavil Karavil force-pushed the goblins/view-apply-batching branch from d99c12c to e921d15 Compare February 25, 2026 01:47
@Karavil Karavil changed the title perf(zql): batch view-apply changes for O(N+M) merge path perf(zql): O(N+M) batch merge path for view-apply changes Feb 25, 2026
@Karavil Karavil changed the title perf(zql): O(N+M) batch merge path for view-apply changes perf(zql): batch merge path for view change application Feb 25, 2026
@Karavil Karavil changed the title perf(zql): batch merge path for view change application perf(zql): O(N+M) batch merge path for view change application Feb 25, 2026
Problem: applyChanges loops sequentially, calling applyChange per row.
Each applyChange does a binary search + splice, making bulk updates O(N*M)
where N is view size and M is change count. At 50K changes this takes >1s.

Solution: O(N+M) merge-sort batch path that walks the existing sorted view
array and sorted changes simultaneously, producing a new merged array in a
single pass. Always attempted first (no threshold gate); falls back to
sequential for unsupported change types (child changes, sort-key-changing
edits, singular format, hidden schemas).

Also batches ArrayView hydration: collects all initial adds into a single
applyChanges call instead of per-row applyChange.

Works with the immutable applyChange API (returns new Entry, respects
mutate parameter). The batch path always builds a fresh array since we
construct it from scratch via merge, so immutability is naturally handled.

Benchmarks (batch vs sequential):
  5 changes:     1.3x faster
  10 changes:    1.7x faster
  50 changes:    2.2x faster
  100 changes:   2.5x faster
  500 changes:   3.5x faster
  1,000 changes: 4.5x faster
  5,000 changes: 13x faster
  10,000 changes: 21x faster
  50,000 changes: 433x faster

Stack overflow prevention: the old sequential path used array.splice() which
could blow the call stack at >125K elements. The batch path builds a flat
array with push(), avoiding this entirely.
@Karavil Karavil force-pushed the goblins/view-apply-batching branch from e90b29e to 8db97d4 Compare February 25, 2026 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant