Skip to content

perf(zql): fuse fetch pipeline, add PK fast path, reduce allocations#5612

Draft
Karavil wants to merge 6 commits intorocicorp:mainfrom
goblinshq:perf/fetch-pipeline
Draft

perf(zql): fuse fetch pipeline, add PK fast path, reduce allocations#5612
Karavil wants to merge 6 commits intorocicorp:mainfrom
goblinshq:perf/fetch-pipeline

Conversation

@Karavil
Copy link
Contributor

@Karavil Karavil commented Feb 25, 2026

Note: This PR is part of an upstream contribution effort from the Goblins team (@goblinshq). Co-authored with Claude by Anthropic.

Summary

Fuse the MemorySource #fetch generator pipeline, add a direct PK lookup fast path, and add overlay fast paths.

Motivation

MemorySource #fetch is the entry point for every data scan in the IVM pipeline. The current implementation chains 5 generators together: generateRows -> generateWithOverlay -> generateWithStart -> generateWithConstraint -> generateWithFilter. Each generator adds a suspend/resume frame per row.

In a workload with 135 IVM pipelines, each fetching ~200 rows, the generator frame overhead compounds: 5 frames x 200 rows x 135 pipelines = 135,000 generator suspend/resume cycles per page render. This was the single largest contributor to CPU time in our profiling.

Additionally, many fetches are single-row PK lookups (e.g., fetching a specific assignment by ID) that still go through the full 5-generator pipeline despite only ever returning 0 or 1 rows.

Changes

Generator fusion

  • No-overlay path (generateFetchDirect): Replaces the 5-generator chain with a single generator that handles start position, constraint matching, and filter predicate in one loop. Eliminates 4 generator frame suspend/resume costs per row.
  • With-overlay path (generatePostOverlayFused): After overlay interleaving, fuses start + constraint + filter into a single generator. Reduces from 4 post-overlay generators to 1.

PK fast path

  • When filters constrain to a single primary key value and no overlay is active, perform a direct BTree.get() O(log n) lookup instead of scanning the full index. Returns a single-element array or empty array, bypassing the generator pipeline entirely.

Non-generator #fetch

  • Convert *#fetch() from a generator function to a regular function returning Iterable<Node | 'yield'>. The callers already consume it via for...of, so this removes one generator frame with no behavioral change.

Overlay fast paths

  • Skip overlay processing entirely when no overlay is active or when the overlay doesn't affect the current fetch (add and remove both undefined after constraint/filter application)
  • connectionComparator: use compareRows directly for non-reverse case, avoiding closure wrapping

Code cleanup

  • Use typed locals (as Value, as string) instead of repeated as casts in comparator functions
  • Remove dead code (generateWithConstraint, generateWithFilter, generateRows) superseded by fused generators

Expected Performance Impact

The generator fusion is the single biggest win. For 135 IVM pipelines, each fetch previously went through 5 generator frames with suspend/resume overhead per row. The fused paths reduce this to 1 generator (no overlay) or 2 generators (with overlay), eliminating ~80% of generator frame overhead.

The PK fast path provides O(log n) direct lookup for single-row fetches, avoiding the entire generator pipeline. This is particularly impactful for join child fetches that look up individual rows by foreign key.

Combined with the full optimization series, these changes contributed to reducing page freeze from ~7.7s to <1s in a production scenario (45 parent rows x ~200 related rows, 135 IVM pipelines).

Testing

  • All 1238 existing IVM tests pass (memory-source, source, join, filter, exists, take, skip, yield, fan-out, union, etc.)
  • TypeScript compilation passes with zero errors

Stack Order

This PR is part of a stacked series of IVM performance optimizations. Merge in order:

  1. perf(zql): reduce allocations with frozen sentinel and object reuse #5609 - Allocation reduction
  2. perf(zql): cache primary index key, pkConstraint, and index lookups #5610 - Index caching
  3. perf(zql): comparator fast paths + compareBounds null fix #5611 - Comparator fast paths
  4. perf(zql): fuse fetch pipeline, add PK fast path, reduce allocations #5612 (this PR) - Fetch pipeline fusion

Independent PRs (no conflicts): #5607 (BTree iterators), #5608 (Join optimizations)

@vercel
Copy link

vercel bot commented Feb 25, 2026

Someone is attempting to deploy a commit to the Rocicorp Team on Vercel.

A member of the Team first needs to authorize it.

Alp added 6 commits February 25, 2026 06:10
…euse

Two allocation reduction optimizations for the IVM push hot path:

1. Shared EMPTY_RELATIONSHIPS sentinel: Replace per-node {} allocation
   with a frozen shared object, reducing GC pressure during fetch and push.

2. Reuse outputChange objects in genPush: Pre-allocate reusable objects
   and mutate row fields before yielding, instead of creating new objects
   per connection.

Object reuse is safe because filterPush consumers are synchronous within
the generator chain.
Cache frequently recomputed values to avoid repeated JSON.stringify
and map lookups on hot paths:
- Cache #primaryIndexKey in constructor (avoid JSON.stringify per call)
- Cache pkConstraint on Connection (avoid recomputing from filters)
- Cache #getOrCreateIndex results per connection (avoid repeated lookups)

Part of IVM pipeline perf optimizations that reduced page freeze from
~7.7s to <1s in a production app.
…imization

Optimize hot comparison paths in the IVM pipeline:

* Add compareStringUTF8Fast for ASCII-fast string comparison with UTF-8 fallback
* Reorder compareValues to check strings before nulls (most common type)
* Add single-key fast path in makeComparator avoiding loop overhead
* Add single-key fast path in makeBoundComparator with fully inlined comparison
* Fix compareBounds null handling for nullable database columns
@Karavil Karavil force-pushed the perf/fetch-pipeline branch from 332d920 to be6b85b Compare February 25, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant