perf(p99): skip ForcePublish for filter lazy loads (v2, no engine clone)#198
Merged
perf(p99): skip ForcePublish for filter lazy loads (v2, no engine clone)#198
Conversation
… internal RwLock Previous attempt (PR #197, reverted) tried to clone + publish the entire InnerEngine from the query thread, which was O(all fields) and caused a 1.85s flush regression. New approach: FilterField::load_values() and load_field_complete() take &self and use internal RwLock — they can be called directly on the published snapshot's fields without cloning the engine or publishing. Loaded bitmaps become visible to all readers immediately through the shared Arc<FilterField>. Key changes: - Filter loads (load_field_complete, load_values): apply directly to the current snapshot, skip ForcePublish entirely. Fire-and-forget send to flush thread for its staging copy. - Sort loads (load_layers needs &mut self): still use ForcePublish round-trip to flush thread. This is safe because: - FilterField::bitmaps is RwLock<HashMap> — internal mutation is sound - The flush thread also applies the same data from lazy_tx (idempotent) - No engine clone, no ArcSwap::store race Expected: filter lazy_load_us drops from 1-5s (ForcePublish block) to 10-100ms (actual disk read time). Sort lazy loads still block but are much rarer than filter loads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces reverted PR #197 with a safe approach that doesn't clone InnerEngine.
Filter lazy loads are the most common per_value_lazy path (postId, modelVersionIds, tagIds). The old path blocked on ForcePublish (~431ms) waiting for the flush thread. PR #197 tried to clone + publish the engine (caused 1.85s regression).
New approach:
FilterField::load_values()takes&selfwith internal RwLock — call it directly on the published snapshot's fields. No engine clone, no ArcSwap::store, no ForcePublish. Loaded bitmaps are immediately visible to all readers.Sort loads still use ForcePublish (rare —
load_layersneeds&mut self).Why this is safe
FilterField::bitmapsisRwLock<HashMap<u64, VersionedBitmap>>— designed for concurrent accesslazy_tx(idempotent, applies on next cycle)store()What failed in PR #197
(*current).clone()deep-cloned InnerEngine on the query thread — O(all fields), 1.85s on 109M records. Caused flush to spike from 431ms to 1.85s.Expected impact
Filter
lazy_load_usdrops from 1-5s → actual disk read time (10-100ms).Test plan
cargo check --features serverpasses🤖 Generated with Claude Code