Skip to content

feat: doc semaphore as runtime PATCH + per-trace cache hit/miss#206

Merged
JustMaier merged 1 commit intomainfrom
ivy/doc-semaphore-patchable
Apr 14, 2026
Merged

feat: doc semaphore as runtime PATCH + per-trace cache hit/miss#206
JustMaier merged 1 commit intomainfrom
ivy/doc-semaphore-patchable

Conversation

@JustMaier
Copy link
Copy Markdown
Contributor

Summary

  • Semaphore permits as PATCH (doc_disk_read_permits) — default 64, atomic swap via ArcSwap, not persisted
  • QueryTrace gains docs_cache_hits + docs_cache_misses per query for correlation
  • Brings back the OOM protection removed in the semaphore-less v1.0.216, but live-tunable to find the sweet spot

Problem

Session history:

  • 16 permits (orig): 14.5s convoy under cold cache (too few)
  • unlimited (v1.0.216): OOM at 16GB+ RSS (too many, uncapped page faults)
  • 64 permits (this): guess between, tunable via PATCH

Traces have docs_us + docs_count total but no per-query cache hit/miss, so slow queries can't be correlated with their cache behavior.

Changes

  • src/server.rs: DOC_DISK_READ_SEMAPHORE as ArcSwap<Semaphore>, acquire via load_full. PATCH handler swaps atomically. Per-query AtomicU64 hits/misses populated in both inline and spawn_blocking paths.
  • src/query_metrics.rs: Two new fields on QueryTrace, #[serde(default)] for back-compat.

Test plan

  • Compiles clean (release)
  • Deploy, confirm default 64 holds RSS stable
  • curl -X PATCH /indexes/civitai/config -d '{"doc_disk_read_permits": 32}' — tune down under memory pressure
  • Look at trace UI — new fields populate correctly
  • Correlate slow queries with their docs_cache_misses count

🤖 Generated with Claude Code

Two changes for live-tuning the miss-path doc read bottleneck:

1. Semaphore permits tunable via PATCH doc_disk_read_permits.
   Default 64 (was removed to fix 14.5s convoy, but caused OOM).
   Uses ArcSwap to atomically replace the semaphore — in-flight
   permits on the old one drain naturally; new acquires hit the new.
   Not persisted — experiment lever.

2. QueryTrace gains docs_cache_hits + docs_cache_misses fields.
   Populated from both the inline path (all hits → hits=batch_size)
   and the spawn_blocking path (hits=total-miss_count).
   Enables per-query correlation of doc cache behavior with latency.

Together these let us tune the convoy/OOM tradeoff live and measure
the doc cache hit rate distribution per query shape.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@JustMaier JustMaier merged commit 913251e into main Apr 14, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant