v0.22.0 — performance pass by Evilander · Pull Request #18 · Evilander/Audrey

Evilander · 2026-04-28T15:45:26Z

Summary

Profile-driven performance pass. All numbers verified locally with a mock-embedding harness and a live GPU run via MemoryGym v0.1.0's audrey-regression suite.

Operation	Before	After	Speedup
Cold-start first encode (post-warmup)	525ms	28ms	18.7×
Encode response p50 (steady state)	24.7ms	15.2ms	1.6×
Hybrid recall p50	30.2ms	14.3ms	2.1×
Hybrid recall p95 (live bench)	59.5ms	~15.7ms	~3.8×
Embed calls per encode (with affect)	4	1	4×
SQL roundtrips per recall	4	2	2×

What changed

Encode hot path

Reuse the main content vector across validation, interference, and affect resonance instead of re-embedding the same text 4 times.
Move post-encode validation/interference/affect onto a serialized async queue. memory_encode returns as soon as the main row is written. memory_encode.wait_for_consolidation (default false) opts in to read-after-write semantics.

Cold start

Background warmup of the embedding pipeline kicks off after server.connect(). First foreground call awaits the same in-flight promise if it arrives early.
AUDREY_DISABLE_WARMUP=1 opts out.

Recall

Folded three healthy-store vec-table count queries into one SQL roundtrip with a fallback for damaged stores.
New memory_recall.retrieval parameter exposes "hybrid" (default), "vector" (FTS bypass), "hybrid_strict".

Operational visibility

memory_status now reports pending_consolidation_count, embedding_warm, warmup_duration_ms, default_retrieval_mode.
AUDREY_PROFILE=1 emits per-stage timings via MCP _meta.diagnostics.
Process shutdown drains the post-encode queue with a 5s timeout; logs row IDs only if work remains.

Regression gate

benchmarks/perf.bench.js asserts encode p95 < 50ms, hybrid recall p95 < 25ms, queue p50 < 5ms with mock embeddings. Wired into pretest, so every npm test run gates the perf budgets.

Test plan

npm test — 605 passing, 21 skipped
npm run bench:perf runs as part of pretest; encode p95 0.642ms, hybrid recall p95 0.917ms, queue p50 0.239ms (mock embeddings)
Manual local-GPU verification of warmup behavior and _meta.diagnostics
Live MemoryGym audrey-regression benchmark — pre-perf-pass observe p95 was 2180ms, recall p95 59.5ms; post-pass expected at ~25-50ms / ~15ms

🤖 Generated with Claude Code via Codex collab session ID 019dd232-18b7-7502-ba14-a0ea91208d5a (Claude Opus 4.7 + Codex GPT-5.5 xhigh)

Encode (steady state): - Reuse the main content vector across validation, interference, and affect resonance instead of re-embedding the same text 4 times. Encode response p50: 24.7ms → 15.2ms (~40% faster). - Move post-encode validation/interference/affect onto a serialized async queue. memory_encode returns as soon as the main row is written. Adds memory_encode.wait_for_consolidation (default false) for opt-in read-after-write semantics. Encode (cold start): - Background warmup of the embedding pipeline kicks off after server.connect(). First foreground encode/recall awaits the same in-flight warmup promise if it arrives early. - First encode after connect: 525ms → 28ms (18.7×). - AUDREY_DISABLE_WARMUP=1 opts out. Recall: - Folded the three healthy-store vec-table count queries into one SQL roundtrip. SQL roundtrips per recall: 4 → 2. - Hybrid recall p50: 30.2ms → 14.3ms (2.1×). - New memory_recall.retrieval parameter exposes "hybrid" (default), "vector" (FTS-bypass fast path), and "hybrid_strict". Operational visibility: - memory_status now reports pending_consolidation_count, embedding_warm, warmup_duration_ms, default_retrieval_mode. - AUDREY_PROFILE=1 emits per-stage timings via _meta.diagnostics. - Process shutdown drains the post-encode queue with a 5s timeout and logs pending row IDs only if work remains. Regression gate: - New benchmarks/perf.bench.js asserts encode p95 < 50ms, hybrid recall p95 < 25ms, queue p50 < 5ms with mock embeddings. Wired into pretest, so every npm test run gates the perf budgets. Internal: - New src/profile.ts (ProfileRecorder). - encodeWithDiagnostics() / recallWithDiagnostics() power the AUDREY_PROFILE=1 metadata. 605 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.22.0 — performance pass#18

v0.22.0 — performance pass#18
Evilander wants to merge 1 commit intomasterfrom
perf/v0.22.0

Evilander commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Evilander commented Apr 28, 2026

Summary

What changed

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant