v8.0: Phases 15, 16, 19 — File Watcher, Embedding Cache, Plugin Updates by RichardHightower · Pull Request #116 · SpillwaveSolutions/agent-brain

RichardHightower · 2026-03-12T22:24:35Z

Summary

Milestone v8.0 (Performance & DX) — ships Phases 15, 16, and 19 with 46 commits across 88 files (+13k/-1.7k lines).

Phase 15: File Watcher & Background Incremental Reindex

FileWatcherService using watchfiles for real-time filesystem monitoring
Background incremental reindex on file changes (debounced, dedup-aware)
CLI --watch flag for agent-brain index and folder management
Health endpoint reports watcher status

Phase 16: Embedding Cache (2-Layer: LRU Memory + aiosqlite Disk)

EmbeddingCacheService with in-memory LRU (1k entries) + persistent SQLite (WAL mode)
Cache-intercepted embed_text() and embed_texts() with batch put_many()
Provider/model change auto-wipe on startup
CLI cache status and cache clear commands
Event-loop starvation fixes: all CPU-heavy pipeline stages wrapped in asyncio.to_thread()
- ChromaDB upsert, BM25 build, graph index build, content injection, document loading, chunking
13/13 UAT tests passing

Phase 19: Plugin & Skill Updates for Embedding Cache

New agent-brain-cache slash command for cache management
Updated help text, API reference, and agent instructions
Cache-aware skill and plugin documentation

Infrastructure

UAT tester agent (uat-tester.md) with pre-granted permissions for E2E testing
Upgraded release agent (release_agent.md) with comprehensive permissions
Planning docs and research artifacts

Test plan

task before-push passes (894 server + 156 CLI tests)
Server coverage: 77%, CLI coverage: 60%
Phase 16 UAT: 13/13 tests passing (including cache clear < 10s during active indexing)
Phase 15 UAT: 14/15 tests passing (1 minor env-specific issue)
CI pipeline passes

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Research synthesis for Agent Brain v8.0 Performance & Developer Experience. Covers stack (watchfiles, aiosqlite, cachetools), features (embedding cache, query cache, file watcher, background incremental, UDS transport), architecture (injection-first pattern, dual Uvicorn server), and 10 critical pitfalls. SUMMARY.md includes 4-phase roadmap implications and confidence assessment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…tegories) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Phases 15-18: File Watcher → Embedding Cache → Query Cache → UDS Transport 28 requirements mapped across 4 phases with 100% coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…dates Verified watchfiles 1.1.1 awatch() per-folder task pattern in project venv. Confirmed anyio.Event stop_event works in asyncio context. Documented DefaultFilter built-in exclusions, debounce millisecond conversion, and backward-compatible FolderRecord/JobRecord extensions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…pdates Two plans in 2 waves covering FileWatcherService, data model extensions, CLI watch flags, job source tracking, and plugin documentation updates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…emental updates - Add watch_mode, watch_debounce_seconds, include_code fields to FolderRecord - Backward-compatible _load_jsonl using .get() for missing watch fields - Add source field to JobRecord, JobSummary, JobDetailResponse (default='manual') - Add source parameter to enqueue_job() (default='manual') - Add watch_mode, watch_debounce_seconds to IndexRequest - Add watch_mode, watch_debounce_seconds to FolderInfo API response - Add AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS=30 to Settings - 13 new tests covering all model changes and backward compat

…h status - New FileWatcherService with per-folder asyncio tasks using watchfiles.awatch() - AgentBrainWatchFilter extends DefaultFilter with dist/build/.next/coverage dirs - start()/stop() lifecycle via anyio.Event for clean shutdown - add_folder_watch()/remove_folder_watch() for dynamic registration - _enqueue_for_folder() routes via job queue with source='auto', force=False - Wired into lifespan: starts after JobWorker, stops before JobWorker - /health/status includes file_watcher.running and file_watcher.watched_folders - IndexingStatus model extended with file_watcher field - 18 unit tests covering all service behaviors - task before-push exits 0 (860 passed, 77% coverage)

… shipped - 15-01-SUMMARY.md: documents FileWatcherService, model extensions, 31 tests - STATE.md: advance to Plan 02, add v8.0 Phase 15 decisions - ROADMAP.md: mark 15-01-PLAN.md complete [x]

- Add watch_mode/watch_debounce_seconds to JobRecord for job-level tracking - Index router passes watch fields from IndexRequest to enqueue_job - JobWorker._apply_watch_config() updates FolderRecord and notifies FileWatcherService after successful job completion - CLI folders add: --watch auto/off and --debounce flags - CLI folders list: Watch column showing auto/off per folder - CLI jobs: Source column showing manual/auto per job - CLI client: watch_mode and watch_debounce_seconds params on index() - IndexingService passes include_code to folder_manager.add_folder() - 10 server tests + 6 CLI tests for watch integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ction - api_reference.md: folders add/list/remove examples, Watch column, Source column, File Watcher section with debounce and exclusion docs - agent-brain-index.md: --watch and --debounce parameters, watch auto examples, file watching notes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ode integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…emental detection _verify_collection_delta read job.eviction_summary which was only set after verification passed — a catch-22 that prevented zero-change incremental runs from passing. Now receives eviction_result directly from the pipeline as a parameter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Round 2 after eviction_result verification fix. Blocker resolved. Only remaining issue is #4 (Watch column not observable in live run due to test environment having no indexable documents — code confirmed correct via CLI unit tests). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…LOB) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two plans in 2 waves covering ECACHE-01 through ECACHE-06: - Plan 01 (Wave 1): EmbeddingCacheService, EmbeddingGenerator integration, API endpoints, settings - Plan 02 (Wave 2): CLI cache commands, status display, health endpoint integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Create EmbeddingCacheService with two-layer cache (OrderedDict LRU + aiosqlite WAL) - Cache key: SHA-256(content):provider:model:dimensions (ECACHE-01) - Provider fingerprint auto-wipe on mismatch at startup (ECACHE-04) - get_batch() batch SQL lookup for embed_texts() efficiency - float32 BLOB storage via struct.pack for ~12 KB/entry at 3072 dims - Add EMBEDDING_CACHE_MAX_DISK_MB/MAX_MEM_ENTRIES/PERSIST_STATS to Settings - Add embedding_cache subdirectory to SUBDIRECTORIES and resolve_storage_paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ests - Intercept embed_text/embed_texts with lazy-import cache check (breaks circular import) - embed_texts uses get_batch() for single-query batch lookup, calls provider only for misses - embed_query gets caching for free via embed_text delegation - Initialize EmbeddingCacheService in lifespan BEFORE IndexingService (ECACHE-07) - Add _build_provider_fingerprint() helper to api/main.py - Register cache_router at /index/cache with GET (status) and DELETE (clear) endpoints - Add embedding_cache field to IndexingStatus; populate in health.py when entry_count > 0 - Write 22 unit tests: all 8 required test cases + coverage for stats/singleton - Update test_storage_paths.py to include embedding_cache in expected keys - task before-push: 893 passed, 23 skipped, 0 failed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…red into EmbeddingGenerator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add DocServeClient.cache_status() GET /index/cache/status - Add DocServeClient.clear_cache() DELETE /index/cache - Create commands/cache.py with cache_group (status + clear subcommands) - cache status: Rich table with entry_count, hit_rate, hits, misses, mem_entries, size - cache clear: confirmation prompt with entry count; --yes/-y to skip - Register cache_group in commands/__init__.py and cli.py - Update CLI help text to include Cache Commands section

…nd tests - Add embedding_cache field to IndexingStatus dataclass in api_client.py - Populate embedding_cache from /health/status response in DocServeClient.status() - agent-brain status now shows embedding cache line: N entries, X% hit rate (H hits, M misses) - status --json includes embedding_cache dict (or null for fresh installs) - Fix test_status_json_output: set mock_status.embedding_cache=None (JSON serializable) - New test_cache_command.py: 12 tests covering help, status, clear (confirmation + --yes) - [Rule 1 - Bug] Fixed MagicMock JSON serialization error in existing test

- 16-02-SUMMARY.md: cache group commands + metrics in status + 12 tests - STATE.md: Phase 16 plan 2/2 complete; decisions recorded

- 16-02-PLAN.md checkbox checked - Phase 16 progress table: 2/2, Complete, 2026-03-10

Client was calling GET /index/cache/status but server mounts at GET /index/cache/. Fixed URL to match server router. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

9/9 must-haves verified, all 6 ECACHE requirements satisfied. Gap fix: CLI cache status endpoint URL corrected (7fea667). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rompt default - Add "source" key to metadata dict in document_loader.py load_from_folder() and load_single_file() so indexing_service manifest diffing can find file paths (was reading doc.metadata["source"] which was never set) - Add trailing slash to DELETE /index/cache/ in api_client.py to prevent 307 redirect that caused JSONDecodeError on cache clear - Pass default=False to Rich Confirm.ask so prompt shows [y/N] instead of [y/n] - Update 16-UAT.md with full round 2 results (6 pass, 7 issues) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…che routes Fix 13: Belt-and-suspenders for metadata["source"] in indexing_service.py with fallback chain + invariant guard for documents > 0 with no paths. Fix 13: Early return in query_service.py when index is empty (avoids 500). Fix 7: Cache router rewritten with shared impl + no-slash aliases. Fix 6: Confirm.ask default=False + regression test for prompt default. Fix 11: Narrowly omit embedding_cache key when None in health endpoint. Fix 10: Add --verbose/-v flag to status command for cache detail. All 916 server + 156 CLI tests pass. task before-push clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…locking Fix 11: Remove response_model=IndexingStatus from /health/status route. Always return model_dump dict so the embedding_cache pop actually works (response_model was re-serializing through Pydantic, re-adding null). Fix 8: cache clear() no longer acquires self._lock. Uses its own DB connection with PRAGMA busy_timeout=5000 so it never blocks behind a long embedding write stream. SQLite WAL handles concurrent writers at the page level. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix 10: /health/status now prefers storage_backend.get_count() over legacy vector_store, fixing total_chunks: 0 when the two Chroma instances diverge. Fix 8: Add asyncio.sleep(0) yield every 10 docs in chunking loops (ContextAwareChunker.chunk_documents + code chunking loop) so HTTP requests aren't starved during long indexing runs. Fix 11 test: Add integration test asserting /health/status omits embedding_cache key (not null) when cache is empty. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix 8 (round 6): The 29.4s cache clear timeout was caused by two things compounding: 1. embed_texts() miss loop calls cache.put() per miss with no yield, starving HTTP request processing during large indexing runs. Fix: asyncio.sleep(0) every 10 cache writes. 2. clear() ran VACUUM inline, which can take 10-20s on a large DB and needs an exclusive lock. Fix: VACUUM scheduled as fire-and-forget background task via asyncio.create_task(), so DELETE + response returns immediately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…veness - Wrap document post-processing (stat, language detect) in asyncio.to_thread() - Wrap chunk_single_document text splitting in asyncio.to_thread() - Wrap chunk_code_document tree-sitter parsing in asyncio.to_thread() - Replace fire-and-forget VACUUM with PRAGMA wal_checkpoint(TRUNCATE) - Add put_many() batch cache writes (single transaction, one lock) - Use put_many() in embed_texts() instead of per-miss put() loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…thread() Root cause: previous fixes only wrapped document loading and chunking in asyncio.to_thread(), but the pipeline's heaviest sync operations were still blocking the event loop: - ChromaDB collection.upsert() with up to 40k items per batch - BM25Retriever.from_defaults() + persist() (tokenization + disk I/O) - GraphIndexManager.build_from_documents() (entity extraction) - ContentInjector.apply_to_chunks() (metadata enrichment) - os.stat() during manifest save All sync-heavy operations in the indexing pipeline now run in threads, keeping the event loop free for concurrent HTTP requests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…he management

Phase 16 (Embedding Cache) UAT fully passed after 8 rounds of fixes. Key final fix: wrapping all CPU-heavy indexing pipeline stages in asyncio.to_thread() to prevent event-loop starvation. Also adds uat-tester agent and uat-testing skill for permission-free UAT testing in future phases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Phase 16 delivers a two-layer embedding cache: - Layer 1: In-memory LRU (OrderedDict, 1000 entries) - Layer 2: Persistent aiosqlite disk cache (WAL mode, 500MB limit) Key features: - Cache-intercepted embed_text/embed_texts with batch lookup - Provider fingerprint auto-wipe on model change (ECACHE-04) - CLI: agent-brain cache status, agent-brain cache clear --yes - /index/cache/status and DELETE /index/cache/ API endpoints - Health endpoint includes cache metrics when populated - All CPU-heavy indexing stages run in asyncio.to_thread() for non-blocking API responsiveness during indexing UAT: 13/13 tests passing across 8 rounds of validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ference - Create agent-brain-plugin/commands/agent-brain-cache.md with status and clear subcommands - Add CACHE COMMANDS category to agent-brain-help.md display section - Add agent-brain-cache row to Command Reference table in agent-brain-help.md - Add GET /index/cache and DELETE /index/cache sections to api_reference.md with response schemas - Add agent-brain cache status/clear commands to CLI Commands Reference section

- Add cache trigger phrases to using-agent-brain SKILL.md YAML description - Add Cache Management section to using-agent-brain SKILL.md (when to check, when to clear) - Add Cache Management to Contents table of contents - Add cache performance trigger pattern to search-assistant.md - Add cache performance check step (step 6) to search-assistant.md assistance flow - Add EMBEDDING_CACHE_MAX_MEM_ENTRIES and EMBEDDING_CACHE_MAX_DISK_MB to configuring-agent-brain env vars table - Add Embedding Cache Tuning note to configuring-agent-brain SKILL.md

…nagement - 19-01-SUMMARY.md: 6 plugin files updated with cache management surface - STATE.md: session progress recorded, plan advanced - ROADMAP.md: Phase 19 marked complete (1/1 plans) - REQUIREMENTS.md: XCUT-03 marked complete

- Expanded release_agent.md with comprehensive pre-granted permissions (build, quality gates, git, gh CLI, shell utilities) - Added Phase 19 planning scaffold and research docs - Added UAT test plans and phase 15 smoke validation plan Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

RichardHightower and others added 30 commits March 6, 2026 12:11

docs: start milestone v8.0 Performance & Developer Experience

82744fb

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: define milestone v8.0 requirements (28 requirements across 6 ca…

18d4b2d

…tegories) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: create milestone v8.0 roadmap (4 phases, DX-first order)

b9ab688

Phases 15-18: File Watcher → Embedding Cache → Query Cache → UDS Transport 28 requirements mapped across 4 phases with 100% coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs(15): capture phase context

4d931f2

docs(15): create phase plan — File Watcher & Background Incremental U…

b809453

…pdates Two plans in 2 waves covering FileWatcherService, data model extensions, CLI watch flags, job source tracking, and plugin documentation updates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs(15-01): complete Phase 15 Plan 01 execution — FileWatcherService…

56757be

… shipped - 15-01-SUMMARY.md: documents FileWatcherService, model extensions, 31 tests - STATE.md: advance to Plan 02, add v8.0 Phase 15 decisions - ROADMAP.md: mark 15-01-PLAN.md complete [x]

docs(15-02): complete Phase 15 Plan 02 execution — CLI/plugin watch_m…

4a86fcc

…ode integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs(16): capture phase context for embedding cache

0421723

docs(state): record phase 16 context session

9499b1e

docs(16): research phase — embedding cache (aiosqlite, LRU, float32 B…

5fb33aa

…LOB) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs(16-01): complete EmbeddingCacheService plan — two-layer cache wi…

47d1758

…red into EmbeddingGenerator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs(16-02): complete cache CLI commands plan — SUMMARY + STATE updated

9eb51f2

- 16-02-SUMMARY.md: cache group commands + metrics in status + 12 tests - STATE.md: Phase 16 plan 2/2 complete; decisions recorded

docs(16-02): mark Phase 16 complete in ROADMAP.md

391c149

- 16-02-PLAN.md checkbox checked - Phase 16 progress table: 2/2, Complete, 2026-03-10

fix(16): correct cache status endpoint URL in CLI client

7fea667

Client was calling GET /index/cache/status but server mounts at GET /index/cache/. Fixed URL to match server router. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs(phase-16): complete phase execution — embedding cache shipped

774aba4

9/9 must-haves verified, all 6 ECACHE requirements satisfied. Gap fix: CLI cache status endpoint URL corrected (7fea667). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

RichardHightower and others added 16 commits March 12, 2026 00:29

docs(19): generate context from gap analysis

b239352

docs(19): research phase — plugin and skill updates for embedding cac…

eba687b

…he management

docs(19): add validation strategy

175097a

docs(19): create phase plan for plugin cache docs

a5045e2

docs(phase-19): complete phase execution

7cfcf9a

RichardHightower temporarily deployed to ci-testing March 12, 2026 22:24 — with GitHub Actions Inactive

RichardHightower merged commit a34c769 into main Mar 12, 2026
3 checks passed

RichardHightower deleted the v8.0/phases-15-16-19 branch March 12, 2026 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v8.0: Phases 15, 16, 19 — File Watcher, Embedding Cache, Plugin Updates#116

v8.0: Phases 15, 16, 19 — File Watcher, Embedding Cache, Plugin Updates#116
RichardHightower merged 46 commits intomainfrom
v8.0/phases-15-16-19

RichardHightower commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RichardHightower commented Mar 12, 2026

Summary

Phase 15: File Watcher & Background Incremental Reindex

Phase 16: Embedding Cache (2-Layer: LRU Memory + aiosqlite Disk)

Phase 19: Plugin & Skill Updates for Embedding Cache

Infrastructure

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant