v8.0: Phases 15, 16, 19 — File Watcher, Embedding Cache, Plugin Updates#116
Merged
RichardHightower merged 46 commits intomainfrom Mar 12, 2026
Merged
v8.0: Phases 15, 16, 19 — File Watcher, Embedding Cache, Plugin Updates#116RichardHightower merged 46 commits intomainfrom
RichardHightower merged 46 commits intomainfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Research synthesis for Agent Brain v8.0 Performance & Developer Experience. Covers stack (watchfiles, aiosqlite, cachetools), features (embedding cache, query cache, file watcher, background incremental, UDS transport), architecture (injection-first pattern, dual Uvicorn server), and 10 critical pitfalls. SUMMARY.md includes 4-phase roadmap implications and confidence assessment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tegories) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phases 15-18: File Watcher → Embedding Cache → Query Cache → UDS Transport 28 requirements mapped across 4 phases with 100% coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…dates Verified watchfiles 1.1.1 awatch() per-folder task pattern in project venv. Confirmed anyio.Event stop_event works in asyncio context. Documented DefaultFilter built-in exclusions, debounce millisecond conversion, and backward-compatible FolderRecord/JobRecord extensions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…pdates Two plans in 2 waves covering FileWatcherService, data model extensions, CLI watch flags, job source tracking, and plugin documentation updates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…emental updates - Add watch_mode, watch_debounce_seconds, include_code fields to FolderRecord - Backward-compatible _load_jsonl using .get() for missing watch fields - Add source field to JobRecord, JobSummary, JobDetailResponse (default='manual') - Add source parameter to enqueue_job() (default='manual') - Add watch_mode, watch_debounce_seconds to IndexRequest - Add watch_mode, watch_debounce_seconds to FolderInfo API response - Add AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS=30 to Settings - 13 new tests covering all model changes and backward compat
…h status - New FileWatcherService with per-folder asyncio tasks using watchfiles.awatch() - AgentBrainWatchFilter extends DefaultFilter with dist/build/.next/coverage dirs - start()/stop() lifecycle via anyio.Event for clean shutdown - add_folder_watch()/remove_folder_watch() for dynamic registration - _enqueue_for_folder() routes via job queue with source='auto', force=False - Wired into lifespan: starts after JobWorker, stops before JobWorker - /health/status includes file_watcher.running and file_watcher.watched_folders - IndexingStatus model extended with file_watcher field - 18 unit tests covering all service behaviors - task before-push exits 0 (860 passed, 77% coverage)
… shipped - 15-01-SUMMARY.md: documents FileWatcherService, model extensions, 31 tests - STATE.md: advance to Plan 02, add v8.0 Phase 15 decisions - ROADMAP.md: mark 15-01-PLAN.md complete [x]
- Add watch_mode/watch_debounce_seconds to JobRecord for job-level tracking - Index router passes watch fields from IndexRequest to enqueue_job - JobWorker._apply_watch_config() updates FolderRecord and notifies FileWatcherService after successful job completion - CLI folders add: --watch auto/off and --debounce flags - CLI folders list: Watch column showing auto/off per folder - CLI jobs: Source column showing manual/auto per job - CLI client: watch_mode and watch_debounce_seconds params on index() - IndexingService passes include_code to folder_manager.add_folder() - 10 server tests + 6 CLI tests for watch integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ction - api_reference.md: folders add/list/remove examples, Watch column, Source column, File Watcher section with debounce and exclusion docs - agent-brain-index.md: --watch and --debounce parameters, watch auto examples, file watching notes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ode integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…emental detection _verify_collection_delta read job.eviction_summary which was only set after verification passed — a catch-22 that prevented zero-change incremental runs from passing. Now receives eviction_result directly from the pipeline as a parameter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Round 2 after eviction_result verification fix. Blocker resolved. Only remaining issue is #4 (Watch column not observable in live run due to test environment having no indexable documents — code confirmed correct via CLI unit tests). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…LOB) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two plans in 2 waves covering ECACHE-01 through ECACHE-06: - Plan 01 (Wave 1): EmbeddingCacheService, EmbeddingGenerator integration, API endpoints, settings - Plan 02 (Wave 2): CLI cache commands, status display, health endpoint integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Create EmbeddingCacheService with two-layer cache (OrderedDict LRU + aiosqlite WAL) - Cache key: SHA-256(content):provider:model:dimensions (ECACHE-01) - Provider fingerprint auto-wipe on mismatch at startup (ECACHE-04) - get_batch() batch SQL lookup for embed_texts() efficiency - float32 BLOB storage via struct.pack for ~12 KB/entry at 3072 dims - Add EMBEDDING_CACHE_MAX_DISK_MB/MAX_MEM_ENTRIES/PERSIST_STATS to Settings - Add embedding_cache subdirectory to SUBDIRECTORIES and resolve_storage_paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ests - Intercept embed_text/embed_texts with lazy-import cache check (breaks circular import) - embed_texts uses get_batch() for single-query batch lookup, calls provider only for misses - embed_query gets caching for free via embed_text delegation - Initialize EmbeddingCacheService in lifespan BEFORE IndexingService (ECACHE-07) - Add _build_provider_fingerprint() helper to api/main.py - Register cache_router at /index/cache with GET (status) and DELETE (clear) endpoints - Add embedding_cache field to IndexingStatus; populate in health.py when entry_count > 0 - Write 22 unit tests: all 8 required test cases + coverage for stats/singleton - Update test_storage_paths.py to include embedding_cache in expected keys - task before-push: 893 passed, 23 skipped, 0 failed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…red into EmbeddingGenerator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add DocServeClient.cache_status() GET /index/cache/status - Add DocServeClient.clear_cache() DELETE /index/cache - Create commands/cache.py with cache_group (status + clear subcommands) - cache status: Rich table with entry_count, hit_rate, hits, misses, mem_entries, size - cache clear: confirmation prompt with entry count; --yes/-y to skip - Register cache_group in commands/__init__.py and cli.py - Update CLI help text to include Cache Commands section
…nd tests - Add embedding_cache field to IndexingStatus dataclass in api_client.py - Populate embedding_cache from /health/status response in DocServeClient.status() - agent-brain status now shows embedding cache line: N entries, X% hit rate (H hits, M misses) - status --json includes embedding_cache dict (or null for fresh installs) - Fix test_status_json_output: set mock_status.embedding_cache=None (JSON serializable) - New test_cache_command.py: 12 tests covering help, status, clear (confirmation + --yes) - [Rule 1 - Bug] Fixed MagicMock JSON serialization error in existing test
- 16-02-SUMMARY.md: cache group commands + metrics in status + 12 tests - STATE.md: Phase 16 plan 2/2 complete; decisions recorded
- 16-02-PLAN.md checkbox checked - Phase 16 progress table: 2/2, Complete, 2026-03-10
Client was calling GET /index/cache/status but server mounts at GET /index/cache/. Fixed URL to match server router. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9/9 must-haves verified, all 6 ECACHE requirements satisfied. Gap fix: CLI cache status endpoint URL corrected (7fea667). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rompt default - Add "source" key to metadata dict in document_loader.py load_from_folder() and load_single_file() so indexing_service manifest diffing can find file paths (was reading doc.metadata["source"] which was never set) - Add trailing slash to DELETE /index/cache/ in api_client.py to prevent 307 redirect that caused JSONDecodeError on cache clear - Pass default=False to Rich Confirm.ask so prompt shows [y/N] instead of [y/n] - Update 16-UAT.md with full round 2 results (6 pass, 7 issues) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…che routes Fix 13: Belt-and-suspenders for metadata["source"] in indexing_service.py with fallback chain + invariant guard for documents > 0 with no paths. Fix 13: Early return in query_service.py when index is empty (avoids 500). Fix 7: Cache router rewritten with shared impl + no-slash aliases. Fix 6: Confirm.ask default=False + regression test for prompt default. Fix 11: Narrowly omit embedding_cache key when None in health endpoint. Fix 10: Add --verbose/-v flag to status command for cache detail. All 916 server + 156 CLI tests pass. task before-push clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…locking Fix 11: Remove response_model=IndexingStatus from /health/status route. Always return model_dump dict so the embedding_cache pop actually works (response_model was re-serializing through Pydantic, re-adding null). Fix 8: cache clear() no longer acquires self._lock. Uses its own DB connection with PRAGMA busy_timeout=5000 so it never blocks behind a long embedding write stream. SQLite WAL handles concurrent writers at the page level. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix 10: /health/status now prefers storage_backend.get_count() over legacy vector_store, fixing total_chunks: 0 when the two Chroma instances diverge. Fix 8: Add asyncio.sleep(0) yield every 10 docs in chunking loops (ContextAwareChunker.chunk_documents + code chunking loop) so HTTP requests aren't starved during long indexing runs. Fix 11 test: Add integration test asserting /health/status omits embedding_cache key (not null) when cache is empty. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix 8 (round 6): The 29.4s cache clear timeout was caused by two things compounding: 1. embed_texts() miss loop calls cache.put() per miss with no yield, starving HTTP request processing during large indexing runs. Fix: asyncio.sleep(0) every 10 cache writes. 2. clear() ran VACUUM inline, which can take 10-20s on a large DB and needs an exclusive lock. Fix: VACUUM scheduled as fire-and-forget background task via asyncio.create_task(), so DELETE + response returns immediately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…veness - Wrap document post-processing (stat, language detect) in asyncio.to_thread() - Wrap chunk_single_document text splitting in asyncio.to_thread() - Wrap chunk_code_document tree-sitter parsing in asyncio.to_thread() - Replace fire-and-forget VACUUM with PRAGMA wal_checkpoint(TRUNCATE) - Add put_many() batch cache writes (single transaction, one lock) - Use put_many() in embed_texts() instead of per-miss put() loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…thread() Root cause: previous fixes only wrapped document loading and chunking in asyncio.to_thread(), but the pipeline's heaviest sync operations were still blocking the event loop: - ChromaDB collection.upsert() with up to 40k items per batch - BM25Retriever.from_defaults() + persist() (tokenization + disk I/O) - GraphIndexManager.build_from_documents() (entity extraction) - ContentInjector.apply_to_chunks() (metadata enrichment) - os.stat() during manifest save All sync-heavy operations in the indexing pipeline now run in threads, keeping the event loop free for concurrent HTTP requests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 16 (Embedding Cache) UAT fully passed after 8 rounds of fixes. Key final fix: wrapping all CPU-heavy indexing pipeline stages in asyncio.to_thread() to prevent event-loop starvation. Also adds uat-tester agent and uat-testing skill for permission-free UAT testing in future phases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 16 delivers a two-layer embedding cache: - Layer 1: In-memory LRU (OrderedDict, 1000 entries) - Layer 2: Persistent aiosqlite disk cache (WAL mode, 500MB limit) Key features: - Cache-intercepted embed_text/embed_texts with batch lookup - Provider fingerprint auto-wipe on model change (ECACHE-04) - CLI: agent-brain cache status, agent-brain cache clear --yes - /index/cache/status and DELETE /index/cache/ API endpoints - Health endpoint includes cache metrics when populated - All CPU-heavy indexing stages run in asyncio.to_thread() for non-blocking API responsiveness during indexing UAT: 13/13 tests passing across 8 rounds of validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ference - Create agent-brain-plugin/commands/agent-brain-cache.md with status and clear subcommands - Add CACHE COMMANDS category to agent-brain-help.md display section - Add agent-brain-cache row to Command Reference table in agent-brain-help.md - Add GET /index/cache and DELETE /index/cache sections to api_reference.md with response schemas - Add agent-brain cache status/clear commands to CLI Commands Reference section
- Add cache trigger phrases to using-agent-brain SKILL.md YAML description - Add Cache Management section to using-agent-brain SKILL.md (when to check, when to clear) - Add Cache Management to Contents table of contents - Add cache performance trigger pattern to search-assistant.md - Add cache performance check step (step 6) to search-assistant.md assistance flow - Add EMBEDDING_CACHE_MAX_MEM_ENTRIES and EMBEDDING_CACHE_MAX_DISK_MB to configuring-agent-brain env vars table - Add Embedding Cache Tuning note to configuring-agent-brain SKILL.md
…nagement - 19-01-SUMMARY.md: 6 plugin files updated with cache management surface - STATE.md: session progress recorded, plan advanced - ROADMAP.md: Phase 19 marked complete (1/1 plans) - REQUIREMENTS.md: XCUT-03 marked complete
- Expanded release_agent.md with comprehensive pre-granted permissions (build, quality gates, git, gh CLI, shell utilities) - Added Phase 19 planning scaffold and research docs - Added UAT test plans and phase 15 smoke validation plan Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Milestone v8.0 (Performance & DX) — ships Phases 15, 16, and 19 with 46 commits across 88 files (+13k/-1.7k lines).
Phase 15: File Watcher & Background Incremental Reindex
FileWatcherServiceusingwatchfilesfor real-time filesystem monitoring--watchflag foragent-brain indexand folder managementPhase 16: Embedding Cache (2-Layer: LRU Memory + aiosqlite Disk)
EmbeddingCacheServicewith in-memory LRU (1k entries) + persistent SQLite (WAL mode)embed_text()andembed_texts()with batchput_many()cache statusandcache clearcommandsasyncio.to_thread()Phase 19: Plugin & Skill Updates for Embedding Cache
agent-brain-cacheslash command for cache managementInfrastructure
uat-tester.md) with pre-granted permissions for E2E testingrelease_agent.md) with comprehensive permissionsTest plan
task before-pushpasses (894 server + 156 CLI tests)🤖 Generated with Claude Code