Skip to content

feat: BM25 hybrid retrieval, learn_document(), config wiring, MABench resume#30

Merged
emson merged 5 commits intomainfrom
feature/elfmem-core-retrieval
Apr 11, 2026
Merged

feat: BM25 hybrid retrieval, learn_document(), config wiring, MABench resume#30
emson merged 5 commits intomainfrom
feature/elfmem-core-retrieval

Conversation

@emson
Copy link
Copy Markdown
Owner

@emson emson commented Apr 11, 2026

Summary

  • BM25 in core retrieval pipeline — stage 2b in hybrid_retrieve() discovers blocks with strong keyword overlap that vector search misses. Soft dependency on rank_bm25 (pip install elfmem[bm25]); silently skipped when not installed.
  • learn_document() — chunk, learn, auto-consolidate in one call. Accepts optional chunker callback (e.g. nltk.sent_tokenize). Returns LearnDocumentResult.
  • dream(skip_llm=, skip_contradictions=) — forwards to consolidate(), enabling fast-path consolidation without bypassing policy tracking.
  • Config wiring fixcontradiction_threshold, near_dup_exact_threshold, near_dup_near_threshold existed in MemoryConfig but were never passed through api.py to consolidate().
  • MABench adapter simplified — 319→160 lines. Removed _BM25Index, _rrf_merge, chunk_text, manual session/consolidation management.
  • MABench resume — atomic per-example result writes + --resume flag skips completed examples.

Test plan

  • 572 tests passing (30 new, 0 regressions)
  • ruff check clean on all changed files
  • Smoke test: python -m benchmarks.memoryagentbench.runner --test
  • Verify --resume skips completed examples on re-run

🤖 Generated with Claude Code

emson and others added 5 commits April 11, 2026 09:07
… resume

Move BM25 keyword search, document chunking, and auto-consolidation
into elfmem core so benchmark adapters stay thin. Fix three config
fields that existed in MemoryConfig but were never wired through to
consolidate(). Add resume support to MABench runner.

- BM25 stage 2b in hybrid_retrieve() (soft dep on rank_bm25)
- learn_document() with auto-dream at inbox_threshold intervals
- dream(skip_llm=, skip_contradictions=) forwarding
- Wire contradiction_threshold, near_dup_*_threshold from config
- MABench adapter: 319→160 lines (removed BM25/RRF/chunking)
- MABench runner: atomic per-example writes + --resume
- 30 new tests (572 total)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI doesn't have nltk installed. The adapter imports nltk at module
level, so the test file needs the same importorskip guard used by
the other benchmark test files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
rank_bm25 is an optional dependency not installed in CI. The direct
unit tests on _stage_2b_bm25_search need the same importorskip
guard so they skip gracefully when the package is absent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove tests on private functions (_stage_2b_bm25_search,
_default_chunker, _assemble_chunks) and internal state (_HAS_BM25).
All behaviour is now tested through the public API (recall(),
learn_document(), dream(), consolidate()). No importorskip guards
needed since tests no longer depend on optional packages directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@emson emson merged commit 4bc032d into main Apr 11, 2026
6 checks passed
@emson emson deleted the feature/elfmem-core-retrieval branch April 11, 2026 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant