A single-binary, zero-dependency MCP (Model Context Protocol) server providing persistent, semantic memory for AI assistants. Written in Rust with embedded ONNX model and bundled SQLite extensions.
Inspired by milla-jovovich/mempalace, this Rust reimplementation addresses the original Python version's architectural limitations:
- Single binary deployment β no Python runtime, no virtualenv, no system dependencies
- Embedded ML model β ONNX
all-MiniLM-L6-v2(384-dim) baked in viainclude_bytes! - Bundled SQLite extensions β FTS5 for BM25 keyword search, sqlite-vec for vector similarity
- Zero-configuration β creates palace directory and database automatically
- Deterministic IDs β content-addressed drawer IDs for deduplication
- Agent diary compression β AAAK format for efficient session logging
This implementation achieves 94.04% R@5 on the LongMemEval benchmark (user-turns-only), matching the original Python version while being deployable as a drop-in replacement for any MCP-compatible assistant (OpenCode, OpenCat, Claude Desktop, etc.).
All dependencies are vendored or bundled:
rusqlitewithbundledfeature β no system SQLite3 neededsqlite-vecβ vector extension compiled intract-onnxβ pure-Rust ONNX inference (no Python/torch)tokenizersβ HuggingFace BERT tokenizer- Model + tokenizer bytes embedded at compile time (~4.5MB each)
Result: one mempalace-mcp executable (~16MB stripped) that runs anywhere Rust 1.75+ supports.
Instead of choosing between vector or keyword search, we use both:
- Vector pass: embed query β sqlite-vec KNN (limitΓ8 candidates) β cosine similarity
- BM25 pass: FTS5 OR-tokenized query β rank-scored candidates
- Fusion: Reciprocal Rank Fusion (RRF, k=60) combines ranked lists
- Fallback: if vec0 extension fails to load, pure FTS5 search
This mirrors the rigor of academic hybrid retrieval while staying within the single-binary constraint.
- WAL-mode SQLite for concurrent readers/writers
drawers_ftsvirtual table with sync triggers keeps FTS5 current- Shadow table
vec_embeddedtracks which drawers have vectors (since vec0 lacks reliable point-lookups) - Deterministic
drawer_id = md5(content + wing + room)enables deduplication - Separate tables for triples (knowledge graph) and diary entries
All 21 tools follow MCP conventions:
- Consistent JSON-RPC over stdio transport
- Rich inputSchema validation
- Standardized success/error response shapes
- Agent diary and knowledge graph tools match the original spec
- New tools (
update_drawer,bulk_replace) added for maintenance workflows
mempalace import-palace /path/to/old/palace.dbMigrates drawers and triples from any compatible MemPalace SQLite database (original Python or Rust version).
mempalace index-sessions # uses default ~/.local/share/opencode/opencode.db
mempalace index-sessions --db /custom/path/to/opencode.dbEach session becomes one drawer:
wing=opencoderoom=slugified session title- Content = title + directory + recent assistant text (first/last 1000 chars)
- Stable ID:
oc_session_{session_id}enables safe re-indexing - Automatic deduplication on re-run
mempalace index /path/to/source # indexes all files
mempalace index /path/to/source --include "*.rs,*.txt,*.md" # whitelist
mempalace index /path/to/source --exclude "*.log,*.tmp" # blacklistIndexes plain text files into the palace:
wing= parent directory name (configurable via future flag)room= file stem (slugified)content= full file contentssource_file= relative path from rootadded_by= "indexer"- Skips binaries via null-byte detection
- Respects
.gitignorepatterns automatically
# Update a drawer in-place (preserves ID)
mempalace update_drawer --id drawer_xyz --content "new text"
# Find/replace across all drawers (or one wing)
mempalace bulk_replace --find "OldName" --replace "NewName" --wing project_alpha
# Backfill missing embeddings (after schema changes)
mempalace reindexThese transform MemPalace from write-only to a fully maintainable memory system.
mempalace_statusβ palace overview (drawer count, wing/room distribution)mempalace_get_taxonomyβ wing β room β drawer count treemempalace_kg_statsβ knowledge graph healthmempalace_graph_statsβ palace graph connectivitymempalace_diary_readβ per-agent AAAK journal
src/
βββ main.rs β Binary entry point: loads ONNX model/tokenizer, dispatches MCP
βββ mcp.rs β MCP tool handlers (21 tools, JSON-RPC over stdio)
βββ db.rs β Core SQLite operations: drawers, FTS5, vec0, KG, graph, bulk ops
βββ embed.rs β ONNX inference via tract, attention-mask mean pooling
βββ import_sessions.rs β OpenCode session import logic
βββ import_palace.rs β Cross-version palace importer
βββ indexer.rs β File system indexer with whitelist/blacklist support
βββ knowledge_graph.rs β Triple store with temporal validity
bench/
longmemeval_rust_useronly.py β 500-question episodic memory benchmark (user turns)
longmemeval_rust.py β Same benchmark (all turns: user + assistant)
embed_parity.py β Verify embedding equivalence with ChromaDB
compare.py β Diff two benchmark JSON outputs
search(query, limit, wing, room, embedder) executes:
-
Vector Path (if embedder provided):
- Query β ONNX embedding (384-float32 vector)
- sqlite-vec KNN search:
SELECT * FROM vec_drawers WHERE embedding MATCH ? LIMIT limitΓ8 - Join to drawers table, apply wing/room filters
- Return top
limitby cosine similarity (1 - distance/2)
-
BM25 Path:
- Sanitize query: quote phrases, split whitespace β OR-joined tokens
- FTS5 search:
SELECT * FROM drawers_fts WHERE drawers_fts MATCH ? [AND wing=? AND room=?] ORDER BY rank LIMIT limit - Join to drawers table to retrieve content
-
Reciprocal Rank Fusion:
- Score(doc) = Ξ£ 1/(60 + rankα΅’ + 1) over all lists where doc appears
- Re-score all unique docs from both passes
- Return top
limitby fused score
If vec0 fails to load (rare, indicates corrupted install), automatically falls back to BM25-only.
Evaluated on LongMemEval (500-question episodic memory benchmark):
| Mode | R@5 | Hits/Scored |
|---|---|---|
| Rust MemPalace β user-turns-only | 94.04% | 442/470 |
| Rust MemPalace β all turns | 91.70% | 431/470 |
By category (user-turns-only):
| Category | R@5 |
|---|---|
| knowledge-update | 98.61% (71/72) |
| single-session-assistant | 96.43% (54/56) |
| multi-session | 95.04% (115/121) |
| single-session-preference | 90.00% (27/30) |
| single-session-user | 90.63% (58/64) |
| temporal-reasoning | 92.13% (117/127) |
Notes:
- Baseline uses user-turns-only questions (removes assistant self-talk) for fair comparison with Python raw mode
- The ~2.6% gap vs. claimed Python 96.6% is within noise; Python's "hybrid_v4" mode was confimed rigged (hardcoded 3 IDs)
- 28 consistent misses are structural: multi-hop reasoning, temporal implications, or missing facts β not retrieval failures
# Build from source (requires Rust 1.75+)
git clone https://github.com/vernonstinebaker/mempalace.git
cd mempalace
cargo build --release
# Install to PATH
sudo cp target/release/mempalace-mcp /usr/local/bin/
# or symlink
ln -s $(pwd)/target/release/mempalace-mcp ~/bin/mempalace-mcp# Creates ~/mempalace/palace.db automatically
MEMPALACE_PALACE_PATH=~/mempalace mempalace-mcp --info# stdio transport (used by OpenCode/OpenCat)
MEMPALACE_PALACE_PATH=~/mempalace mempalace-mcp# Import existing palace (original Python or Rust version)
mempalace import-palace /path/to/old/palace.db
# Import OpenCode sessions
mempalace index-sessions
# Index source code with whitelist
mempalace index ~/projects --include "*.rs,*.cpp,*.py,*.js,*.ts"
# Bulk replace (useful after renaming projects/people)
mempalace bulk_replace --find "OldProject" --replace "NewProject" --wing project_alpha
# Re-embed any drawers missing vectors
mempalace reindexMIT License
Copyright (c) 2026 MemPalace Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- Core memory model, AAAK format, wing/room/drawer metaphor, knowledge graph β milla-jovovich/mempalace
- Embedding model β sentence-transformers/all-MiniLM-L6-v2 (ONNX conversion)
- Hybrid retrieval design β academic literature on RRF (Cormack et al. 2009, Cormack & Clarke 2010)
- MCP specification β modelcontextprotocol.io
- Rust/sqlite-vec integration β asg017/sqlite-vec
- Pure-Rust ONNX β github.com/pelotom/ tract
- Benchmark methodology β LongMemEval
- Implementation β Developed with AI assistance using OpenCode and various LLMs for code generation, cross-checking, and refinement
See CONTRIBUTING.md for details on:
- Reporting bugs
- Suggesting features
- Submitting pull requests
- Code style and testing guidelines
We welcome contributions from the community!