This project is a local MCP (Model Context Protocol) server that exposes a small set of tools:
memory.search– semantic search over stored memoriesmemory.save– store a new memorymemory.supersede– mark an old memory as supersededmemory.delete– permanently remove a memory by idmemory.ping– sanity check / version output
- Keep durable coding context across chat sessions (decisions, preferences, gotchas, API contracts).
- Retrieve relevant past context semantically (not only keyword matching).
- Scope memory per project using
WORKSPACE_KEYwhile keeping one shared local database. - Correct memory over time by superseding outdated entries or deleting irrelevant ones.
- Run everything locally (no external vector DB required).
- User asks a question in chat.
- Agent calls
memory.searchto fetch relevant context. - Agent answers using retrieved memory + current codebase context.
- New durable insight is stored via
memory.save. - Old memory is updated via
memory.supersedeor removed viamemory.delete.
It uses:
- Bun + TypeScript
- Zvec (
@zvec/zvec) as embedded in-process vector database (docs: https://zvec.org/en/docs/) - Ollama
/api/embedwith embeddinggemma for embeddings (docs: https://docs.ollama.com/capabilities/embeddings, model: https://ollama.com/library/embeddinggemma)
- Bun installed
- Ollama installed and running locally
Pull the embedding model:
ollama pull embeddinggemmabun installbun run startThis runs an MCP server over stdio.
bun run testCurrent tests include:
tests/embed.test.ts– validates Ollama embedding response parsing and error handlingtests/memory-db.test.ts– validatessave,search,supersede, anddeleteon the Zvec-backed store
MEMORY_DB_PATH(default./data/memory.zvec)OLLAMA_BASE_URL(defaulthttp://localhost:11434)OLLAMA_EMBED_MODEL(defaultembeddinggemma)EMBEDDING_DIM(default768, must match your embedding model)WORKSPACE_KEY(defaultdefault)
This repo includes .vscode/mcp.json that registers this server:
- command:
bun - args:
run start
You can adjust environment variables in that file.
If you want this MCP server available in all workspaces, add it to your User MCP configuration instead of only .vscode/mcp.json:
- Open Command Palette:
MCP: Open User Configuration - Add a server entry that starts this repo from a fixed directory.
Example (Linux):
{
"servers": {
"local-memory-mcp": {
"type": "stdio",
"command": "bun",
"args": ["--cwd", "/path/to/local-memory-mcp", "run", "start"],
"env": {
"MEMORY_DB_PATH": "/path/to/local-memory-mcp/data/memory.zvec",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_EMBED_MODEL": "embeddinggemma",
"EMBEDDING_DIM": "768",
"WORKSPACE_KEY": "${workspaceFolderBasename}"
}
}
}
}Notes:
- Use an absolute
MEMORY_DB_PATHso all projects use the same database. WORKSPACE_KEY=${workspaceFolderBasename}keeps memories separated per project automatically.- Enable VS Code setting
chat.mcp.autoStart(Experimental) to auto-start/restart MCP servers when needed.
Docs:
Add this server to Claude Code as a local stdio MCP server.
This repository already includes:
.mcp.jsonfor project-scoped Claude MCP configurationCLAUDE.mdfor memory-first agent behavior guidelines
claude mcp add --transport stdio --scope user \
--env MEMORY_DB_PATH=/absolute/path/to/local-memory-mcp/data/memory.zvec \
--env OLLAMA_BASE_URL=http://localhost:11434 \
--env OLLAMA_EMBED_MODEL=embeddinggemma \
--env EMBEDDING_DIM=768 \
--env WORKSPACE_KEY=default \
local-memory-mcp -- bun --cwd /absolute/path/to/local-memory-mcp run startclaude mcp add --transport stdio --scope project \
--env MEMORY_DB_PATH=./data/memory.zvec \
--env OLLAMA_BASE_URL=http://localhost:11434 \
--env OLLAMA_EMBED_MODEL=embeddinggemma \
--env EMBEDDING_DIM=768 \
--env WORKSPACE_KEY=${PWD##*/} \
local-memory-mcp -- bun run startProject .mcp.json example:
{
"mcpServers": {
"local-memory-mcp": {
"type": "stdio",
"command": "bun",
"args": ["run", "start"],
"env": {
"MEMORY_DB_PATH": "./data/memory.zvec",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_EMBED_MODEL": "embeddinggemma",
"EMBEDDING_DIM": "768",
"WORKSPACE_KEY": "${PWD##*/}"
}
}
}
}Notes:
--scope projectwrites to.mcp.jsonin the project root.--scope userstores the server in your user Claude configuration.- Keep all Claude flags before the server name, and put
--before the server command.
Useful commands:
claude mcp list
claude mcp get local-memory-mcp
claude mcp remove local-memory-mcpDocs:
{
"tool": "memory.search",
"arguments": {
"query": "What is our policy for multi-session memory?",
"topK": 8,
"workspaceKey": "my-repo"
}
}{
"tool": "memory.save",
"arguments": {
"workspaceKey": "my-repo",
"type": "decision",
"summary": "We use zvec with Ollama embeddinggemma for long-term memory.",
"text": "Decision: The Copilot/agent memory sidecar stores vectors in zvec and generates embeddings via Ollama /api/embed using embeddinggemma.",
"tags": ["memory", "zvec", "ollama", "embeddinggemma"],
"importance": 0.8
}
}{
"tool": "memory.delete",
"arguments": {
"workspaceKey": "my-repo",
"id": 42
}
}- The DB uses one Zvec collection with:
- dense vector field
embedding - scalar fields for metadata (
workspaceKey,type,summary, etc.)
- dense vector field
- KNN queries are executed through Zvec
querySyncwith metadata filters.
Run all tests:
bun run testCurrent test coverage:
tests/embed.test.ts- parses successful Ollama
/api/embedresponses intoFloat32Array - verifies error handling when Ollama returns non-2xx responses
- parses successful Ollama
tests/memory-db.test.ts- validates
save+searchbehavior with workspace/type filtering - validates
supersedebehavior (superseded items are excluded from search) - validates
deletebehavior and returned payload semantics
- validates