Graphiti is a framework for building and querying temporally-aware knowledge graphs, specifically tailored for AI agents operating in dynamic environments. Unlike traditional retrieval-augmented generation (RAG) methods, Graphiti continuously integrates user interactions, structured and unstructured enterprise data, and external information into a coherent, queryable graph. The framework supports incremental data updates, efficient retrieval, and precise historical queries without requiring complete graph recomputation, making it suitable for developing interactive, context-aware AI applications.
This is an experimental Model Context Protocol (MCP) server implementation for Graphiti. The MCP server exposes Graphiti's key functionality through the MCP protocol, allowing AI assistants to interact with Graphiti's knowledge graph capabilities.
The Graphiti MCP server exposes the following key high-level functions of Graphiti:
- Episode Management: Add, retrieve, and delete episodes (text, messages, or JSON data)
- Entity Management: Search and manage entity nodes and relationships in the knowledge graph
- Search Capabilities: Search for facts (edges) and node summaries using semantic and hybrid search
- Group Management: Organize and manage groups of related data with group_id filtering
- Graph Maintenance: Clear the graph and rebuild indices
- Flexible Ollama Configuration: Fully configurable LLM and embedding models via CLI arguments and environment variables
The fastest way to get started is using uvx to install and run from PyPI:
# Install and run with default settings
uvx montesmakes.graphiti-memory
# Run with custom configuration
uvx montesmakes.graphiti-memory --transport stdio --group-id my-projectπ For detailed uvx usage and configuration options, see UVXUSAGE.md
For development or to run from source:
git clone https://github.com/mandelbro/graphiti-memory.git
cd graphiti/mcp_serverOr with GitHub CLI:
gh repo clone mandelbro/graphiti-memory
cd graphiti/mcp_server- Note the full path to this directory.
cd graphiti && pwd
-
Install the Graphiti prerequisites.
-
Configure Claude, Cursor, or other MCP client to use Graphiti with a
stdiotransport. See the client documentation on where to find their MCP configuration files.
- Change directory to the
mcp_serverdirectory
cd graphiti/mcp_server
- Start the service using Docker Compose
docker compose up
- Point your MCP client to
http://localhost:8020/sse
The fastest way to get started is using uvx to run the server directly:
# Install and run with default settings
uvx montesmakes.graphiti-memory
# Run with custom configuration
uvx montesmakes.graphiti-memory --transport stdio --group-id my-project
# For detailed uvx usage, see UVXUSAGE.md- Ensure you have Python 3.10 or higher installed.
- A running Neo4j database (version 5.26 or later required)
- Ollama installed and running (default) OR OpenAI API key for LLM operations
The server now defaults to using Ollama for LLM operations and embeddings. To set up Ollama:
- Install Ollama: Visit https://ollama.ai for installation instructions
- Start Ollama: Run
ollama serveto start the Ollama server - Pull required models:
ollama pull deepseek-r1:7b # LLM model ollama pull nomic-embed-text # Embedding model
The server will automatically connect to Ollama at http://localhost:11434/v1 and use these models by default.
If you prefer to use OpenAI instead of Ollama:
- Set the environment variable:
USE_OLLAMA=false - Configure your OpenAI API key:
OPENAI_API_KEY=your_api_key_here - Optionally customize the models using
MODEL_NAMEandSMALL_MODEL_NAMEenvironment variables
The project includes Docker Compose configuration for easy deployment. There are several ways to configure Ollama with Docker:
If you have Ollama running on your host machine:
-
Start Ollama on your host:
ollama serve
-
Pull required models:
ollama pull deepseek-r1:7b ollama pull nomic-embed-text
-
Start the services:
docker compose up
The server will connect to your host Ollama instance using host.docker.internal:11434.
Note: The Graphiti core library requires an OPENAI_API_KEY environment variable even when using Ollama (for the reranker component). The Docker configuration includes a dummy API key (abc) for this purpose.
If you prefer to run Ollama in a container:
-
Uncomment the Ollama service in
docker-compose.yml:ollama: image: ollama/ollama:latest ports: - "11434:11434" volumes: - ollama_data:/root/.ollama
-
Update the OLLAMA_BASE_URL in the environment section:
environment: - OLLAMA_BASE_URL=http://ollama:11434/v1
-
Uncomment the volume:
volumes: ollama_data:
-
Start the services:
docker compose up
-
Pull models in the container:
docker compose exec ollama ollama pull deepseek-r1:7b docker compose exec ollama ollama pull nomic-embed-text
To use OpenAI instead of Ollama in Docker:
-
Create a
.envfile with your OpenAI configuration:USE_OLLAMA=false OPENAI_API_KEY=your_openai_api_key_here MODEL_NAME=gpt-4o-mini
-
Update docker-compose.yml to use the
.envfile:env_file: - .env
-
Start the services:
docker compose up
-
Clone the repository:
git clone <repository-url> cd mcp_server
-
Install dependencies:
uv sync
-
Set up environment variables (optional):
cp .env.example .env # Edit .env with your configuration -
Start Neo4j (if using Docker):
docker compose up neo4j -d
-
Run the server:
uv run src/graphiti_mcp_server.py --transport sse
The server supports multiple configuration methods with the following precedence (highest to lowest):
- CLI arguments (highest priority)
- Environment variables
- YAML configuration files
- Default values (lowest priority)
For complex configurations, you can use YAML files in the config/ directory:
# config/providers/ollama.yml
llm:
model: "deepseek-r1:7b"
base_url: "http://localhost:11434/v1"
temperature: 0.1
max_tokens: 8192
model_parameters:
num_ctx: 4096 # Context window size
num_predict: -1 # Number of tokens to predict
repeat_penalty: 1.1 # Penalty for repeating tokens
top_k: 40 # Limit token selection to top K
top_p: 0.9 # Cumulative probability cutoffSee the config/README.md for detailed information about YAML configuration.
The server uses the following environment variables:
NEO4J_URI: URI for the Neo4j database (default:bolt://localhost:7687)NEO4J_USER: Neo4j username (default:neo4j)NEO4J_PASSWORD: Neo4j password (default:demodemo)
The server now defaults to using Ollama for LLM operations and embeddings. You can configure it using these environment variables:
USE_OLLAMA: Use Ollama for LLM and embeddings (default:true)OLLAMA_BASE_URL: Ollama base URL (default:http://localhost:11434/v1)OLLAMA_LLM_MODEL: Ollama LLM model name (default:deepseek-r1:7b)OLLAMA_EMBEDDING_MODEL: Ollama embedding model name (default:nomic-embed-text)OLLAMA_EMBEDDING_DIM: Ollama embedding dimension (default:768)LLM_MAX_TOKENS: Maximum tokens for LLM responses (default:8192)
Ollama Model Parameters: You can now configure Ollama-specific model parameters like num_ctx, top_p, repeat_penalty, etc. using YAML configuration files. This provides fine-grained control over model behavior that wasn't previously available through environment variables alone.
To use OpenAI instead of Ollama, set USE_OLLAMA=false and configure:
OPENAI_API_KEY: OpenAI API key (required for LLM operations)OPENAI_BASE_URL: Optional base URL for OpenAI APIMODEL_NAME: OpenAI model name to use for LLM operations (default:gpt-4.1-mini)SMALL_MODEL_NAME: OpenAI model name to use for smaller LLM operations (default:gpt-4.1-nano)LLM_TEMPERATURE: Temperature for LLM responses (0.0-2.0)LLM_MAX_TOKENS: Maximum tokens for LLM responses (default:8192)
To use Azure OpenAI, set USE_OLLAMA=false and configure:
AZURE_OPENAI_ENDPOINT: Azure OpenAI LLM endpoint URLAZURE_OPENAI_DEPLOYMENT_NAME: Azure OpenAI LLM deployment nameAZURE_OPENAI_API_VERSION: Azure OpenAI LLM API versionAZURE_OPENAI_EMBEDDING_API_KEY: Azure OpenAI Embedding deployment key (if different fromOPENAI_API_KEY)AZURE_OPENAI_EMBEDDING_ENDPOINT: Azure OpenAI Embedding endpoint URLAZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME: Azure OpenAI embedding deployment nameAZURE_OPENAI_EMBEDDING_API_VERSION: Azure OpenAI API versionAZURE_OPENAI_USE_MANAGED_IDENTITY: Use Azure Managed Identities for authenticationLLM_MAX_TOKENS: Maximum tokens for LLM responses (default:8192)
SEMAPHORE_LIMIT: Episode processing concurrency. See Concurrency and LLM Provider 429 Rate Limit ErrorsMCP_SERVER_PORT: Port for the MCP server when using SSE transport (default: 8020)
You can set these variables in a .env file in the project directory. A sample configuration file (sample_env.txt) is provided with all available options and their default values.
To run the Graphiti MCP server directly using uv:
uv run src/graphiti_mcp_server.pyWith options:
uv run src/graphiti_mcp_server.py --model gpt-4.1-mini --transport sseAvailable arguments:
--model: Overrides theMODEL_NAMEenvironment variable (only when not using Ollama).--small-model: Overrides theSMALL_MODEL_NAMEenvironment variable (only when not using Ollama).--temperature: Overrides theLLM_TEMPERATUREenvironment variable.--max-tokens: Overrides theLLM_MAX_TOKENSenvironment variable.--transport: Choose the transport method (sse or stdio, default: sse)--port: Port to bind the MCP server to (default: 8020)--group-id: Set a namespace for the graph (optional). If not provided, defaults to "default".--destroy-graph: If set, destroys all Graphiti graphs on startup.--use-custom-entities: Enable entity extraction using the predefined ENTITY_TYPES
--use-ollama: Use Ollama for LLM and embeddings (default: true)--ollama-base-url: Ollama base URL (default: http://localhost:11434/v1)--ollama-llm-model: Ollama LLM model name (default: deepseek-r1:7b)--ollama-embedding-model: Ollama embedding model name (default: nomic-embed-text)--ollama-embedding-dim: Ollama embedding dimension (default: 768)
The Graphiti MCP server provides flexible configuration options for Ollama models. Here are some common use cases:
Use default models:
# With default .env configuration
uv run src/graphiti_mcp_server.py
# Or explicitly set in .env file:
# USE_OLLAMA=true
# OLLAMA_LLM_MODEL=deepseek-r1:7b
# OLLAMA_EMBEDDING_MODEL=nomic-embed-textUse a different LLM model:
uv run src/graphiti_mcp_server.py --ollama-llm-model llama3.2:3bUse a different embedding model with custom dimension:
uv run src/graphiti_mcp_server.py --ollama-embedding-model all-minilm-l6-v2 --ollama-embedding-dim 384Use custom max tokens for larger responses:
uv run src/graphiti_mcp_server.py --max-tokens 32768Connect to a remote Ollama server:
uv run src/graphiti_mcp_server.py --ollama-base-url http://remote-server:11434/v1 --ollama-llm-model llama3.2:8bYou can also configure Ollama models using environment variables in a .env file:
# Create or edit .env file
nano .envAdd the following variables to your .env file:
# Ollama Configuration
OLLAMA_LLM_MODEL=mistral:7b
OLLAMA_EMBEDDING_MODEL=all-minilm-l6-v2
OLLAMA_EMBEDDING_DIM=384
LLM_TEMPERATURE=0.1
LLM_MAX_TOKENS=32768Then run the server:
uv run src/graphiti_mcp_server.pyThe configuration system follows this priority order (highest to lowest):
- CLI arguments - Override all other settings
- Environment variables - Provide defaults that can be overridden by CLI
- Default values - Built-in defaults for all settings
Common LLM Models:
deepseek-r1:7b(default) - Good balance of performance and qualityllama3.2:3b- Fast, smaller model for developmentllama3.2:8b- Higher quality, larger modelmistral:7b- Excellent performance for many taskscodellama:7b- Specialized for code generationphi3:3.8b- Microsoft's efficient model
Common Embedding Models:
nomic-embed-text(default) - High-quality embeddingsnomic-embed-text-v2- Improved version of the defaultall-minilm-l6-v2- Fast, efficient embeddingsall-MiniLM-L6-v2- Alternative spelling for the same modeltext-embedding-ada-002- OpenAI-compatible embeddings
- Smaller models (3B parameters) are faster but may have lower quality
- Larger models (7B+ parameters) provide better quality but require more resources
- Embedding dimensions affect both performance and storage requirements
- Remote Ollama servers can be used for distributed deployments
Graphiti's ingestion pipelines are designed for high concurrency, controlled by the SEMAPHORE_LIMIT environment variable.
By default, SEMAPHORE_LIMIT is set to 10 concurrent operations to help prevent 429 rate limit errors from your LLM provider. If you encounter such errors, try lowering this value.
If your LLM provider allows higher throughput, you can increase SEMAPHORE_LIMIT to boost episode ingestion performance.
The Graphiti MCP server can be deployed using Docker. The Dockerfile uses uv for package management, ensuring
consistent dependency installation.
Before running the Docker Compose setup, you need to configure the environment variables. You have two options:
-
Using a .env file (recommended):
- Copy the provided
.env.examplefile to create a.envfile:cp .env.example .env
- Edit the
.envfile to set your OpenAI API key and other configuration options:# Required for LLM operations OPENAI_API_KEY=your_openai_api_key_here MODEL_NAME=gpt-4.1-mini # Optional: OPENAI_BASE_URL only needed for non-standard OpenAI endpoints # OPENAI_BASE_URL=https://api.openai.com/v1 - The Docker Compose setup is configured to use this file if it exists (it's optional)
- Copy the provided
-
Using environment variables directly:
- You can also set the environment variables when running the Docker Compose command:
OPENAI_API_KEY=your_key MODEL_NAME=gpt-4.1-mini docker compose up
- You can also set the environment variables when running the Docker Compose command:
The Docker Compose setup includes a Neo4j container with the following default configuration:
- Username:
neo4j - Password:
demodemo - URI:
bolt://neo4j:7687(from within the Docker network) - Memory settings optimized for development use
A Graphiti MCP container is available at: zepai/knowledge-graph-mcp. The latest build of this container is used by the Compose setup below.
Start the services using Docker Compose:
docker compose upOr if you're using an older version of Docker Compose:
docker-compose upThis will start both the Neo4j database and the Graphiti MCP server. The Docker setup:
- Uses
uvfor package management and running the server - Installs dependencies from the
pyproject.tomlfile - Connects to the Neo4j container using the environment variables
- Exposes the server on port 8020 for HTTP-based SSE transport
- Includes a healthcheck for Neo4j to ensure it's fully operational before starting the MCP server
To use the Graphiti MCP server with an MCP-compatible client, configure it to connect to the server:
Important
You will need the Python package manager, uv installed. Please refer to the uv install instructions.
Ensure that you set the full path to the uv binary and your Graphiti project folder.
Basic Ollama configuration:
{
"mcpServers": {
"graphiti-memory": {
"transport": "stdio",
"command": "/Users/<user>/.local/bin/uv",
"args": [
"run",
"--isolated",
"--directory",
"/Users/<user>/dev/graphiti-memory",
"--project",
".",
"src/graphiti_mcp_server.py",
"--transport",
"stdio"
],
"env": {
"NEO4J_URI": "bolt://localhost:7687",
"NEO4J_USER": "neo4j",
"NEO4J_PASSWORD": "password"
}
}
}
}Custom Ollama models via CLI arguments:
{
"mcpServers": {
"graphiti-memory": {
"transport": "stdio",
"command": "/Users/<user>/.local/bin/uv",
"args": [
"run",
"--isolated",
"--directory",
"/Users/<user>/dev/graphiti-memory",
"--project",
".",
"src/graphiti_mcp_server.py",
"--transport",
"stdio",
"--ollama-llm-model",
"llama3.2:3b",
"--ollama-embedding-model",
"all-minilm-l6-v2",
"--ollama-embedding-dim",
"384"
],
"env": {
"NEO4J_URI": "bolt://localhost:7687",
"NEO4J_USER": "neo4j",
"NEO4J_PASSWORD": "password"
}
}
}
}Custom Ollama models via environment variables:
{
"mcpServers": {
"graphiti-memory": {
"transport": "stdio",
"command": "/Users/<user>/.local/bin/uv",
"args": [
"run",
"--isolated",
"--directory",
"/Users/<user>/dev/graphiti-memory",
"--project",
".",
"src/graphiti_mcp_server.py",
"--transport",
"stdio"
],
"env": {
"NEO4J_URI": "bolt://localhost:7687",
"NEO4J_USER": "neo4j",
"NEO4J_PASSWORD": "password",
"OLLAMA_LLM_MODEL": "mistral:7b",
"OLLAMA_EMBEDDING_MODEL": "nomic-embed-text-v2",
"OLLAMA_EMBEDDING_DIM": "768",
"LLM_TEMPERATURE": "0.1",
"LLM_MAX_TOKENS": "32768"
}
}
}
}{
"mcpServers": {
"graphiti-memory": {
"transport": "stdio",
"command": "/Users/<user>/.local/bin/uv",
"args": [
"run",
"--isolated",
"--directory",
"/Users/<user>/dev/graphiti-memory",
"--project",
".",
"src/graphiti_mcp_server.py",
"--transport",
"stdio"
],
"env": {
"NEO4J_URI": "bolt://localhost:7687",
"NEO4J_USER": "neo4j",
"NEO4J_PASSWORD": "password",
"USE_OLLAMA": "false",
"OPENAI_API_KEY": "sk-XXXXXXXX",
"MODEL_NAME": "gpt-4.1-mini"
}
}
}
}For SSE transport (HTTP-based), you can use this configuration:
{
"mcpServers": {
"graphiti-memory": {
"transport": "sse",
"url": "http://localhost:8020/sse"
}
}
}The Graphiti MCP server exposes the following tools:
add_episode: Add an episode to the knowledge graph (supports text, JSON, and message formats)search_nodes: Search the knowledge graph for relevant node summariessearch_facts: Search the knowledge graph for relevant facts (edges between entities)delete_entity_edge: Delete an entity edge from the knowledge graphdelete_episode: Delete an episode from the knowledge graphget_entity_edge: Get an entity edge by its UUIDget_episodes: Get the most recent episodes for a specific groupclear_graph: Clear all data from the knowledge graph and rebuild indicesget_status: Get the status of the Graphiti MCP server and Neo4j connection
The Graphiti MCP server can process structured JSON data through the add_episode tool with source="json". This
allows you to automatically extract entities and relationships from structured data:
add_episode(
name="Customer Profile",
episode_body="{\"company\": {\"name\": \"Acme Technologies\"}, \"products\": [{\"id\": \"P001\", \"name\": \"CloudSync\"}, {\"id\": \"P002\", \"name\": \"DataMiner\"}]}",
source="json",
source_description="CRM data"
)
To integrate the Graphiti MCP Server with the Cursor IDE, follow these steps:
- Run the Graphiti MCP server using the SSE transport:
python src/graphiti_mcp_server.py --transport sse --use-custom-entities --group-id <your_group_id>Hint: specify a group_id to namespace graph data. If you do not specify a group_id, the server will use "default" as the group_id.
or
docker compose up- Configure Cursor to connect to the Graphiti MCP server.
{
"mcpServers": {
"graphiti-memory": {
"url": "http://localhost:8020/sse"
}
}
}-
Add the Graphiti rules to Cursor's User Rules. See cursor_rules.md for details.
-
Kick off an agent session in Cursor.
The integration enables AI assistants in Cursor to maintain persistent memory through Graphiti's knowledge graph capabilities.
The Graphiti MCP Server container uses the SSE MCP transport. Claude Desktop does not natively support SSE, so you'll need to use a gateway like mcp-remote.
-
Run the Graphiti MCP server using SSE transport:
docker compose up
-
(Optional) Install
mcp-remoteglobally: If you prefer to havemcp-remoteinstalled globally, or if you encounter issues withnpxfetching the package, you can install it globally. Otherwise,npx(used in the next step) will handle it for you.npm install -g mcp-remote
-
Configure Claude Desktop: Open your Claude Desktop configuration file (usually
claude_desktop_config.json) and add or modify themcpServerssection as follows:{ "mcpServers": { "graphiti-memory": { // You can choose a different name if you prefer "command": "npx", // Or the full path to mcp-remote if npx is not in your PATH "args": [ "mcp-remote", "http://localhost:8020/sse" // Ensure this matches your Graphiti server's SSE endpoint ] } } }If you already have an
mcpServersentry, addgraphiti-memory(or your chosen name) as a new key within it. -
Restart Claude Desktop for the changes to take effect.
Server won't start with Ollama:
- Ensure Ollama is installed and running:
ollama serve - Check that required models are pulled:
ollama list - Verify Ollama server is accessible:
curl http://localhost:11434/v1/models - Check your
.envfile hasUSE_OLLAMA=true(default)
Model not found errors:
- Pull the required model:
ollama pull <model-name> - Check model name spelling (case-sensitive)
- Verify model is available in Ollama library
Embedding dimension mismatch:
- Ensure
OLLAMA_EMBEDDING_DIMmatches your embedding model's output dimension - Common dimensions: 384 (all-minilm-l6-v2), 768 (nomic-embed-text), 1536 (nomic-embed-text-v2)
Performance issues:
- Try smaller models for faster response times
- Adjust
SEMAPHORE_LIMITfor concurrency control - Consider using remote Ollama servers for distributed workloads
Neo4j connection errors:
- Verify Neo4j is running and accessible
- Check connection credentials and URI
- Ensure Neo4j version is 5.26 or later
MCP client connection issues:
- Verify transport method (stdio vs sse) matches client requirements
- Check port configuration for SSE transport
- Ensure firewall allows connections on configured ports
- Python 3.10 or higher
- Neo4j database (version 5.26 or later required)
- Ollama installed and running (default) OR OpenAI API key (for LLM operations)
- MCP-compatible client
The Graphiti MCP server uses the Graphiti core library, which includes anonymous telemetry collection. When you initialize the Graphiti MCP server, anonymous usage statistics are collected to help improve the framework.
- Anonymous identifier and system information (OS, Python version)
- Graphiti version and configuration choices (LLM provider, database backend, embedder type)
- No personal data, API keys, or actual graph content is ever collected
To disable telemetry in the MCP server, set the environment variable:
export GRAPHITI_TELEMETRY_ENABLED=falseOr add it to your .env file:
GRAPHITI_TELEMETRY_ENABLED=false
For complete details about what's collected and why, see the Telemetry section in the main Graphiti README.
This project is automatically published to PyPI using GitHub Actions and trusted publishing:
- PyPI: https://pypi.org/project/montesmakes.graphiti-memory/
- Installation:
uvx montesmakes.graphiti-memoryoruv tool install montesmakes.graphiti-memory
- Release Process: Use
scripts/prepare-release.shto prepare new releases - Publishing Setup: See docs/PYPI_SETUP.md for complete PyPI configuration
- Manual Testing: Use the "Manual Package Test" GitHub Action workflow
Releases are automatically published to PyPI when a new GitHub release is created. TestPyPI deployment happens automatically on pushes to the main branch.
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
-
Clone the repository:
git clone https://github.com/mandelbro/graphiti-memory.git cd graphiti/mcp_server -
Install dependencies:
uv sync --extra dev
-
Run tests:
uv run pytest
-
Format and lint:
uv run ruff format uv run ruff check
This project is licensed under the same license as the parent Graphiti project.