MemCP — Persistent Main Memory Database with MySQL Compatibility

MemCP is a MySQL-compatible database that keeps your data fully in main memory for maximum speed. Your data is safe — it's persisted to disk automatically and survives restarts — but because there's no disk I/O on the query path, reads and aggregations run 10–100× faster than MySQL.

Use your existing MySQL connector, ORMs, and SQL queries. No migration, no rewrite.

Status: Beta. Core SQL (SELECT, INSERT, UPDATE, DELETE, JOIN, GROUP BY, subqueries, triggers) works well in production. Some advanced SQL edge cases are still being improved — check open issues if a specific query doesn't behave as expected.

Why Switch from MySQL? 💡

⚡ 10-100x Faster on Aggregations and Reports

No disk I/O on the query path — data lives in main memory, always ready
Fast for writes and reads — handles inserts and complex statistics in the same database
Built-in REST API — query your database directly over HTTP, no middleware needed
Sub-millisecond response times even on large tables with GROUP BY and aggregations

Why is it so much faster for statistics? MySQL reads entire rows to compute a SUM() or COUNT(). MemCP stores each column separately, so SELECT region, SUM(revenue) FROM orders GROUP BY region only reads the two columns it actually needs — not every field of every row.

🔌 Drop-in MySQL Compatibility

-- Your existing MySQL queries work immediately
CREATE TABLE users (id INT, name VARCHAR(100), email VARCHAR(255));
INSERT INTO users VALUES (1, 'Alice', 'alice@example.com');
SELECT * FROM users WHERE id = 1;

🌐 Built-in REST API Server

# Start MemCP with REST API
./memcp --api-port=4321 lib/main.scm

# Query via HTTP instantly
curl -X POST http://localhost:4321/sql/mydb \
  -d "SELECT * FROM users" \
  -H "Authorization: Basic cm9vdDphZG1pbg=="

API Endpoints:

/sql/<database> — MySQL-dialect SQL
/psql/<database> — PostgreSQL-dialect SQL
/rdf/<database> — SPARQL queries
/rdf/<database>/load_ttl — load RDF/Turtle data
/dashboard — admin dashboard with live CPU/memory/connection gauges, database browser, shard and compression statistics, and user management

📊 Perfect for Modern Workloads

Microservices - Embedded database per service
APIs and Web Apps - Ultra-low latency responses
Real-time Analytics - Process data as fast as it arrives
Development & Testing - Instant setup, no configuration

Architecture & Languages 🏗️

MemCP combines the best of multiple worlds with a carefully chosen tech stack:

Go (Storage Engine & Core)

High-performance storage engine built in Go
Concurrent request handling with goroutines
Memory-efficient data structures
Cross-platform compatibility

Scheme (SQL Parser & Compiler)

Advanced SQL parser written in Scheme
Query optimization and compilation
Extensible language for complex transformations
Functional programming advantages for parsing

Flexible Scripting Support

Command-line argument support for automation
Dynamic query generation and processing
Easy integration with existing workflows

Key Advantages 🎯

🔥 Ultra-Fast REST APIs

Traditional setup: Client → HTTP Server → Database Connection → Disk I/O MemCP: Client → HTTP Server → Main Memory ✨

// Response times you'll see
MySQL (with network + disk):  10-50ms
MemCP (main memory):          0.1-1ms  // 50x faster!

⚡ Docker

docker pull carli2/memcp
docker run -it -p 4321:4321 -p 3307:3307 carli2/memcp

🧠 Persistent and Safe

Data is written to disk — restarts and crashes don't lose your data
S3, MinIO, and Ceph backends for cloud and distributed deployments
Automatic compression reduces storage footprint significantly vs. MySQL
Configurable data directory — point it at any local or remote path

Memory Management 🧩

MemCP is designed to run alongside other services on the same machine without blowing up your RAM.

Automatic compression — MemCP stores each column in the most compact format that fits the data: small integers get bit-packed, repeated strings become dictionary-encoded, sequential IDs are stored as ranges. A table that takes 10 GB in MySQL often fits in 1–3 GB in MemCP.

Configurable memory budget — by default MemCP uses at most 50% of your server's RAM. You can set an exact limit via the dashboard or the settings API:

# Limit to 4 GB total
curl -u root:admin -X POST http://localhost:4321/dashboard/api/settings \
  -d '{"key":"MaxRamBytes","value":4294967296}'

# Or as a percentage of total RAM (default: 50)
curl -u root:admin -X POST http://localhost:4321/dashboard/api/settings \
  -d '{"key":"MaxRamPercent","value":40}'

Automatic eviction — when MemCP approaches its memory limit, it automatically unloads the least recently used data from RAM. That data stays safe on disk and is transparently reloaded the next time a query needs it. Frequently accessed hot data stays in memory; cold data steps aside.

System-wide pressure awareness — if the whole server runs low on free RAM (below 10%), MemCP detects this and proactively releases its own cache — even if its own budget isn't exhausted yet. This keeps your application, web server, and OS responsive regardless of load spikes.

Separate budget for persistent data — a second budget (default: 30% of RAM) controls how much space the on-disk data loaded into RAM may occupy, independently of temporary query working memory. Both limits are tunable at runtime without restart.

🔧 Developer-Friendly

Comprehensive test suite with 2470+ SQL tests across 100+ test suites
YAML-based testing framework
Extensive error handling and validation
Built-in performance monitoring

Quick Start 🚀

# 1. Build MemCP
go mod download
make

# 2. Start with REST API
./memcp --api-port=4321 --mysql-port=3307 lib/main.scm

# Run as a background daemon (use --no-repl to avoid exiting when stdin closes)
./memcp --no-repl --api-port=4321 --mysql-port=3307 lib/main.scm &

# 3. Create your first database
curl -X POST http://localhost:4321/sql/system \
  -d "CREATE DATABASE myapp" \
  -u root:admin

# 4. Start building lightning-fast apps!
curl -X POST http://localhost:4321/sql/myapp \
  -d "CREATE TABLE products (id INT, name VARCHAR(100), price DECIMAL(10,2))" \
  -u root:admin

CLI Flags

Flag	Default	Description
`--api-port=PORT`	`4321`	HTTP API listen port
`--mysql-port=PORT`	`3307`	MySQL protocol listen port
`--mysql-socket=PATH`	`/tmp/memcp.sock`	MySQL Unix socket path
`--root-password=PASSWORD`	`admin`	Initial root password (first run only)
`--disable-api`	—	Disable HTTP API server
`--disable-mysql`	—	Disable MySQL protocol server
`--no-repl`	—	Disable interactive REPL (required for daemon/background use)
`-data DIR`	`./data`	Data directory

Authentication

Security note: Never expose MemCP directly to the internet with default credentials. Always set a strong --root-password before any network-accessible deployment.

Default credentials: root / admin.
Set the initial root password via CLI: --root-password=supersecret at the first run (on a fresh -data folder), or via Docker env ROOT_PASSWORD.
Docker Compose example:

services:
  memcp:
    image: carli2/memcp:latest
    environment:
      - ROOT_PASSWORD=supersecret
      - PARAMS=--api-port=4321
    ports:
      - "4321:4321"  # HTTP API
      - "3307:3307"  # MySQL protocol
    volumes:
      - memcp_data:/data
volumes:
  memcp_data: {}

Change the credentials with:

curl -X POST http://localhost:4321/sql/system \
  -d "ALTER USER root IDENTIFIED BY 'supersecret'" \
  -u root:admin

Performance Comparison 📈

Query	MySQL (SSD)	MemCP
`SELECT * FROM users WHERE id = 1`	1–5ms	0.1ms
`SELECT region, SUM(revenue) FROM orders GROUP BY region`	200–800ms	2–10ms
`SELECT COUNT(*), AVG(price) FROM products WHERE category = ?`	50–200ms	0.5ms
`INSERT INTO events VALUES (...)`	10–30ms	0.2ms
REST API call (HTTP → query → JSON response)	20–100ms	1–10ms

Measured on standard dev hardware. Aggregation queries show the largest speedup because MemCP only reads the columns a query actually uses.

Use Cases 💼

📊 Dashboards and Reports - GROUP BY queries that take seconds in MySQL run in milliseconds
📡 Realtime Monitoring - Aggregate metrics and counters over millions of rows without slowing down
🛒 E-commerce - Product catalog queries, price calculations, and inventory stats at any scale
🎮 Gaming Backends - Leaderboards, player statistics, and session data with sub-millisecond latency
💰 Financial Applications - Aggregations, risk calculations, and transaction summaries in real time
🧪 Development & Testing - Instant database setup, no configuration, throw it away when done

Contributing 🤝

We'd love your help making MemCP even better!

🌟 Why Contribute?

Work with cutting-edge database technology
Learn Go, Scheme, and database internals
Help shape an early-stage project where your contributions have real impact
Build ultra-high-performance systems

🛠️ Easy Ways to Contribute

📝 Add test cases - Expand our comprehensive test suite
🐛 Fix bugs - Help us squash issues and improve stability
⚡ Performance optimization - Make fast even faster
📚 Documentation - Help other developers get started
🔧 New features - SQL functions, operators, and capabilities

🚀 Getting Started

# 1. Fork the repository
# 2. Clone your fork
git clone https://github.com/launix-de/memcp.git

# 3. Set up development environment
cd memcp
go build -o memcp

# 4. Run the test suite (starts its own server automatically)
python3 run_sql_tests.py tests/01_basic_sql.yaml

# 5. Make your changes and add tests
# 6. Submit a pull request!

🎯 Current Contribution Opportunities

Vector database features - Advanced similarity search
Additional SQL functions - String, math, and date functions
Performance benchmarking - Automated performance testing
Driver development - Language-specific database drivers
Documentation examples - Real-world usage scenarios

Testing 🧪

MemCP includes a comprehensive test framework:

# Run all tests
make test

# Or if you want to contribute, deploy this as a Pre-commit hook:
cp git-pre-commit .git/hooks/pre-commit

# Run specific test suites (starts its own server automatically)
python3 run_sql_tests.py tests/01_basic_sql.yaml      # Basic operations
python3 run_sql_tests.py tests/02_functions.yaml      # SQL functions
python3 run_sql_tests.py tests/07_error_cases.yaml    # Error handling

# Connect to an already-running instance (skip startup)
python3 run_sql_tests.py tests/01_basic_sql.yaml 4321 --connect-only

Performance Testing 📊

MemCP includes an auto-calibrating performance test framework that adapts to your machine.

Running Performance Tests

# Run perf tests (uses calibrated baselines)
PERF_TEST=1 make test

# Calibrate for your machine (run ~10 times to reach target time range)
PERF_TEST=1 PERF_CALIBRATE=1 make test

# Freeze row counts for bisecting performance regressions
PERF_TEST=1 PERF_NORECALIBRATE=1 make test

# Show query plans for each test
PERF_TEST=1 PERF_EXPLAIN=1 make test

How Calibration Works

Initial run starts with 10,000 rows per test
Each calibration run scales row counts by 30% up/down
Target is 10-20 seconds query time per test
Baselines are stored in .perf_baseline.json
After ~10 runs, row counts stabilize in the target range

Output Format

✅ Perf: COUNT (7.9ms / 8700ms, 20,000 rows, 0.39µs/row, 11.4MB heap)
         │       │        │           │        │           └─ Heap memory after insert
         │       │        │           │        └─ Time per row
         │       │        │           └─ Calibrated row count
         │       │        └─ Threshold (from baseline × 1.1)
         │       └─ Actual query time
         └─ Test name

Performance Debugging Cookbook

Detecting a performance regression:

# 1. Freeze baselines to use consistent row counts
PERF_TEST=1 PERF_NORECALIBRATE=1 make test

# 2. If a test fails threshold, you have a regression

Bisecting a performance bug:

# 1. Checkout the known-good commit, run calibration
git checkout good-commit
PERF_TEST=1 PERF_CALIBRATE=1 make test  # run 10x to calibrate

# 2. Save the baseline
cp .perf_baseline.json .perf_baseline_good.json

# 3. Bisect with frozen row counts
git bisect start
git bisect bad HEAD
git bisect good good-commit
git bisect run bash -c 'PERF_TEST=1 PERF_NORECALIBRATE=1 make test'

Analyzing slow queries:

# Show query plans to understand execution
PERF_TEST=1 PERF_EXPLAIN=1 make test

Environment Variables

Variable	Values	Description
`PERF_TEST`	`0`/`1`	Enable performance tests
`PERF_CALIBRATE`	`0`/`1`	Update baselines with new times
`PERF_NORECALIBRATE`	`0`/`1`	Freeze row counts (for bisecting)
`PERF_EXPLAIN`	`0`/`1`	Show query plans

Remote Storage Backends 🗄️

MemCP supports storing databases on remote storage backends instead of the local filesystem. To configure a remote backend, create a JSON configuration file in the data folder instead of a directory.

S3 / MinIO Storage

Store your database on Amazon S3 or any S3-compatible storage (MinIO, Ceph RGW, etc.).

Configuration file (data/mydb.json):

{
  "backend": "s3",
  "access_key_id": "your-access-key",
  "secret_access_key": "your-secret-key",
  "region": "us-east-1",
  "bucket": "memcp-data",
  "prefix": "databases"
}

For MinIO or self-hosted S3-compatible storage:

{
  "backend": "s3",
  "access_key_id": "minioadmin",
  "secret_access_key": "minioadmin",
  "endpoint": "http://localhost:9000",
  "bucket": "memcp",
  "prefix": "data",
  "force_path_style": true
}

Quick MinIO setup for testing:

# Start MinIO with Docker
docker run -d --name minio \
  -p 9000:9000 -p 9001:9001 \
  -e MINIO_ROOT_USER=minioadmin \
  -e MINIO_ROOT_PASSWORD=minioadmin \
  minio/minio server /data --console-address ":9001"

# Create a bucket (via MinIO Console at http://localhost:9001)
# Or via mc CLI:
mc alias set local http://localhost:9000 minioadmin minioadmin
mc mb local/memcp

Ceph/RADOS Storage

Store your database directly on Ceph RADOS for high-performance distributed storage.

Why is Ceph optional? The Ceph backend uses CGO to link against librados (the Ceph client library). This requires the C headers and library to be installed at compile time and the shared library at runtime. To keep the default build simple and portable, Ceph support is behind a build tag.

# Install Ceph development libraries (Ubuntu/Debian)
sudo apt-get install librados-dev

# Build MemCP with Ceph support
make ceph
# or: go build -tags=ceph

Configuration file (data/mydb.json):

{
  "backend": "ceph",
  "username": "client.memcp",
  "cluster": "ceph",
  "pool": "memcp",
  "prefix": "databases"
}

Optional fields:

conf_file: Path to ceph.conf (defaults to /etc/ceph/ceph.conf)

Setting up a Ceph development cluster with vstart.sh:

# Clone Ceph source
git clone https://github.com/ceph/ceph.git
cd ceph

# Install dependencies and build (only vstart target needed)
./install-deps.sh
pip install cython setuptools
./do_cmake.sh
cd build && ninja vstart

# Start a development cluster
cd ..
MON=1 OSD=3 MDS=0 MGR=1 ./build/bin/vstart.sh -d -n -x

# Create a pool for MemCP
./build/bin/ceph osd pool create memcp 32

# Create a user for MemCP (optional, can also use client.admin)
./build/bin/ceph auth get-or-create client.memcp \
  mon 'allow r' \
  osd 'allow rwx pool=memcp' \
  -o ceph.client.memcp.keyring

Environment variables for vstart cluster:

export CEPH_CONF=/path/to/ceph/build/ceph.conf
export CEPH_KEYRING=/path/to/ceph/build/keyring

Backend Configuration Reference

Field	Backend	Description
`backend`	all	Backend type: `"s3"` or `"ceph"`
`prefix`	all	Object key prefix for database objects
`access_key_id`	S3	AWS or S3-compatible access key
`secret_access_key`	S3	AWS or S3-compatible secret key
`region`	S3	AWS region (e.g., `"us-east-1"`)
`endpoint`	S3	Custom endpoint URL (for MinIO, etc.)
`bucket`	S3	S3 bucket name
`force_path_style`	S3	Use path-style URLs (required for MinIO)
`username`	Ceph	Ceph user (e.g., `"client.admin"`)
`cluster`	Ceph	Cluster name (usually `"ceph"`)
`conf_file`	Ceph	Path to ceph.conf (optional)
`pool`	Ceph	RADOS pool name

License 📄

MemCP is open source software. See the LICENSE file for details.

Ready to experience database performance like never before? Get Started • Contribute • Join our Community

MemCP: Because your applications deserve better than "good enough" performance. ⚡

Name		Name	Last commit message	Last commit date
Latest commit History 1,193 Commits
.github/workflows		.github/workflows
apps		apps
assets		assets
docs		docs
lib		lib
papers/aggregationcache		papers/aggregationcache
scm		scm
storage		storage
tests		tests
third_party		third_party
todos		todos
tools		tools
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
git-pre-commit		git-pre-commit
go.mod		go.mod
go.sum		go.sum
main.go		main.go
memcp.singularity.recipe		memcp.singularity.recipe
run_sql_tests.py		run_sql_tests.py
run_tests_unbuffered.py		run_tests_unbuffered.py
test_memcp_api.py		test_memcp_api.py
watch-memcp.sh		watch-memcp.sh

Folders and files

Latest commit

History

Repository files navigation

MemCP — Persistent Main Memory Database with MySQL Compatibility

Why Switch from MySQL? 💡

⚡ 10-100x Faster on Aggregations and Reports

🔌 Drop-in MySQL Compatibility

🌐 Built-in REST API Server

📊 Perfect for Modern Workloads

Architecture & Languages 🏗️

Go (Storage Engine & Core)

Scheme (SQL Parser & Compiler)

Flexible Scripting Support

Key Advantages 🎯

🔥 Ultra-Fast REST APIs

⚡ Docker

🧠 Persistent and Safe

Memory Management 🧩

🔧 Developer-Friendly

Quick Start 🚀

CLI Flags

Authentication

Performance Comparison 📈

Use Cases 💼

Contributing 🤝

🌟 Why Contribute?

🛠️ Easy Ways to Contribute

🚀 Getting Started

🎯 Current Contribution Opportunities

Testing 🧪

Performance Testing 📊

Running Performance Tests

How Calibration Works

Output Format

Performance Debugging Cookbook

Environment Variables

Remote Storage Backends 🗄️

S3 / MinIO Storage

Ceph/RADOS Storage

Backend Configuration Reference

License 📄

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages