A Lightweight, distributed, high-performance Metadata management component that can replace heavy-resource systems like Zookeeper and ETCD. Supports integration as a library (so/lib) or as a single-process solution.
- Raft Consensus: Built on etcd's battle-tested raft library
- High Availability: Tolerates up to (N-1)/2 node failures in an N-node cluster
- Dual Storage Modes:
- Memory + WAL: Default mode with WAL-based persistence (fast, suitable for most use cases)
- RocksDB: Full persistent storage backend (requires RocksDB C++ library)
- HTTP API: Simple REST API for key-value operations
- Dynamic Membership: Add/remove nodes without downtime
- Snapshots: Automatic log compaction via snapshots
- Single Binary: Lightweight, easy to deploy
The easiest way to build and run MetaStore is using the provided Makefile:
# Build the binary
make build
# Or simply
make
# Run with memory storage
make run-memory
# Run with RocksDB storage
make run-rocksdb
# Start 3-node cluster with memory storage
make cluster-memory
# Start 3-node cluster with RocksDB storage
make cluster-rocksdb
# Stop all nodes
make stop-cluster
# Clean build artifacts
make clean
# Show all available commands
make helpPrerequisites:
- Go 1.23 or higher
- CGO enabled
- RocksDB C++ library installed
Linux:
# Debian/Ubuntu
sudo apt-get install librocksdb-dev
# RHEL/CentOS/Fedora
sudo yum install rocksdb-devel
# Build MetaStore
export CGO_LDFLAGS="-lrocksdb -lpthread -lstdc++ -ldl -lm -lzstd -llz4 -lz -lsnappy -lbz2"
export CGO_ENABLED=1
go build -ldflags="-s -w" -o metaStore# Install build dependencies
sudo yum install -y gcc-c++ make cmake git \
snappy snappy-devel \
zlib zlib-devel \
bzip2 bzip2-devel \
lz4-devel \
zstd libzstd-devel \
gflags-devel
# Install GCC 11 toolset (required for RocksDB)
sudo dnf install -y gcc-toolset-11
scl enable gcc-toolset-11 bash
echo "source /opt/rh/gcc-toolset-11/enable" >> ~/.bashrc
source ~/.bashrc
# Clone and build RocksDB v10.7.5
git clone --branch v10.7.5 https://github.com/facebook/rocksdb.git
cd rocksdb
make clean
make static_lib -j$(nproc)
sudo make install
# Return to MetaStore directory and build
cd /path/to/MetaStore
export CGO_LDFLAGS="-lrocksdb -lpthread -lstdc++ -ldl -lm -lzstd -llz4 -lz -lsnappy -lbz2"
export CGO_ENABLED=1
go build -ldflags="-s -w" -o metaStoreNote: Building RocksDB from source gives you the latest stable version (v10.7.5) with better performance and bug fixes. The package manager version may be older.
macOS:
# Install RocksDB
brew install rocksdb
# Build MetaStore
export CGO_LDFLAGS="-lrocksdb -lpthread -lstdc++ -ldl -lm -lzstd -llz4 -lz -lsnappy -lbz2"
export CGO_ENABLED=1
go build -ldflags="-s -w" -o metaStore# Install build dependencies
brew install cmake snappy zlib bzip2 lz4 zstd gflags
# Clone and build RocksDB v10.7.5
git clone --branch v10.7.5 https://github.com/facebook/rocksdb.git
cd rocksdb
make clean
make static_lib -j$(sysctl -n hw.ncpu)
sudo make install
# Return to MetaStore directory and build
cd /path/to/MetaStore
export CGO_LDFLAGS="-lrocksdb -lpthread -lstdc++ -ldl -lm -lzstd -llz4 -lz -lsnappy -lbz2"
export CGO_ENABLED=1
go build -ldflags="-s -w" -o metaStoreNote for macOS users: If you encounter linking errors with Go 1.25+ on older SDK versions, see ROCKSDB_BUILD_MACOS.md for detailed troubleshooting steps.
Windows:
# Install RocksDB using vcpkg
vcpkg install rocksdb:x64-windows
# Build MetaStore
$env:CGO_ENABLED=1
$env:CGO_LDFLAGS="-lrocksdb -lpthread -lstdc++ -ldl -lm -lzstd -llz4 -lz -lsnappy -lbz2"
go build -ldflags="-s -w" -o metaStore.exe# Install dependencies (requires Visual Studio 2019 or later)
# Install CMake, Git, and required compression libraries via vcpkg
vcpkg install snappy:x64-windows zlib:x64-windows bzip2:x64-windows lz4:x64-windows zstd:x64-windows
# Clone and build RocksDB
git clone --branch v10.7.5 https://github.com/facebook/rocksdb.git
cd rocksdb
mkdir build
cd build
cmake .. -G "Visual Studio 16 2019" -A x64 -DCMAKE_BUILD_TYPE=Release
cmake --build . --config Release
cmake --install . --prefix "C:\rocksdb"
# Build MetaStore (adjust paths to match your installation)
cd \path\to\MetaStore
$env:CGO_ENABLED=1
$env:CGO_CFLAGS="-IC:\rocksdb\include"
$env:CGO_LDFLAGS="-LC:\rocksdb\lib -lrocksdb"
go build -ldflags="-s -w" -o metaStore.exeThe unified build produces a single binary that supports both memory and RocksDB storage modes. You can switch between storage engines at runtime using the --storage flag.
# Start with memory + WAL storage (default)
metaStore --id 1 --cluster http://127.0.0.1:12379 --port 12380 --storage memory# Create data directory first
mkdir -p data
# Start with RocksDB storage
metaStore --id 1 --cluster http://127.0.0.1:12379 --port 12380 --storage rocksdbThe unified binary allows you to choose the storage engine at runtime using the --storage flag.
Each store process maintains a single raft instance and a key-value server. The process's list of comma separated peers (--cluster), its raft ID index into the peer list (--id), and HTTP key-value server port (--port) are passed through the command line.
Next, store a value ("hello") to a key ("my-key"):
curl -L http://127.0.0.1:12380/my-key -XPUT -d hello
Finally, retrieve the stored key:
curl -L http://127.0.0.1:12380/my-key
To test cluster recovery, first start a cluster and write a value "foo":
curl -L http://127.0.0.1:12380/my-key -XPUT -d fooNext, remove a node and replace the value with "bar" to check cluster availability:
curl -L http://127.0.0.1:12380/my-key -XPUT -d bar
curl -L http://127.0.0.1:32380/my-keyFinally, bring the node back up and verify it recovers with the updated value "bar":
curl -L http://127.0.0.1:22380/my-keyNodes can be added to or removed from a running cluster using requests to the REST API.
For example, suppose we have a 3-node cluster that was started with the commands:
metaStore --id 1 --cluster http://127.0.0.1:12379,http://127.0.0.1:22379,http://127.0.0.1:32379 --port 12380
metaStore --id 2 --cluster http://127.0.0.1:12379,http://127.0.0.1:22379,http://127.0.0.1:32379 --port 22380
metaStore --id 3 --cluster http://127.0.0.1:12379,http://127.0.0.1:22379,http://127.0.0.1:32379 --port 32380A fourth node with ID 4 can be added by issuing a POST:
curl -L http://127.0.0.1:12380/4 -XPOST -d http://127.0.0.1:42379Then the new node can be started as the others were, using the --join option:
metaStore --id 4 --cluster http://127.0.0.1:12379,http://127.0.0.1:22379,http://127.0.0.1:32379,http://127.0.0.1:42379 --port 42380 --joinThe new node should join the cluster and be able to service key/value requests.
We can remove a node using a DELETE request:
curl -L http://127.0.0.1:12380/3 -XDELETENode 3 should shut itself down once the cluster has processed this request.
The store consists of three components: a raft-backed key-value store, a REST API server, and a raft consensus server based on etcd's raft implementation.
The raft-backed key-value store is a key-value map that holds all committed key-values. The store bridges communication between the raft server and the REST server. Key-value updates are issued through the store to the raft server. The store updates its map once raft reports the updates are committed.
The REST server exposes the current raft consensus by accessing the raft-backed key-value store. A GET command looks up a key in the store and returns the value, if any. A key-value PUT command issues an update proposal to the store.
The raft server participates in consensus with its cluster peers. When the REST server submits a proposal, the raft server transmits the proposal to its peers. When raft reaches a consensus, the server publishes all committed updates over a commit channel. For store, this commit channel is consumed by the key-value store.
- Persistence: Write-Ahead Log (WAL) + periodic snapshots
- Storage Location:
./metaStore-{id}/(WAL),./metaStore-{id}-snap/(snapshots) - Use Case: Fast performance, suitable for most scenarios
- Data Loss: Minimal (only uncommitted entries on crash)
- Recovery: Fast snapshot + WAL replay
- Persistence: Full persistent storage with RocksDB backend
- Storage Location:
./data/{id}/(RocksDB + snapshots in./data/{id}/snap/) - Use Case: When you need guaranteed persistence of all data
- Data Loss: None (all data persisted to disk atomically)
- Recovery: Direct from RocksDB (faster for large datasets)
- Requirements: RocksDB C++ library, CGO enabled, built with
-tags=rocksdb - Note: The
data/parent directory must exist before starting the node
- Pros: Faster reads/writes, lower latency, no external dependencies
- Cons: Limited by available RAM for large datasets, slower recovery with large WAL
- Pros: Handles TB-scale datasets, faster recovery, guaranteed persistence, efficient compaction
- Cons: Slightly higher latency due to disk I/O, requires RocksDB library and CGO
--id int Node ID (default: 1)
--cluster string Comma-separated list of cluster peer URLs (default: "http://127.0.0.1:9021")
--port int HTTP API port for key-value operations (default: 9121)
--join Join an existing cluster (default: false)
--storage string Storage engine: "memory" or "rocksdb" (default: "memory")
The unified binary supports runtime storage engine selection. Both memory and RocksDB modes are always available.
export CGO_LDFLAGS="-lrocksdb -lpthread -lstdc++ -ldl -lm -lzstd -llz4 -lz -lsnappy -lbz2"
export CGO_ENABLED=1
go test -vmacOS users: See ROCKSDB_BUILD_MACOS.md for SDK compatibility issues.
# Single node test
go test -v -run TestPutAndGetKeyValue
# Cluster test
go test -v -run TestProposeOnCommit
# Snapshot test
go test -v -run TestSnapshot
# RocksDB storage test
go test -v -run TestRocksDBStorageA cluster of N nodes can tolerate up to (N-1)/2 failures:
- 1 node: 0 failures (no fault tolerance)
- 3 nodes: 1 failure
- 5 nodes: 2 failures
- 7 nodes: 3 failures
Breaking Changes:
- Removed build tag separation between memory and RocksDB modes
- Single unified binary now supports both storage engines at runtime
- Simplified build process: no need for
-tags=rocksdbflag
New Features:
- Runtime storage engine selection via
--storageflag - Unified
main.goentry point for both memory and RocksDB modes - Consistent command-line interface across all storage modes
Improvements:
- Simplified build process: single
go buildcommand - No more separate binaries for different storage modes
- Easier maintenance with unified codebase
- Both storage engines always available in single binary
Migration Guide:
- Old:
go build -tags=rocksdb→ New:go build(with CGO and RocksDB libraries) - Old: Binary selection at compile time → New: Runtime selection with
--storageflag - Default storage mode remains
memoryfor backward compatibility
Apache License 2.0 (inherited from etcd)
MetaStore follows the golang-standards/project-layout standard:
MetaStore/
├── cmd/metastore/ # Application entry point
├── internal/ # Private packages (按功能分层)
│ ├── store/ # Storage layer (Memory & RocksDB implementations)
│ ├── raft/ # Raft consensus layer
│ ├── http/ # HTTP API layer
│ └── storage/ # Low-level storage engine
├── Makefile # Build automation
└── README.md
For detailed structure information, see PROJECT_LAYOUT.md.
- Quick Start Guide - 10-step tutorial to get started
- This README - Complete feature overview and API reference
- ⭐ Architecture Design - Comprehensive architecture overview
- Package structure and responsibilities
- Dual storage engine explanation (Memory vs RocksDB)
- Raft storage layer deep dive
- Data flow and component relationships
- Must-read for understanding the codebase!
- Implementation Details - Architecture and design decisions
- Project Summary - Complete project overview
- Files Checklist - Complete file inventory
- RocksDB Test Guide - How to run RocksDB tests in different environments
- RocksDB Test Report - Expected test results and performance benchmarks
- Git Commit Guide - How to commit changes to the project