A production-grade, in-memory distributed cache built in Node.js. Keys are partitioned across a cluster via consistent hashing, eviction follows an LRU policy, and writes are replicated asynchronously across configurable replica nodes. Each node exposes an HTTP API and optionally persists data to a Redis sidecar.
- Architecture
- Features
- Project Structure
- Getting Started
- HTTP API
- Configuration
- Environment Variables
- Tests
- Design Decisions
┌──────────────────────────────────────────────────────────────────┐
│ CacheCluster │
│ │
│ ┌─────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ ConsistentHash│ │ ReplicationManager│ │ RebalanceManager │ │
│ │ (ring, 150 │ │ (async N-replica │ │ (key migration │ │
│ │ vnodes/node)│ │ writes) │ │ on join/leave) │ │
│ └──────┬──────┘ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │ │
│ ┌──────▼──────────────────▼──────────────────────▼──────────┐ │
│ │ CacheNode × N │ │
│ │ LRUCache (doubly-linked list + HashMap) │ │
│ │ Optional: Redis persistence sidecar │ │
│ └────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
▲
│ HTTP (Express)
│
┌──────┴──────┐ ┌─────────────┐
│ CacheServer │ │ CacheClient │ ← hash-routes + replica fallback
└─────────────┘ └─────────────┘
Every container/pod runs one CacheServer instance representing a single node. Peers are discovered at startup via the PEER_NODES environment variable.
| Feature | Detail |
|---|---|
| LRU eviction | O(1) get/set via doubly-linked list + HashMap |
| TTL support | Per-key expiry tracked in-memory and in Redis |
| Consistent hashing | 150 virtual nodes per physical node, O(log n) lookup |
| Async replication | Configurable replica count, writes fan-out in parallel |
| Fault tolerance | On node failure, missing replicas are backfilled with TTL intact |
| Online rebalancing | Keys migrate automatically when nodes join or leave |
| Redis persistence | Optional per-node backing store; PTTL used to preserve remaining TTL |
| HTTP API | GET / POST / PUT / DELETE on /cache/:key |
| Observability | Hit/miss rates, eviction counts, latency histograms, cluster health events |
| Docker Compose | One-command 5-node cluster with Redis sidecars |
| Kubernetes | StatefulSet, headless service, ConfigMap |
.
├── config/
│ ├── default.js # Development defaults
│ └── production.js # Production overrides (loaded when NODE_ENV=production)
├── docker/
│ ├── Dockerfile
│ └── docker-compose.yml
├── k8s/
│ ├── namespace.yaml
│ ├── configmap.yaml
│ ├── statefulset.yaml
│ └── service.yaml
├── src/
│ ├── cache/
│ │ ├── LRUCache.js # Core LRU cache (doubly-linked list + HashMap)
│ │ ├── ConsistentHash.js # Hash ring with virtual nodes
│ │ ├── CacheNode.js # Single node (LRU + optional Redis)
│ │ └── CacheCluster.js # Cluster orchestrator
│ ├── replication/
│ │ ├── ReplicationManager.js # Async fan-out writes
│ │ └── RebalanceManager.js # Key migration on topology changes
│ ├── server/
│ │ ├── CacheServer.js # Express app entry point
│ │ ├── routes/
│ │ │ ├── cache.routes.js # /cache endpoints
│ │ │ └── health.routes.js # /health endpoints
│ │ └── middleware/
│ │ ├── errorHandler.js
│ │ └── requestLogger.js
│ ├── client/
│ │ └── CacheClient.js # Hash-routing client with replica fallback
│ ├── monitoring/
│ │ ├── MetricsCollector.js
│ │ └── ClusterMonitor.js
│ └── utils/
│ ├── hash.js
│ └── logger.js
└── tests/
├── unit/ # LRUCache, ConsistentHash, CacheNode, ReplicationManager
├── integration/ # CacheCluster, CacheClient end-to-end
└── simulation/ # Zipf-distribution hit-rate simulation
- Node.js 18+
- npm 9+
- Docker + Docker Compose (for multi-node mode)
git clone https://github.com/ayomidelog/Dis.git
cd Dis
npm installNODE_ID=node-1 PORT=3001 npm startcd docker
docker compose up --buildThis starts 5 cache nodes (ports 3001–3005) each with a dedicated Redis sidecar (ports 6381–6385). Each node automatically discovers its peers via PEER_NODES.
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/statefulset.yaml
kubectl apply -f k8s/service.yamlAll endpoints are served on each node at http://<host>:<port>.
Retrieve a cached value.
GET /cache/user:42200 OK
{ "key": "user:42", "value": { "name": "Alice" } }404 Not Found
{ "error": "Key not found" }Store a value (key in request body).
POST /cache
Content-Type: application/json
{ "key": "user:42", "value": { "name": "Alice" }, "ttl": 60000 }| Field | Type | Required | Description |
|---|---|---|---|
key |
string | Yes | Cache key |
value |
any | Yes | Value to store |
ttl |
integer | No | Time-to-live in milliseconds (must be > 0) |
201 Created
{ "key": "user:42", "value": { "name": "Alice" }, "ttl": 60000 }Update/upsert a value (key in URL path).
PUT /cache/user:42
Content-Type: application/json
{ "value": { "name": "Alice" }, "ttl": 60000 }200 OK
{ "key": "user:42", "value": { "name": "Alice" }, "ttl": 60000 }Remove a key.
DELETE /cache/user:42200 OK
{ "key": "user:42", "deleted": true }404 Not Found
{ "error": "Key not found" }Cluster health summary.
GET /health200 OK
{
"status": "ok",
"nodeCount": 5,
"uptime": 142.3,
"stats": {
"nodeCount": 5,
"replicaCount": 2,
"totals": { "hits": 9120, "misses": 880, "evictions": 14, "hitRate": 0.912 }
}
}Configs live in config/. The active file is selected by NODE_ENV (falls back to default).
| Key | Default | Production |
|---|---|---|
cluster.nodeCount |
5 | 5 |
cluster.replicaCount |
2 | 3 |
cluster.virtualNodes |
150 | 200 |
cache.defaultCapacity |
1 000 | 10 000 |
cache.defaultTTL |
3 600 000 ms | 86 400 000 ms |
server.basePort |
3001 | 3001 |
monitoring.healthCheckInterval |
5 000 ms | 10 000 ms |
| Variable | Description | Example |
|---|---|---|
NODE_ENV |
Config profile (development / production) |
production |
NODE_ID |
Unique ID for this node | node-1 |
PORT |
HTTP listen port | 3001 |
REDIS_HOST |
Redis host for persistence | redis-1 |
REDIS_PORT |
Redis port | 6379 |
PEER_NODES |
Comma-separated list of peer nodes in id:host:port format |
node-2:cache-node-2:3001,node-3:cache-node-3:3001 |
LOG_LEVEL |
Winston log level | info |
CACHE_SIMULATION_LOG |
Set to any value to print simulation stats | 1 |
# All tests (51 total)
npm test
# Unit tests only
npm run test:unit
# Integration tests only
npm run test:integration
# Hit-rate simulation (Zipf distribution)
npm run test:simulationNo running Redis instance is required — Redis persistence is opt-in.
Simulation results (10 000 requests, Zipf skew 1.5, top-20% pre-populated):
hitRate = 96.50% threshold >= 90%
Why consistent hashing?
Adding or removing a node remaps only keys / n entries instead of all keys, making topology changes cheap.
Why 150 virtual nodes per physical node?
More virtual nodes give a more uniform key distribution across physical nodes. 150 is a common sweet-spot for small-to-medium clusters (< 100 nodes).
Why O(1) LRU with a doubly-linked list?
Both get (move to front) and set (evict tail) run in O(1) time — no heap or sorted structure needed.
Why PTTL instead of storing the original TTL?
Storing the original TTL and re-applying it on every Redis cache-miss would extend expiry on each read. PTTL returns the remaining time-to-live so the LRU warm-up preserves the original expiry.
Why duck-typing for CacheCluster detection in CacheClient?
instanceof and constructor.name break with bundlers, subclasses, and multiple module copies. Checking for required method signatures is more robust.