Welcome to Redis from Scratch, a comprehensive implementation of a Redis-like in-memory database server built from the ground up. This project demonstrates how Redis works internally by implementing core features including single-threaded event loop architecture, key-value storage with TTL, persistence mechanisms (AOF and RDB), and a publish/subscribe messaging system.
- Single-threaded event loop with non-blocking I/O
- Connection multiplexing using
select()system call - Redis Serialization Protocol (RESP) for client communication
- Command pipelining support
- In-memory key-value store with Python dictionaries
- Time-To-Live (TTL) support for automatic key expiration
- Hybrid expiration strategy: lazy + active cleanup
- Memory tracking with real-time usage monitoring
- Append-Only File (AOF) for command-level durability
- RDB snapshots for point-in-time backups
- Configurable sync policies (always, everysec, no)
- Background rewriting for AOF compaction
- Publish/Subscribe system with channel-based messaging
- Fire-and-forget delivery model
- Pattern subscriptions support
- Channel management commands
- Basic: SET, GET, DEL, EXISTS, KEYS, FLUSHALL, TYPE
- TTL: EXPIRE, EXPIREAT, TTL, PTTL, PERSIST
- Persistence: SAVE, BGSAVE, LASTSAVE, BGREWRITEAOF
- Pub/Sub: SUBSCRIBE, UNSUBSCRIBE, PUBLISH, PUBSUB
- Utility: PING, ECHO, INFO, QUIT
redis_from_scratch/
βββ main.py # Entry point
βββ redis_server/
β βββ __init__.py
β βββ server.py # Main event loop and networking
β βββ command_handler.py # Command routing and execution
β βββ storage.py # In-memory data store with TTL
β βββ pubsub.py # Publish/Subscribe implementation
β βββ response.py # RESP protocol formatter
β βββ commands/ # Command implementations
β β βββ basic.py # SET, GET, DEL, etc.
β β βββ expiration.py # TTL commands
β β βββ persistence.py # SAVE, BGSAVE, etc.
β β βββ pubsub.py # Pub/Sub commands
β β βββ ...
β βββ persistence/ # Persistence layer
β βββ manager.py # Persistence orchestration
β βββ aof.py # AOF implementation
β βββ rdb.py # RDB implementation
β βββ config.py # Configuration management
β βββ recovery.py # Data recovery
βββ data/ # Persistence files (created at runtime)
- Python 3.8+
telnet(for testing)- Basic understanding of Redis commands
- Clone the repository:
git clone https://github.com/poridhioss/Redis_from_scratch.git
cd Redis_from_scratch- Install dependencies (if any):
pip install -r requirements.txtStart the Redis server:
python main.pyYou should see output similar to:
Single-threaded Redis server listening on localhost:6379...
Data recovered from persistence files (if any).
Ready to accept connections.
- Open a new terminal and connect using telnet:
telnet localhost 6379- You should receive a welcome message:
+OK Server ready
- Test basic commands:
PING
+OK
SET mykey "Hello Redis"
+OK
GET mykey
$11
Hello Redis
- Set a key with expiration:
SET session:user123 "data" EX 10
+OK
TTL session:user123
:8 # Remaining seconds
# Wait 10 seconds...
GET session:user123
$-1 # Key expired
- Use EXPIREAT with a specific timestamp:
SET temp "value"
+OK
EXPIREAT temp 1893456000 # Unix timestamp
:1
TTL temp
:31536000 # 1 year from now
- Enable AOF (if not already enabled in config):
CONFIG SET appendonly yes
+OK
- Perform write operations:
SET persistent:1 "data1"
SET persistent:2 "data2"
DEL persistent:1
- Check AOF file:
cat data/appendonly.aof- Restart server and verify data recovery:
# Stop server with Ctrl+C
python main.py
# Reconnect with telnet
GET persistent:2
$5
data2- Create manual snapshot:
SAVE
+OK # Synchronous save
BGSAVE
+Background saving started # Asynchronous save
- Check RDB file:
ls -la data/dump.rdb- Test automatic snapshots (configured in
persistence/config.py):
# Modify many keys quickly to trigger automatic saveTerminal 1 (Subscriber):
telnet localhost 6379
SUBSCRIBE news alertsTerminal 2 (Subscriber 2):
telnet localhost 6379
SUBSCRIBE newsTerminal 3 (Publisher):
telnet localhost 6379
PUBLISH news "Breaking news!"
:2 # Delivered to 2 subscribers
PUBSUB CHANNELS
*2
$4
news
$6
alertsOpen multiple terminal windows and connect simultaneously:
# Terminal 1
telnet localhost 6379
SET counter 0
INCR counter
# Terminal 2
telnet localhost 6379
GET counter
:1
# Terminal 3
telnet localhost 6379
INCR counter
GET counter
:2Send multiple commands at once:
echo -e "SET a 1\nSET b 2\nGET a\nGET b" | telnet localhost 6379Configuration is managed in redis_server/persistence/config.py. Key settings:
DEFAULT_CONFIG = {
'appendonly': True, # Enable AOF
'appendfsync': 'everysec', # Sync policy: always/everysec/no
'rdb_enabled': True, # Enable RDB snapshots
'rdb_filename': 'dump.rdb', # RDB file name
'rdb_save_conditions': [ # Automatic save triggers
(900, 1), # 1 change in 15 minutes
(300, 10), # 10 changes in 5 minutes
(60, 10000), # 10000 changes in 1 minute
],
'aof_rewrite_percentage': 100, # Rewrite when 100% bigger
'aof_rewrite_min_size': 64, # Minimum size in MB
}The core of the system is a single-threaded event loop using Python's select() for I/O multiplexing:
def _event_loop(self):
while self.running:
# Monitor sockets for activity
readable, _, _ = select.select(
[self.server_socket] + list(self.clients.keys()),
[], [], 0.05 # 50ms timeout
)
# Handle new connections
if self.server_socket in readable:
self._accept_new_connection()
# Handle client data
for client_socket in readable:
if client_socket != self.server_socket:
self._handle_client_data(client_socket)
# Background tasks (every 100ms)
if time.time() - self.last_background_run >= 0.1:
self._run_background_tasks()- Lazy Expiration: Keys are checked for expiration on every access
- Active Expiration: Background thread samples and removes expired keys
def _is_key_valid(self, key):
"""Check if key exists and hasn't expired"""
if key not in self._data:
return False
value, _, expiry_time = self._data[key]
if expiry_time and expiry_time <= time.time():
# Key expired - remove it
self._remove_expired_key(key, value)
return False
return TrueAOF logs every write command to disk:
def log_command(self, command, *args):
"""Log write command to AOF file"""
if command.upper() not in WRITE_COMMANDS:
return
# Format: timestamp COMMAND args
timestamp = int(time.time())
line = f"{timestamp} {command.upper()} {' '.join(args)}\n"
# Write to buffer
self.buffer.write(line.encode('utf-8'))
# Sync based on policy
if self.sync_policy == 'always':
self._fsync()RDB creates binary snapshots of the entire dataset:
def save_rdb(self, data_store, filename):
"""Save current dataset to RDB file"""
temp_filename = filename + '.tmp'
with open(temp_filename, 'wb') as f:
# Write magic header
f.write(b'REDIS0001')
# Serialize all keys with their data
for key, (value, data_type, expiry) in data_store.items():
self._write_key(f, key, value, data_type, expiry)
# Write checksum
if self.config['rdb_checksum']:
f.write(self._calculate_checksum())
# Atomic rename
os.rename(temp_filename, filename)Channel-based messaging with fire-and-forget delivery:
class PubSubManager:
def __init__(self):
self.channels = defaultdict(set) # channel -> set of clients
self.client_subscriptions = defaultdict(set) # client -> set of channels
def publish(self, channel, message):
"""Publish message to all subscribers of a channel"""
subscribers = self.channels.get(channel, set())
delivered = 0
for client in subscribers:
try:
# Build RESP response: ["message", channel, message]
response = ResponseBuilder.array([
ResponseBuilder.bulk_string("message"),
ResponseBuilder.bulk_string(channel),
ResponseBuilder.bulk_string(message)
])
client.send(response)
delivered += 1
except:
# Client disconnected
self._cleanup_client(client)
return delivered| Operation | Time Complexity | Notes |
|---|---|---|
| GET/SET | O(1) | Hash table lookup |
| KEYS | O(n) | Scans all keys |
| EXPIRE | O(1) | Updates expiration metadata |
| PUBLISH | O(n) | n = number of subscribers |
| AOF append | O(1) | Appends to file buffer |
| RDB save | O(n) | n = number of keys |
# Set cache with 5-second TTL
SET cache:user:123 "{'name': 'John', 'age': 30}" EX 5
# Use cache
GET cache:user:123
# Cache automatically expires after 5 seconds# Create session
SET session:abc123 "{'user_id': 456, 'logged_in': true}" EX 3600
# Check session
TTL session:abc123
:3598
# Refresh session
EXPIRE session:abc123 3600# Client 1 subscribes
SUBSCRIBE notifications:user:789
# Client 2 publishes
PUBLISH notifications:user:789 "You have a new message!"# Enable both persistence mechanisms
CONFIG SET appendonly yes
CONFIG SET save "900 1 300 10 60 10000"
# Perform critical operations
SET order:1001 "confirmed"
SET payment:1001 "processed"
# Force save
SAVE
# Verify persistence
INFO persistence-
"Connection refused" error:
- Ensure server is running:
python main.py - Check port 6379 is not blocked
- Ensure server is running:
-
Commands not recognized:
- Check command spelling (Redis commands are case-insensitive)
- Ensure you're using supported commands
-
Memory usage high:
- Check for keys without TTL
- Use
INFO memoryto analyze usage - Consider implementing eviction policies
-
Persistence files not created:
- Verify write permissions in
data/directory - Check configuration settings
- Verify write permissions in
Enable verbose logging by modifying server.py:
class RedisServer:
def __init__(self, debug=False):
self.debug = debug
# ...
def _log(self, message):
if self.debug:
print(f"[DEBUG] {message}")- I/O Multiplexing: How
select()enables single-threaded concurrency - Copy-on-Write: RDB snapshot mechanism
- Redis Protocol (RESP): Client-server communication format
- Memory Management: Python's dict implementation and memory overhead
- Persistence Trade-offs: AOF vs RDB strengths and weaknesses
- Add new data types: Lists, Sets, Sorted Sets, Hashes
- Implement transactions: MULTI/EXEC commands
- Add replication: Master-slave replication
- Implement clustering: Sharding across multiple instances
- Add Lua scripting: Embedded script execution
This Redis-from-scratch implementation demonstrates the core principles of in-memory databases:
- Simplicity: Single-threaded design eliminates concurrency issues
- Performance: Memory-based operations with O(1) complexity
- Durability: Multiple persistence strategies for different use cases
- Extensibility: Modular architecture for adding new features
By understanding and experimenting with this implementation, you'll gain deep insights into how production systems like Redis are built and optimized.