Skip to content

ANSHAM1/Embedded_LSM_KV_Database

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Embedded LSM-Tree Key-Value Database (C++)

A single-node, embedded, crash-safe key–value database implemented in C++, inspired by real-world LSM-tree storage engines (LevelDB / RocksDB internals).

This project focuses on core database internals:

  • Write-Ahead Logging (WAL)
  • MemTable / Immutable MemTable
  • SSTable (Sorted String Table)
  • Crash recovery
  • Tombstone-based deletes

No networking. No SQL. No distributed complexity.
Just correct storage-engine fundamentals.


Architecture

The database is not a linear pipeline.
It is a layered, event-driven system with short-circuit read paths and out-of-band maintenance.

LSM-Tree Architecture


High-Level Design

Write Path (put, del)

  1. Append operation to WAL
  2. fsync() → durability commit point
  3. Update MemTable
  4. When MemTable reaches size threshold:
    • Freeze it as Immutable MemTable
    • Create a new empty MemTable
    • Flush Immutable MemTable to an SSTable

Read Path (get)

Reads short-circuit through layers:

  1. MemTable
  2. Immutable MemTable
  3. SSTables (newest → oldest)

Search stops immediately on:

  • Value found
  • Tombstone found

Core Components

Write-Ahead Log (WAL)

  • Append-only binary file
  • Stores PUT and DELETE operations
  • fsync() defines the commit boundary
  • Used only for crash recovery

MemTable

  • In-memory ordered map
  • Stores latest key states
  • Deletes stored as tombstones
  • Fast reads and writes

Immutable MemTable

  • Read-only snapshot of a full MemTable
  • Allows writes to continue without blocking
  • Flushed to disk as an SSTable

SSTable

  • Immutable, binary, sorted on-disk file
  • Stores final key states
  • Includes:
    • Data blocks (sorted key-value records)
    • Sparse index (key → offset)
    • Footer (index location)

Crash Recovery

On startup:

  1. Discover existing SSTables
  2. Open WAL
  3. Replay WAL records sequentially
  4. Rebuild MemTable
  5. Resume normal operation

Partial WAL records are safely ignored.


Supported Operations

put(key, value);
get(key);
del(key);

About

Embedded LSM-Tree Key–Value Database

Topics

Resources

License

Stars

Watchers

Forks

Contributors