TitanLB v2 is a specialized Layer 4 Load Balancer designed for the Edge. It combines the speed of XDP (eBPF) with smart state tracking to handle fragmented traffic reliability, solving a key weakness in stateless load balancers like Katran.
This repository contains the Digital Twin, a rigorous Python simulation that mathematically models the kernel behavior before we write the final C code.
The project is split into two cores:
-
simulation_core/(The Digital Twin)- A high-fidelity Python engine that mimics Linux Kernel logic (XDP hooks).
- Goal: Validate algorithms (Maglev Hashing, LRU Eviction, Fragment Reassembly) in a controlled environment.
- Parity: Uses deterministic "Simulation Time Units" (STU) to model CPU cost per instruction.
-
ebpf_blueprint/(The Reference)- Contains the C/Go code templates that will be deployed to Linux in Phase 2.
- Note: The logic here maps 1:1 to the Python simulation.
We do not just "mock" traffic; we simulate the mathematical constraints of the kernel.
Instead of relying on variable CPU clock speed, we assign fixed costs to operations:
- Hash Calculation: 10 STU
- Map Lookup (Read): 20 STU
- Map Update (Write/Lock): 60 STU
- Packet Encap (Resize): 40 STU
We benchmark TitanLB against industry giants by simulating their architectural trade-offs:
| Load Balancer | Architecture | Strength | Weakness |
|---|---|---|---|
| TitanLB v2 | Stateful Optimization | Zero Drops on Fragments | Slightly higher write cost on new flows. |
| Katran (FB) | Stateless Maglev | Fastest (Lowest Cost) | Drops ~50% of fragmented traffic (Hash Mismatch). |
| Cilium | Pipeline/Tail-Calls | Feature Rich (Security) | 2x-3x Higher Cost due to deep map chains. |
-
Install Dependencies:
pip install flask mmh3
-
Start the Server:
python simulation_core/app.py
-
Open the Dashboard:
- Click the link in terminal:
http://127.0.0.1:5000
- Click the link in terminal:
- Go to the Dashboard.
- Drag the Fragmentation Slider to 100%.
- Observe:
- TitanLB: Reliability stays at 100%. (It tracks
IP_IDto link fragments). - Katran: Reliability crashes to ~50%. (It hashes Frag #2 differently than Frag #1, sending it to the wrong server).
- TitanLB: Reliability stays at 100%. (It tracks
- Look at the Avg Cost (STU) metric.
- Observe:
- TitanLB: ~80-90 STU (Efficient).
- Cilium: ~175+ STU (Heavy pipeline overhead).
- Uncheck the switch next to Server 103.
- Observe:
- The Maglev Ring (Visual Bar) instantly rebalances.
- Purple slots (103) disappear and are taken over by Blue/Indigo (101/102).
TitanLB_v2/
├── simulation_core/
│ ├── app.py # Flask Server & API
│ ├── simulator.py # Traffic Engine & Threading
│ ├── lb_logic.py # The Core Algorithms (Maglev, LRU, Pipeline)
│ ├── maglev.py # Consistent Hashing Math
│ └── templates/
│ └── dashboard.html # The UI with Chart.js
├── ebpf_blueprint/ # C Code for Linux Phase
│ ├── titanlb.c
│ └── agent.go
└── README.md # This file