Skip to content

AbhinavPInamdar/PulseRoute

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PulseRoute - Adaptive gRPC Load Balancer

Client-side gRPC load balancer that routes requests using EWMA latency scores, per-endpoint circuit breakers, jittered exponential backoff, and optional request hedging to minimize tail latency and maximize availability.

An intelligent, production-grade load balancer inspired by the go-zero microservices framework and modern distributed systems research. This project goes beyond simple round-robin or random selection to provide adaptive, performance-aware request distribution with advanced resilience patterns.

Core Concepts

1. P2C (Power of 2 Choices) Algorithm

Instead of checking all nodes (expensive) or picking one at random (potentially bad), P2C:

  • Randomly selects 2 nodes
  • Compares their current load
  • Routes the request to the less loaded node

Why it works: Research shows that choosing the better of 2 random options provides exponentially better load distribution than pure random selection, at minimal cost.

Complexity: O(1) vs O(n) for checking all nodes, while achieving ~90% of the optimal distribution.

2. EWMA (Exponentially Weighted Moving Average)

EWMA calculates a weighted average that gives more importance to recent data:

new_lag = old_lag × β + responseTime × (1 - β)

Where:

  • β (beta) is a decay factor between 0 and 1
  • Higher β = more weight on history
  • Lower β = more weight on recent measurements

Why EWMA? It smooths out noise while remaining sensitive to real performance changes. A single slow request won't drastically change the average, but a sustained slowdown will be quickly detected.

3. Dynamic β Calculation

The algorithm dynamically adjusts β based on request frequency using Newton's Law of Cooling:

β = e^(-td/τ)

Where:

  • td = time since last request (seconds)
  • τ (tau) = time constant (600 seconds = 10 minutes)

Examples:

  • High traffic (td = 1s): β ≈ 0.998 → Trust history, smooth out noise
  • Medium gap (td = 600s): β ≈ 0.368 → Balance old and new data
  • Long gap (td = 3000s): β ≈ 0.007 → Old data is stale, trust new measurement

This makes the system self-tuning based on traffic patterns.

4. Load Calculation Formula

load = lag × (inflight + 1)

Where:

  • lag = EWMA of response time (milliseconds)
  • inflight = number of currently active requests

Why this formula?

  • Captures both latency (how slow is the node?) and concurrency (how busy is it?)
  • The +1 ensures lag always matters, even for idle nodes
  • A slow but idle node gets a moderate score
  • A fast but busy node gets a moderate score
  • A slow AND busy node gets a high (bad) score
  • A fast AND idle node gets the best (low) score

Implementation Details

Thread Safety

All shared state uses atomic operations:

  • atomic.LoadInt64 / atomic.StoreInt64 for counters
  • atomic.LoadUint64 / atomic.StoreUint64 for lag (stored as float64 bits)
  • sync.RWMutex for protecting the node list

This allows safe concurrent access from multiple goroutines.

Node Lifecycle

When a request starts:

  1. P2C selects the best of 2 random nodes
  2. StartRequest() increments the inflight counter
  3. Updates the last timestamp
  4. Load increases immediately (more inflight requests)

When a request finishes:

  1. FinishRequest() decrements the inflight counter
  2. Calculates the actual response time
  3. Computes dynamic β based on time since last request
  4. Updates lag using EWMA formula
  5. Load adjusts based on new lag and reduced inflight count

Current Output Example

[req 0] picked node: node2:8080
  load after start: 2
  load after finish: 4.182450324601795
---
[req 1] picked node: node1:8080
  load after start: 2
  load after finish: 1.6581643718150858
---
[req 2] picked node: node3:8080
  load after start: 2
  load after finish: 1.75791935244032
---
[req 3] picked node: node1:8080
  load after start: 3.3163287436301716
  load after finish: 7.825876886123887
---

Understanding the Output

Request 0:

  • Picked node2 (all nodes start with load = 1.0)
  • After start: load = 2 (lag=1.0, inflight=1, so 1.0 × (1+1) = 2)
  • After finish: load = 4.18 (request took ~4.18 seconds, EWMA updated)

Request 1:

  • Picked node1 instead of node2 (P2C compared two nodes, node1's load of 1.0 < node2's 4.18)
  • Shows the algorithm avoiding the slower node

Request 3:

  • Picked node1 again (still had lower load than alternatives)
  • After start: load = 3.32 (lag from previous request × 2 inflight)
  • After finish: load = 7.83 (this request was also slow)

The algorithm continuously adapts, routing requests away from slow or busy nodes.

Project Structure

PulseRoute/
├── pkg/
│   ├── load.go      # Node struct, EWMA calculation, load tracking
│   └── balancer.go  # Balancer struct, P2C selection algorithm
├── main.go          # Demo/test implementation
├── go.mod           # Go module definition
└── README.md        # This file

Running the Project

# Run the demo
go run main.go

# Build the binary
go build -o pulseroute

# Run the binary
./pulseroute

Vision & Roadmap

This project is evolving from a learning exercise into a production-grade gRPC load balancer. See ARCHITECTURE.md for detailed design and ROADMAP.md for the complete implementation plan.

Current Status: Phase 1 - Foundation ✅

Completed:

  • ✅ Basic Node structure with atomic operations
  • ✅ EWMA calculation with dynamic β (beta)
  • ✅ P2C (Power of 2 Choices) selection algorithm
  • ✅ Load calculation formula (lag × inflight)
  • ✅ Request lifecycle tracking (StartRequest/FinishRequest)
  • ✅ Basic simulation demonstrating adaptive behavior

Next: Circuit Breaker Implementation

The next step is implementing per-endpoint circuit breakers with state transitions (CLOSED → OPEN → HALF-OPEN). This will make the system resilient to failing endpoints.

Full Roadmap

See ROADMAP.md for the complete 12-week implementation plan covering:

  • Phase 2: Enhanced P2C with health awareness & retry logic
  • Phase 3: gRPC integration with interceptors & health checks
  • Phase 4: Request hedging & service discovery (Consul/k8s)
  • Phase 5: Observability (Prometheus, Jaeger, Grafana)
  • Phase 6: Production readiness (testing, docs, benchmarks)
  • Phase 7: Advanced features (sidecar agent, control plane UI)

Algorithm Advantages

  1. Self-balancing: No central coordinator needed
  2. Adaptive: Responds to changing node performance
  3. Efficient: O(1) selection cost
  4. Resilient: Automatically avoids slow or overloaded nodes
  5. Smooth: EWMA prevents overreaction to temporary spikes
  6. Self-tuning: Dynamic β adjusts to traffic patterns

References

Built as a learning project to understand adaptive load balancing algorithms and their real-world applications in distributed systems.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages