PulseRoute - Adaptive gRPC Load Balancer

Client-side gRPC load balancer that routes requests using EWMA latency scores, per-endpoint circuit breakers, jittered exponential backoff, and optional request hedging to minimize tail latency and maximize availability.

An intelligent, production-grade load balancer inspired by the go-zero microservices framework and modern distributed systems research. This project goes beyond simple round-robin or random selection to provide adaptive, performance-aware request distribution with advanced resilience patterns.

Core Concepts

1. P2C (Power of 2 Choices) Algorithm

Instead of checking all nodes (expensive) or picking one at random (potentially bad), P2C:

Randomly selects 2 nodes
Compares their current load
Routes the request to the less loaded node

Why it works: Research shows that choosing the better of 2 random options provides exponentially better load distribution than pure random selection, at minimal cost.

Complexity: O(1) vs O(n) for checking all nodes, while achieving ~90% of the optimal distribution.

2. EWMA (Exponentially Weighted Moving Average)

EWMA calculates a weighted average that gives more importance to recent data:

new_lag = old_lag × β + responseTime × (1 - β)

Where:

β (beta) is a decay factor between 0 and 1
Higher β = more weight on history
Lower β = more weight on recent measurements

Why EWMA? It smooths out noise while remaining sensitive to real performance changes. A single slow request won't drastically change the average, but a sustained slowdown will be quickly detected.

3. Dynamic β Calculation

The algorithm dynamically adjusts β based on request frequency using Newton's Law of Cooling:

β = e^(-td/τ)

Where:

td = time since last request (seconds)
τ (tau) = time constant (600 seconds = 10 minutes)

Examples:

High traffic (td = 1s): β ≈ 0.998 → Trust history, smooth out noise
Medium gap (td = 600s): β ≈ 0.368 → Balance old and new data
Long gap (td = 3000s): β ≈ 0.007 → Old data is stale, trust new measurement

This makes the system self-tuning based on traffic patterns.

4. Load Calculation Formula

load = lag × (inflight + 1)

Where:

lag = EWMA of response time (milliseconds)
inflight = number of currently active requests

Why this formula?

Captures both latency (how slow is the node?) and concurrency (how busy is it?)
The +1 ensures lag always matters, even for idle nodes
A slow but idle node gets a moderate score
A fast but busy node gets a moderate score
A slow AND busy node gets a high (bad) score
A fast AND idle node gets the best (low) score

Implementation Details

Thread Safety

All shared state uses atomic operations:

atomic.LoadInt64 / atomic.StoreInt64 for counters
atomic.LoadUint64 / atomic.StoreUint64 for lag (stored as float64 bits)
sync.RWMutex for protecting the node list

This allows safe concurrent access from multiple goroutines.

Node Lifecycle

When a request starts:

P2C selects the best of 2 random nodes
StartRequest() increments the inflight counter
Updates the last timestamp
Load increases immediately (more inflight requests)

When a request finishes:

FinishRequest() decrements the inflight counter
Calculates the actual response time
Computes dynamic β based on time since last request
Updates lag using EWMA formula
Load adjusts based on new lag and reduced inflight count

Current Output Example

[req 0] picked node: node2:8080
  load after start: 2
  load after finish: 4.182450324601795
---
[req 1] picked node: node1:8080
  load after start: 2
  load after finish: 1.6581643718150858
---
[req 2] picked node: node3:8080
  load after start: 2
  load after finish: 1.75791935244032
---
[req 3] picked node: node1:8080
  load after start: 3.3163287436301716
  load after finish: 7.825876886123887
---

Understanding the Output

Request 0:

Picked node2 (all nodes start with load = 1.0)
After start: load = 2 (lag=1.0, inflight=1, so 1.0 × (1+1) = 2)
After finish: load = 4.18 (request took ~4.18 seconds, EWMA updated)

Request 1:

Picked node1 instead of node2 (P2C compared two nodes, node1's load of 1.0 < node2's 4.18)
Shows the algorithm avoiding the slower node

Request 3:

Picked node1 again (still had lower load than alternatives)
After start: load = 3.32 (lag from previous request × 2 inflight)
After finish: load = 7.83 (this request was also slow)

The algorithm continuously adapts, routing requests away from slow or busy nodes.

Project Structure

PulseRoute/
├── pkg/
│   ├── load.go      # Node struct, EWMA calculation, load tracking
│   └── balancer.go  # Balancer struct, P2C selection algorithm
├── main.go          # Demo/test implementation
├── go.mod           # Go module definition
└── README.md        # This file

Running the Project

# Run the demo
go run main.go

# Build the binary
go build -o pulseroute

# Run the binary
./pulseroute

Vision & Roadmap

This project is evolving from a learning exercise into a production-grade gRPC load balancer. See ARCHITECTURE.md for detailed design and ROADMAP.md for the complete implementation plan.

Current Status: Phase 1 - Foundation ✅

Completed:

✅ Basic Node structure with atomic operations
✅ EWMA calculation with dynamic β (beta)
✅ P2C (Power of 2 Choices) selection algorithm
✅ Load calculation formula (lag × inflight)
✅ Request lifecycle tracking (StartRequest/FinishRequest)
✅ Basic simulation demonstrating adaptive behavior

Next: Circuit Breaker Implementation

The next step is implementing per-endpoint circuit breakers with state transitions (CLOSED → OPEN → HALF-OPEN). This will make the system resilient to failing endpoints.

Full Roadmap

See ROADMAP.md for the complete 12-week implementation plan covering:

Phase 2: Enhanced P2C with health awareness & retry logic
Phase 3: gRPC integration with interceptors & health checks
Phase 4: Request hedging & service discovery (Consul/k8s)
Phase 5: Observability (Prometheus, Jaeger, Grafana)
Phase 6: Production readiness (testing, docs, benchmarks)
Phase 7: Advanced features (sidecar agent, control plane UI)

Algorithm Advantages

Self-balancing: No central coordinator needed
Adaptive: Responds to changing node performance
Efficient: O(1) selection cost
Resilient: Automatically avoids slow or overloaded nodes
Smooth: EWMA prevents overreaction to temporary spikes
Self-tuning: Dynamic β adjusts to traffic patterns

References

go-zero Adaptive Load Balancing - Original implementation
The Power of Two Choices in Randomized Load Balancing - Research paper
EWMA on Wikipedia

Built as a learning project to understand adaptive load balancing algorithms and their real-world applications in distributed systems.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
api		api
cmd		cmd
docs		docs
examples		examples
internal		internal
pkg		pkg
scripts		scripts
ARCHITECTURE.md		ARCHITECTURE.md
README.md		README.md
ROADMAP.md		ROADMAP.md
go.mod		go.mod
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PulseRoute - Adaptive gRPC Load Balancer

Core Concepts

1. P2C (Power of 2 Choices) Algorithm

2. EWMA (Exponentially Weighted Moving Average)

3. Dynamic β Calculation

4. Load Calculation Formula

Implementation Details

Thread Safety

Node Lifecycle

Current Output Example

Understanding the Output

Project Structure

Running the Project

Vision & Roadmap

Current Status: Phase 1 - Foundation ✅

Next: Circuit Breaker Implementation

Full Roadmap

Algorithm Advantages

References

About

Uh oh!

Releases

Packages

Languages

AbhinavPInamdar/PulseRoute

Folders and files

Latest commit

History

Repository files navigation

PulseRoute - Adaptive gRPC Load Balancer

Core Concepts

1. P2C (Power of 2 Choices) Algorithm

2. EWMA (Exponentially Weighted Moving Average)

3. Dynamic β Calculation

4. Load Calculation Formula

Implementation Details

Thread Safety

Node Lifecycle

Current Output Example

Understanding the Output

Project Structure

Running the Project

Vision & Roadmap

Current Status: Phase 1 - Foundation ✅

Next: Circuit Breaker Implementation

Full Roadmap

Algorithm Advantages

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages