Warning
This project is in alpha. APIs, configuration format, and wire protocols may change without notice. Do not use in production.
A high-performance, horizontally scalable rate-limiting service designed for Envoy sidecars. DRL eliminates the latency of external databases by using a Peer-to-Peer Hybrid Architecture:
- Local enforcement — fully-replicated in-memory Blocklist for O(1) rejection
- Shadow accounting — hashed, asynchronous global quota tracking
- Warm-bootstrap — state sync on startup prevents vulnerability windows during rolling updates
DRL's primary deployment model is as a second sidecar in the same pod as Envoy. The ShouldRateLimit
gRPC call never crosses a network boundary — it resolves over the loopback interface, eliminating DNS
resolution, TLS negotiation, and switch hops from the enforcement path entirely. Block decisions are
O(1) in-process blocklist lookups that return in microseconds.
Everything else — counter forwarding to the consistent-hash owner and block-event gossip across the cluster — happens asynchronously, after the response has already been returned to Envoy. A slow peer, a GC pause, or a temporary network partition between DRL instances never delays a rate-limit decision.
%%{init: {'flowchart': {'curve': 'step'}}}%%
flowchart LR
subgraph pod-a ["Pod A"]
WA["Workload"] <--> EA["Envoy\nsidecar"]
EA -- "① localhost gRPC" --> DA["DRL\nsidecar"]
DA -- "OK / OVER_LIMIT" --> EA
end
subgraph pod-b ["Pod B"]
WB["Workload"] <--> EB["Envoy\nsidecar"]
EB -- "① localhost gRPC" --> DB["DRL\nsidecar"]
DB -- "OK / OVER_LIMIT" --> EB
end
subgraph pod-c ["Pod C"]
WC["Workload"] <--> EC["Envoy\nsidecar"]
EC -- "① localhost gRPC" --> DC["DRL\nsidecar"]
DC -- "OK / OVER_LIMIT" --> EC
end
DA <-.->|"② gossip + block events"| DB
DB <-.->|"② gossip + block events"| DC
DA <-.->|"② gossip + block events"| DC
DA -.->|"③ UDP counter batch"| DB
DB -.->|"③ UDP counter batch"| DC
DC -.->|"③ UDP counter batch"| DA
| Path | Transport | Blocks Envoy? | |
|---|---|---|---|
| ① | Envoy → DRL block check | localhost gRPC | yes — microseconds |
| ② | DRL → DRL block propagation | Memberlist gossip (UDP/TCP) | no — fire and forget |
| ③ | DRL → owner counter increment | UDP CounterBatch |
no — fire and forget |
A request that slips through once costs nothing. A rate limiter that adds latency to every request costs everything.
DRL is built on a deliberate trade-off: it tolerates a brief window where a handful of requests may pass through after a limit is triggered, in exchange for never needing an external store and keeping the enforcement path at sub-millisecond latency.
| Property | Traditional centralised approach (Redis / Memcached) | DRL |
|---|---|---|
| Enforcement latency | +1–5 ms per request (network round-trip to store) | ~0 ms (in-process blocklist lookup) |
| External dependency | Required — the store is a single point of failure | None — each node is self-contained |
| Sidecar deployment | Sidecar still calls out over the network | Sidecar calls localhost — same OS network namespace |
| Consistency window | Strong (synchronous write before OK) | Eventual — gossip convergence typically < 1 s |
| Failure mode | Store outage → rate limiting fails open or hard | Node isolation → local blocklist still enforces; remote counters lag |
The scenarios where a few requests sneak through are narrow and short-lived:
- Sub-second gossip convergence — when a block is decided on the owner node, Serf/Memberlist propagates the event cluster-wide in well under a second. The "leak window" is bounded by gossip latency, not by request rate.
- Repeat offenders are caught locally — once a block event reaches a node, every subsequent request from that entity is rejected at the in-process blocklist check before the response is even assembled.
- The alternative is worse — synchronous distributed consensus on every request serialises traffic through a bottleneck, adds tail latency to the hot path, and introduces a new failure domain. DRL eliminates all three problems.
- Sidecar topology amplifies the benefit — when deployed as a sidecar next to Envoy, the gRPC
ShouldRateLimitcall never leaves the host. There is no network hop, no TLS handshake overhead, and no DNS resolution. The blocklist lookup is effectively a function call.
For the overwhelming majority of rate-limiting use cases — API abuse prevention, bot mitigation, per-user quota enforcement — a sub-second enforcement window is operationally indistinguishable from strong consistency, while the latency and reliability properties are dramatically better.
DRL ships a built-in web dashboard for real-time cluster monitoring — no external tools required.
The dashboard is served directly from the DRL binary at http://<node>:8082/drl/ui/. Access is
zero-touch: the browser performs an automatic ECDH key exchange on page load, establishing an encrypted
session without any manual login. Metrics from all cluster nodes are aggregated in real time.
See the UI documentation for access instructions, authentication details, and deployment examples.
| Topic | Description |
|---|---|
| Getting Started | Quick start and overview |
| Configuration | Complete KDL config reference and environment variables |
| Membership | Cluster formation, gossip, warm-bootstrap, block propagation |
| Cache | In-memory blocklist and accounting cache architecture |
| Accounting | Shadow accounting, entity hashing, batched flushing |
| gRPC API | Envoy ratelimit.v3 service implementation |
| Internal HTTP API | Management endpoints and digest authentication |
| Control Plane UI | Built-in web dashboard: access, authentication, and cross-node metrics aggregation |
| Metrics | Prometheus metrics reference, label definitions, and Grafana panel queries |
| Sizing Guide | Memory footprint, capacity tables, and deployment recommendations |
| Deployment Models | Docker Compose, ECS Fargate, Kubernetes sidecar/fleet, and Istio configurations |
Ready-to-use deployment configurations live under deployments/:
| Flavour | Path | Infrastructure | Description |
|---|---|---|---|
| Docker Compose | deployments/docker-compose/ |
Local machine | Full stack via docker compose up — fastest way to try DRL |
| ECS Sidecar | deployments/ecs-sidecar/ |
AWS ECS Fargate (Terraform) | echo-server + envoy + DRL as co-located Fargate task sidecars |
| K8s Sidecar | deployments/k8s-sidecar/ |
Any Kubernetes cluster (Kustomize) | DRL as a third container inside each application pod |
| K8s Fleet | deployments/k8s-fleet/ |
Any Kubernetes cluster (Kustomize) | DRL as a dedicated Deployment; Envoy connects via ClusterIP Service |
| Istio | deployments/istio/ |
Istio service mesh | Configuration guide: inject DRL into existing Istio-managed sidecars via EnvoyFilter / AuthorizationPolicy |
Reports are published to GitHub Pages after each successful run on main.
| Job | Goal | Pipeline | Report |
|---|---|---|---|
| Lint & Unit Tests | Runs golangci-lint and go test -race ./... with coverage on every push. |
runs on main | — |
| Functional (1 replica) | Validates core rate-limiting correctness on a single node: requests below the threshold are allowed; requests above it are blocked at the configured ratio. | runs on main | report |
| Functional (5 replicas) | Same correctness check on a 5-node cluster. Verifies that block events propagate via gossip and are enforced cluster-wide, not just on the owner node. | runs on main | report |
| Functional (10 replicas) | Stress-tests gossip convergence and consistent-hash ownership at a larger scale. Confirms allowed/blocked ratios stay within acceptable thresholds as the ring grows. | runs on main | report |
| Handover | Verifies graceful state transfer during a rolling update: a leaving node evacuates its accounting counters to a peer, so rate-limit enforcement continues uninterrupted after scale-down. | runs on main | report |
| Performance | Measures sustained throughput and p95/p99 latency of the ShouldRateLimit gRPC path under a ramp-up traffic model. Establishes a baseline for regression detection. |
runs on main | report |
MIT
