Conversation
…anagement servers - Signal server: Redis distributed registry and pub/sub for cross-instance peer routing and message forwarding - Management server: Redis pub/sub for account updates, distributed locks (SET NX EX), ephemeral peer deadline management - Traefik load balancer with health checks for automatic failover - 14 integration tests validating HA behavior (7 signal + 7 management) - Full WireGuard encrypted login+sync in failover tests - Comprehensive documentation: README.md, docs/TESTING.md, docs/BUILD_DEPLOY.md, docs/REBASE_GUIDE.md - All HA parameters externally configurable via env vars and YAML - Docker Compose test environment with 2x signal + 2x management + Traefik + Redis - Original upstream README preserved as original_readme.md
|
VPN Dev seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (48)
📝 WalkthroughWalkthroughThis PR introduces a complete high-availability (HA) infrastructure for NetBird, enabling horizontal scaling of Signal and Management servers through Redis-backed distributed coordination. It adds configuration, deployment tooling, comprehensive integration tests, and documentation to support active-active clustering with automatic failover, cross-instance routing, and distributed locks. Changes
Sequence Diagram(s)sequenceDiagram
participant PeerA as Peer A<br/>(Signal Instance 1)
participant Sig1 as Signal 1<br/>HA Server
participant Redis as Redis<br/>Registry
participant Sig2 as Signal 2<br/>HA Server
participant PeerB as Peer B<br/>(Signal Instance 2)
PeerA->>Sig1: Connect Stream (register)
Sig1->>Redis: HSet peer_registry: peerA → instance1
Sig1->>Redis: Expire peer_registry (TTL)
PeerB->>Sig2: Connect Stream (register)
Sig2->>Redis: HSet peer_registry: peerB → instance2
Sig2->>Redis: Expire peer_registry (TTL)
PeerA->>Sig1: Send message to peerB
Sig1->>Redis: HGet peer_registry for peerB
Redis-->>Sig1: instance2
Sig1->>Redis: Publish to signal:instance2 channel
Sig2->>Redis: Subscribe signal:instance2 channel
Redis-->>Sig2: Receive message envelope
Sig2->>PeerB: Route message
PeerB-->>Sig2: Receive
sequenceDiagram
participant Agent as Agent
participant Mgmt1 as Management 1<br/>HA Instance
participant Redis as Redis<br/>Registry & Pub/Sub
participant Mgmt2 as Management 2<br/>HA Instance
Agent->>Mgmt1: Login (acquire lock)
Mgmt1->>Redis: SETNX lock:peerID with instanceID
Redis-->>Mgmt1: OK
Mgmt1->>Mgmt1: Recalculate network map
Mgmt1->>Redis: Publish account:accountID update
Redis-->>Mgmt2: Receive account update
Mgmt2->>Mgmt2: Recalculate network map
Mgmt2->>Redis: HSet peer_registry: agentID → instance1
alt Instance 1 Fails
Agent->>Redis: Query peer_registry for mgmt instance
Redis-->>Agent: instance2
Agent->>Mgmt2: Reconnect, sync
Mgmt2->>Agent: Send updated network config
end
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly Related PRs
Suggested Reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|




/
Summary by CodeRabbit
Release Notes
New Features
Documentation
Tests
Chores