feat: Ubuntu 24 generic bootstrap (network scan + peer discovery)#5
feat: Ubuntu 24 generic bootstrap (network scan + peer discovery)#5alexandremattioli wants to merge 11 commits intomainfrom
Conversation
…overy, systemd, cloud-init) - scripts/bootstrap/ubuntu24/install.sh: installs deps and sets up systemd - scripts/bootstrap/ubuntu24/bootstrap.sh: detects iface, ip/cidr, scans for next free IP - scripts/bootstrap/ubuntu24/network_utils.sh: helpers for IP math and reachability - scripts/bootstrap/ubuntu24/peer_agent.py: UDP broadcast-based peer discovery with optional HMAC - scripts/bootstrap/ubuntu24/build-agent.service: systemd unit - scripts/bootstrap/ubuntu24/agent-runner.sh: entrypoint for service - scripts/bootstrap/ubuntu24/cloud-init.yaml: one-shot cloud-init to install agent - docs/ubuntu24-bootstrap.md: usage
… openssl installed
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a peer discovery and network management system for Ubuntu 24.04 build agent nodes. The implementation enables build servers to automatically discover each other on a local network using UDP broadcast with optional HMAC authentication.
- Implements UDP broadcast-based peer discovery with HELLO/WELCOME message protocol
- Provides network utilities for interface detection and IP address management
- Sets up automated deployment via cloud-init and systemd service
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/bootstrap/ubuntu24/peer_agent.py | Core peer discovery agent using UDP broadcast for node communication |
| scripts/bootstrap/ubuntu24/network_utils.sh | Network utility functions for interface detection and IP scanning |
| scripts/bootstrap/ubuntu24/install.sh | Installation script for deploying agent on Ubuntu 24.04 nodes |
| scripts/bootstrap/ubuntu24/cloud-init.yaml | Cloud-init configuration for automated bootstrap |
| scripts/bootstrap/ubuntu24/build-agent.service | Systemd service definition for peer discovery agent |
| scripts/bootstrap/ubuntu24/bootstrap.sh | Bootstrap script for network configuration and IP claiming |
| scripts/bootstrap/ubuntu24/agent-runner.sh | Service wrapper that determines network parameters and launches agent |
| docs/ubuntu24-bootstrap.md | Documentation for installation and usage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 'hostname': hostname, | ||
| 'primary_ip': primary, | ||
| 'candidate_ip': args.candidate, | ||
| 'ts': datetime.utcnow().isoformat()+"Z", |
There was a problem hiding this comment.
The datetime.utcnow() method is deprecated as of Python 3.12. Replace with datetime.now(timezone.utc) to ensure compatibility with newer Python versions. Import timezone from datetime.
|
|
||
| sock=socket.socket(socket.AF_INET, socket.SOCK_DGRAM) | ||
| sock.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1) | ||
| sock.settimeout(2.0) |
There was a problem hiding this comment.
The socket is not bound to the PORT, which means it cannot receive messages from other peers. Add sock.bind(('', PORT)) after line 67 to enable receiving broadcast messages on the designated port.
| sock.settimeout(2.0) | |
| sock.settimeout(2.0) | |
| sock.bind(('', PORT)) |
| pl=msg['payload'] | ||
| if pl.get('node_id') == node_id: | ||
| continue | ||
| peers.append({'ip': addr[0], **pl}) |
There was a problem hiding this comment.
Peers can be added multiple times if they send multiple messages (e.g., both HELLO and WELCOME). This results in duplicate entries in the peers list. Consider tracking unique peers by node_id and only adding new ones or updating existing entries.
| @@ -0,0 +1,104 @@ | |||
| #!/usr/bin/env python3 | |||
| import argparse, json, os, socket, struct, sys, time, uuid, hmac, hashlib | |||
There was a problem hiding this comment.
Import of 'sys' is not used.
Import of 'struct' is not used.
| import argparse, json, os, socket, struct, sys, time, uuid, hmac, hashlib | |
| import argparse, json, os, socket, time, uuid, hmac, hashlib |
| except Exception: | ||
| pass |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| except Exception: | |
| pass | |
| except Exception as e: | |
| print(f"Warning: Could not read secret file '{args.secret_file}': {e}. HMAC signing will be disabled.", file=sys.stderr) |
…or identity - peer_agent.py: send IDENTIFY request, receive role/packages/config advice - Peers respond with suggested role based on cluster state - New server waits for consensus before self-configuring - Persistent identity at /var/lib/build/identity.json - README updated: role assignment is peer-driven, not predefined
…change ideas, receive messages - Complete hive philosophy: peer-driven, self-organizing - Step-by-step joining instructions with secret sharing - Message exchange protocol documentation - Real-world examples of 3 nodes joining - Troubleshooting common hive connectivity issues - Extensions: task distribution, heartbeat, capability exchange - Security model for cross-subnet and secret rotation
…, signal handlers, CLI tool - Replace print() with structured logging (timestamps, levels) - Add retry logic with exponential backoff for IDENTIFY messages - Auto-install packages immediately after role assignment - Graceful shutdown on SIGTERM/SIGINT - SO_REUSEADDR for reliable port binding - CLI tool: hive status|peers|identity|reset - Improved systemd units: restart limits, timeouts - Better error messages throughout
- peer_agent.py now checks for /var/lib/build/hackerbook_acknowledged - warns and exits (non-founder) if not present - founder node allowed to proceed but still warned
…n identity - peer_agent annotates identity with build_server name and is_coordinator - advisor can read peers.json to know existing role distribution
…ate on shutdown/startup
… task placeholder
Adds an initial Ubuntu 24 bootstrap capability:
Peers discovered are written to /var/lib/build/peers.json.
Future enhancements proposed in docs:
Please review for security assumptions (shared secret distribution) and network scanning limits (currently naive).