Skip to content

feat: Ubuntu 24 generic bootstrap (network scan + peer discovery)#5

Open
alexandremattioli wants to merge 11 commits intomainfrom
feature/ubuntu24-bootstrap
Open

feat: Ubuntu 24 generic bootstrap (network scan + peer discovery)#5
alexandremattioli wants to merge 11 commits intomainfrom
feature/ubuntu24-bootstrap

Conversation

@alexandremattioli
Copy link
Copy Markdown
Owner

Adds an initial Ubuntu 24 bootstrap capability:

  • install.sh to install dependencies & systemd unit
  • bootstrap.sh & network_utils.sh for interface/IP detection & next free IP scan
  • peer_agent.py for UDP broadcast-based discovery (HELLO/WELCOME) with optional HMAC signatures
  • agent-runner.sh entrypoint
  • systemd unit build-agent.service
  • cloud-init.yaml for quick provisioning
  • docs/ubuntu24-bootstrap.md explaining usage & next steps

Peers discovered are written to /var/lib/build/peers.json.

Future enhancements proposed in docs:

  • richer coordination protocol
  • cross-subnet registry
  • role-based configuration

Please review for security assumptions (shared secret distribution) and network scanning limits (currently naive).

…overy, systemd, cloud-init)

- scripts/bootstrap/ubuntu24/install.sh: installs deps and sets up systemd
- scripts/bootstrap/ubuntu24/bootstrap.sh: detects iface, ip/cidr, scans for next free IP
- scripts/bootstrap/ubuntu24/network_utils.sh: helpers for IP math and reachability
- scripts/bootstrap/ubuntu24/peer_agent.py: UDP broadcast-based peer discovery with optional HMAC
- scripts/bootstrap/ubuntu24/build-agent.service: systemd unit
- scripts/bootstrap/ubuntu24/agent-runner.sh: entrypoint for service
- scripts/bootstrap/ubuntu24/cloud-init.yaml: one-shot cloud-init to install agent
- docs/ubuntu24-bootstrap.md: usage
Copilot AI review requested due to automatic review settings November 8, 2025 00:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a peer discovery and network management system for Ubuntu 24.04 build agent nodes. The implementation enables build servers to automatically discover each other on a local network using UDP broadcast with optional HMAC authentication.

  • Implements UDP broadcast-based peer discovery with HELLO/WELCOME message protocol
  • Provides network utilities for interface detection and IP address management
  • Sets up automated deployment via cloud-init and systemd service

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
scripts/bootstrap/ubuntu24/peer_agent.py Core peer discovery agent using UDP broadcast for node communication
scripts/bootstrap/ubuntu24/network_utils.sh Network utility functions for interface detection and IP scanning
scripts/bootstrap/ubuntu24/install.sh Installation script for deploying agent on Ubuntu 24.04 nodes
scripts/bootstrap/ubuntu24/cloud-init.yaml Cloud-init configuration for automated bootstrap
scripts/bootstrap/ubuntu24/build-agent.service Systemd service definition for peer discovery agent
scripts/bootstrap/ubuntu24/bootstrap.sh Bootstrap script for network configuration and IP claiming
scripts/bootstrap/ubuntu24/agent-runner.sh Service wrapper that determines network parameters and launches agent
docs/ubuntu24-bootstrap.md Documentation for installation and usage

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

'hostname': hostname,
'primary_ip': primary,
'candidate_ip': args.candidate,
'ts': datetime.utcnow().isoformat()+"Z",
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datetime.utcnow() method is deprecated as of Python 3.12. Replace with datetime.now(timezone.utc) to ensure compatibility with newer Python versions. Import timezone from datetime.

Copilot uses AI. Check for mistakes.

sock=socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
sock.settimeout(2.0)
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The socket is not bound to the PORT, which means it cannot receive messages from other peers. Add sock.bind(('', PORT)) after line 67 to enable receiving broadcast messages on the designated port.

Suggested change
sock.settimeout(2.0)
sock.settimeout(2.0)
sock.bind(('', PORT))

Copilot uses AI. Check for mistakes.
pl=msg['payload']
if pl.get('node_id') == node_id:
continue
peers.append({'ip': addr[0], **pl})
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Peers can be added multiple times if they send multiple messages (e.g., both HELLO and WELCOME). This results in duplicate entries in the peers list. Consider tracking unique peers by node_id and only adding new ones or updating existing entries.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,104 @@
#!/usr/bin/env python3
import argparse, json, os, socket, struct, sys, time, uuid, hmac, hashlib
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'sys' is not used.
Import of 'struct' is not used.

Suggested change
import argparse, json, os, socket, struct, sys, time, uuid, hmac, hashlib
import argparse, json, os, socket, time, uuid, hmac, hashlib

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +32
except Exception:
pass
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
pass
except Exception as e:
print(f"Warning: Could not read secret file '{args.secret_file}': {e}. HMAC signing will be disabled.", file=sys.stderr)

Copilot uses AI. Check for mistakes.
…or identity

- peer_agent.py: send IDENTIFY request, receive role/packages/config advice
- Peers respond with suggested role based on cluster state
- New server waits for consensus before self-configuring
- Persistent identity at /var/lib/build/identity.json
- README updated: role assignment is peer-driven, not predefined
…change ideas, receive messages

- Complete hive philosophy: peer-driven, self-organizing
- Step-by-step joining instructions with secret sharing
- Message exchange protocol documentation
- Real-world examples of 3 nodes joining
- Troubleshooting common hive connectivity issues
- Extensions: task distribution, heartbeat, capability exchange
- Security model for cross-subnet and secret rotation
…, signal handlers, CLI tool

- Replace print() with structured logging (timestamps, levels)
- Add retry logic with exponential backoff for IDENTIFY messages
- Auto-install packages immediately after role assignment
- Graceful shutdown on SIGTERM/SIGINT
- SO_REUSEADDR for reliable port binding
- CLI tool: hive status|peers|identity|reset
- Improved systemd units: restart limits, timeouts
- Better error messages throughout
- peer_agent.py now checks for /var/lib/build/hackerbook_acknowledged
- warns and exits (non-founder) if not present
- founder node allowed to proceed but still warned
…n identity

- peer_agent annotates identity with build_server name and is_coordinator
- advisor can read peers.json to know existing role distribution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants