feat(gateway): add libkrun microVM gateway for hardware-isolated cluster bootstrap#76
Closed
feat(gateway): add libkrun microVM gateway for hardware-isolated cluster bootstrap#76
Conversation
…ter bootstrap Introduce the navigator-gateway crate with safe Rust wrappers over the libkrun C FFI, enabling lightweight microVM execution via Apple Hypervisor.framework (macOS) or KVM (Linux). Key components: - navigator-gateway crate: KrunContextBuilder with RAII, ~15 FFI bindings, support for virtio-fs, console redirect, TSI control, and gvproxy virtio-net networking - nav gateway run: ad-hoc microVM execution (direct enter model) - nav gateway cluster: boots k3s inside a microVM with gvproxy networking, automatic rootfs extraction from Docker images, and port forwarding via gvproxy HTTP API - vm-init.sh: guest bootstrap script with DHCP networking, noop CNI plugin (kernel lacks bridge module), and tmpfs-backed k3s data dir - macOS codesigning with com.apple.security.hypervisor entitlement via auto-signing in scripts/bin/nav - navigator-cli build.rs for libkrun/libkrunfw rpath resolution
…ot retry After fork_start(), the parent now: - Polls https://localhost:<kube_port>/readyz with 2s intervals (120s timeout) - Checks child PID is alive between polls (fast-fail on VM crash) - Reads kubeconfig from rootfs via virtio-fs (host-visible) - Rewrites server URL and cluster name, stores in standard location - Prints kubectl usage instructions when ready Always forwards kube API port (ephemeral if --kube-port not specified) to enable health checking. Adds --name flag for kubeconfig context naming. Boot reliability: vm-init.sh now retries k3s up to 3 times with cleanup between attempts, handling the transient kine SQLite race condition on tmpfs that occasionally crashes k3s on first boot. Exports rewrite_kubeconfig and store_kubeconfig from navigator-bootstrap for use by the gateway cluster command.
…er export leak is_pid_alive() now uses waitpid(WNOHANG) instead of kill(pid, 0). kill(0) returns success for zombie processes, so the health check loop would never detect an early child exit — it would spin for the full 120s timeout instead of failing fast. Also fixes a resource leak in extract_rootfs_from_docker(): the docker export child process was never waited on, leaving a zombie and missing its exit status check. Fixes doc comment on net_gvproxy() that incorrectly stated guest IP as .2 (actual: .3 with gvproxy v0.8.6). Removes misleading explicit drop of c_env_strings in build().
…e log Root cause of readyz timeout: gvproxy DHCP assigns guest IPs nondeterministically (.2 or .3 depending on timing), but port forwarding was hardcoded to .3. When the guest got .2, the health check polled the wrong IP for 120s. Fix: vm-init.sh now assigns 192.168.127.2 statically instead of using DHCP. The CLI guest_ip constant matches. This eliminates the race. Also fixes libkrun VMM warnings (virtio-fs passthrough symlink errors) appearing on the parent terminal — fork_start() now redirects the child's stderr to the console log file via dup2(). The warnings are from macOS symlink resolution limits on Kubernetes ConfigMap mounts and are non-fatal. vm-init.sh is now always updated in the cached rootfs on every boot (not just during initial extraction), so networking fixes take effect without requiring rootfs re-extraction.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is just a prototype. Not intended to merge
Summary
Introduces a new
navigator gatewaycommand group that uses libkrun to launch hardware-isolated microVMs via Apple Hypervisor.framework (macOS ARM64) or KVM (Linux), replacing Docker+k3s containers with a lighter-weight VM-based approach.nav gateway run— ad-hoc microVM execution with a user-provided rootfsnav gateway cluster— boots k3s inside a microVM with gvproxy networking, automatic rootfs extraction from Docker images, health checking, and kubeconfig extractionnavigator-gatewaycrate — safe Rust wrappers over ~15 libkrun C FFI functions with RAII, builder pattern, and support for virtio-fs, console redirect, TSI control, and gvproxy virtio-net networkingKey design decisions
connect()syscalls which breaks k3s internal localhost connectionsfork_start()boots VM in child process; parent polls/readyz, extracts kubeconfig from virtio-fs rootfs, blocks on waitpidNew files
crates/navigator-gateway/crates/navigator-cli/build.rsdeploy/gateway/vm-init.shcrates/navigator-gateway/entitlements.plistPrerequisites
Usage