Ansible-driven infrastructure automation for a single-node Kubernetes homelab running on an NVIDIA Jetson Orin NX. Everything runs from Docker containers — no local Ansible install needed.
| Host | Hardware | OS | Kernel | Role |
|---|---|---|---|---|
| jetson | Seeed reComputer J4012 (Orin NX 16GB) | Ubuntu 20.04.6 LTS (L4T R35.5.0) | 5.10.192-tegra | k3s server, Tailscale exit node |
- k3s (v1.34.5) — Lightweight Kubernetes with Flannel VXLAN + kube-proxy
- FluxCD — GitOps continuous delivery, watches
k8s/in this repo - Tailscale — Mesh VPN, exit node for LAN access, remote kubectl
- GitHub Actions Runner Controller (ARC) — Self-hosted ARM64 CI runners (
jetson-arm64)
.
├── Dockerfile # Ansible runtime (Python 3.12, ansible-core <2.18)
├── Makefile # All commands run via Docker
├── entrypoint.sh # Docker entrypoint (vault password handling)
├── ansible/
│ ├── ansible.cfg # Ansible configuration
│ ├── vault.yml # Encrypted secrets (ansible-vault)
│ ├── vault.yml.example # Vault template
│ ├── requirements.yml # Ansible Galaxy collections
│ ├── site.yml # Full playbook: common → kernel-modules → tailscale → k3s → flux
│ ├── k3s.yml # k3s-only: common → kernel-modules → k3s
│ ├── tailscale.yml # Tailscale-only playbook
│ ├── flux.yml # FluxCD bootstrap playbook
│ ├── diagnose.yml # Networking diagnostics playbook
│ ├── inventory/
│ │ ├── hosts.yml # Host inventory
│ │ ├── group_vars/all.yml # Shared variables (k3s version, timezone)
│ │ └── host_vars/jetson/vars.yml # Jetson-specific variables
│ └── roles/
│ ├── common/ # Base packages, sysctl, iptables-legacy, kernel modules
│ ├── kernel-modules/ # Compile missing Tegra kernel modules (see KERNEL.md)
│ ├── tailscale/ # Install, configure, exit node NAT fix
│ ├── k3s/ # Install/configure k3s, fetch kubeconfig
│ └── flux/ # Bootstrap FluxCD, create ARC secrets
├── k8s/
│ ├── infrastructure/
│ │ ├── arc-controller/ # ARC controller HelmRelease
│ │ └── arc-runners/ # Runner scale set (jetson-arm64)
│ └── apps/ # Application manifests (FluxCD-managed)
└── kubeconfig/ # Fetched kubeconfigs (gitignored)
- Docker
- SSH key access to the Jetson (via Tailscale or LAN)
- 1Password CLI (optional — for automatic vault password)
# Build the Ansible Docker image
make build
# Setup vault secrets
make setup
vim ansible/vault.yml # fill in your secrets
make vault-encrypt
# Verify SSH connectivity
make ping
# Full deploy (common + kernel-modules + tailscale + k3s + flux)
make deployThe vault file (ansible/vault.yml) stores three secrets:
| Variable | Purpose |
|---|---|
vault_become_pass |
sudo password for the Jetson |
vault_tailscale_authkey |
Tailscale auth key for node registration |
vault_github_token |
GitHub PAT for Flux bootstrap and ARC runners |
The GitHub PAT needs Contents (read/write) and Administration (read/write) on this repo.
Vault password resolution order:
VAULT_PASSWORDenvironment variable- 1Password CLI (
op read 'op://Private/jetson-ansible-playbook/password') .vault-passwordfile (gitignored)- Interactive prompt
make build # Build the Ansible Docker image
make deploy # Run full site playbook (all roles)
make deploy-tailscale # Run Tailscale-only playbook
make deploy-k3s # Run k3s-only playbook
make deploy-k3s-clean # Run k3s with clean reinstall
make deploy-flux # Bootstrap FluxCD on the cluster
make diagnose # Run networking diagnostics on the Jetson
make ping # Verify SSH connectivity to all hosts
make k c="get pods -A" # Run kubectl against the k3s cluster
make lint # Lint playbooks and roles
make shell # Open a shell in the Ansible container
make setup # Setup vault secrets from template
make vault-encrypt # Encrypt the vault file
make vault-edit # Edit the encrypted vault file
make install-ca-cert # Trust the homelab CA on your machine (macOS/Linux)
make kubeconfig-merge # Merge homelab kubeconfig into ~/.kube/config
make clean # Remove the Docker image
After deploying k3s, the kubeconfig is fetched to kubeconfig/jetson.yaml with the Tailscale IP as the server address:
# Using the make shortcut (scoped KUBECONFIG, doesn't pollute your shell)
make k c="get nodes"
make k c="get pods -A"
# Or export it manually
export KUBECONFIG=$(pwd)/kubeconfig/jetson.yaml
kubectl get nodesBase system setup applied to all hosts:
- Installs essential packages (curl, jq, open-iscsi, nfs-common, etc.)
- Ensures
iptables-legacyis the active alternative (not nftables) - Loads kernel modules (br_netfilter, overlay, iptables modules)
- Applies sysctl settings (IP forwarding, inotify limits, bridge netfilter)
- Sets timezone
Compiles missing netfilter/ipset kernel modules for the Tegra kernel. This is required because NVIDIA's stock kernel doesn't include several modules that Kubernetes networking needs. See ansible/KERNEL.md for the full story.
- Downloads L4T R35.5.0 BSP sources from NVIDIA
- Enables missing kernel configs (IP_SET, XT_MATCH_NFACCT, etc.)
- Compiles only the needed subdirectories using
make M=<dir> modules - Installs
.kofiles to/lib/modules/and runsdepmod - Loads modules and persists them across reboots
- Skips entirely on subsequent runs if modules already work
Installs and configures Tailscale:
- Adds Tailscale apt repo and installs the package
- Brings up Tailscale with auth key, exit node, and route advertisement
- Exit node NAT fix: Deploys a systemd service + timer that re-applies iptables MASQUERADE rules every 5 minutes (k3s/flannel clobbers
ts-postrouting)
Installs k3s using the official install script:
- Optionally uninstalls existing k3s for clean installs (
k3s_clean_install: true) - Deploys
/etc/rancher/k3s/config.yamlwith Flannel VXLAN, cluster/service CIDRs - Adds Tailscale IP as a TLS SAN for remote kubectl access
- Disables Traefik by default (managed by Flux instead)
- Fetches kubeconfig locally with the Tailscale IP substituted as server address
Bootstraps FluxCD for GitOps:
- Installs the Flux CLI
- Bootstraps Flux pointing at
k8s/in this repo - Creates the ARC GitHub secret for runner authentication
Flux watches the k8s/ directory in this repo and reconciles changes automatically:
k8s/infrastructure/— Cluster infrastructure (ARC controller, runner scale sets)k8s/apps/— Application workloads
To deploy something new, add manifests to k8s/apps/, commit, and push. Flux picks it up.
The cluster uses a self-signed CA to issue wildcard certificates for *.int.shankyjs.com. cert-manager handles issuance and renewal automatically.
# Generate a 4096-bit RSA CA keypair (valid 10 years)
openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
-keyout ca.key -out ca.crt -days 3650 \
-subj "/CN=shankyjs-homelab-ca/O=shankyjs-homelab"
# Store both in 1Password (item: ca_key_int_shankyjs_com)
op item edit ca_key_int_shankyjs_com \
"credential[password]=$(cat ca.key)" \
"certificate[password]=$(cat ca.crt)"
# Delete the private key from disk — it lives only in 1Password now
rm ca.keyThe public cert is committed at k8s/certs/ca.crt for trust distribution. The private key never touches the repo.
1Password (ca_key_int_shankyjs_com)
└─ ExternalSecret (cert-manager/ca-key-secret) ── refreshes every 24h
## Internal DNS overrides
Some services (like Zot) are reachable via public DNS but must be accessed
through the internal Traefik IP from inside the cluster. We use CoreDNS
to provide a split-horizon override so pods resolve the internal IP.
- Source: `k8s/infrastructure/coredns/configmap.yaml`
- Override: `zot.int.shankyjs.com -> 100.88.193.63`
Self-checks:
kubectl -n arc-runners exec -- getent hosts zot.int.shankyjs.com
kubectl -n arc-runners exec -- sh -c
'curl -vk https://zot.int.shankyjs.com/v2/_catalog
--resolve zot.int.shankyjs.com:443:100.88.193.63'
└─ ClusterIssuer (selfsigned-ca)
├─ Certificate (traefik/wildcard-tls) ── *.int.shankyjs.com, 1yr, auto-renew at 30d
└─ Certificate (zot/wildcard-tls) ── *.int.shankyjs.com, 1yr, auto-renew at 30d
The CA public cert is also distributed as a ConfigMap (kube-system/homelab-ca-cert) for containers that need to trust it (e.g. ARC DinD runners pushing to Zot).
# macOS (adds to System Keychain) / Linux (update-ca-certificates)
scripts/install-ca-cert.sh| Item | Field | Contents |
|---|---|---|
ca_key_int_shankyjs_com |
credential |
PEM-encoded CA private key |
ca_key_int_shankyjs_com |
certificate |
PEM-encoded CA public certificate |
- All Ansible runs happen in Docker — No local Ansible install, no Python version conflicts, reproducible across machines
- Ansible pinned to <2.18 — The Jetson runs Ubuntu 20.04 with Python 3.8; newer Ansible drops support for Python 3.8 on targets
- Flannel + kube-proxy over Calico eBPF — Calico requires ipset support which needed custom kernel modules anyway; Flannel works out of the box once the kernel modules are compiled
- Single GitHub PAT for both Flux and ARC — Simplifies secret management
- Monorepo — Ansible config, k8s manifests, and CI config all in one place