HA k3s cluster on NixOS, managed with Ansible and Helmfile.
- Currently all nodes are master+worker
- Longhorn config relies on having at least 2 replicas (>=2 nodes)
- Logs Drilldown plugin is downloaded straight from GH Releases, skipping any of the grafana cloud stuff
- Nixos-init uses the whole disk, formatting and installing on it
- Fix loki-canary drop rules
- TF For CF
- Better support for custom dashboards
- Better way of declaring plugin GH links for Grafana
- Better helm install and manage flow
- NixOS - declarative OS configuration
- k3s - lightweight Kubernetes
- Cloudflare Tunnels - zero-trust SSH and ingress access
- Longhorn - distributed block storage
- kube-prometheus-stack - Prometheus, Grafana, Alertmanager, node-exporter
- Loki + Promtail - log aggregation
cd ansible/inventory
cp hosts.yml.example hosts.yml
cp group_vars/all.yml.example group_vars/all.ymlEdit hosts.yml with node IPs and Cloudflare SSH tunnel tokens.
Edit group_vars/all.yml with root_domain, k3s_token, and cloudflare_ingress_tunnel_token.
Per node:
- Boot NixOS minimal ISO
- Set password:
passwd nixos - Run:
ansible-playbook playbooks/nixos-init.yml -i inventory/hosts.yml --limit <node> - Remove USB and reboot
- Change password from default "changeme" set by the config
First node in k8s_control_plane is the cluster seed.
cd ansible
# Storage (first)
ansible-playbook playbooks/deploy-helm.yml -i inventory/hosts.yml -e helm_release=longhorn
# Logging
ansible-playbook playbooks/deploy-helm.yml -i inventory/hosts.yml -e helm_release=loki
ansible-playbook playbooks/deploy-helm.yml -i inventory/hosts.yml -e helm_release=promtail
# Monitoring
ansible-playbook playbooks/deploy-helm.yml -i inventory/hosts.yml -e helm_release=kube-prometheus-stack
# Cloudflare Tunnel
ansible-playbook playbooks/deploy-cloudflare-ingress-tunnel.yml -i inventory/hosts.ymlUpdate node config:
ansible-playbook playbooks/nixos-update.yml -i inventory/hosts.ymlReset k3s on a node (rejoin cluster):
ansible-playbook playbooks/nixos-update.yml -i inventory/hosts.yml -e reset_k3s=true --limit <node>Add to ~/.ssh/config:
Host node1.domain.com
ProxyCommand cloudflared access ssh --hostname %h