From 24a50da57a2dd81a9aa6b4653c13c2c5f434eeb7 Mon Sep 17 00:00:00 2001 From: Edvin Norling Date: Fri, 18 Feb 2022 15:10:12 +0100 Subject: [PATCH 1/2] Initial node docs --- docs/xks/operator-guide/node.md | 57 +++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) create mode 100644 docs/xks/operator-guide/node.md diff --git a/docs/xks/operator-guide/node.md b/docs/xks/operator-guide/node.md new file mode 100644 index 00000000000..332ee758712 --- /dev/null +++ b/docs/xks/operator-guide/node.md @@ -0,0 +1,57 @@ +--- +id: node +title: Node config +--- + +## Azure + +Our nodes are managed by azure but we set a number of + +For smaller clusters we normally use Standard_E2s_v4 for default node pool. + +### Azure Node type selection + +### Azure default node pool + +### Azure ephemeral nodes + +Should I use ephemeral nodes? +The short answer is yes, using Ephemeral nodes is the new standard in Azure but it's not the standard in terraform. +This is probably due to terraform azurerm provider don't want to do breaking changes. +Using Ephemeral nodes will give you better IO on your OS disk, for example it will speed up the boot time of your Kubernetes nodes. +For more information look at the official [Azure Ephemeral os disk](https://docs.microsoft.com/en-us/azure/virtual-machines/ephemeral-os-disks) documentation. + +But there is a problem when starting to us ephemeral disk and it's that if you try to use more disk then the available cache disk that is available on your node type Azure will automatically change you from `Ephemeral` disk to `Managed`. +If you are using the azure cli to create your cluster you don't have to think about this (it will use the default value for the node) but when using Terraform you have to define `os_disk_size_gb`. + +So how do I know how much disk I can use? + +At the time of writing these docs there is no good answer to that question. The Azure documentation is rather inconsistent. +According to there docs you should be able see how much cache disk you can use by looking at the `()` in the table in: [fsv2](https://docs.microsoft.com/en-us/azure/virtual-machines/fsv2-series) + +This is true for `fsv2-series` but it's not the case when it comes to. + +When you have decided what type you want you can verify the available disk by running: + +```shell +curl https://ephemeraldisk.danielstechblog.de/api/ephemeraldisk?location=northeurope&family=Ddsv4 + +curl https://ephemeraldisk.danielstechblog.de/api/ephemeraldisk?location=northeurope&family=Edsv4 +``` + +Instead our default node will be: `Standard_E2ds_v4` + +To verify that you have changed the disk type: + +```shell +az aks nodepool show --cluster-name aks-dev-we-aks1 -g rg-dev-we-aks -n default | jq .osDiskType +``` + +Instead of we `Standard_D2s_v5` change to `Standard_D2d_v5` + +References: + +- [https://www.danielstechblog.io/identify-the-max-capacity-of-ephemeral-os-disks-for-azure-vm-sizes/](https://www.danielstechblog.io/identify-the-max-capacity-of-ephemeral-os-disks-for-azure-vm-sizes/) +- [https://github.com/azure/AKS/issues/2672](https://github.com/azure/AKS/issues/2672) + +## AWS From e8316e07be0763110ab7e52e30a48fb12e5bdbc1 Mon Sep 17 00:00:00 2001 From: Edvin Norling Date: Fri, 18 Feb 2022 15:49:07 +0100 Subject: [PATCH 2/2] Add info about ephemeral nodes --- docs/xks/operator-guide/node.md | 63 ++++++++++++++++++++++++++++----- 1 file changed, 55 insertions(+), 8 deletions(-) diff --git a/docs/xks/operator-guide/node.md b/docs/xks/operator-guide/node.md index 332ee758712..2af83268769 100644 --- a/docs/xks/operator-guide/node.md +++ b/docs/xks/operator-guide/node.md @@ -5,14 +5,18 @@ title: Node config ## Azure -Our nodes are managed by azure but we set a number of - -For smaller clusters we normally use Standard_E2s_v4 for default node pool. +Our Kubernetes nodes are managed by azure. ### Azure Node type selection +TBD + ### Azure default node pool +For smaller clusters we normally use Standard_E2s_v4 for default node pool, if you want to use ephemeral OS disk you can use: Standard_E2ds_v4. + +TBD + ### Azure ephemeral nodes Should I use ephemeral nodes? @@ -29,29 +33,72 @@ So how do I know how much disk I can use? At the time of writing these docs there is no good answer to that question. The Azure documentation is rather inconsistent. According to there docs you should be able see how much cache disk you can use by looking at the `()` in the table in: [fsv2](https://docs.microsoft.com/en-us/azure/virtual-machines/fsv2-series) -This is true for `fsv2-series` but it's not the case when it comes to. +This is true for `fsv2-series` but it's not the case when it comes to `Edsv4-series` and both of them say that `Ephemeral OS Disks: Supported`. When you have decided what type you want you can verify the available disk by running: ```shell +curl https://ephemeraldisk.danielstechblog.de/api/ephemeraldisk?location=northeurope&family=Fsv2 + curl https://ephemeraldisk.danielstechblog.de/api/ephemeraldisk?location=northeurope&family=Ddsv4 curl https://ephemeraldisk.danielstechblog.de/api/ephemeraldisk?location=northeurope&family=Edsv4 ``` -Instead our default node will be: `Standard_E2ds_v4` +In your terraform config set define `os_disk_type="Ephemeral"` and `os_disk_size_gb = 128` if you are using `Standard_F8s_v2`. + +If you define bigger os disk size than what is available you will get an error when using terraform. + +```.hcl +aks_config = { + kubernetes_version = "1.21.2" + sku_tier = "Paid" + default_node_pool = { + orchestrator_version = "1.21.2" + vm_size = "Standard_E2s_v4" + min_count = 1 + max_count = 1 + node_labels = {} + }, + additional_node_pools = [ + { + name = "standard2" + orchestrator_version = "1.21.2" + vm_size = "Standard_F8s_v2" + min_count = 3 + max_count = 20 + node_labels = {} + node_taints = [] + os_disk_type = "Ephemeral" + os_disk_size_gb = 128 + spot_enabled = false + spot_max_price = null + } + ] +} +``` + +The minimal AKS image size supported is [30Gb](https://docs.microsoft.com/en-us/azure/aks/cluster-configuration#use-ephemeral-os-on-existing-clusters). +The default value of `os_disk_size_gb` is 128Gb. Many of the smaller node types have rather small ephemeral disks available. For example `Standard_F2s_v2` only have 32Gb available. +This leaves a rather small place for things like container images, logs and emptyDir. + +To workaround this issue it is possible to set `kubelet_disk_type = "Temporary"`, this way it will be possible to use the temporary disk +that always is available on all azure vm:s. These disks are in general bigger then the ephemeral disk. +This features got supported very recently in [azurerm](https://github.com/hashicorp/terraform-provider-azurerm/issues/15449) so we haven't had time to implement it. +This is high on our TODO and you can follow the feature [here](https://github.com/XenitAB/terraform-modules/issues/554). +In the mean time you can only use ephemeral or managed os disk. -To verify that you have changed the disk type: +You can easy see what disk type you are using on your nodepool: ```shell az aks nodepool show --cluster-name aks-dev-we-aks1 -g rg-dev-we-aks -n default | jq .osDiskType ``` -Instead of we `Standard_D2s_v5` change to `Standard_D2d_v5` - References: - [https://www.danielstechblog.io/identify-the-max-capacity-of-ephemeral-os-disks-for-azure-vm-sizes/](https://www.danielstechblog.io/identify-the-max-capacity-of-ephemeral-os-disks-for-azure-vm-sizes/) - [https://github.com/azure/AKS/issues/2672](https://github.com/azure/AKS/issues/2672) ## AWS + +TBD