diff --git a/keps/prod-readiness/sig-network/4762.yaml b/keps/prod-readiness/sig-network/4762.yaml index 38fb70c6754..6e9112666cc 100644 --- a/keps/prod-readiness/sig-network/4762.yaml +++ b/keps/prod-readiness/sig-network/4762.yaml @@ -1,3 +1,5 @@ kep-number: 4762 alpha: approver: "@soltysh" +beta: + approver: "@soltysh" diff --git a/keps/sig-network/4762-allow-arbitrary-fqdn-as-pod-hostname/README.md b/keps/sig-network/4762-allow-arbitrary-fqdn-as-pod-hostname/README.md index 0a9b0d947db..3d3df2c0e0f 100644 --- a/keps/sig-network/4762-allow-arbitrary-fqdn-as-pod-hostname/README.md +++ b/keps/sig-network/4762-allow-arbitrary-fqdn-as-pod-hostname/README.md @@ -1,81 +1,4 @@ - # KEP-4762: Allows setting arbitrary FQDN as the pod's hostname - - - - - - [Release Signoff Checklist](#release-signoff-checklist) - [Summary](#summary) @@ -133,8 +56,8 @@ Items marked with (R) are required *prior to targeting to a milestone / release* - [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) - [x] (R) KEP approvers have approved the KEP status as `implementable` - [x] (R) Design details are appropriately documented -- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) - - [ ] e2e Tests for all Beta API Operations (endpoints) +- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [x] e2e Tests for all Beta API Operations (endpoints) - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free - [ ] (R) Graduation criteria is in place @@ -142,7 +65,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release* - [ ] (R) Production readiness review completed - [ ] (R) Production readiness review approved - [ ] "Implementation History" section is up-to-date for milestone -- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] - [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes - #### Alpha - Use the `HostnameOverride` feature gate to implement this feature. - Initial e2e tests completed and enabled. + - The link to the added e2e test: https://github.com/kubernetes/kubernetes/blob/master/test/e2e/common/node/pod_hostnameoverride.go - Add documentation for feature gates. - Add a detailed table to the docs illustrating the mappings between pod hostnames and DNS records under different configurations. #### Beta -- Gather feedback from developers and surveys - Make feature gate to be enabled by default. - Update the feature gate documentation. @@ -342,28 +221,6 @@ Older apiserver versions will similarly ignore the hostnameOverride field: ## Production Readiness Review Questionnaire - - ### Feature Enablement and Rollback ###### How can this feature be enabled / disabled in a live cluster? @@ -393,7 +250,7 @@ There will be no impact on running Pods in the cluster. This change solely affec ###### Are there any tests for feature enablement/disablement? -We will verify proper functionality through unit tests and e2e tests, covering both enabled and disabled states of the feature gate. +We have added unit tests for enabling and disabling the feature gate in: `pkg/kubelet/kubelet_pods_test.go#TestGeneratePodHostNameAndDomain` ### Rollout, Upgrade and Rollback Planning @@ -413,7 +270,104 @@ The `kubelet_started_pods_total` metrics helps determine whether enabling/disabl ###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? -N/A +I use `FEATURE_GATES=HostnameOverride=true ./hack/local-up-cluster.sh` to create a new cluster. + +Check the cluster version: +``` +$GOPATH/src/k8s.io/kubernetes/_output/bin/kubectl version +Client Version: v1.35.0 +Kustomize Version: v5.7.1 +Server Version: v1.35.0 +``` +Run a pod that uses HostnameOverride: +``` +cat < - ###### How can an operator determine if the feature is in use by workloads? - +Users can check which workloads are utilizing this feature with the following command: +``` +kubectl get pods -A -o json | jq -r '.items[] | select(.spec.hostnameOverride != null) | "\(.metadata.namespace) \(.metadata.name) \(.spec.hostnameOverride)"' +``` ###### How can someone using this feature know that it is working for their instance? - - -Validate alignment of `podSpec.hostnameOverride` with the pod's actual hostname using: -kubectl exec -- hostname +Users can use the following command to identify which workloads are using this feature and verify whether it is functioning as expected. +``` +kubectl get pods -A -o json | jq -r '.items[] | select(.spec.hostnameOverride != null) | "\(.metadata.namespace) \(.metadata.name) \(.spec.hostnameOverride)"' | while IFS=' ' read -r ns pod ho; do actual=$(kubectl exec -n "$ns" "$pod" -- hostname 2>/dev/null); [ "$actual" = "$ho" ] && echo "$ns $pod $actual $ho"; done +``` ###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? - +If the `kubelet_started_pods_errors_total` metric in a cluster remains consistently at 0, then after introducing this feature, the value of `kubelet_started_pods_errors_total` should similarly remain at 0. ###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? - - [x] Metrics - Metric name: `run_podsandbox_errors_total`, `kubelet_started_pods_total`, `kubelet_started_pods_errors_total`, `kubelet_restarted_pods_total` @@ -482,33 +405,16 @@ Pick one more of these and delete the rest. ###### Are there any missing metrics that would be useful to have to improve observability of this feature? - +No ### Dependencies - - ###### Does this feature depend on any specific services running in the cluster? No ### Scalability - - ###### Will enabling / using this feature result in any new API calls? No @@ -539,33 +445,13 @@ No ### Troubleshooting - - ###### How does this feature react if the API server and/or etcd is unavailable? +No impact to the running workloads + ###### What are other known failure modes? - +No known failure modes. ###### What steps should be taken if SLOs are not being met to determine the problem? @@ -573,6 +459,7 @@ For each of them, fill in the following information by copying the below templat - 2024-07-18: Initial draft KEP - 2025-08-13: Align KEPs with implemented PRs and documentation. +- 2025-10-10: Promote to beta stage ## Drawbacks diff --git a/keps/sig-network/4762-allow-arbitrary-fqdn-as-pod-hostname/kep.yaml b/keps/sig-network/4762-allow-arbitrary-fqdn-as-pod-hostname/kep.yaml index 76ab65c0e26..4e0a07d2af3 100644 --- a/keps/sig-network/4762-allow-arbitrary-fqdn-as-pod-hostname/kep.yaml +++ b/keps/sig-network/4762-allow-arbitrary-fqdn-as-pod-hostname/kep.yaml @@ -14,12 +14,12 @@ approvers: - "@SergeyKanzhelev" # The target maturity stage in the current dev cycle for this KEP. -stage: alpha +stage: beta # The most recent milestone for which work toward delivery of this KEP has been # done. This can be the current (upcoming) milestone, if it is being actively # worked on. -latest-milestone: "v1.34" +latest-milestone: "v1.35" # The milestone at which this feature was, or is targeted to be, at each stage. milestone: