From b901bf4f217d4ad81948b8812f931d704ee9f44b Mon Sep 17 00:00:00 2001 From: Peter Hunt Date: Tue, 7 Oct 2025 16:18:47 -0400 Subject: [PATCH] KEP-2371: update to beta Signed-off-by: Peter Hunt --- keps/prod-readiness/sig-node/2371.yaml | 2 + .../2371-cri-pod-container-stats/README.md | 212 +++++++++--------- .../2371-cri-pod-container-stats/kep.yaml | 26 ++- 3 files changed, 119 insertions(+), 121 deletions(-) diff --git a/keps/prod-readiness/sig-node/2371.yaml b/keps/prod-readiness/sig-node/2371.yaml index 5ef2d7ac4d0..bf53590e29b 100644 --- a/keps/prod-readiness/sig-node/2371.yaml +++ b/keps/prod-readiness/sig-node/2371.yaml @@ -1,3 +1,5 @@ kep-number: 2371 alpha: approver: "@deads2k" +beta: + approver: "@deads2k" diff --git a/keps/sig-node/2371-cri-pod-container-stats/README.md b/keps/sig-node/2371-cri-pod-container-stats/README.md index ea296f39247..4bf44dfdde9 100644 --- a/keps/sig-node/2371-cri-pod-container-stats/README.md +++ b/keps/sig-node/2371-cri-pod-container-stats/README.md @@ -35,9 +35,9 @@ - [Integration tests](#integration-tests) - [e2e tests](#e2e-tests) - [Graduation Criteria](#graduation-criteria) - - [Alpha implementation](#alpha-implementation) - - [Alpha -> Beta Graduation](#alpha---beta-graduation) - - [Beta -> GA Graduation](#beta---ga-graduation) + - [Alpha](#alpha) + - [Beta](#beta) + - [GA](#ga) - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) - [Version Skew Strategy](#version-skew-strategy) - [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) @@ -50,6 +50,7 @@ - [Implementation History](#implementation-history) - [Drawbacks](#drawbacks) - [Alternatives](#alternatives) + - [Infrastructure Needed (Optional)](#infrastructure-needed-optional) # cAdvisor-less, CRI-full Container and Pod Stats @@ -208,7 +209,7 @@ We want to avoid using cAdvisor for container & pod level stats and move metric * Improve performance and reduce confusion on metrics collection in the Kubelet. * Do not introduce breaking changes to the Summary API. * Eliminate dependencies on container runtime clients used by cAdvisor. -* Enhance CRI implementations to provide metrics analogous to the existing metrics provided by `/metrics/cadvisor`. +* Enhance CRI implementations to provide metrics analogous to the existing metrics provided by `/metrics/cadvisor`, and have the kubelet report metrics through that endpoint. ### Non-Goals @@ -221,6 +222,7 @@ We want to avoid using cAdvisor for container & pod level stats and move metric - Propose alternatives to the Summary API - Drop support for the fields in `/metrics/cadvisor` - Support `/metrics/cadvisor` from the Kubelet longterm. +- Windows stats and metrics ## Proposal @@ -241,11 +243,10 @@ This will be described in more detail in the [design details section](#design-de ### /metrics/cadvisor -1. Expose the metric fields provided in `/metrics/cadvisor` in an analogous Prometheus endpoint directly from the CRI implementation. +1. Expose the metric fields provided in `/metrics/cadvisor` in the same Prometheus endpoint, gathered by Kubelet from from the CRI implementation and reported through the Kubelet. 2. cAdvisor should be updated to support no longer collecting stats that are duplicated with CRI implementation, and omit them from the report sent to `/metrics/cadvisor`. -3. The precise endpoint can change, but all the fields should be duplicated (so custom rules can be maintained). -4. Kubelet does not collect nor expose pod and container level metrics that were formally collected for and exposed by `/metrics/cadvisor`. -5. Kubelet should broadcast the endpoint from the CRI, similarly to how it does for `/metrics/cadvisor`. +3. The precise endpoint will not change, but all the fields should be duplicated (so custom rules can be maintained). +4. Kubelet does not collect pod and container level metrics that were formally collected for and exposed by `/metrics/cadvisor`. ### User Stories [optional] @@ -265,7 +266,7 @@ on two entities reporting metrics, not totally changing what stats the Kubelet r Thus, this KEP largely the plan described [here](#plan), with some changes: -- The CRI implementation will be responsible for the fields in the `/metrics/cadvisor` endpoint, though the name of the endpoint and location may change. +- The CRI implementation will be responsible for gathering the fields for the `/metrics/cadvisor` endpoint, though they will be broadcasted from the Kubelet. - CRI API is used for all of the monitoring endpoints related to Containers and Pods (except Volume and Ephemeral Storage) - CRI API is used to provide metrics for eviction (as it relies on the summary API, which will be populated by the CRI implementation) @@ -315,7 +316,12 @@ as cAdvisor is fine tuned to perform in an adequate manner. ### Stats Summary API #### CRI Implementation -The CRI implementation will need to be extended to support reporting the full set of container-level from the [Summary API](#summary-container-stats-object). A new gRPC call will also be added to the CRI that allows reporting for metrics currently exported by cAdvisor, but are outside the scope of the Summary API. This new gRPC call will return a Prometheus metric based response which Kubelet can export. Additionally, `PodAndContainerStatsFromCRI` feature gate support will be added to only report Prometheus based metrics from the CRI when calling `/metrics/cadvisor` endpoint when the feature gate is enabled. The additional metrics we support will need to be added to the individual container runtimes. +The CRI implementation will need to be extended to support reporting the full set of container-level from the [Summary API](#summary-container-stats-object). +A new gRPC call will also be added to the CRI that allows reporting for metrics currently exported by cAdvisor, but are outside the scope of the Summary API. +This new gRPC call will return a Prometheus metric based response which Kubelet can export. Additionally, `PodAndContainerStatsFromCRI` feature gate support +will be added to only report Prometheus based metrics from metrics translated from the CRI implementation when calling `/metrics/cadvisor` endpoint when the feature gate is enabled. +The additional metrics we support will need to be added to the individual container runtimes. + ##### ContainerStats additions Currently, the CRI endpoints `{,List}ContainerStats` report the following fields for each container: - CPU @@ -608,9 +614,6 @@ Once all required CRI changes are completed, Kubelet can update its CRI stats pr To do so, we propose to add a feature gate, that, when set, modifies the existing CRI stats provider by removing all usage of cAdvisor for pod and container level stats. It will also configure cAdvisor to not report these stats. -As a note on that point: if users enable this behavior in alpha, and rely on `/metrics/cadvisor`, they would need to enable cAdvisor as a daemonset on the node. -There is no plan for the alpha iteration of this KEP to support `/metrics/cadvisor` coming from the built-in cAdvisor (when the feature gate is set). - Since all internal entities rely solely on the Summary API (eviction, preemption, metrics server), their needs will be satisfied by using the information gathered from the CRI. For users that rely on `/metrics/cadvisor`, see the details below. @@ -619,6 +622,8 @@ Additional work may be required to evaluate other kubelet components (e.g. evict Ideally all components will rely on summary API thereby alleviating need for cAdvisor for container and pod level stats. This is also a requirement to be able to disable cAdvisor container metrics collection. +To make clear to cluster admins when metrics are coming from CRI, rather than cadvisor, a new metric `kubelet_metrics_provider` will be used, with `provider` label either `cri` or `cadvisor`. + #### cAdvisor Once CRI and Kubelet stats provider level changes are in place, we can evaluate disabling cAdvisor from collecting container and pod level stats. @@ -645,9 +650,9 @@ so users can rely on them as a plug-and-play interface between the different imp The table above describes the various metrics that are in this endpoint. Each compliant CRI implementation must: -- Have a location broadcasted about where these metrics can be gathered from. The endpoint name must not necessarily be `/metrics/cadvisor`, nor be gathererd from the same port as it was from cAdvisor +- Respond to the PodSandboxMetrics and PodSandboxMetricDescriptors CRI calls with metrics from pods and containers that include the same information cAdvisor formerly reported. - Implement *all* metrics within the set of metrics that are decided on. - - **TODO** How will we decide this set? We could support all, or take polls from the community and come up with a set of sufficiently useful metrics. + - Currently, the [e2e_node](https://github.com/kubernetes/kubernetes/pull/126213) suite has been extended to include tests for the required metrics. - Pass a set of tests in the critest suite that verify they report the correct values for *all* supported metrics labels (to ensure continued conformance and standardization). Below is the proposed strategy for doing so: @@ -722,6 +727,8 @@ These are not used currently but are be very specific to each implementation and Based on the above settings, we should stay conservative and expose the existing set of working overlapping stats. This is what is proposed in the [changes cri](#cri-implementation) +Note: the actual implementation of these stats and metrics should be done in a separate KEP. These are included for informational purposes only. + ### Test Plan