Skip to content

Commit bcdb346

Browse files
committed
Merge remote-tracking branch 'origin/mainline' into eks-mcp-launch
2 parents 4bd9e8a + 5d540f9 commit bcdb346

12 files changed

+147
-1160
lines changed

latest/ug/clusters/create-cluster.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Learn how to create an Amazon EKS cluster to run Kubernetes applications, includ
1212

1313
[NOTE]
1414
====
15-
This topic covers creating EKS clusters without EKS Auto Mode.
15+
This topic covers creating Amazon EKS clusters without EKS Auto Mode.
1616
1717
For detailed instructions on creating an EKS Auto Mode cluster, see <<create-cluster-auto>>.
1818

latest/ug/doc-history.adoc

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -34,14 +34,6 @@ link:eks/latest/userguide/security-iam-awsmanpol.html[type="documentation"]
3434
Added `ec2:CopyVolumes` permission to `AmazonEBSCSIDriverPolicy` to allow the EBS CSI Driver to copy EBS volumes directly.
3535

3636

37-
[.update,date="2025-11-20"]
38-
=== New {aws} managed policy
39-
[.update-ulink]
40-
link:eks/latest/userguide/security-iam-awsmanpol.html#security-iam-awsmanpol-updates[type="documentation"]
41-
42-
Amazon EKS has released a new managed policy `AmazonEKSMCPReadOnlyAccess` to enable read-only tools in the Amazon EKS MCP Server for observability and troubleshooting. For information, see link:eks/latest/userguide/security-iam-awsmanpol.html#security-iam-awsmanpol-updates[Amazon EKS updates to {aws} managed policies,type="documentation"].
43-
44-
4537
[.update,date="2025-10-22"]
4638
=== {aws} managed policy updates
4739
[.update-ulink]

latest/ug/networking/lbc-manifest.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -229,7 +229,7 @@ curl -Lo v2_14_1_full.yaml https://github.com/kubernetes-sigs/aws-load-balancer-
229229
+
230230
[source,shell,subs="verbatim,attributes"]
231231
----
232-
sed -i.bak -e '764,773d' ./v2_14_1_full.yaml
232+
sed -i.bak -e '764,772d' ./v2_14_1_full.yaml
233233
----
234234
+
235235
If you downloaded a different file version, then open the file in an editor and remove the following lines.

latest/ug/observability/container-network-observability.adoc

Lines changed: 144 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ In addition, Amazon EKS now provides network monitoring visualizations in the {a
1111

1212
These capabilities are enabled by https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-NetworkFlowMonitor.html[Amazon CloudWatch Network Flow Monitor].
1313

14-
== Use Cases
14+
== Use cases
1515

1616
=== Measure network performance to detect anomalies
1717
Several teams standardize on an observability stack that allows them to measure their system’s performance, visualize system metrics and be alarmed in the event that a specific threshold is breached. Container network observability in EKS aligns with this by exposing key system metrics that you can scrape to broaden observability of your system’s network performance at the pod and worker node level.
@@ -24,23 +24,137 @@ A lot of teams run EKS as the foundation for their platforms, making it the foca
2424

2525
== Features
2626

27-
. Performance metrics - This feature allows you to scrape network-related system metrics for pods and worker nodes directly from the Network Flow Monitor Agent running in your EKS cluster.
28-
. Service map - This feature dynamically visualizes intercommunication between workloads in the cluster, allowing you to quickly disclose key metrics (RT, RTO, and DT) associated with network flows between communicating pods.
27+
. Performance metrics - This feature allows you to scrape network-related system metrics for pods and worker nodes directly from the Network Flow Monitor (NFM) Agent running in your EKS cluster.
28+
. Service map - This feature dynamically visualizes intercommunication between workloads in the cluster, allowing you to quickly disclose key metrics (retransmissions - RT, retransmission timeouts - RTO, and data transferred - DT) associated with network flows between communicating pods.
2929
. Flow table - With this table, you can monitor the top talkers across the Kubernetes workloads in your cluster from three different angles: {aws} service view, cluster view, and external view. For each view, you can see the retransmissions, retransmission timeouts, and data transferred between the source pod and its destination.
3030
* {aws} service view: Shows top talkers to {aws} services (DynamoDB and S3)
3131
* Cluster view: Shows top talkers within the cluster (east ← to → west)
3232
* External view: Shows top talkers to cluster-external destinations outside {aws}
3333

34-
== Get Started
35-
To get started, enable Container Network Observability in the EKS console for a new or existing cluster. This will automate the creation of Network Flow Monitor dependencies (https://docs.aws.amazon.com/networkflowmonitor/2.0/APIReference/API_CreateScope.html[Scope] and https://docs.aws.amazon.com/networkflowmonitor/2.0/APIReference/API_CreateMonitor.html[Monitor] resources). In addition, you will have to install the https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-NetworkFlowMonitor-agents-kubernetes-eks.html[Network Flow Monitor Agent add-on]. Alternatively, you can install these dependencies using the `{aws} CLI`, https://docs.aws.amazon.com/eks/latest/APIReference/API_Operations_Amazon_Elastic_Kubernetes_Service.html[EKS APIs] (for the add-on), https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-NetworkFlowMonitor-API-operations.html[NFM APIs] or Infrastructure as Code (like https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/networkflowmonitor_monitor[Terraform]). Once these dependencies are in place, you can configure your preferred monitoring tool to scrape network performance metrics for pods and worker nodes from the NFM agent. To visualize the network activity and performance of your workloads, you can navigate to the EKS console under the “Network” tab of the cluster’s observability dashboard.
34+
== Get started
35+
To get started, enable Container Network Observability in the EKS console for a new or existing cluster. This will automate the creation of Network Flow Monitor (NFM) dependencies (https://docs.aws.amazon.com/networkflowmonitor/2.0/APIReference/API_CreateScope.html[Scope] and https://docs.aws.amazon.com/networkflowmonitor/2.0/APIReference/API_CreateMonitor.html[Monitor] resources). In addition, you will have to install the https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-NetworkFlowMonitor-agents-kubernetes-eks.html[Network Flow Monitor Agent add-on]. Alternatively, you can install these dependencies using the `{aws} CLI`, https://docs.aws.amazon.com/eks/latest/APIReference/API_Operations_Amazon_Elastic_Kubernetes_Service.html[EKS APIs] (for the add-on), https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-NetworkFlowMonitor-API-operations.html[NFM APIs] or Infrastructure as Code (like https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/networkflowmonitor_monitor[Terraform]). Once these dependencies are in place, you can configure your preferred monitoring tool to scrape network performance metrics for pods and worker nodes from the NFM agent. To visualize the network activity and performance of your workloads, you can navigate to the EKS console under the “Network” tab of the cluster’s observability dashboard.
3636

3737
When using Network Flow Monitor in EKS, you can maintain your existing observability workflow and technology stack while leveraging a set of additional features which further enable you to understand and optimize the network layer of your EKS environment. You can learn more about the https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-NetworkFlowMonitor.pricing.html[Network Flow Monitor pricing here].
3838

39+
== Prerequisites and important notes
40+
41+
. As mentioned above, if you enable Container Network Observability from the EKS console, the underlying NFM resource dependencies (Scope and Monitor) will be automatically created on your behalf, and you will be guided through the installation process of the EKS add-on for NFM.
42+
. If you want to enable this feature using Infrastructure as Code (IaC) like Terraform, you will have to define the following dependencies in your IaC: NFM Scope, NFM Monitor, EKS add-on for NFM. In addition, you'll have to grant the https://docs.aws.amazon.com/aws-managed-policy/latest/reference/CloudWatchNetworkFlowMonitorAgentPublishPolicy.html[relevant permissions] to the EKS add-on using https://docs.aws.amazon.com/eks/latest/userguide/pod-id-agent-setup.html[Pod Identity].
43+
. You must be running a minimum version of 1.1.0 for the NFM agent's EKS add-on.
44+
45+
=== Required IAM permissions
46+
47+
==== EKS add-on for NFM agent
48+
You can use the `CloudWatchNetworkFlowMonitorAgentPublishPolicy` https://docs.aws.amazon.com/aws-managed-policy/latest/reference/CloudWatchNetworkFlowMonitorAgentPublishPolicy.html[{aws} managed policy] with Pod Identity. This policy contains permissions for the NFM agent to send telemetry reports (metrics) to a Network Flow Monitor endpoint.
49+
[source,json,subs="verbatim,attributes"]
50+
----
51+
{
52+
"Version" : "2012-10-17",
53+
"Statement" : [
54+
{
55+
"Effect" : "Allow",
56+
"Action" : [
57+
"networkflowmonitor:Publish"
58+
],
59+
"Resource" : "*"
60+
}
61+
]
62+
}
63+
----
64+
65+
==== Container Network Observability in the EKS console
66+
The following permissions are required to enable the feature and visualize the service map and flow table in the console.
67+
[source,json,subs="verbatim,attributes"]
68+
----
69+
{
70+
"Version" : "2012-10-17",
71+
"Statement" : [
72+
{
73+
"Effect": "Allow",
74+
"Action": [
75+
"networkflowmonitor:ListScopes",
76+
"networkflowmonitor:ListMonitors",
77+
"networkflowmonitor:GetScope",
78+
"networkflowmonitor:GetMonitor",
79+
"networkflowmonitor:CreateScope",
80+
"networkflowmonitor:CreateMonitor",
81+
"networkflowmonitor:TagResource",
82+
"networkflowmonitor:StartQueryMonitorTopContributors",
83+
"networkflowmonitor:StopQueryMonitorTopContributors",
84+
"networkflowmonitor:GetQueryStatusMonitorTopContributors",
85+
"networkflowmonitor:GetQueryResultsMonitorTopContributors"
86+
],
87+
"Resource": "*"
88+
}
89+
]
90+
}
91+
----
92+
93+
== Using Infrastructure as Code (IaC)
94+
95+
=== Terraform
96+
97+
If you are using Terraform to manage your {aws} cloud infrastructure, you can include the following resource configurations to enable Container Network Observability for your cluster.
98+
99+
===== NFM Scope
100+
101+
```
102+
data "aws_caller_identity" "current" {}
103+
104+
resource "aws_networkflowmonitor_scope" "example" {
105+
target {
106+
region = "us-east-1"
107+
target_identifier {
108+
target_type = "ACCOUNT"
109+
target_id {
110+
account_id = data.aws_caller_identity.current.account_id
111+
}
112+
}
113+
}
114+
115+
tags = {
116+
Name = "example"
117+
}
118+
}
119+
```
120+
121+
===== NFM Monitor
122+
123+
```
124+
resource "aws_networkflowmonitor_monitor" "example" {
125+
monitor_name = "eks-cluster-name-monitor"
126+
scope_arn = aws_networkflowmonitor_scope.example.scope_arn
127+
128+
local_resource {
129+
type = "AWS::EKS::Cluster"
130+
identifier = aws_eks_cluster.example.arn
131+
}
132+
133+
remote_resource {
134+
type = "AWS::Region"
135+
identifier = "us-east-1" # this must be the same region that the cluster is in
136+
}
137+
138+
tags = {
139+
Name = "example"
140+
}
141+
}
142+
```
143+
144+
===== EKS add-on for NFM
145+
146+
```
147+
resource "aws_eks_addon" "example" {
148+
cluster_name = aws_eks_cluster.example.name
149+
addon_name = "aws-network-flow-monitoring-agent"
150+
}
151+
```
152+
39153
== How does it work?
40154

41-
=== Performance Metrics
155+
=== Performance metrics
42156

43-
==== System Metrics
157+
==== System metrics
44158
If you are running third party (3P) tooling to monitor your EKS environment (such as Prometheus and Grafana), you can scrape the supported system metrics directly from the Network Flow Monitor agent. These metrics can be sent to your monitoring stack to expand measurement of your system’s network performance at the pod and worker node level. The available metrics are listed in the table, under Supported system metrics.
45159

46160
image::images/nfm-eks-metrics-workflow.png[Illustration of scraping system metrics]
@@ -62,7 +176,7 @@ OPEN_METRICS_PORT:
62176
Range: [0..65535]
63177
----
64178

65-
==== Flow Level Metrics
179+
==== Flow level metrics
66180
In addition, Network Flow Monitor captures network flow data along with flow level metrics: retransmissions, retransmission timeouts, and data transferred. This data is processed by Network Flow Monitor and visualized in the EKS console to surface traffic in your cluster’s environment, and how it’s performing based on these flow level metrics.
67181

68182
The diagram below depicts a workflow in which both types of metrics (system and flow level) can be leveraged to gain more operational intelligence.
@@ -74,60 +188,71 @@ image::images/nfm-eks-metrics-types-workflow.png[Illustration of workflow with d
74188

75189
Important note: The scraping of system metrics from the NFM agent and the process of the NFM agent pushing flow-level metrics to the NFM backend are independent processes.
76190

77-
===== Supported System Metrics
191+
===== Supported system metrics
78192

79193
Important note: system metrics are exported in https://openmetrics.io/[OpenMetrics] format.
80194

81-
[%header,cols="3"]
82-
|===
195+
[%header,cols="4"]
196+
|====
83197

84198
|Metric name
85199
|Type
200+
|Dimensions
86201
|Description
87202

88203
|ingress_flow_count
89204
|Counter
205+
|podName, podNamespace, nodeName
90206
|Numbers of flows to a pod
91207

92208
|egress_flow_count
93209
|Counter
210+
|podName, podNamespace, nodeName
94211
|Number of flows from a pod to anywhere
95212

96213
|ingress_pkt_count
97214
|Counter
215+
|podName, podNamespace, nodeName
98216
|Number of TCP packets received by a pod
99217

100218
|egress_pkt_count
101219
|Counter
220+
|podName, podNamespace, nodeName
102221
|Number of TCP packets sent out by a pod
103222

104223
|ingress_bytes_count
105224
|Counter
225+
|podName, podNamespace, nodeName
106226
|Number of bytes received by a pod
107227

108228
|egress_bytes_count
109229
|Counter
230+
|podName, podNamespace, nodeName
110231
|Number of bytes sent out by a pod
111232

112233
|bw_in_allowance_exceeded
113234
|Counter
235+
|eniID, nodeName
114236
|Number of packets queued or dropped because the inbound aggregate bandwidth exceeded the maximum for the instance
115237

116238
|bw_out_allowance_exceeded
117239
|Counter
240+
|eniID, nodeName
118241
|Number of packets queued or dropped because the outbound aggregate bandwidth exceeded the maximum for the instance
119242

120243
|pps_allowance_exceeded
121244
|Counter
245+
|eniID, nodeName
122246
|Packets per second limit breached at a pod
123247

124248
|conntrack_allowance_exceeded
125249
|Counter
250+
|eniID, nodeName
126251
|Connection Track limit breached. An event will be generated if 90 to 95% conntrack table limit is reached and logged on the node.
127252

128-
|===
253+
|====
129254

130-
===== Supported System Metrics
255+
===== Supported flow level metrics
131256

132257
[%header,cols="3"]
133258
|===
@@ -150,7 +275,7 @@ Important note: system metrics are exported in https://openmetrics.io/[OpenMetri
150275

151276
|===
152277

153-
=== Service Map and Flow Table
278+
=== Service map and flow table
154279

155280
image::images/nfm-eks-workflow.png[Illustration of how NFM works with EKS]
156281

@@ -167,7 +292,7 @@ image::images/nfm-eks-workflow.png[Illustration of how NFM works with EKS]
167292

168293
The network flows pulled from the Top Contributors API are scoped to a 1 hour time range, and can include up to 500 flows from each category. For the service map, this means up to 1000 flows can be sourced and presented from the Intra AZ and Inter AZ flow categories over a 1 hour time range. For the flow table, this means that up to 3000 network flows can be sourced and presented from all 6 network flow categories over a 2 hour time range.
169294

170-
===== Example: Service Map
295+
===== Example: Service map
171296

172297
_Deployment view_
173298

@@ -185,7 +310,7 @@ _Pod view_
185310

186311
image::images/photo-gallery-pod.png[Illustration of service map with photo-gallery app in pod view]
187312

188-
===== Example: Flow Table
313+
===== Example: Flow table
189314

190315
_{aws} service view_
191316

@@ -195,8 +320,9 @@ _Cluster view_
195320

196321
image::images/cluster-view.png[Illustration of flow table in cluster view]
197322

198-
== Considerations and Limitations
323+
== Considerations and limitations
199324
* Container Network Observability in EKS is only available in regions where https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-NetworkFlowMonitor-Regions.html[Network Flow Monitor is supported].
200325
* Supported system metrics are in OpenMetrics format, and can be directly scraped from the Network Flow Monitor (NFM) agent.
201326
* To enable Container Network Observability in EKS using Infrastructure as Code (IaC) like https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/networkflowmonitor_monitor[Terraform], you need to have these dependencies defined and created in your configurations: NFM scope, NFM monitor and the NFM agent.
202-
* Network Flow Monitor supports up to approximately 5 million flows per minute. This is approximately 5,000 EC2 instances (EKS worker nodes) with the Network Flow Monitor agent installed. Installing agents on more than 5000 instances may affect monitoring performance until additional capacity is available.
327+
* Network Flow Monitor supports up to approximately 5 million flows per minute. This is approximately 5,000 EC2 instances (EKS worker nodes) with the Network Flow Monitor agent installed. Installing agents on more than 5000 instances may affect monitoring performance until additional capacity is available.
328+
* You must be running a minimum version of 1.1.0 for the NFM agent's EKS add-on.

latest/ug/security/iam-reference/security-iam-awsmanpol.adoc

Lines changed: 0 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -84,25 +84,6 @@ This policy includes the following permissions that allow Amazon EKS to complete
8484
8585
To view the latest version of the JSON policy document, see link:aws-managed-policy/latest/reference/AmazonEKSFargatePodExecutionRolePolicy.html#AmazonEKSFargatePodExecutionRolePolicy-json[AmazonEKSFargatePodExecutionRolePolicy,type="documentation"] in the {aws} Managed Policy Reference Guide.
8686

87-
[#security-iam-awsmanpol-amazoneksmcpreadonlyaccess]
88-
== {aws} managed policy: AmazonEKSMCPReadOnlyAccess
89-
:info_titleabbrev: AmazonEKSMCPReadOnlyAccess
90-
91-
You can attach `AmazonEKSMCPReadOnlyAccess` to your IAM entities. This policy provides read-only access to Amazon EKS resources and related {aws} services, enabling the Amazon EKS Model Context Protocol (MCP) Server to perform observability and troubleshooting operations without making any modifications to your infrastructure.
92-
93-
*Permissions details*
94-
95-
This policy includes the following permissions that allow principals to complete the following tasks:
96-
97-
* *`eks`* - Allows principals to describe and list EKS clusters, node groups, add-ons, access entries, insights, and access the Kubernetes API for read-only operations.
98-
* *`iam`* - Allows principals to retrieve information about IAM roles, policies, and their attachments to understand the permissions associated with EKS resources.
99-
* *`ec2`* - Allows principals to describe VPCs, subnets, and route tables to understand the network configuration of EKS clusters.
100-
* *`sts`* - Allows principals to retrieve caller identity information for authentication and authorization purposes.
101-
* *`logs`* - Allows principals to start queries and retrieve query results from CloudWatch Logs for troubleshooting and monitoring.
102-
* *`cloudwatch`* - Allows principals to retrieve metric data for monitoring cluster and workload performance.
103-
* *`eks-mcp`* - Allows principals to invoke MCP operations and call read-only tools within the Amazon EKS MCP Server.
104-
105-
To view the permissions for this policy, see link:aws-managed-policy/latest/reference/AmazonEKSMCPReadOnlyAccess.html[AmazonEKSMCPReadOnlyAccess,type="documentation"] in the {aws} Managed Policy Reference.
10687

10788
[#security-iam-awsmanpol-AmazonEKSConnectorServiceRolePolicy]
10889
== {aws} managed policy: AmazonEKSConnectorServiceRolePolicy
@@ -453,10 +434,6 @@ https://github.com/awsdocs/amazon-eks-user-guide/commits/mainline/latest/ug/secu
453434
|Added `ec2:CopyVolumes` permission to allow the EBS CSI Driver to copy EBS volumes directly.
454435
|November 17, 2025
455436

456-
|Introduced <<security-iam-awsmanpol-amazoneksmcpreadonlyaccess>>.
457-
|Amazon EKS introduced new managed policy `AmazonEKSMCPReadOnlyAccess` to enable read-only tools in the Amazon EKS MCP Server for observability and troubleshooting.
458-
|November 18, 2025
459-
460437
|Added permission to <<security-iam-awsmanpol-amazoneksservicerolepolicy>>.
461438
|Added `ec2:DescribeRouteTables` and `ec2:DescribeNetworkAcls` permissions to `AmazonEKSServiceRolePolicy`. This allows Amazon EKS to detect configuration issues with VPC route tables and network ACLs for hybrid nodes as part of cluster insights.
462439
|Oct 22, 2025

0 commit comments

Comments
 (0)