Skip to content

Create the 4.18 monitoring branch and move the existing content #97809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 7 additions & 9 deletions about-ocp-monitoring/about-ocp-monitoring.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ include::_attributes/common-attributes.adoc[]

toc::[]

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
ifndef::openshift-dedicated,openshift-rosa[]
{product-title} includes a preconfigured, preinstalled, and self-updating monitoring stack that provides monitoring for core platform components. You also have the option to xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#enabling-monitoring-for-user-defined-projects-uwm_preparing-to-configure-the-monitoring-stack-uwm[enable monitoring for user-defined projects].

A cluster administrator can xref:../configuring-core-platform-monitoring/preparing-to-configure-the-monitoring-stack.adoc#preparing-to-configure-the-monitoring-stack[configure the monitoring stack] with the supported configurations. {product-title} delivers monitoring best practices out of the box.
Expand All @@ -15,14 +15,12 @@ A set of alerts are included by default that immediately notify administrators a

After installing {product-title}, cluster administrators can optionally enable monitoring for user-defined projects. By using this feature, cluster administrators, developers, and other users can specify how services and pods are monitored in their own projects.
As a cluster administrator, you can find answers to common problems such as user metrics unavailability and high consumption of disk space by Prometheus in xref:../troubleshooting/troubleshooting-monitoring-issues.adoc#troubleshooting-monitoring-issues[Troubleshooting monitoring issues].
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
endif::openshift-dedicated,openshift-rosa[]

ifdef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
ifdef::openshift-dedicated,openshift-rosa[]
In {product-title}, you can monitor your own projects in isolation from Red{nbsp}Hat Site Reliability Engineering (SRE) platform metrics. You can monitor your own projects without the need for an additional monitoring solution.
endif::openshift-dedicated,openshift-rosa[]




The {product-title}
ifdef::openshift-rosa,openshift-rosa-hcp[]
(ROSA)
endif::openshift-rosa,openshift-rosa-hcp[]
monitoring stack is based on the link:https://prometheus.io/[Prometheus] open source project and its wider ecosystem.
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
35 changes: 16 additions & 19 deletions about-ocp-monitoring/monitoring-stack-architecture.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,24 +6,22 @@ include::_attributes/common-attributes.adoc[]

toc::[]

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
The {product-title} monitoring stack is based on the link:https://prometheus.io/[Prometheus] open source project and its wider ecosystem.
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
The monitoring stack includes default monitoring components and components for monitoring user-defined projects.
The {product-title}
ifdef::openshift-rosa[]
(ROSA)
endif::openshift-rosa[]
monitoring stack is based on the link:https://prometheus.io/[Prometheus] open source project and its wider ecosystem. The monitoring stack includes default monitoring components and components for monitoring user-defined projects.

// Understanding the monitoring stack
include::modules/monitoring-understanding-the-monitoring-stack.adoc[leveloffset=+1]

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
ifndef::openshift-dedicated,openshift-rosa[]
//Default monitoring components
include::modules/monitoring-default-monitoring-components.adoc[leveloffset=+1]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

include::modules/monitoring-default-monitoring-targets.adoc[leveloffset=+2]

[role="_additional-resources"]
.Additional resources
* xref:../accessing-metrics/accessing-metrics-as-an-administrator.adoc#getting-detailed-information-about-a-target_accessing-metrics-as-an-administrator[Getting detailed information about a metrics target]
endif::openshift-dedicated,openshift-rosa[]

//Components for monitoring user-defined projects
include::modules/monitoring-components-for-monitoring-user-defined-projects.adoc[leveloffset=+1]
Expand All @@ -35,25 +33,24 @@ include::modules/monitoring-monitoring-stack-in-ha-clusters.adoc[leveloffset=+1]

[role="_additional-resources"]
.Additional resources

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html-single/operators/index#osdk-ha-sno[High-availability or single-node cluster detection and support]
* xref:../configuring-core-platform-monitoring/storing-and-recording-data.adoc#configuring-persistent-storage_storing-and-recording-data[Configuring persistent storage]
* xref:../configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc#configuring-performance-and-scalability[Configuring performance and scalability]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

ifdef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
* xref:../configuring-user-workload-monitoring/storing-and-recording-data-uwm.adoc#configuring-persistent-storage_storing-and-recording-data-uwm[Configuring persistent storage]
* xref:../configuring-user-workload-monitoring/configuring-performance-and-scalability-uwm.adoc#configuring-performance-and-scalability-uwm[Configuring performance and scalability]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

//Glossary of common terms for OCP monitoring
include::modules/monitoring-common-terms.adoc[leveloffset=+1]

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
ifndef::openshift-dedicated,openshift-rosa[]
[role="_additional-resources"]
[id="additional-resources_{context}"]
== Additional resources
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/support/index#about-remote-health-monitoring[About remote health monitoring]
* xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#granting-users-permission-to-monitor-user-defined-projects_preparing-to-configure-the-monitoring-stack-uwm[Granting users permissions for monitoring for user-defined projects]
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/security_and_compliance/index#tls-security-profiles[Configuring TLS security profiles]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
endif::openshift-dedicated,openshift-rosa[]






4 changes: 1 addition & 3 deletions accessing-metrics/accessing-metrics-as-a-developer.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,8 @@ You can access metrics to monitor the performance of your cluster workloads.

* xref:../key-concepts/key-concepts.adoc#understanding-metrics_key-concepts[Understanding metrics]

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
//Viewing a list of available metrics
include::modules/monitoring-viewing-a-list-of-available-metrics.adoc[leveloffset=+1]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

//Querying metrics for user-defined projects with the OCP web console
include::modules/monitoring-querying-metrics-for-user-defined-projects-with-mon-dashboard.adoc[leveloffset=+1]
Expand All @@ -33,4 +31,4 @@ include::modules/monitoring-reviewing-monitoring-dashboards-developer.adoc[level
.Additional resources

* xref:../key-concepts/key-concepts.adoc#about-monitoring-dashboards_key-concepts[About monitoring dashboards]
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/building_applications/index#monitoring-project-and-application-metrics-using-developer-perspective[Monitoring project and application metrics using the Developer perspective]
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html-single/building_applications/index#odc-monitoring-project-and-application-metrics-using-developer-perspective[Monitoring project and application metrics using the Developer perspective]
3 changes: 1 addition & 2 deletions accessing-metrics/accessing-metrics-as-an-administrator.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,8 @@ You can access metrics to monitor the performance of cluster components and your

* xref:../key-concepts/key-concepts.adoc#understanding-metrics_key-concepts[Understanding metrics]

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
//Viewing a list of available metrics
include::modules/monitoring-viewing-a-list-of-available-metrics.adoc[leveloffset=+1]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

//Querying metrics for all projects with the OCP web console
include::modules/monitoring-querying-metrics-for-all-projects-with-mon-dashboard.adoc[leveloffset=+1]
Expand All @@ -36,3 +34,4 @@ include::modules/monitoring-reviewing-monitoring-dashboards-admin.adoc[leveloffs
.Additional resources

* xref:../key-concepts/key-concepts.adoc#about-monitoring-dashboards_key-concepts[About monitoring dashboards]

Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,8 @@ include::modules/monitoring-resources-reference-for-the-cluster-monitoring-opera
[id="additional-resources_{context}"]
== Additional resources

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
* xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#enabling-monitoring-for-user-defined-projects-uwm_preparing-to-configure-the-monitoring-stack-uwm[Enabling monitoring for user-defined projects]
* xref:../configuring-core-platform-monitoring/configuring-metrics.adoc#configuring-remote-write-storage_configuring-metrics[Configuring remote write storage for core platform monitoring]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
* xref:../configuring-user-workload-monitoring/configuring-metrics-uwm.adoc#configuring-remote-write-storage_configuring-metrics-uwm[Configuring remote write storage for monitoring of user-defined projects]
* xref:../accessing-metrics/accessing-metrics-as-an-administrator.adoc#accessing-metrics-as-an-administrator[Accessing metrics as an administrator]
* xref:../accessing-metrics/accessing-metrics-as-a-developer.adoc#accessing-metrics-as-a-developer[Accessing metrics as a developer]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,13 @@ The configuration file is always defined under the `config.yaml` key in the conf
====
* Not all configuration parameters for the monitoring stack are exposed.
Only the parameters and fields listed in this reference are supported for configuration.
For more information about supported configurations, see xref:../support-for-monitoring/maintenance-and-support-for-monitoring.adoc#maintenance-and-support-for-monitoring[Maintenance and support for monitoring].
For more information about supported configurations, see
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
* xref:../support-for-monitoring/maintenance-and-support-for-monitoring.adoc#maintenance-and-support-for-monitoring[Maintenance and support for monitoring]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
ifdef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
xref:../../observability/monitoring/configuring-the-monitoring-stack.adoc#maintenance-and-support_configuring-the-monitoring-stack[Maintenance and support for monitoring].
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

* Configuring cluster monitoring is optional.
* If a configuration does not exist or is empty, default values are used.
Expand All @@ -55,7 +61,7 @@ link:#thanosrulerconfig[ThanosRulerConfig]
[options="header"]
|===
| Property | Type | Description
|apiVersion|string|Defines the API version of Alertmanager. `v1` is no longer supported, `v2` is set as the default value.
|apiVersion|string|Defines the API version of Alertmanager. Possible values are `v1` or `v2`. The default is `v2`.

|bearerToken|*v1.SecretKeySelector|Defines the secret key reference containing the bearer token to use when authenticating to Alertmanager.

Expand Down Expand Up @@ -718,7 +724,7 @@ Appears in: link:#userworkloadconfiguration[UserWorkloadConfiguration]
[options="header"]
|===
| Property | Type | Description
|additionalAlertmanagerConfigs|[]link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig]|Configures how the Thanos Ruler component communicates with additional Alertmanager instances. The Cluster Monitoring Operator reads the cluster-wide proxy settings and configures the appropriate proxy URL for the Alertmanager endpoints. All Alertmanager endpoints in this group are expected to use the same proxy URL. Endpoints that bypass the cluster proxy should be placed in a separate group. The default value is `nil`.
|additionalAlertmanagerConfigs|[]link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig]|Configures how the Thanos Ruler component communicates with additional Alertmanager instances. The default value is `nil`.

|evaluationInterval|string|Configures the default interval between Prometheus rule evaluations in case the `PrometheusRule` resource does not specify any value. The interval must be set between 5 seconds and 5 minutes. The value can be expressed in: seconds (for example `30s`), minutes (for example `1m`) or a mix of minutes and seconds (for example `1m30s`). It applies to `PrometheusRule` resources without the `openshift.io/prometheus-rule-evaluation-scope=\"leaf-prometheus\"` label. The default value is `15s`.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Alertmanager does not send notifications by default. It is strongly recommended
* xref:../key-concepts/key-concepts.adoc#sending-notifications-to-external-systems_key-concepts[Sending notifications to external systems]
* link:https://www.pagerduty.com/[PagerDuty website]
* link:https://www.pagerduty.com/docs/guides/prometheus-integration-guide/[Prometheus Integration Guide (PagerDuty documentation)]
* xref:../support-for-monitoring/maintenance-and-support-for-monitoring.adoc#support-version-matrix-for-monitoring-components_maintenance-and-support-for-monitoring[Support version matrix for monitoring components]
* xref:../release-notes/monitoring-release-notes.adoc#support-version-matrix-for-monitoring-components_monitoring-release-notes[Support version matrix for monitoring components]
* xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#enabling-alert-routing-for-user-defined-projects_preparing-to-configure-the-monitoring-stack-uwm[Enabling alert routing for user-defined projects]

include::modules/monitoring-configuring-alert-routing-default-platform-alerts.adoc[leveloffset=+2]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,26 +19,18 @@ include::modules/monitoring-adding-a-secret-to-the-alertmanager-configuration.ad
//Attaching additional labels to your time series and alerts
include::modules/monitoring-attaching-additional-labels-to-your-time-series-and-alerts.adoc[leveloffset=+1,tags=**;!CPM;UWM]

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
[role="_additional-resources"]
.Additional resources

* xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#enabling-monitoring-for-user-defined-projects-uwm_preparing-to-configure-the-monitoring-stack-uwm[Enabling monitoring for user-defined projects]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

[id="configuring-alert-notifications_{context}"]
== Configuring alert notifications

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
In {product-title}, an administrator can enable alert routing for user-defined projects with one of the following methods:

* Use the default platform Alertmanager instance.
* Use a separate Alertmanager instance only for user-defined projects.
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

ifdef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
In {product-title}, the `dedicated-admin` user can enable alert routing for user-defined projects by using a separate Alertmanager instance for user-defined projects.
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

Developers and other users with the `alert-routing-edit` cluster role can configure custom alert notifications for their user-defined projects by configuring alert receivers.

Expand All @@ -58,7 +50,7 @@ Review the following limitations of alert routing for user-defined projects:
* xref:../key-concepts/key-concepts.adoc#sending-notifications-to-external-systems_key-concepts[Sending notifications to external systems]
* link:https://www.pagerduty.com/[PagerDuty website]
* link:https://www.pagerduty.com/docs/guides/prometheus-integration-guide/[Prometheus Integration Guide (PagerDuty documentation)]
* xref:../support-for-monitoring/maintenance-and-support-for-monitoring.adoc#support-version-matrix-for-monitoring-components_maintenance-and-support-for-monitoring[Support version matrix for monitoring components]
* xref:../release-notes/monitoring-release-notes.adoc#support-version-matrix-for-monitoring-components_monitoring-release-notes[Support version matrix for monitoring components]
* xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#enabling-alert-routing-for-user-defined-projects_preparing-to-configure-the-monitoring-stack-uwm[Enabling alert routing for user-defined projects]

include::modules/monitoring-configuring-alert-routing-for-user-defined-projects.adoc[leveloffset=+2]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,8 @@ include::modules/monitoring-configuring-remote-write-storage.adoc[leveloffset=+1

[role="_additional-resources"]
.Additional resources
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/monitoring_apis/index#spec-remotewrite-writerelabelconfigs[`writeRelabelConfigs`]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
* link:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config[`relabel_config` (Prometheus documentation)]

include::modules/monitoring-supported-remote-write-authentication-settings.adoc[leveloffset=+2]
Expand All @@ -33,9 +32,8 @@ include::modules/monitoring-example-remote-write-queue-configuration.adoc[levelo

[role="_additional-resources"]
.Additional resources
ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]

* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/monitoring_apis/index#spec-remotewrite-2[Prometheus REST API reference for remote write]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
* link:https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage[Remote write compatible endpoints (Prometheus documentation)]
* link:https://prometheus.io/docs/practices/remote_write/#remote-write-tuning[Remote write tuning (Prometheus documentation)]
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/nodes/index#nodes-pods-secrets-about_nodes-pods-secrets[Understanding secrets]
Expand Down Expand Up @@ -64,9 +62,7 @@ include::modules/monitoring-example-service-endpoint-authentication-settings.ado
[role="_additional-resources"]
.Additional resources

ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
* xref:../configuring-user-workload-monitoring/preparing-to-configure-the-monitoring-stack-uwm.adoc#enabling-monitoring-for-user-defined-projects-uwm_preparing-to-configure-the-monitoring-stack-uwm[Enabling monitoring for user-defined projects]
* link:https://access.redhat.com/articles/6675491[Scrape Prometheus metrics using TLS in ServiceMonitor configuration] (Red{nbsp}Hat Customer Portal article)
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/monitoring_apis/index#podmonitor-monitoring-coreos-com-v1[PodMonitor API]
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html-single/monitoring_apis/index#servicemonitor-monitoring-coreos-com-v1[ServiceMonitor API]
endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[]
* link:https://access.redhat.com/articles/6675491[Scrape Prometheus metrics using TLS in ServiceMonitor configuration (Red{nbsp}Hat Customer Portal)]
Loading