-
Notifications
You must be signed in to change notification settings - Fork 521
OCPEDGE-2215: Updated TNF EP to address some drift from original requirements. #1887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
jaypoulz
commented
Nov 6, 2025
- Updated warning against baremetal platform including BMC block
- Updated test section to note that we'll skip requirements criteria if no requirements are provided
- Added a new block that explains the PacemakerCluster API, the status collector, and the health check controller
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
enhancements/two-node-fencing/tnf.md
Outdated
| @@ -317,6 +317,9 @@ In the future, it may be possible to lower the privilege level of the TNF contro | |||
| to run without root privileges. We are working with the RHEL-HA team to identify the specific set of commands that we use to narrow the scope of progress towards this goal. This remains a long-term | |||
| objective for both teams. | |||
|
|
|||
| ##### The PacemakerCluster Health Check | |||
| See [Status Propogation with PacemakerCluster Health Check](#status-propogation-with-pacemakercluster-health-check) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: propagation (somewhere else in the doc too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that word is hard. :)
enhancements/two-node-fencing/tnf.md
Outdated
| need to do so. An example of this would be if the cluster administrator rotated their BMC password without updating the fencing secret in the cluster. This would be caught by the pacemaker monitoring | ||
| checks, but something in the cluster would need to aware of that to propogate that information to the user directly. | ||
|
|
||
| To acheive this, we plan on using a pair of new controllers in CEO. The first is a status collector, which syncs every 30 seconds to gather that current state of pacemaker via `sudo pcs status xml`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, some spelling:
To achieve this, we plan on using two new controllers in CEO. The first is a status collector which syncs every 30 seconds to gather the current state of pacemaker via sudo pcs status xml
enhancements/two-node-fencing/tnf.md
Outdated
| @@ -708,6 +665,35 @@ This collection of diagrams collects a series of scenarios where both nodes fail | |||
|
|
|||
|  | |||
|
|
|||
| #### Status Propogation with PacemakerCluster Health Check | |||
| An important goal of Two Node OpenShift with Fencing is ensuring that we always warn the user before a disaster event can occur that would require manually intervention if we have the information we | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is really hard to parse. Maybe something like:
An important goal of Two Node OpenShift with Fencing is ensuring an early warning for potentially disastrous events (that would requiring manual intervention). To provide this warning, the information needs to be available within the cluster.
enhancements/two-node-fencing/tnf.md
Outdated
| - A list of nodes currently registered in pacemaker | ||
| - A list of recent events recorded by the pacemaker resources | ||
| - A list of recent fencing events performed by pacemaker | ||
| - A dump of the full pacemaker XML. This is kept so that in the case that the XML API is changed in a way that breaks the other fields, we can quickly deliver a fix for the breakage that parses the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wording: This is kept to be able to deliver a quick fix to the XML parsing code if the XML API is changed in such a way that other fields break
enhancements/two-node-fencing/tnf.md
Outdated
| - A dump of the full pacemaker XML. This is kept so that in the case that the XML API is changed in a way that breaks the other fields, we can quickly deliver a fix for the breakage that parses the | ||
| XML directly. | ||
|
|
||
| Once the PacemakerCluster object is populated is it handled on the CEO side by a new pacemaker healthcheck controller. This controller evaluates the status of the report and creates events in CEO for the following things: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it -> it is
enhancements/two-node-fencing/tnf.md
Outdated
| - Not all resources and nodes are in their expected / healthy state | ||
| - The pacemakercluster status object is stale (hasn't been updated in the last 5 minutes) | ||
|
|
||
| Overall these health checks are almost entirely informational. The only time they are used outside of this event creation or operator status is to ensure that the nodes recorded in pacemaker match the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove "this" in "this event"
enhancements/two-node-fencing/tnf.md
Outdated
| - A list of nodes currently registered in pacemaker | ||
| - A list of recent events recorded by the pacemaker resources | ||
| - A list of recent fencing events performed by pacemaker | ||
| - A dump of the full pacemaker XML. This is kept so that in the case that the XML API is changed in a way that breaks the other fields, we can quickly deliver a fix for the breakage that parses the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't feel like the right place to put this. This to me at least is not really a field a user would care about. Who is this API for?
enhancements/two-node-fencing/tnf.md
Outdated
| - A dump of the full pacemaker XML. This is kept so that in the case that the XML API is changed in a way that breaks the other fields, we can quickly deliver a fix for the breakage that parses the | ||
| XML directly. | ||
|
|
||
| Once the PacemakerCluster object is populated is it handled on the CEO side by a new pacemaker healthcheck controller. This controller evaluates the status of the report and creates events in CEO for the following things: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can the writer of PacemakerCluster not produce the events? Seems that component has all of the correct/relevant information to be able to write the events?
| - Warnings for fencing events that have happened on the cluster | ||
|
|
||
| More importantly, it also sets the CEO's status to degraded if one of the following conditions are true: | ||
| - Not all resources and nodes are in their expected / healthy state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct for CEO to go degraded? I thought I saw kubelet listed? Wouldn't some other component be responsible for alerting when a kubelet on a control plane node is down? Doesn't really feel like a CEO issue to report?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most other components that relying on multiple replicas will be degraded at the same time. The obvious one is API server. In fact, CEO already reports degraded when kubelet is down because it doesn't have all of the endpoints it thinks it's supposed to have (one per control-plane node).
The reason we include reporting the kubelet behavior in the pacemaker status is because pacemaker ensures that kubelet is started before etcd. That means, that for etcd to be healthy, kubelet must be healthy. We could ignore the state of the kubelet resource when reporting the state of pacemaker, but as I mentioned before, the etcd member controller is going to be reporting degraded anyway so it's just extra information that explains why pacemaker is unhealthy.
|
|
||
| More importantly, it also sets the CEO's status to degraded if one of the following conditions are true: | ||
| - Not all resources and nodes are in their expected / healthy state | ||
| - The pacemakercluster status object is stale (hasn't been updated in the last 5 minutes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs admin intervention in a fairly prompt manner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't know for sure. We can only give admins instructions if we know the state of pacemaker. If we haven't received a status, this means that CEO's status collector cronjob has stopped posting them or what's being posted is being rejected by the API.
In either case, the cluster could be in a state where the cluster could fail without recovering automatically. The goal is to raise this in a way where the cluster admin knows that something could be wrong.
- Updated warning against baremetal platform including BMC block - Updated test section to note that we'll skip requirements criteria if no requirements are provided - Added a new block that explains the PacemakerCluster API, the status collector, and the health check controller
|
@jaypoulz: This pull request references OCPEDGE-2215 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@jaypoulz: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| would be caught by the pacemaker monitoring checks, but something in the cluster needs to propagate that information to the user directly. | ||
|
|
||
| To acheive this, we plan on using two new controllers in CEO. The first is a status collector which syncs every 30 seconds to gather that current state of pacemaker via `sudo pcs status xml`. | ||
| This is parsed and populates a new status object called a `PacemakerCluster`, which is a singleton resource that created by CEO when the transition to an external etcd is completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some typos in thi sentece, which could flow better like so:
This is parsed and populates by a new status object called a PacemakerCluster, which is a singleton resource that created by CEO when the transition to an external etcd is completed.
| - A list of recent fencing events performed by pacemaker | ||
| - A dump of the full pacemaker XML. This is kept to be able to deliver a quick fix to the XML parsing code if the XML API is changed in such a way that other fields break. | ||
|
|
||
| Once the PacemakerCluster object is populated it is handled on the CEO side by a new pacemaker healthcheck controller. This controller evaluates the status of the report and creates events in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence seems a bit wordy. What about the following:
Once populated, the PacemakerCluster object is handled ...