Skip to content

Comments

Add Tuned to configure NIC queues#108

Open
MarSik wants to merge 1 commit intoopenshift-kni:mainfrom
MarSik:nic-queues
Open

Add Tuned to configure NIC queues#108
MarSik wants to merge 1 commit intoopenshift-kni:mainfrom
MarSik:nic-queues

Conversation

@MarSik
Copy link
Member

@MarSik MarSik commented Feb 17, 2025

This file configures the number of NIC queues for kernel level networking.

@openshift-ci openshift-ci bot requested review from fedepaol and ffromani February 17, 2025 19:31
@openshift-ci
Copy link

openshift-ci bot commented Feb 17, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MarSik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 17, 2025
- data: |
[main]
summary=Configuration to set to 32 the nb of interrupt per NIC
include=openshift-node-performance-$name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the reference configuration we also include the PerformanceProfile. We should explicitly align the names here along with a note (comment) that if you change one you need to change the other.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could use lookupCR here to match the perf profile name, however it would not work when multiple profiles (and multiple MCPs) show up..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this block is from the reference CR, not the cluster-compare template, so it may be good to hard-code some kind of reasonable default, and document when it needs to be changed.

But this is a general problem we seem to keep hitting. Is it worth opening an RFE against NTO that would allow some kind of tuned-template way to say "Include the right tuned performance profile based on this node's MCP pool name"?

type=net
# Update the NIC name pattern
devices_udev_regex=^INTERFACE=ens1f(0|1)
channels=combined 32
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During discussion it was floated that 16 may be sufficient. Do we have data to back up either choice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, this is a conservative recommendation. We can start with 16 as that is what we recommend as a starter point for schedulable control plane as well.

@MarSik MarSik force-pushed the nic-queues branch 2 times, most recently from 07336d1 to 370879a Compare February 18, 2025 08:28
@MarSik
Copy link
Member Author

MarSik commented Feb 18, 2025

@imiller0 I need a bit of a help with

./compare.sh reference-crs reference-crs-kube-compare reference-crs-kube-compare/compare_ignore
The following files exist in source-crs only, but not found in reference:
reference-crs/required/networking/tuned-queues.yaml

@lack
Copy link
Member

lack commented Feb 19, 2025

You need to add another copy of the file in telco-core/configuration/reference-crs-kube-compare, and you'll need to decide which (if any) fields can be altered by an end-user and still be compliant with the RDS. Any of those user-settable values should be templated in the kube-compare reference copy.

@MarSik
Copy link
Member Author

MarSik commented Feb 20, 2025

@imiller0 @lack Like this? Can you give it lgtm so I am not approving my own PR?

Copy link
Member

@lack lack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some notes on how to describe the ways users will likely customize the values in the example object.

You will also need to add the tuned-queues.yaml to the metadata.yaml under reference-crs-kube-compare, and if you do use the capturegroup mechanism you will also need to declare the field that has the capturegroups to use that enhanced algorthm:

...
          - path: required/networking/tuned-queues.yaml
            config:
              perField:
                - pathToKey: spec.profile.0.data
                  inlineDiffFunc: capturegroups
...

[main]
summary=Configuration to set to 32 the nb of interrupt per NIC
# Make sure the include name matches the PerformanceProfile
include=openshift-node-performance-{{ .metadata.name }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will force the include to be include=openshift-node-performance-configuration-nic, which I'm not sure is right - It actually needs to be the name of the PerformanceProfile, not the name of this TuneD object, right?

If so, it's probably better to just use a capturegroup regex here (and also devices_udev_regex); something like this:

Suggested change
include=openshift-node-performance-{{ .metadata.name }}
include=openshift-node-performance-(?<performanceProfileName>[[:^space:]]+)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarSik can we accept Jim's changes and re-validate?

[net]
type=net
# Update the NIC name pattern
devices_udev_regex=^INTERFACE=ens1f(0|1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, if users are going to customize the regex part of the string, you should use something like this:

Suggested change
devices_udev_regex=^INTERFACE=ens1f(0|1)
devices_udev_regex=^INTERFACE=(?<ifaceRegex>[^\r\n]+)

Note: Depends on openshift/release#62113 to merge first so CI will pass (fixes a bug in the cluster-compare version used so it doesn't get confused by the trailing ) in the example CR)

type=net
# Update the NIC name pattern
devices_udev_regex=^INTERFACE=ens1f(0|1)
channels=combined 32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If users are allowed to change this number, consider:

Suggested change
channels=combined 32
channels=combined (?<channels>[0-9]+)

Comment on lines +20 to +24
recommend:
- machineConfigLabels:
machineconfiguration.openshift.io/role: "worker"
priority: 19
profile: configuration-nic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we want to template the MCP name here and it's inside a list, there's a little work that needs doing, something like this:

Suggested change
recommend:
- machineConfigLabels:
machineconfiguration.openshift.io/role: "worker"
priority: 19
profile: configuration-nic
recommend:
{{- range .spec.recommend }}
- machineConfigLabels:
machineconfiguration.openshift.io/role: {{ .machineConfigLabels."machineconfiguration.openshift.io/role" }}
priority: 19
profile: configuration-nic
{{- end }}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a list, however only the profile created in the Tuned should be referenced here. So the name of the profile is static as well as the number of items.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this always be "worker", or could this need to match different MCP pools?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can match various MCP pools indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants