Skip to content

Add support for Rocky Linux 10.1#797

Open
pranavcracker wants to merge 3 commits intomainfrom
feat/rocky_10.1
Open

Add support for Rocky Linux 10.1#797
pranavcracker wants to merge 3 commits intomainfrom
feat/rocky_10.1

Conversation

@pranavcracker
Copy link
Copy Markdown
Collaborator

@pranavcracker pranavcracker commented Mar 18, 2026

Description

  • This PR adds support for Rocky Linux 10.1 in Kubemarine and introduces improved handling of kernel upgrades during the system preparation phase.

Background

While validating Kubemarine on Rocky Linux 10.1, an issue was observed during the prepare.system.modprobe stage:

  • Kernel packages are upgraded during prepare.package_manager.manage_packages
  • The system continues running the old kernel (no reboot yet)
  • Required kernel modules (e.g., br_netfilter) are unavailable
  • This leads to failures in modprobe

Solution

  • Add support for Rocky Linux 10.1
  • Kernel Upgrade Detection
  • Early Reboot Mechanism
  • Avoid Duplicate Reboots

Changes Introduced

1. Rocky Linux 10.1 Support

  • Added support for rhel10 OS family where applicable
  • Ensured compatibility with kernel 6.x and updated package behavior

2. Kernel Upgrade Detection

  • Detects mismatch between:
    • Running kernel (uname -r)
    • Latest installed kernel
  • Implemented for:
    • RHEL-based systems (rpm -q kernel-core)
    • Debian-based systems (/boot/vmlinuz-*)

3. Early Reboot Mechanism

  • If kernel upgrade is detected:
    • Perform immediate reboot of affected nodes
    • Uses reboot_group for precise scope
  • Ensures subsequent steps (e.g., modprobe, sysctl) operate on the correct kernel

4. Avoid Duplicate Reboots

  • Introduced cluster.context["early_reboot_done"]
  • Prevents redundant reboot at cumulative stage

How to apply

Test Cases

TestCase 1
Steps:

  1. Install cluster with Rocky Linux 10.1 OS

Results:

Before After
Failure Success

TestCase 2: Kernel upgrade during installation (Rocky Linux 10.1)
Steps:

  1. Prepare a fresh VM with Rocky Linux 10.1
  2. Ensure base image contains older running kernel
  3. Run Kubemarine installation
  4. During prepare.package_manager.manage_packages, allow kernel packages to upgrade
  5. Observe behavior before prepare.system.modprobe

Results:

Before After
modprobe failure due to missing kernel modules Early reboot triggered and modprobe succeeds

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • There is no breaking changes, or migration patch is provided
  • Integration CI passed
  • Unit tests. If Yes list of new/changed tests with brief description
  • There is no merge conflicts

Unit tests

Indicate new or changed unit tests and what they do, if any.

@pranavcracker pranavcracker marked this pull request as ready for review March 26, 2026 11:55
manage_custom_packages
])

affected_hosts = system.detect_kernel_upgrade(group)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please help to understand following questions:

  • Which exactly package causes kernel upgrade? Maybe share install log where we can see kernel upgraded
  • From which version to which version kernel is upgraded?

Copy link
Copy Markdown
Collaborator Author

@pranavcracker pranavcracker Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During installation of several mandatory packages the kernel is getting upgraded as below -

*** TASK prepare.package_manager.manage_packages ***
 Running kubemarine.procedures.install.manage_mandatory_packages: 
 Group 'balancer' is requested for usage, but this group does not exist.
 Group 'balancer' is requested for usage, but this group does not exist.
 Group 'balancer' is requested for usage, but this group does not exist.
 Group 'balancer' is requested for usage, but this group does not exist.
 [install.manage_mandatory_packages] Installing ['conntrack-tools', 'iptables-nft', 'openssl', 'curl', 'policycoreutils-python-utils', 'kmod'] on 'k8s-prpa-aio'
 [executor.queue] Performing sudo ('yum install -y -d1 --color=never conntrack-tools iptables-nft openssl curl policycoreutils-python-utils kmod; PACKAGES=$(rpm -q conntrack-tools iptables-nft openssl curl policycoreutils-python-utils kmod); if [ $? != 0 ]; then echo "$PACKAGES" | grep \'is not installed\'; echo "Failed to check version for some packages. Make sure packages are not already installed with higher versions. Also, make sure user-defined packages have rpm-compatible names. "; exit 1; fi ',) on nodes ['10.102.0.126'] with options: {'warn': False, 'hide': True, 'pty': True, 'env': None, 'timeout': None}
 [executor._prepare_merged_action] Executing sudo ['yum install -y -d1 --color=never conntrack-tools iptables-nft openssl curl policycoreutils-python-utils kmod; PACKAGES=$(rpm -q conntrack-tools iptables-nft openssl curl policycoreutils-python-utils kmod); if [ $? != 0 ]; then echo "$PACKAGES" | grep \'is not installed\'; echo "Failed to check version for some packages. Make sure packages are not already installed with higher versions. Also, make sure user-defined packages have rpm-compatible names. "; exit 1; fi '] on host 10.102.0.126 with options: {'warn': False, 'hide': True, 'pty': True, 'env': None, 'timeout': 2700}
10.102.0.126: code=0
	=== stdout ===
	Last metadata expiration check: 0:00:02 ago on Thu 26 Mar 2026 11:05:04 AM UTC.
	Package openssl-1:3.5.1-3.el10.x86_64 is already installed.
	Package curl-8.12.1-2.el10.x86_64 is already installed.
	Package policycoreutils-python-utils-3.9-1.el10.noarch is already installed.
	Package kmod-31-12.el10.x86_64 is already installed.
	Dependencies resolved.
	========================================================================================================================================================================================================
	 Package                                             Architecture                        Version                                               Repository                                          Size
	========================================================================================================================================================================================================
	Installing:
	 conntrack-tools                                     x86_64                              1.4.8-3.el10                                          updatesrocky10-yumsrv                              235 k
	 iptables-nft                                        x86_64                              1.8.11-12.el10_1                                      updatesrocky10-yumsrv                              189 k
	 kernel-core                                         x86_64                              6.12.0-124.40.1.el10_1                                rocky10-yumsrv                                      18 M
	Upgrading:
	 curl                                                x86_64                              8.12.1-2.el10_1.2                                     rocky10-yumsrv                                     217 k
	 iptables-libs                                       x86_64                              1.8.11-12.el10_1                                      rocky10-yumsrv                                     408 k
	 libcurl                                             x86_64                              8.12.1-2.el10_1.2                                     rocky10-yumsrv                                     368 k
	 openssl                                             x86_64                              1:3.5.1-7.el10_1                                      rocky10-yumsrv                                     1.2 M
	 openssl-fips-provider                               x86_64                              1:3.5.1-7.el10_1                                      rocky10-yumsrv                                     812 k
	 openssl-libs                                        x86_64                              1:3.5.1-7.el10_1                                      rocky10-yumsrv                                     2.2 M
	Installing dependencies:
	 kernel-modules                                      x86_64                              6.12.0-124.40.1.el10_1                                rocky10-yumsrv                                      41 M
	 kernel-modules-core                                 x86_64                              6.12.0-124.40.1.el10_1                                rocky10-yumsrv                                      30 M
	 kernel-modules-extra                                x86_64                              6.12.0-124.40.1.el10_1                                rocky10-yumsrv                                     2.8 M
	 libnetfilter_cthelper                               x86_64                              1.0.1-1.el10                                          updatesrocky10-yumsrv                               23 k
	 libnetfilter_cttimeout                              x86_64                              1.0.0-27.el10                                         updatesrocky10-yumsrv                               23 k
	 libnetfilter_queue                                  x86_64                              1.0.5-9.el10                                          updatesrocky10-yumsrv                               28 k
	 libnftnl                                            x86_64                              1.2.8-4.el10                                          rocky10-yumsrv                                      84 k
	
	Transaction Summary
	========================================================================================================================================================================================================
	Install  10 Packages
	Upgrade   6 Packages
	
	Total download size: 97 M
	Downloading Packages:
	--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
	Total                                                                                                                                                                    46 MB/s |  97 MB     00:02     
	Running transaction check
	Transaction check succeeded.
	Running transaction test
	Transaction test succeeded.
	Running transaction
	kdump: For kernel=/boot/vmlinuz-6.12.0-124.40.1.el10_1.x86_64, crashkernel=2G-64G:256M,64G-:512M now. Please reboot the system for the change to take effect. Note if you don't want kdump-utils to manage the crashkernel kernel parameter, please set auto_reset_crashkernel=no in /etc/kdump.conf.
	
	
	Upgraded:
	  curl-8.12.1-2.el10_1.2.x86_64          iptables-libs-1.8.11-12.el10_1.x86_64   libcurl-8.12.1-2.el10_1.2.x86_64   openssl-1:3.5.1-7.el10_1.x86_64   openssl-fips-provider-1:3.5.1-7.el10_1.x86_64  
	  openssl-libs-1:3.5.1-7.el10_1.x86_64  
	Installed:
	  conntrack-tools-1.4.8-3.el10.x86_64                 iptables-nft-1.8.11-12.el10_1.x86_64                 kernel-core-6.12.0-124.40.1.el10_1.x86_64   kernel-modules-6.12.0-124.40.1.el10_1.x86_64  
	  kernel-modules-core-6.12.0-124.40.1.el10_1.x86_64   kernel-modules-extra-6.12.0-124.40.1.el10_1.x86_64   libnetfilter_cthelper-1.0.1-1.el10.x86_64   libnetfilter_cttimeout-1.0.0-27.el10.x86_64   
	  libnetfilter_queue-1.0.5-9.el10.x86_64              libnftnl-1.2.8-4.el10.x86_64                        
	
	Complete!

And with our proposed changes we are detecting such kernel upgrade and triggering reboot during kubemarine.procedures.install.manage_custom_packages as below-

[install.manage_custom_packages] Skipped - no packages configuration defined in config file
[executor.queue] Performing sudo ('uname -r',) on nodes ['10.102.0.126'] with options: {'warn': False, 'hide': True, 'pty': False, 'env': None, 'timeout': None}
[executor._prepare_merged_action] Executing sudo ['uname -r'] on host 10.102.0.126 with options: {'warn': False, 'hide': True, 'pty': False, 'env': None, 'timeout': 2700}
[executor.queue] Performing sudo ('rpm -q kernel-core',) on nodes ['10.102.0.126'] with options: {'warn': True, 'hide': True, 'pty': False, 'env': None, 'timeout': None}
[executor._prepare_merged_action] Executing sudo ['rpm -q kernel-core'] on host 10.102.0.126 with options: {'warn': True, 'hide': True, 'pty': False, 'env': None, 'timeout': 2700}
[system.detect_kernel_upgrade] 10.102.0.126: Kernel upgrade detected (running kernel: 6.12.0-124.8.1.el10_1.x86_64, latest installed: 6.12.0-124.40.1.el10_1.x86_64)
[install.system_prepare_package_manager_manage_packages] Rebooting node(s) 10.102.0.126 to apply updated kernel before proceeding                             with further system configuration.
[__init__.is_cluster_installed] Searching for already installed cluster...
[executor.queue] Performing sudo ('kubectl cluster-info',) on nodes ['10.102.0.126'] with options: {'warn': True, 'hide': True, 'pty': True, 'env': None, 'timeout': 15}
[executor._prepare_merged_action] Executing sudo ['kubectl cluster-info'] on host 10.102.0.126 with options: {'warn': True, 'hide': True, 'pty': True, 'env': None, 'timeout': 15}
[__init__.is_cluster_installed] Failed to detect any Kubernetes cluster
[executor.queue] Performing sudo ('last reboot',) on nodes ['10.102.0.126'] with options: {'warn': False, 'hide': True, 'pty': False, 'env': None, 'timeout': None}
[executor._prepare_merged_action] Executing sudo ['last reboot'] on host 10.102.0.126 with options: {'warn': False, 'hide': True, 'pty': False, 'env': None, 'timeout': 2700}
[executor.queue] Performing sudo ('systemctl stop sshd || sudo systemctl stop ssh ; sudo reboot 2>/dev/null >/dev/null',) on nodes ['10.102.0.126'] with options: {'warn': True, 'hide': True, 'pty': False, 'env': None, 'timeout': None}
[executor._prepare_merged_action] Executing sudo ['systemctl stop sshd || sudo systemctl stop ssh ; sudo reboot 2>/dev/null >/dev/null'] on host 10.102.0.126 with options: {'warn': True, 'hide': True, 'pty': False, 'env': None, 'timeout': 2700}
[system.perform_group_reboot] Waiting for boot up...
[system.perform_group_reboot] Initial boot history:
[system.perform_group_reboot] 10.102.0.126: code=0
[system.perform_group_reboot] 	=== stdout ===
[system.perform_group_reboot] 	reboot   system boot  6.12.0-124.8.1.e Thu Mar 26 11:03   still running
[system.perform_group_reboot] 	
[system.perform_group_reboot] 	wtmp begins Thu Mar 26 11:03:58 2026
[system.perform_group_reboot] 	
[executor._wait_for_boot_with_executor] Trying to connect to nodes, timeout is 600 seconds...
[executor._disconnect] Disconnected session with 10.102.0.126
....
[executor._wait_for_boot_with_executor] Attempting to connect to nodes...
[executor._prepare_merged_action] Executing run ["sudo -S -p '[sudo] password: ' last reboot"] on host 10.102.0.126 with options: {'hide': True, 'pty': True, 'watchers': [<kubemarine.core.executor.RawExecutor._do_nopasswd.<locals>.NoPasswdResponder object at 0x78b8c14903b0>], 'timeout': 2700}
[executor._wait_for_boot_with_executor] All nodes are online now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants