Skip to content

Conversation

shreeya-patel98
Copy link
Collaborator

Update process (This kernel CentOS base for 5.14.0-570)

  • Kernel History Rebuild Process for all src.rpms hosted by RESF
  • Create sig-cloud-9/5.14.0-570.X.1.el9_6 branch
  • Check if any maintained code is included in the new el release.
  • Cherry-pick all code from previous branch into new branch (skipping unneeded code)
    • Fix conflicts as they arise
  • Build and Test

Removed Commits

None

Forward Port Process

shreeya@spatel-dev-bom:~/ciq/workspace/sig-cloud-9/kernel-src-tree-tools$ python3 rolling-release-update.py \
    --repo ../kernel-src-tree/ \
    --new-base-branch rocky9_6 \
    --old-rolling-branch sig-cloud-9/5.14.0-570.39.1.el9_6
[rolling release update] Rolling Product:  sig-cloud-9
[rolling release update] Checking out branch:  sig-cloud-9/5.14.0-570.39.1.el9_6
[rolling release update] Gathering all the RESF kernel Tags
b'1b9ea68b26cf (tag: resf_kernel-5.14.0-570.39.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.39.1.el9_6'
b'6ad42715f2bf (tag: resf_kernel-5.14.0-570.37.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.37.1.el9_6'
b'0564e55498d2 (tag: resf_kernel-5.14.0-570.32.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.32.1.el9_6'
b'e0a1a84bc26b (tag: resf_kernel-5.14.0-570.30.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.30.1.el9_6'
b'9fbeb8c24bbd (tag: resf_kernel-5.14.0-570.28.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.28.1.el9_6'
b'8cc6f289778f (tag: resf_kernel-5.14.0-570.26.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.26.1.el9_6'
b'cad0cbcb03be (tag: resf_kernel-5.14.0-570.25.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.25.1.el9_6'
b'4743a27158ca (tag: resf_kernel-5.14.0-570.24.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.24.1.el9_6'
b'08b6475feb07 (tag: resf_kernel-5.14.0-570.23.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.23.1.el9_6'
b'667004a38548 (tag: resf_kernel-5.14.0-570.22.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.22.1.el9_6'
b'9477e3364951 (tag: resf_kernel-5.14.0-570.21.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.21.1.el9_6'
b'b94108159618 (tag: resf_kernel-5.14.0-570.19.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.19.1.el9_6'
b'e8b954c95fef (tag: resf_kernel-5.14.0-570.18.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.18.1.el9_6'
b'838cd1e8d046 (tag: resf_kernel-5.14.0-570.17.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.17.1.el9_6'
b'171ceb527773 (tag: resf_kernel-5.14.0-570.16.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.16.1.el9_6'
b'18c0812a6563 (tag: resf_kernel-5.14.0-570.12.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.12.1.el9_6'
[rolling release update] Old Rolling Branch Tags:  [b'1b9ea68b26cf', b'6ad42715f2bf', b'0564e55498d2', b'e0a1a84bc26b', b'9fbeb8c24bbd', b'8cc6f289778f', b'cad0cbcb03be', b'4743a27158ca', b'08b6475feb07', b'667004a38548', b'9477e3364951', b'b94108159618', b'e8b954c95fef', b'838cd1e8d046', b'171ceb527773', b'18c0812a6563']
[rolling release update] Checking out branch:  rocky9_6
[rolling release update] Gathering all the RESF kernel Tags
b'4eb20e218dc1 (HEAD -> rocky9_6, tag: resf_kernel-5.14.0-570.42.2.el9_6, origin/rocky9_6_rebuild, origin/rocky9_6) Rebuild rocky9_6 with kernel-5.14.0-570.42.2.el9_6'
b'4d2fb3e9de8a (tag: resf_kernel-5.14.0-570.41.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.41.1.el9_6'
b'1b9ea68b26cf (tag: resf_kernel-5.14.0-570.39.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.39.1.el9_6'
b'6ad42715f2bf (tag: resf_kernel-5.14.0-570.37.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.37.1.el9_6'
b'0564e55498d2 (tag: resf_kernel-5.14.0-570.32.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.32.1.el9_6'
b'e0a1a84bc26b (tag: resf_kernel-5.14.0-570.30.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.30.1.el9_6'
b'9fbeb8c24bbd (tag: resf_kernel-5.14.0-570.28.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.28.1.el9_6'
b'8cc6f289778f (tag: resf_kernel-5.14.0-570.26.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.26.1.el9_6'
b'cad0cbcb03be (tag: resf_kernel-5.14.0-570.25.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.25.1.el9_6'
b'4743a27158ca (tag: resf_kernel-5.14.0-570.24.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.24.1.el9_6'
b'08b6475feb07 (tag: resf_kernel-5.14.0-570.23.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.23.1.el9_6'
b'667004a38548 (tag: resf_kernel-5.14.0-570.22.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.22.1.el9_6'
b'9477e3364951 (tag: resf_kernel-5.14.0-570.21.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.21.1.el9_6'
b'b94108159618 (tag: resf_kernel-5.14.0-570.19.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.19.1.el9_6'
b'e8b954c95fef (tag: resf_kernel-5.14.0-570.18.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.18.1.el9_6'
b'838cd1e8d046 (tag: resf_kernel-5.14.0-570.17.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.17.1.el9_6'
b'171ceb527773 (tag: resf_kernel-5.14.0-570.16.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.16.1.el9_6'
b'18c0812a6563 (tag: resf_kernel-5.14.0-570.12.1.el9_6) Rebuild rocky9_6 with kernel-5.14.0-570.12.1.el9_6'
[rolling release update] New Base Branch Tags:  [b'4eb20e218dc1', b'4d2fb3e9de8a', b'1b9ea68b26cf', b'6ad42715f2bf', b'0564e55498d2', b'e0a1a84bc26b', b'9fbeb8c24bbd', b'8cc6f289778f', b'cad0cbcb03be', b'4743a27158ca', b'08b6475feb07', b'667004a38548', b'9477e3364951', b'b94108159618', b'e8b954c95fef', b'838cd1e8d046', b'171ceb527773', b'18c0812a6563']
[rolling release update] Latest RESF tag sha:  b'1b9ea68b26cf'
"1b9ea68b26cff8acb6b0c3ca118cb6ca6fe35d6e Rebuild rocky9_6 with kernel-5.14.0-570.39.1.el9_6"
[rolling release update] Checking out old rolling branch:  sig-cloud-9/5.14.0-570.39.1.el9_6
[rolling release update] Finding the CIQ Kernel and Associated Upstream commits between the last resf tag and HEAD
[rolling release update] Last RESF tag sha:  b'1b9ea68b26cf'
[rolling release update] Total Commit in old branch:  21
{ "CIQ COMMMIT" : "UPSTREAM COMMMIT" }
Printing first 5 and last 5 commits
{
  "349825fcca33d889679db40e53a61621a19992ae": "fbe346ce9d626680a4dd0f079e17c7b5dd32ffad",
  "314c9a34f8dfb110f0c7a3f0957bc907b7a1b172": "7768c5f417336fa58dbfef9bb7ecd7eeec6d8886",
  "0a97725f214d7c6644b0affcef53ba66542bcabd": "c09ef59e17c6921c577d54bc8da4331b955d01a7",
  "17025007533176bfa9bc9b7932cf6038cacdc50b": "fa37a8849634db2dd3545116873da8cf4b1e67c6",
  "e5b20bdb3ace00d955bd2795997184b88625d4e6": "2fc8a346625eb1abfe202062c7e6a13d76cde5ea"
}
{
  "12a929d0f463d1ac52efde69381c3a3ad294d457": "9e517a8e9d9a303bf9bde35e5c5374795544c152",
  "b65dbea3f3b465d68a3c1ef1022097c4a112cfd1": "4a3b99bc04e501b816db78f70064e26a01257910",
  "7ac254549e9371b99413b6202e433e48acab7def": "a9c0b33ef2306327dd2db02c6274107065ff9307",
  "0b471068cba4bcec6160e168c681e41a1d8de601": "290e5d3c49f687c1567bde634dc33d57b0674919",
  "c8848a51fde7c5d7914260d1fbc5e81c8825fb1b": ""
}
[rolling release update] Checking out new base branch:  rocky9_6
[rolling release update] Finding the kernel version for the new rolling release
b'4eb20e218dc1 (HEAD -> rocky9_6, tag: resf_kernel-5.14.0-570.42.2.el9_6, origin/rocky9_6_rebuild, origin/rocky9_6) Rebuild rocky9_6 with kernel-5.14.0-570.42.2.el9_6'
<re.Match object; span=(0, 70), match=b'4eb20e218dc1 (HEAD -> rocky9_6, tag: resf_kernel>
[rolling release update} New Branch to create  sig-cloud-9/5.14.0-570.42.2.el9_6
[rolling release update] Check if branch Exists:  sig-cloud-9/5.14.0-570.42.2.el9_6
Branch sig-cloud-9/5.14.0-570.42.2.el9_6 does not exists creating
[rolling release update] Creating new branch for PR:  shreeya_sig-cloud-9/5.14.0-570.42.2.el9_6
[rolling release update] Creating Map of all new commits from last rolling release fork
[rolling release update] Total Commit in new branch:  33
{ "CIQ COMMMIT" : "UPSTREAM COMMMIT" }
Printing first 5 and last 5 commits
{
  "4eb20e218dc14ccebeba83cebb0c1e99ae9558d6": "",
  "dd29e40cfb67c62b0712db4c95c85a0b7406404b": "f90fff1e152dedf52b932240ebbd670d83330eca",
  "41f852672141afc17a72f2f6028cad4bb3e08ae3": "67dfc11982f7e3c37f0977e74671da2391b29181",
  "08cd329a6ea02002ecf3cfe42f2ec820a1af4a9c": "6aa989ab2bd0d37540c812b4270006ff794662e7",
  "304da2d7727fcc997fd927d285d382f2dbaeeb98": "f6bfc9afc7510cb5e6fbe0a17c507917b0120280"
}
{
  "7a72e84c4d36e1cdc7a85a36b4ef22a48b87ee53": "7b90df78184de90fe5afcc45393c8ad83b5b18a1",
  "1a168d89311513a7aabd7a84c5984945b54f10e2": "15589bda46830695a3261518bb7627afac61f519",
  "d7b9b6684636453b125f4f507e62682c67ac4a98": "bcd6e41d983621954dfc3f1f64249a55838b3e6a",
  "784526d2187b2be7e7fdd79a5e19384e88ac0eeb": "021ba7f1babd029e714d13a6bf2571b08af96d0f",
  "dd23207689a2bc1a63fe81dc92ad46c1d6daebc4": "b2beb5bb2cd90d7939e470ed4da468683f41baa3"
}
[rolling release update] Checking if any of the commits from the old rolling release are already present in the new base branch
[rolling release update] Removing commits from the new branch
[rolling release update] Applying the remaining commits to the new branch
Applying commit  "c8848a51fde7c5d7914260d1fbc5e81c8825fb1b selftests/mm temporary fix of hmm infinite loop"
Applying commit  "0b471068cba4bcec6160e168c681e41a1d8de601 net: mana: Add support for Multi Vports on Bare metal"
Applying commit  "7ac254549e9371b99413b6202e433e48acab7def tools: hv: Enable debug logs for hv_kvp_daemon"
Applying commit  "b65dbea3f3b465d68a3c1ef1022097c4a112cfd1 RDMA/mana_ib: use the correct page size for mapping user-mode doorbell page"
Applying commit  "12a929d0f463d1ac52efde69381c3a3ad294d457 RDMA/mana_ib: use the correct page table index based on hardware page size"
Applying commit  "41a9a9c21b0f458305b54d2586a2e836d9a30788 scsi: storvsc: Increase the timeouts to storvsc_timeout"
Applying commit  "4ec8399c6e8d6349834b6c647d1d2ba243f47261 Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges"
Applying commit  "35695361f8b4f847dc1cc7a28fbb44ccec1f28e6 hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages"
Applying commit  "44b3561e08ba773b343636d8275907a9283095a1 hv_netvsc: Preserve contiguous PFN grouping in the page buffer array"
Applying commit  "eca0b96fc4223b14732d186bb3a88c3e4bd4a49c hv_netvsc: Remove rmsg_pgcnt"
Applying commit  "53e1a41e5ad21c27fabda594aeea310ea75f1d69 Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()"
Applying commit  "9b911fa17c507eb948f70bc7da20a2893d617c2d hv_netvsc: Use VF's tso_max_size value when data path is VF"
Applying commit  "02431480bd695bcb58e7c6c438ab3e3a0a5adee8 net: mana: Allow tso_max_size to go up-to GSO_MAX_SIZE"
Applying commit  "29b7ce11a575cbc82d7cea1ef8a0684a0a1b297c net: mana: Add debug logs in MANA network driver"
Applying commit  "3f0545b63fd50b3c89ccae98d5e70216ea7e3b2d net: mana: Change the function signature of mana_get_primary_netdev_rcu"
Applying commit  "cb78015c75a9a9496a9245e2ef4bc1b06f3cbf9f RDMA/mana_ib: Handle net event for pointing to the current netdev"
Applying commit  "e5b20bdb3ace00d955bd2795997184b88625d4e6 net: mana: Support holes in device list reply msg"
Applying commit  "17025007533176bfa9bc9b7932cf6038cacdc50b net: mana: Switch to page pool for jumbo frames"
Applying commit  "0a97725f214d7c6644b0affcef53ba66542bcabd net: mana: Expose additional hardware counters for drop and TC via ethtool."
Applying commit  "314c9a34f8dfb110f0c7a3f0957bc907b7a1b172 net: mana: Add handler for hardware servicing events"
Applying commit  "349825fcca33d889679db40e53a61621a19992ae net: mana: Handle Reset Request from MANA NIC"

BUILD

/mnt/scratch/workspace/fips-9-compliant/kernel-src-tree
Skipping make mrproper
[TIMER]{MRPROPER}: 0s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-shreeya_fips-9-compliant_5.14.0-284.30.1-88ef6b42321d"
Making olddefconfig
#
# configuration written to .config
#
Starting Build
  SYNC    include/config/auto.conf.cmd
  UPD     include/config/kernel.release
  DESCEND objtool
  DESCEND bpf/resolve_btfids
  UPD     include/generated/utsrelease.h
  CALL    scripts/atomic/check-atomics.sh
warning: generated include/linux/atomic/atomic-instrumented.h has been modified.
  CALL    scripts/checksyscalls.sh
  CHK     include/generated/compile.h
  CC      init/version.o
  CC      arch/x86/crypto/aesni-intel_glue.o
  AR      init/built-in.a
  CC      kernel/sys.o
  CC      crypto/fips.o
  AR      arch/x86/crypto/built-in.a
  CC      crypto/algapi.o
  CC      security/integrity/ima/ima_init.o
  AR      arch/x86/built-in.a
  CC      net/ethtool/ioctl.o
--
  INSTALL /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+/kernel/drivers/hwmon/ads7828.ko
  STRIP   /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+/kernel/drivers/hwmon/ads7828.ko
  SIGN    /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+/kernel/drivers/hwmon/ads7828.ko
  DEPMOD  /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+
[TIMER]{MODULES}: 10s
Making Install
sh ./arch/x86/boot/install.sh 5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+ \
	arch/x86/boot/bzImage System.map "/boot"
[TIMER]{INSTALL}: 21s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+ and Index to 2
The default is /boot/loader/entries/c3004629c739468680b07075d6fee68a-5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+.conf with index 2 and kernel /boot/vmlinuz-5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+
The default is /boot/loader/entries/c3004629c739468680b07075d6fee68a-5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+.conf with index 2 and kernel /boot/vmlinuz-5.14.0-shreeya_sig-cloud-9_5.14.0-570.42.2.el9_6-2016fc30f6e+
Generating grub configuration file ...
Adding boot menu entry for UEFI Firmware Settings ...
done
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 9s
[TIMER]{BUILD}: 1851s
[TIMER]{MODULES}: 10s
[TIMER]{INSTALL}: 21s
[TIMER]{TOTAL} 1896s
Rebooting in 10 seconds

kernel-build.log

KselfTests

shreeya@spatel-dev-bom ~/c/w/sig-cloud-9> grep -a ^ok kselftest-before.log | wc -l
349
shreeya@spatel-dev-bom ~/c/w/sig-cloud-9> grep -a ^ok kselftest-after.log | wc -l
349

kselftest-after.log
kselftest-before.log

PlaidCat and others added 21 commits September 23, 2025 11:39
jira SECO-170

In Rocky9 if you run ./run_vmtests.sh -t hmm it will fail and cause an
infinite loop on ASSERTs in FIXTURE_TEARDOWN()
This temporary fix is based on the discussion here
https://patchwork.kernel.org/project/linux-kselftest/patch/26017fe3-5ad7-6946-57db-e5ec48063ceb@suse.cz/#25046055

We will investigate further kselftest updates that will resolve the root
causes of this.

Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3208
feature net_mana
commit-author Haiyang Zhang <haiyangz@microsoft.com>
commit 290e5d3

To support Multi Vports on Bare metal, increase the device config response
version. And, skip the register HW vport, and register filter steps, when
the Bare metal hostmode is set.

	Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1747671636-5810-1-git-send-email-haiyangz@microsoft.com
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>

(cherry picked from commit 290e5d3)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3207
feature tools_hv
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit a9c0b33

Allow the KVP daemon to log the KVP updates triggered in the VM
with a new debug flag(-d).
When the daemon is started with this flag, it logs updates and debug
information in syslog with loglevel LOG_DEBUG. This information comes
in handy for debugging issues where the key-value pairs for certain
pools show mismatch/incorrect values.
The distro-vendors can further consume these changes and modify the
respective service files to redirect the logs to specific files as
needed.

	Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Naman Jain <namjain@linux.microsoft.com>
	Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com
	Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com>
(cherry picked from commit a9c0b33)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
…l page

jira LE-3813
commit-author Long Li <longli@microsoft.com>
commit 4a3b99b

When mapping doorbell page from user-mode, the driver should use the system
page size as this memory is allocated via mmap() from user-mode.

	Cc: stable@vger.kernel.org
Fixes: 0266a17 ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
	Signed-off-by: Long Li <longli@microsoft.com>
Link: https://patch.msgid.link/1725030993-16213-2-git-send-email-longli@linuxonhyperv.com
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit 4a3b99b)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
… size

jira LE-3813
commit-author Long Li <longli@microsoft.com>
commit 9e517a8

MANA hardware uses 4k page size. When calculating the page table index,
it should use the hardware page size, not the system page size.

	Cc: stable@vger.kernel.org
Fixes: 0266a17 ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
	Signed-off-by: Long Li <longli@microsoft.com>
Link: https://patch.msgid.link/1725030993-16213-1-git-send-email-longli@linuxonhyperv.com
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit 9e517a8)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3545
commit-author Dexuan Cui <decui@microsoft.com>
commit b2f9665

Currently storvsc_timeout is only used in storvsc_sdev_configure(), and
5s and 10s are used elsewhere. It turns out that rarely the 5s is not
enough on Azure, so let's use storvsc_timeout everywhere.

In case a timeout happens and storvsc_channel_init() returns an error,
close the VMBus channel so that any host-to-guest messages in the
channel's ringbuffer, which might come late, can be safely ignored.

Add a "const" to storvsc_timeout.

	Cc: stable@kernel.org
	Signed-off-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1749243459-10419-1-git-send-email-decui@microsoft.com
	Reviewed-by: Long Li <longli@microsoft.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit b2f9665)
	Signed-off-by: Sultan Alsawaf <sultan@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3554
commit-author Michael Kelley <mhklinux@outlook.com>
commit 380b75d

vmbus_sendpacket_mpb_desc() is currently used only by the storvsc driver
and is hardcoded to create a single GPA range. To allow it to also be
used by the netvsc driver to create multiple GPA ranges, no longer
hardcode as having a single GPA range. Allow the calling driver to
specify the rangecount in the supplied descriptor.

Update the storvsc driver to reflect this new approach.

	Cc: <stable@vger.kernel.org> # 6.1.x
	Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-2-mhklinux@outlook.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 380b75d)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3554
commit-author Michael Kelley <mhklinux@outlook.com>
commit 4f98616

netvsc currently uses vmbus_sendpacket_pagebuffer() to send VMBus
messages. This function creates a series of GPA ranges, each of which
contains a single PFN. However, if the rndis header in the VMBus
message crosses a page boundary, the netvsc protocol with the host
requires that both PFNs for the rndis header must be in a single "GPA
range" data structure, which isn't possible with
vmbus_sendpacket_pagebuffer(). As the first step in fixing this, add a
new function netvsc_build_mpb_array() to build a VMBus message with
multiple GPA ranges, each of which may contain multiple PFNs. Use
vmbus_sendpacket_mpb_desc() to send this VMBus message to the host.

There's no functional change since higher levels of netvsc don't
maintain or propagate knowledge of contiguous PFNs. Based on its
input, netvsc_build_mpb_array() still produces a separate GPA range
for each PFN and the behavior is the same as with
vmbus_sendpacket_pagebuffer(). But the groundwork is laid for a
subsequent patch to provide the necessary grouping.

	Cc: <stable@vger.kernel.org> # 6.1.x
	Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-3-mhklinux@outlook.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 4f98616)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3554
commit-author Michael Kelley <mhklinux@outlook.com>
commit 41a6328

Starting with commit dca5161 ("hv_netvsc: Check status in
SEND_RNDIS_PKT completion message") in the 6.3 kernel, the Linux
driver for Hyper-V synthetic networking (netvsc) occasionally reports
"nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
that Hyper-V has rejected a network packet transmit request from the
guest, and the outgoing network packet is dropped. Higher level
network protocols presumably recover and resend the packet so there is
no functional error, but performance is slightly impacted. Commit
dca5161 is not the cause of the error -- it only added reporting
of an error that was already happening without any notice. The error
has presumably been present since the netvsc driver was originally
introduced into Linux.

The root cause of the problem is that the netvsc driver in Linux may
send an incorrectly formatted VMBus message to Hyper-V when
transmitting the network packet. The incorrect formatting occurs when
the rndis header of the VMBus message crosses a page boundary due to
how the Linux skb head memory is aligned. In such a case, two PFNs are
required to describe the location of the rndis header, even though
they are contiguous in guest physical address (GPA) space. Hyper-V
requires that two rndis header PFNs be in a single "GPA range" data
struture, but current netvsc code puts each PFN in its own GPA range,
which Hyper-V rejects as an error.

The incorrect formatting occurs only for larger packets that netvsc
must transmit via a VMBus "GPA Direct" message. There's no problem
when netvsc transmits a smaller packet by copying it into a pre-
allocated send buffer slot because the pre-allocated slots don't have
page crossing issues.

After commit 14ad6ed ("net: allow small head cache usage with
large MAX_SKB_FRAGS values") in the 6.14-rc4 kernel, the error occurs
much more frequently in VMs with 16 or more vCPUs. It may occur every
few seconds, or even more frequently, in an ssh session that outputs a
lot of text. Commit 14ad6ed subtly changes how skb head memory is
allocated, making it much more likely that the rndis header will cross
a page boundary when the vCPU count is 16 or more. The changes in
commit 14ad6ed are perfectly valid -- they just had the side
effect of making the netvsc bug more prominent.

Current code in init_page_array() creates a separate page buffer array
entry for each PFN required to identify the data to be transmitted.
Contiguous PFNs get separate entries in the page buffer array, and any
information about contiguity is lost.

Fix the core issue by having init_page_array() construct the page
buffer array to represent contiguous ranges rather than individual
pages. When these ranges are subsequently passed to
netvsc_build_mpb_array(), it can build GPA ranges that contain
multiple PFNs, as required to avoid the error "nvsp_rndis_pkt_complete
error status: 2". If instead the network packet is sent by copying
into a pre-allocated send buffer slot, the copy proceeds using the
contiguous ranges rather than individual pages, but the result of the
copying is the same. Also fix rndis_filter_send_request() to construct
a contiguous range, since it has its own page buffer array.

This change has a side benefit in CoCo VMs in that netvsc_dma_map()
calls dma_map_single() on each contiguous range instead of on each
page. This results in fewer calls to dma_map_single() but on larger
chunks of memory, which should reduce contention on the swiotlb.

Since the page buffer array now contains one entry for each contiguous
range instead of for each individual page, the number of entries in
the array can be reduced, saving 208 bytes of stack space in
netvsc_xmit() when MAX_SKG_FRAGS has the default value of 17.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=217503

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217503
	Cc: <stable@vger.kernel.org> # 6.1.x
	Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-4-mhklinux@outlook.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 41a6328)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3554
commit-author Michael Kelley <mhklinux@outlook.com>
commit 5bbc644

init_page_array() now always creates a single page buffer array entry
for the rndis message, even if the rndis message crosses a page
boundary. As such, the number of page buffer array entries used for
the rndis message must no longer be tracked -- it is always just 1.
Remove the rmsg_pgcnt field and use "1" where the value is needed.

	Cc: <stable@vger.kernel.org> # 6.1.x
	Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-5-mhklinux@outlook.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 5bbc644)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3554
commit-author Michael Kelley <mhklinux@outlook.com>
commit 45a442f

With the netvsc driver changed to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(), the latter has no remaining
callers. Remove it.

	Cc: <stable@vger.kernel.org> # 6.1.x
	Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-6-mhklinux@outlook.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 45a442f)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3885
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit 6859209

On Azure, increasing VF's gso/gro packet size to up-to GSO_MAX_SIZE
is not possible without allowing the same for netvsc NIC
(as the NICs are bonded together). For bonded NICs, the min of the max
aggregated pkt size of the members is propagated in the stack.

Therefore, we use netif_set_tso_max_size() to set max aggregated pkt size
to VF's packet size for netvsc too, when the data path is switched over
to the VF
Tested on azure env with Accelerated Networking enabled and disabled.

	Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
	Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6859209)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3885
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit 2731583

Allow the max aggregated pkt size to go up-to GSO_MAX_SIZE for MANA NIC.
This patch only increases the max allowable gso/gro pkt size for MANA
devices and does not change the defaults.
Following are the perf benefits by increasing the pkt aggregate size from
legacy gso_max_size value(64K) to newer one(up-to 511K

IPv4 tests
for i in {1..10}; do netperf -t TCP_RR  -H 10.0.0.5 -p50000 -- -r80000,80000
-O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT|tail -1; done

min	p90	p99	Throughput		gso_max_size
93	171	194	6594.25
97	154	180	7183.74
95	165	189	6927.86
96	165	188	6976.04
93	154	185	7338.05			64K
93	168	189	6938.03
94	169	189	6784.93
92	166	189	7117.56
94	179	191	6678.44
95	157	183	7277.81

min	p90	p99	Throughput
93	134	146	8448.75
95	134	140	8396.54
94	137	148	8204.12
94	137	148	8244.41
94	128	139	8666.52			80K
94	141	153	8116.86
94	138	149	8163.92
92	135	142	8362.72
92	134	142	8497.57
93	136	148	8393.23

IPv6 Tests
for i in {1..10}; do netperf -t TCP_RR  -H fd00:9013:cadd::4 -p50000 --
-r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT|tail -1; done

min	p90	p99	Throughput		gso_max_size
108	165	170	6673.2
101	169	189	6451.69
101	165	169	6737.65
102	167	175	6614.64
101	178	189	6247.13			64K
107	163	169	6678.63
106	176	187	6350.86
100	164	169	6617.36
102	163	170	6849.21
102	168	175	6605.7

min	p90	p99	Throughput
108	155	166	7183
110	154	163	7268.87
109	152	159	7434.35
107	145	157	7569.15
107	149	164	7496.17			80K
110	154	159	7245.85
108	156	162	7266.24
109	145	158	7526.66
106	145	151	7785.75
111	148	157	7246.65

Tested on azure env with Accelerated Networking enabled and disabled.

	Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
	Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2731583)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3889
commit-author Erni Sri Satya Vennela <ernis@linux.microsoft.com>
commit 47dfd7a

Add more logs to assist in debugging and monitoring
driver behaviour, making it easier to identify potential
issues  during development and testing.

	Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1739842455-23899-1-git-send-email-ernis@linux.microsoft.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 47dfd7a)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3893
commit-author Long Li <longli@microsoft.com>
commit a8445cf

Change mana_get_primary_netdev_rcu() to mana_get_primary_netdev(), and
return the ndev with refcount held. The caller is responsible for dropping
the refcount.

Also drop the check for IFF_SLAVE as it is not necessary if the upper
device is present.

	Signed-off-by: Long Li <longli@microsoft.com>
Link: https://patch.msgid.link/1741821332-9392-1-git-send-email-longli@linuxonhyperv.com
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit a8445cf)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3893
commit-author Long Li <longli@microsoft.com>
commit bee35b7
upstream-diff There were conflicts when applying this patch
due to the following missing commits :-
79bccd7 ("RDMA/mana_ib: Add port statistics support")
df91c47 ("RDMA/mana_ib: create/destroy AH")

When running under Hyper-V, the master device to the RDMA device is always
bonded to this RDMA device. This is not user-configurable.

The master device can be unbind/bind from the kernel. During those events,
the RDMA device should set to the current netdev to reflect the change of
master device from those events.

	Signed-off-by: Long Li <longli@microsoft.com>
Link: https://patch.msgid.link/1741821332-9392-2-git-send-email-longli@linuxonhyperv.com
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit bee35b7)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3903
commit-author Haiyang Zhang <haiyangz@microsoft.com>
commit 2fc8a34

According to GDMA protocol, holes (zeros) are allowed at the beginning
or middle of the gdma_list_devices_resp message. The existing code
cannot properly handle this, and may miss some devices in the list.

To fix, scan the entire list until the num_of_devs are found, or until
the end of the list.

	Cc: stable@vger.kernel.org
Fixes: ca9c54d ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
	Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
	Reviewed-by: Long Li <longli@microsoft.com>
	Reviewed-by: Shradha Gupta <shradhagupta@microsoft.com>
	Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/1741723974-1534-1-git-send-email-haiyangz@microsoft.com
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>

(cherry picked from commit 2fc8a34)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3907
commit-author Haiyang Zhang <haiyangz@microsoft.com>
commit fa37a88

Frag allocators, such as netdev_alloc_frag(), were not designed to
work for fragsz > PAGE_SIZE.

So, switch to page pool for jumbo frames instead of using page frag
allocators. This driver is using page pool for smaller MTUs already.

	Cc: stable@vger.kernel.org
Fixes: 80f6215 ("net: mana: Add support for jumbo frame")
	Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
	Reviewed-by: Long Li <longli@microsoft.com>
	Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Link: https://patch.msgid.link/1742920357-27263-1-git-send-email-haiyangz@microsoft.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit fa37a88)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
…htool.

jira LE-3915
commit-author Dipayaan Roy <dipayanroy@linux.microsoft.com>
commit c09ef59

Add support for reporting additional hardware counters for drop and
TC using the ethtool -S interface.

These counters include:

- Aggregate Rx/Tx drop counters
- Per-TC Rx/Tx packet counters
- Per-TC Rx/Tx byte counters
- Per-TC Rx/Tx pause frame counters

The counters are exposed using ethtool_ops->get_ethtool_stats and
ethtool_ops->get_strings. This feature/counters are not available
to all versions of hardware.

	Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
	Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20250609100103.GA7102@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit c09ef59)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3919
commit-author Haiyang Zhang <haiyangz@microsoft.com>
commit 7768c5f

To collaborate with hardware servicing events, upon receiving the special
EQE notification from the HW channel, remove the devices on this bus.
Then, after a waiting period based on the device specs, rescan the parent
bus to recover the devices.

	Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
	Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1749834034-18498-1-git-send-email-haiyangz@linux.microsoft.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 7768c5f)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-3923
commit-author Haiyang Zhang <haiyangz@microsoft.com>
commit fbe346c
upstream-diff There were conflicts seen when applying this
patch due to the following missing commits :-
ca8ac48 ("net: mana: Handle unsupported HWC commands")
505cc26 ("net: mana: Add support for auxiliary device servicing
events")

Upon receiving the Reset Request, pause the connection and clean up
queues, wait for the specified period, then resume the NIC.
In the cleanup phase, the HWC is no longer responding, so set hwc_timeout
to zero to skip waiting on the response.

	Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1751055983-29760-1-git-send-email-haiyangz@linux.microsoft.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit fbe346c)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@shreeya-patel98 shreeya-patel98 merged commit 036fc39 into sig-cloud-9/5.14.0-570.42.2.el9_6 Sep 24, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants