-
Notifications
You must be signed in to change notification settings - Fork 520
Open
Description
/!\ To report a security issue please follow this procedure:
[https://github.com/OpenNebula/one/wiki/Vulnerability-Management-Process]
Description
The current implementation for Huge Pages support, as per the enhancement "Support use of huge pages without CPU pinning #6185," selects a NUMA node based on free resources. The scheduling mechanism effectively balances load across NUMA nodes. However, issues arise during VM migration, leading to inconsistencies.
To Reproduce
- Configure a VM to use Huge Pages and deploy it on a host.
- Initiate a migration using the standard SAVE/Restore or Live migration method.
- Observe that the VM continues to use the old NUMA node on the target host, even if the scheduler selects a different NUMA node based on the target host’s free resources.
- If there is insufficient memory in the old NUMA node on the target, the migration may fail.
- Deploy new VMs and note inconsistencies caused by incorrectly pinned VMs.
Expected behavior
- When a VM is migrated using SAVE/Restore or Live migration methods, the NUMA node assignments should be updated based on the scheduler's decision.
- The migration should update the VM’s configuration with the correct NUMA assignments, avoiding failures and maintaining scheduling consistency.
Details
- Affected Component: Scheduler, Virtual Machine Manager (VMM)
- Hypervisor: KVM
- Version: All
Additional context
- During SAVE and Live migration operations, we can use the --xml option to provide a new XML configuration file with the updated NUMA topology and CPU pinning information. This ensures that the VM's NUMA node and CPU assignments are correctly updated on the target host.
Progress Status
- Code committed - PR B #6772: Fix for NUMA and CPU Pinning Discrepancies During VM Save and Live Migration #6773
- Testing - QA
- Documentation (Release notes - resolved issues, compatibility, known issues)