|
| 1 | +--- |
| 2 | +title: Windows Server Multipath I/O (MPIO) Troubleshooting guidance |
| 3 | +description: Resolves issues in Multipath I/O (MPIO) Hyper-V, clustering, and virtualization environments. |
| 4 | +ms.date: 10/08/2025 |
| 5 | +manager: dcscontentpm |
| 6 | +audience: itpro |
| 7 | +ms.topic: troubleshooting |
| 8 | +ms.reviewer: kaushika |
| 9 | +ms.custom: |
| 10 | +- sap:Backup, Recovery, Disk, and Storage\Multipath IO (MPIO) and Storport |
| 11 | +- pcy:WinComm Storage High Avail |
| 12 | +appliesto: |
| 13 | + - <a href=https://learn.microsoft.com/windows/release-health/windows-server-release-info target=_blank>Supported versions of Windows Server</a> |
| 14 | +--- |
| 15 | + |
| 16 | +# Windows Server Multipath I/O (MPIO) troubleshooting guidance |
| 17 | + |
| 18 | +## Summary |
| 19 | + |
| 20 | +In modern Windows Server environments (Hyper-V, clustering, and virtualization), Multipath I/O (MPIO) is important for achieving storage high availability and fault tolerance. However, failures in configuration, hardware compatibility, or interactions with third-party DSMs (Device Specific Modules) can make disks unavailable and cause performance issues, path loss, and unexpected outages. This article provides troubleshooting help for administrators to resolve MPIO and storage path issues. |
| 21 | + |
| 22 | +## Troubleshooting checklist |
| 23 | + |
| 24 | +Use this checklist for systematic troubleshooting. |
| 25 | + |
| 26 | +**Preparation** |
| 27 | + |
| 28 | +- Verify a current and tested backup of all involved systems. |
| 29 | +- Confirm maintenance window and change management approval. |
| 30 | +- Review recent storage and network changes or firmware and driver updates. |
| 31 | +- Document all observed issues, error messages, event IDs, and timing. |
| 32 | + |
| 33 | +**Initial Triage** |
| 34 | + |
| 35 | +- Are disks or volumes missing in Disk Management, Failover Cluster Manager, or virtual machine (VM) settings? |
| 36 | +- Are any MPIO or storage errors reported? (Check: mpclaim -s -d, Device Manager, Event Viewer.) |
| 37 | +- Are you using third-party DSMs? Are they certified for your OS version? |
| 38 | +- Recent hardware changes: New or replaced cables, HBAs, storage switches and zones, server restarts. |
| 39 | +- Are storage controller and network paths visible in the system management console? |
| 40 | +- Are relevant Windows features (for example, Multipath-IO) installed and enabled? |
| 41 | + |
| 42 | +**Deeper checks** |
| 43 | + |
| 44 | +- Review Windows Event Viewer (system, storage, failover clustering logs). |
| 45 | +- Use PowerShell, DiskPart, and sysinternals tools to inspect and manipulate the disk and volume status. |
| 46 | +- Are problems isolated to one node, all nodes, or all servers? |
| 47 | + |
| 48 | +## Common issues and solutions |
| 49 | + |
| 50 | +The following sections detail the most common failure modes and provide step-by-step solutions. |
| 51 | + |
| 52 | +### Disks or paths missing in MPIO or OS (including after maintenance or restart) |
| 53 | + |
| 54 | +#### Symptoms |
| 55 | + |
| 56 | +- Disk missing in Disk Management or cluster |
| 57 | +- "mpclaim -s -d" missing LUNs |
| 58 | +- "Discover" option unavailable (greyed out) |
| 59 | +- Event log IDs 46, 153, 129 |
| 60 | + |
| 61 | +#### Resolution |
| 62 | + |
| 63 | +1. Check physical and storage connectivity and switch zoning |
| 64 | +2. Remove ghost or hidden devices in Device Manager (enable "show hidden") or run devnode clean |
| 65 | +3. Uninstall unsupported or duplicate DSMs (Control Panel > Programs and Features) |
| 66 | +4. Restart, reinstall, or re-enable the Multipath-IO feature (Install-WindowsFeature Multipath-IO) |
| 67 | +5. Run: |
| 68 | + |
| 69 | + ```console |
| 70 | + mpclaim -e |
| 71 | + mpclaim -s -d |
| 72 | + mpclaim -r -i -a |
| 73 | + ``` |
| 74 | + |
| 75 | + Add missing hardware IDs in **MPIO Properties** > **Discover Multi-Paths**. |
| 76 | +6. Open DiskPart: |
| 77 | + |
| 78 | + ```console |
| 79 | + diskpart |
| 80 | + san policy=OnlineAll |
| 81 | + exit |
| 82 | + ``` |
| 83 | + |
| 84 | +7. Bring missing disks online in Disk Management or through DiskPart. |
| 85 | + |
| 86 | +### Multipath Failover Delays or Path Flapping |
| 87 | + |
| 88 | +#### Symptoms |
| 89 | + |
| 90 | +- 30+ seconds delay or unresponsiveness during failover |
| 91 | +- IO errors |
| 92 | +- "MPIO is in a degraded state" |
| 93 | + |
| 94 | +- Event IDs 140, 46, 153, 129 |
| 95 | + |
| 96 | +#### Resolution |
| 97 | + |
| 98 | +1. Update all storage, HBA, and multipath drivers and firmware. |
| 99 | +2. Set recommended load-balancing policy and failover parameters: |
| 100 | + |
| 101 | + ```powershell |
| 102 | + Set-MSDSMGlobalDefaultLoadBalancePolicy -Policy RR |
| 103 | + Set-MPIOSetting -NotificationState Enabled |
| 104 | + Set-MPIOSetting -NotificationPeriod 30 |
| 105 | + Set-MPIOSetting -NewPDORemovePeriod 20 |
| 106 | + Set-MPIOSetting -CustomPathRecovery Enabled |
| 107 | + Set-MPIOSetting -NewPathRecoveryInterval 10 |
| 108 | + reg add HKLM\SYSTEM\CurrentControlSet\Services\mpio\Parameters /v PathVerifyEnabled /t REG_DWORD /d 1 /f |
| 109 | + ``` |
| 110 | + |
| 111 | + Restart if prompted to do so. |
| 112 | +3. On clusters, increase disk resource timeouts to tolerate slow failover. |
| 113 | + |
| 114 | +### Cluster disk online pending or resource failure |
| 115 | + |
| 116 | +#### Symptoms |
| 117 | + |
| 118 | +- Cluster disk stuck "Online Pending" |
| 119 | +- Event IDs 1069, 4874 |
| 120 | +- Disk takes three to five minutes to come online |
| 121 | + |
| 122 | +#### Resolution |
| 123 | + |
| 124 | +1. Review cluster logs for timeout or resource errors. |
| 125 | +2. Increase resource timeout in Failover Cluster Manager (default is three minutes, try five minutes or more). |
| 126 | +3. Check for volume snapshots or FSRM quotas that delay bringing resources online. |
| 127 | +4. Clear any cluster dependencies or disk policies that are impeding access. |
| 128 | + |
| 129 | +### 4. **Persistent Disk or Path Errors in Event Log** |
| 130 | + |
| 131 | +#### Symptoms |
| 132 | + |
| 133 | +- Recurring Event IDs 153 ("disk retried"), 129 ("reset to device"), 11 ("controller error"), 158 (identical disk GUIDs) |
| 134 | +- Application downtime. |
| 135 | + |
| 136 | +#### Resolution |
| 137 | + |
| 138 | +1. Verify that physical and logical paths exist and are healthy. |
| 139 | +2. Update storage firmware and drivers. |
| 140 | +3. For Event ID 158, reset disk GUIDs on VHDs by using: |
| 141 | + |
| 142 | + ```powershell |
| 143 | + Set-VHD -Path \<VHD-Path> -ResetDiskIdentifier |
| 144 | + ``` |
| 145 | + |
| 146 | +4. For repeated errors with third-party DSMs: Consult storage vendor, or migrate to supported native DSM. |
| 147 | + |
| 148 | +### Duplicate disks or wrong disk order |
| 149 | + |
| 150 | +#### Symptoms |
| 151 | + |
| 152 | +- Duplicate instance of the same disk or LUN |
| 153 | +- Disks renumbered after a restart, addition, or removal |
| 154 | + |
| 155 | +#### Resolution |
| 156 | + |
| 157 | +1. Make sure that only one multipath solution is attached (remove unsupported DSMs, restart, and then reinstall or reconfigure MPIO). |
| 158 | + |
| 159 | +2. Disk order in Windows is nondeterministic. For applications, use persistent identifiers (GUID/UUID/WWN) instead of disk number or drive letter. |
| 160 | + |
| 161 | +### Performance and latency issues, high disk IO waits |
| 162 | + |
| 163 | +#### Symptoms |
| 164 | + |
| 165 | +- High latency in database and apps |
| 166 | +- Periodic IO spikes |
| 167 | +- Low throughput |
| 168 | +- Event 833 (SQL) |
| 169 | +- Slow backups |
| 170 | + |
| 171 | +#### Resolution |
| 172 | + |
| 173 | +1. Run Perfmon to verify: |
| 174 | + |
| 175 | + ```console |
| 176 | + logman create counter PerfLog -o C:\PerfLog.blg -f bincirc ... |
| 177 | + ``` |
| 178 | + |
| 179 | +2. Check for security or antivirus scans on storage volumes (exclude or temporarily disable to test the effect). |
| 180 | +3. Update drivers, firmware, and the OS. |
| 181 | +4. Use 64K allocation units for data volumes. Distribute disks across multiple controllers, if possible. |
| 182 | + |
| 183 | +### Cluster disk disappears after an expansion or resize |
| 184 | + |
| 185 | +#### Symptoms |
| 186 | + |
| 187 | +- After you increase disk or LUN capacity, the volume isn't visible in cluster or disk manager until the role is cycled. |
| 188 | + |
| 189 | +#### Resolution |
| 190 | + |
| 191 | +1. Bring affected cluster role offline, then online (or move role to another node). |
| 192 | +2. Use DiskPart on the owner node to rescan and extend filesystem during maintenance: |
| 193 | + |
| 194 | + ```console |
| 195 | + diskpart |
| 196 | + rescan |
| 197 | + list vol |
| 198 | + select vol x |
| 199 | + extend filesystem |
| 200 | + exit |
| 201 | + ``` |
| 202 | + |
| 203 | +### MPIO path not detected without server restart |
| 204 | + |
| 205 | +#### Symptoms |
| 206 | + |
| 207 | +- After cabling, zoning, or storage configuration change, new paths aren't detected until server is restarted. |
| 208 | + |
| 209 | +#### Resolution |
| 210 | + |
| 211 | +1. Run: |
| 212 | + |
| 213 | + ```console |
| 214 | + Update-HostStorageCache |
| 215 | + mpclaim -n -d |
| 216 | + ``` |
| 217 | + |
| 218 | +2. If the command is unsuccessful, a restart is required. |
| 219 | +3. Make sure that MPIO, DSM, and Storport are up to date. |
| 220 | + |
| 221 | +### Known bug and ICM or product defect scenarios |
| 222 | + |
| 223 | +#### Symptoms |
| 224 | + |
| 225 | +- Persistent failover problems |
| 226 | +- Event 153 or 129, despite all fixes |
| 227 | +- Documented "won’t fix" scenarios in the OS version |
| 228 | + |
| 229 | +#### Resolution |
| 230 | + |
| 231 | +1. Verify with Microsoft Support or vendor documentation all bug and ICM IDs. |
| 232 | +2. Implement documented workarounds. |
| 233 | +3. Upgrade to latest supported Windows Server build if a fix is available (for example, from 2019 to 2022 for known MPIO bugs). |
| 234 | + |
| 235 | +## Common issues quick reference table |
| 236 | + |
| 237 | +| Symptom | Root cause | Resolution | |
| 238 | +| --- | --- | --- | |
| 239 | +| Disk missing in MPIO/Disk Mgmt | Zoning, DSM conflict | Check Device Manager, remove ghost devices, reinstall MPIO, restart | |
| 240 | +| Event 153/129/11 | Path loss, driver/firm | Update firmware/drivers, set MPIO policy, check physical connectivity | |
| 241 | +| Failover delay, slow cluster resource online | Timeout, configuration error | Raise resource timeout, fix quotas/snapshots, update drivers | |
| 242 | +| Duplicate disks/wrong disk order | Bad DSM/configuration | Remove extra DSM, use persistent identifiers (GUID/WWN) | |
| 243 | +| High IO latency, app slow, Event 833 | Driver/antivirus | Exclude storage from AV, update drivers, Perfmon, use 64K clusters | |
| 244 | +| Disk GUID duplicate, Event 158 | No MPIO/clone VHDs | Enable MPIO, Set-VHD -ResetDiskIdentifier | |
| 245 | +| Cluster disk disappears after expansion | Driver/event miss | Cycle role offline/online, rescan, extend with DiskPart | |
| 246 | +| Paths not detected until restart | Storport configuration bug | Update-HostStorageCache, restart | |
| 247 | +| "Requested resource in use," can't bring disk online | Metadata/disk fail | Use CHKDSK, check logs, engage storage team | |
| 248 | +| MPIO failover bug (Win 2019/2022) | Known defect | Upgrade OS, reference vendor/MS documentation | |
| 249 | + |
| 250 | +## Data collection |
| 251 | + |
| 252 | +When issues persist after basic troubleshooting, use these steps to gather diagnostic data: |
| 253 | + |
| 254 | +- **Procmon (Process Monitor):** Trace "ACCESS DENIED" registry or file system events |
| 255 | +- **PowerShell:** |
| 256 | + - Get-VMNetworkAdapter -VMName \<YourVMName> |
| 257 | + - Import-VM for import verification and error export. |
| 258 | + - Set-VMFirmware -VMName \<VMName> -SystemUUID ([guid]::NewGuid()) for UUID management |
| 259 | +- **System and cluster logs:** |
| 260 | + - Windows Event Viewer: Collect logs from Hyper-V VMMS, cluster services, and storage |
| 261 | + - Cluster validation reports |
| 262 | +- **Registry Editor:** Audit permissions under HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization\Worker |
| 263 | +- **Command Line:** |
| 264 | + - sfc /scannow |
| 265 | + - DISM /Online /Cleanup-Image /RestoreHealth |
| 266 | + - bcdedit /set hypervisorlaunchtype auto |
| 267 | + - gpupdate /force |
| 268 | +- **BIOS/UEFI:** screenshots or settings exports |
| 269 | +- **Minidump files** after blue screen |
| 270 | +- **Network trace logs** if connectivity issues exist |
| 271 | +- **Exported VM configuration files:** if you're troubleshooting import/export |
| 272 | +- **Driver versions** for storage, network, and security agents |
| 273 | + |
| 274 | +## References |
| 275 | + |
| 276 | +- [Troubleshoot disk issues](/windows-server/administration/windows-commands/diskpart) |
| 277 | +- [Event ID reference](/windows/win32/eventlog/event-identifiers) |
0 commit comments