WDDM mode is extremely slower compared to TCC mode and Linux doesn't have this issue #1208
Unanswered
FurkanGozukara
asked this question in
Q&A
Replies: 1 comment 1 reply
-
|
(This issue has nothing to do with any packages provided by the CUDA Python team. Moving to discussion for later triage.) |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Is this a duplicate?
Type of Bug
Performance
Component
Not sure
Describe the bug
We are working on generative AI models training
We have noticed that we are getting massive speed loss when we do big data transfer between RAM and GPU on Windows compared to Linux
The hit is such a big scale that Linux runs 2x faster than Windows even more
Same GPU RTX 5090
You can read more info here : kohya-ss/musubi-tuner#700
It turns out if we enable TCC mode on Windows, it gets equal speed as Linux
However again NVIDIA blocked this at driver level
I found a Chinese article with just changing few letters, via patching Patching nvlddmkm.sys, the TCC mode fully becomes working on consumer GPUs
Now my question is, why we can't get Linux speed on Windows?
Moreover it seems like Microsoft added this feature : MCDM
https://learn.microsoft.com/en-us/windows-hardware/drivers/display/mcdm-architecture
Why it is still not available for consumer GPUs?
How can we solve this slowness on Windows compared to Linux?
Thank you so much
How to Reproduce
Do big data transfer between GPU and RAM and compare speed on Windows and Linux
Expected behavior
Same speed as Linux on Windows
Operating System
Windows 11
nvidia-smi output
Microsoft Windows [Version 10.0.26200.7019]
(c) Microsoft Corporation. All rights reserved.
C:\Users\Furkan>nvidia-smi
Sun Nov 2 13:18:29 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 581.57 Driver Version: 581.57 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 5090 WDDM | 00000000:01:00.0 Off | N/A |
| 0% 40C P8 10W / 575W | 641MiB / 32607MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 3090 Ti WDDM | 00000000:0E:00.0 On | Off |
| 30% 52C P0 102W / 450W | 8847MiB / 24564MiB | 4% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
Beta Was this translation helpful? Give feedback.
All reactions