Skip to content

Commit c4079dd

Browse files
leofangCopilotpre-commit-ci[bot]
authored
Expand Windows test matrix to reproduce and fix nvbugs 5630448 (#1242)
* Initial plan * Move install_gpu_driver.ps1 to ci/tools and update call sites Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> * Update install_gpu_driver.ps1 to support GPU type detection and driver modes Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> * Make nightly sections empty in ci/test-matrix.json Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> * Expand Windows test matrix with driver mode support Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> * Wire driver mode from test-matrix.json into Windows workflow Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> * Update install_gpu_driver.ps1 to match CCCL implementation with driver mode support Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> * Simplify driver mode handling per review feedback Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> * Use GPU_TYPE env var instead of parsing JOB_RUNNER Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> * ensure each GPU kind are tested under two modes * fix arch coverage - we do not have access to rtx6000ada - rtxpro6000 is a datacenter card - cover WDDM in at least 2 pipelines * make script more flexible; ensure cover 6 different GPUs, each with 2 different modes rtx2080, rtx4090, rtxpro6000, v100, a100, l4 (t4 nodes are too slow) * Add driver mode verification and change v100 to rtxpro6000 for CUDA 13 Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> * fix * merge Removed redundant 'Ensure GPU is working' step and kept the driver mode verification. * ensure using CTK 12.x with V100 + driver mode check can fail * fix syntax * avoid testing Quadro + WDDM; make driver mode show up in pipeline names * add missing `test-cu12-ft` dep group * fix VMM on Windows * [pre-commit.ci] auto code formatting * RTX cards cannot run MCDM, switch back to L4 for now Updated GPU configurations for Python versions 3.13 and 3.14. * fix silly typo * fix stupid negation --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 1e9e360 commit c4079dd

File tree

8 files changed

+160
-104
lines changed

8 files changed

+160
-104
lines changed

.github/workflows/install_gpu_driver.ps1

Lines changed: 0 additions & 35 deletions
This file was deleted.

.github/workflows/test-wheel-linux.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ jobs:
7474
echo "MATRIX=${MATRIX}" | tee --append "${GITHUB_OUTPUT}"
7575
7676
test:
77-
name: py${{ matrix.PY_VER }}, ${{ matrix.CUDA_VER }}, ${{ (matrix.LOCAL_CTK == '1' && 'local') || 'wheels' }}, GPU ${{ matrix.GPU }}
77+
name: py${{ matrix.PY_VER }}, ${{ matrix.CUDA_VER }}, ${{ (matrix.LOCAL_CTK == '1' && 'local') || 'wheels' }}, ${{ matrix.GPU }}
7878
needs: compute-matrix
7979
strategy:
8080
fail-fast: false

.github/workflows/test-wheel-windows.yml

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ jobs:
6363
echo "MATRIX=${MATRIX}" | tee --append "${GITHUB_OUTPUT}"
6464
6565
test:
66-
name: py${{ matrix.PY_VER }}, ${{ matrix.CUDA_VER }}, ${{ (matrix.LOCAL_CTK == '1' && 'local') || 'wheels' }}, GPU ${{ matrix.GPU }}
66+
name: py${{ matrix.PY_VER }}, ${{ matrix.CUDA_VER }}, ${{ (matrix.LOCAL_CTK == '1' && 'local') || 'wheels' }}, ${{ matrix.GPU }} (${{ matrix.DRIVER_MODE }})
6767
# The build stage could fail but we want the CI to keep moving.
6868
needs: compute-matrix
6969
strategy:
@@ -80,11 +80,23 @@ jobs:
8080
continue-on-error: true
8181

8282
- name: Update driver
83+
env:
84+
DRIVER_MODE: ${{ matrix.DRIVER_MODE }}
85+
GPU_TYPE: ${{ matrix.GPU }}
8386
run: |
84-
.github/workflows/install_gpu_driver.ps1
87+
ci/tools/install_gpu_driver.ps1
8588
8689
- name: Ensure GPU is working
87-
run: nvidia-smi
90+
run: |
91+
nvidia-smi
92+
93+
$mode_output = nvidia-smi | Select-String -Pattern "${{ matrix.DRIVER_MODE }}"
94+
Write-Output "Driver mode check: $mode_output"
95+
if ("$mode_output" -eq "") {
96+
Write-Error "Switching to driver mode ${{ matrix.DRIVER_MODE }} failed!"
97+
exit 1
98+
}
99+
Write-Output "Driver mode verified: ${{ matrix.DRIVER_MODE }}"
88100
89101
- name: Set environment variables
90102
env:

ci/test-matrix.json

Lines changed: 15 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"_description": "Test matrix configurations for CUDA Python CI workflows. This file consolidates the test matrices that were previously hardcoded in the workflow files. All GPU and ARCH values are hard-coded for each architecture: l4 GPU for amd64, a100 GPU for arm64.",
3-
"_sorted_by": "Please keep matrices sorted in ascending order by [ARCH, PY_VER, CUDA_VER, LOCAL_CTK, GPU, DRIVER]",
3+
"_sorted_by": "Please keep matrices sorted in ascending order by [ARCH, PY_VER, CUDA_VER, LOCAL_CTK, GPU, DRIVER]. Windows entries also include DRIVER_MODE.",
44
"_notes": "DRIVER: 'earliest' does not work with CUDA 12.9.1 and LOCAL_CTK: 0 does not work with CUDA 12.0.1",
55
"linux": {
66
"pull-request": [
@@ -25,48 +25,7 @@
2525
{ "ARCH": "arm64", "PY_VER": "3.14", "CUDA_VER": "13.0.2", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
2626
{ "ARCH": "arm64", "PY_VER": "3.14t", "CUDA_VER": "13.0.2", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" }
2727
],
28-
"nightly": [
29-
{ "ARCH": "amd64", "PY_VER": "3.10", "CUDA_VER": "11.8.0", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "earliest" },
30-
{ "ARCH": "amd64", "PY_VER": "3.10", "CUDA_VER": "11.8.0", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
31-
{ "ARCH": "amd64", "PY_VER": "3.10", "CUDA_VER": "12.0.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
32-
{ "ARCH": "amd64", "PY_VER": "3.10", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "latest" },
33-
{ "ARCH": "amd64", "PY_VER": "3.10", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
34-
{ "ARCH": "amd64", "PY_VER": "3.11", "CUDA_VER": "11.8.0", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "earliest" },
35-
{ "ARCH": "amd64", "PY_VER": "3.11", "CUDA_VER": "11.8.0", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
36-
{ "ARCH": "amd64", "PY_VER": "3.11", "CUDA_VER": "12.0.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
37-
{ "ARCH": "amd64", "PY_VER": "3.11", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "latest" },
38-
{ "ARCH": "amd64", "PY_VER": "3.11", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
39-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "11.8.0", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "earliest" },
40-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "11.8.0", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
41-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "12.0.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
42-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "latest" },
43-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
44-
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "11.8.0", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "earliest" },
45-
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "11.8.0", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
46-
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "12.0.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
47-
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "latest" },
48-
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
49-
{ "ARCH": "arm64", "PY_VER": "3.10", "CUDA_VER": "11.8.0", "LOCAL_CTK": "0", "GPU": "a100", "DRIVER": "earliest" },
50-
{ "ARCH": "arm64", "PY_VER": "3.10", "CUDA_VER": "11.8.0", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
51-
{ "ARCH": "arm64", "PY_VER": "3.10", "CUDA_VER": "12.0.1", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
52-
{ "ARCH": "arm64", "PY_VER": "3.10", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "a100", "DRIVER": "latest" },
53-
{ "ARCH": "arm64", "PY_VER": "3.10", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
54-
{ "ARCH": "arm64", "PY_VER": "3.11", "CUDA_VER": "11.8.0", "LOCAL_CTK": "0", "GPU": "a100", "DRIVER": "earliest" },
55-
{ "ARCH": "arm64", "PY_VER": "3.11", "CUDA_VER": "11.8.0", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
56-
{ "ARCH": "arm64", "PY_VER": "3.11", "CUDA_VER": "12.0.1", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
57-
{ "ARCH": "arm64", "PY_VER": "3.11", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "a100", "DRIVER": "latest" },
58-
{ "ARCH": "arm64", "PY_VER": "3.11", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
59-
{ "ARCH": "arm64", "PY_VER": "3.12", "CUDA_VER": "11.8.0", "LOCAL_CTK": "0", "GPU": "a100", "DRIVER": "earliest" },
60-
{ "ARCH": "arm64", "PY_VER": "3.12", "CUDA_VER": "11.8.0", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
61-
{ "ARCH": "arm64", "PY_VER": "3.12", "CUDA_VER": "12.0.1", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
62-
{ "ARCH": "arm64", "PY_VER": "3.12", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "a100", "DRIVER": "latest" },
63-
{ "ARCH": "arm64", "PY_VER": "3.12", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
64-
{ "ARCH": "arm64", "PY_VER": "3.13", "CUDA_VER": "11.8.0", "LOCAL_CTK": "0", "GPU": "a100", "DRIVER": "earliest" },
65-
{ "ARCH": "arm64", "PY_VER": "3.13", "CUDA_VER": "11.8.0", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
66-
{ "ARCH": "arm64", "PY_VER": "3.13", "CUDA_VER": "12.0.1", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" },
67-
{ "ARCH": "arm64", "PY_VER": "3.13", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "a100", "DRIVER": "latest" },
68-
{ "ARCH": "arm64", "PY_VER": "3.13", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest" }
69-
],
28+
"nightly": [],
7029
"special_runners": {
7130
"amd64": [
7231
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "13.0.2", "LOCAL_CTK": "1", "GPU": "H100", "DRIVER": "latest" }
@@ -75,20 +34,19 @@
7534
},
7635
"windows": {
7736
"pull-request": [
78-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "latest" },
79-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "t4", "DRIVER": "latest" },
80-
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "13.0.2", "LOCAL_CTK": "0", "GPU": "t4", "DRIVER": "latest" },
81-
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "13.0.2", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
82-
{ "ARCH": "amd64", "PY_VER": "3.14", "CUDA_VER": "13.0.2", "LOCAL_CTK": "0", "GPU": "t4", "DRIVER": "latest" },
83-
{ "ARCH": "amd64", "PY_VER": "3.14", "CUDA_VER": "13.0.2", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" },
84-
{ "ARCH": "amd64", "PY_VER": "3.14t", "CUDA_VER": "13.0.2", "LOCAL_CTK": "0", "GPU": "t4", "DRIVER": "latest" },
85-
{ "ARCH": "amd64", "PY_VER": "3.14t", "CUDA_VER": "13.0.2", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" }
37+
{ "ARCH": "amd64", "PY_VER": "3.10", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "rtx2080", "DRIVER": "latest", "DRIVER_MODE": "WDDM" },
38+
{ "ARCH": "amd64", "PY_VER": "3.10", "CUDA_VER": "13.0.2", "LOCAL_CTK": "1", "GPU": "rtxpro6000", "DRIVER": "latest", "DRIVER_MODE": "TCC" },
39+
{ "ARCH": "amd64", "PY_VER": "3.11", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "v100", "DRIVER": "latest", "DRIVER_MODE": "MCDM" },
40+
{ "ARCH": "amd64", "PY_VER": "3.11", "CUDA_VER": "13.0.2", "LOCAL_CTK": "0", "GPU": "rtx4090", "DRIVER": "latest", "DRIVER_MODE": "WDDM" },
41+
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "latest", "DRIVER_MODE": "MCDM" },
42+
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "13.0.2", "LOCAL_CTK": "1", "GPU": "a100", "DRIVER": "latest", "DRIVER_MODE": "TCC" },
43+
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest", "DRIVER_MODE": "TCC" },
44+
{ "ARCH": "amd64", "PY_VER": "3.13", "CUDA_VER": "13.0.2", "LOCAL_CTK": "0", "GPU": "rtxpro6000", "DRIVER": "latest", "DRIVER_MODE": "MCDM" },
45+
{ "ARCH": "amd64", "PY_VER": "3.14", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "v100", "DRIVER": "latest", "DRIVER_MODE": "TCC" },
46+
{ "ARCH": "amd64", "PY_VER": "3.14", "CUDA_VER": "13.0.2", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest", "DRIVER_MODE": "MCDM" },
47+
{ "ARCH": "amd64", "PY_VER": "3.14t", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest", "DRIVER_MODE": "TCC" },
48+
{ "ARCH": "amd64", "PY_VER": "3.14t", "CUDA_VER": "13.0.2", "LOCAL_CTK": "0", "GPU": "a100", "DRIVER": "latest", "DRIVER_MODE": "MCDM" }
8649
],
87-
"nightly": [
88-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "11.8.0", "LOCAL_CTK": "0", "GPU": "l4", "DRIVER": "latest" },
89-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "11.8.0", "LOCAL_CTK": "1", "GPU": "t4", "DRIVER": "latest" },
90-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "12.9.1", "LOCAL_CTK": "0", "GPU": "t4", "DRIVER": "latest" },
91-
{ "ARCH": "amd64", "PY_VER": "3.12", "CUDA_VER": "12.9.1", "LOCAL_CTK": "1", "GPU": "l4", "DRIVER": "latest" }
92-
]
50+
"nightly": []
9351
}
9452
}

ci/tools/install_gpu_driver.ps1

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
#
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
# Install the driver
6+
function Install-Driver {
7+
8+
# Set the correct URL, filename, and arguments to the installer
9+
# This driver is picked to support Windows 11 & CUDA 13.0
10+
$version = '581.15'
11+
12+
# Get GPU type from environment variable
13+
$gpu_type = $env:GPU_TYPE
14+
15+
$data_center_gpus = @('a100', 'h100', 'l4', 't4', 'v100', 'rtxa6000', 'rtx6000ada')
16+
$desktop_gpus = @('rtx2080', 'rtx4090', 'rtxpro6000')
17+
18+
if ($data_center_gpus -contains $gpu_type) {
19+
Write-Output "Data center GPU detected: $gpu_type"
20+
$filename="$version-data-center-tesla-desktop-winserver-2022-2025-dch-international.exe"
21+
$server_path="tesla/$version"
22+
} elseif ($desktop_gpus -contains $gpu_type) {
23+
Write-Output "Desktop GPU detected: $gpu_type"
24+
$filename="$version-desktop-win10-win11-64bit-international-dch-whql.exe"
25+
$server_path="Windows/$version"
26+
} else {
27+
Write-Output "Unknown GPU type: $gpu_type"
28+
exit 1
29+
}
30+
31+
$url="https://us.download.nvidia.com/$server_path/$filename"
32+
$filepath="C:\NVIDIA-Driver\$filename"
33+
34+
Write-Output "Installing NVIDIA driver version $version for GPU type $gpu_type"
35+
Write-Output "Download URL: $url"
36+
37+
# Silent install arguments
38+
$install_args = '/s /noeula /noreboot';
39+
40+
# Create the folder for the driver download
41+
if (!(Test-Path -Path 'C:\NVIDIA-Driver')) {
42+
New-Item -Path 'C:\' -Name 'NVIDIA-Driver' -ItemType 'directory' | Out-Null
43+
}
44+
45+
# Download the file to a specified directory
46+
# Disabling progress bar due to https://github.com/GoogleCloudPlatform/compute-gpu-installation/issues/29
47+
$ProgressPreference_tmp = $ProgressPreference
48+
$ProgressPreference = 'SilentlyContinue'
49+
Write-Output 'Downloading the driver installer...'
50+
Invoke-WebRequest $url -OutFile $filepath
51+
$ProgressPreference = $ProgressPreference_tmp
52+
Write-Output 'Download complete!'
53+
54+
# Install the file with the specified path from earlier
55+
Write-Output 'Running the driver installer...'
56+
Start-Process -FilePath $filepath -ArgumentList $install_args -Wait
57+
Write-Output 'Done!'
58+
59+
# Handle driver mode configuration
60+
# This assumes we have the prior knowledge on which GPU can use which mode.
61+
$driver_mode = $env:DRIVER_MODE
62+
if ($driver_mode -eq "WDDM") {
63+
Write-Output "Setting driver mode to WDDM..."
64+
nvidia-smi -fdm 0
65+
} elseif ($driver_mode -eq "TCC") {
66+
Write-Output "Setting driver mode to TCC..."
67+
nvidia-smi -fdm 1
68+
} elseif ($driver_mode -eq "MCDM") {
69+
Write-Output "Setting driver mode to MCDM..."
70+
nvidia-smi -fdm 2
71+
} else {
72+
Write-Output "Unknown driver mode: $driver_mode"
73+
exit 1
74+
}
75+
pnputil /disable-device /class Display
76+
pnputil /enable-device /class Display
77+
# Give it a minute to settle:
78+
Start-Sleep -Seconds 5
79+
}
80+
81+
# Run the functions
82+
Install-Driver

cuda_core/cuda/core/experimental/_memory/_virtual_memory_resource.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ class VirtualMemoryResourceOptions:
7070
peers: Iterable[int] = field(default_factory=tuple)
7171
self_access: VirtualMemoryAccessTypeT = "rw"
7272
peer_access: VirtualMemoryAccessTypeT = "rw"
73+
win32_handle_metadata: int | None = 0
7374

7475
_a = driver.CUmemAccess_flags
7576
_access_flags = {"rw": _a.CU_MEM_ACCESS_FLAGS_PROT_READWRITE, "r": _a.CU_MEM_ACCESS_FLAGS_PROT_READ, None: 0}
@@ -212,6 +213,7 @@ def modify_allocation(self, buf: Buffer, new_size: int, config: VirtualMemoryRes
212213
prop.location.id = self.device.device_id
213214
prop.allocFlags.gpuDirectRDMACapable = 1 if self.config.gpu_direct_rdma else 0
214215
prop.requestedHandleTypes = VirtualMemoryResourceOptions._handle_type_to_driver(self.config.handle_type)
216+
prop.win32HandleMetaData = self.config.win32_handle_metadata if self.config.win32_handle_metadata else 0
215217

216218
# Query granularity
217219
gran_flag = VirtualMemoryResourceOptions._granularity_to_driver(self.config.granularity)
@@ -495,11 +497,11 @@ def allocate(self, size: int, stream: Stream = None) -> Buffer:
495497
# ---- Build allocation properties ----
496498
prop = driver.CUmemAllocationProp()
497499
prop.type = VirtualMemoryResourceOptions._allocation_type_to_driver(config.allocation_type)
498-
499500
prop.location.type = VirtualMemoryResourceOptions._location_type_to_driver(config.location_type)
500501
prop.location.id = self.device.device_id if config.location_type == "device" else -1
501502
prop.allocFlags.gpuDirectRDMACapable = 1 if config.gpu_direct_rdma else 0
502503
prop.requestedHandleTypes = VirtualMemoryResourceOptions._handle_type_to_driver(config.handle_type)
504+
prop.win32HandleMetaData = self.config.win32_handle_metadata if self.config.win32_handle_metadata else 0
503505

504506
# ---- Query and apply granularity ----
505507
# Choose min vs recommended granularity per config

cuda_core/pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ test-cu12 = ["cuda-core[test]", "cupy-cuda12x; python_version < '3.14'", "cuda-t
5656
test-cu13 = ["cuda-core[test]", "cupy-cuda13x; python_version < '3.14'", "cuda-toolkit[cudart]==13.*"] # runtime headers needed by CuPy
5757
# free threaded build, cupy doesn't support free-threaded builds yet, so avoid installing it for now
5858
# TODO: cupy should support free threaded builds
59+
test-cu12-ft = ["cuda-core[test]", "cuda-toolkit[cudart]==12.*"]
5960
test-cu13-ft = ["cuda-core[test]", "cuda-toolkit[cudart]==13.*"]
6061

6162
[project.urls]

0 commit comments

Comments
 (0)