Skip to content

Conversation

@baszalmstra
Copy link
Collaborator

This PR implements the __cuda_arch virtual package based on the nvidia-virtual-packages reference implementation. The package detects the minimum CUDA compute capability (SM version) across all CUDA devices on the system, enabling hardware-specific package constraints and multi-variant builds.

@carterbox I wanted to get a head start with this to also make it available in rattler-build so we can experiment with this more easily.

AI Disclosure

Written by Claude Sonnet 4.5, edited and reviewed by me.

@baszalmstra baszalmstra requested a review from wolfv November 20, 2025 12:39
@baszalmstra baszalmstra force-pushed the claude/add-virtual-package-01Aeo9zN7n665RzTuY9tXQhD branch 3 times, most recently from 1aadbac to 80d7361 Compare November 20, 2025 12:50
@baszalmstra baszalmstra marked this pull request as ready for review November 20, 2025 13:55
//! NVIDIA drivers. This is detected via:
//!
//! * CUDA driver library (libcuda) - Standard method
//! * nvidia-smi command - Fallback on musl systems where dynamic library loading is not supported

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!? I am not familiar with musl libc. Is this a platform restriction or a rust restriction?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

musl is an alternative libc implementation that doesnt allow dynamic library loading.

///
/// This corresponds to the `__cuda_arch` virtual package. Returns `None` if
/// no CUDA devices are detected or if device enumeration fails.
pub arch_info: Option<CudaArchInfo>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to have a discussion about which default values make sense. I currently use __cuda_arch=0=None as the fallback value, which you should probably match here.

However, I haven't determined what the correct behavior is if the system has no devices. Use the default values or just don't expose the virtual package?

Also, CEP-26 requires that you print a warning if default/fallback values are used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, Ill implement that as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, __cuda is not present if there are no CUDA drivers. The same goes for __win, __unix, etc. What would be the advantage of always exposing the virtual package? Knowing that the installer has the capability of exposing it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure of the existing behavior of __cuda if the driver is not detected. While I think there is value in knowing that the installer has the capability of detecting a virtual package but didn't find anything. I think it's better that we follow existing conventions and hide the virtual package when a device is not detected or fails to detect.

I suppose there could be a warning message to notify the user that the installer is capable of __cuda and __cuda_arch, but didn't detect anything? OR maybe that would be annoying to users without NVIDIA hardware.

@baszalmstra
Copy link
Collaborator Author

Thanks for taking a look at this @carterbox I highly appreciate it.

Ill try to follow your implementation closely.

@baszalmstra baszalmstra force-pushed the claude/add-virtual-package-01Aeo9zN7n665RzTuY9tXQhD branch 2 times, most recently from df0852a to f1e9534 Compare November 20, 2025 21:23
…ection

This commit implements the __cuda_arch virtual package based on the
nvidia-virtual-packages reference implementation. The package detects
the minimum CUDA compute capability across all detected CUDA devices.

Implementation Details:
- Unified CUDA detection that queries both driver version and compute
  capability in a single library load for efficiency
- Uses CUDA Driver API (cuDeviceGetCount, cuDeviceGetAttribute,
  cuDeviceGetName) to enumerate devices
- Returns minimum compute capability to ensure package compatibility
  across multi-GPU systems
- Proper C string handling with CStr for device names
- Independent failure modes (version can exist without devices)
- Cached detection for performance

Virtual Package Integration:
- Added CudaArch struct with version and device_name fields
- Implements EnvOverride trait with CONDA_OVERRIDE_CUDA_ARCH support
- Supports both 'version' and 'version=build' override formats
  (e.g., '8.6' or '8.6=NVIDIA GeForce RTX 3090')
- Device name used as build string to identify which GPU determined
  the minimum compute capability
- Full integration with VirtualPackage enum and VirtualPackages struct

Rationale:
CUDA compute capability determines which instruction sets and features
are available on a GPU. This virtual package enables hardware-specific
constraints and multi-variant builds optimized for different GPU
architectures while ensuring packages work on all GPUs in a system.

Testing:
- All existing tests pass
- Added comprehensive tests for override parsing
- Test both version-only and version=build formats
@baszalmstra baszalmstra force-pushed the claude/add-virtual-package-01Aeo9zN7n665RzTuY9tXQhD branch from f1e9534 to 68d6f14 Compare November 20, 2025 21:27
Update character validation and NVIDIA removal to match the stricter
approach from conda-incubator/nvidia-virtual-packages PR #5:

- Change `is_valid_build_string_char()` to allow only alphanumeric
  characters (a-z, A-Z, 0-9), removing support for underscore,
  period, and plus
- Update `sanitize_device_name()` to remove "NVIDIA" (case-insensitive)
  anywhere in the string, not just as a prefix
- Add `remove_nvidia_ci()` helper function for removing all occurrences
  of NVIDIA from a string
- Remove now-unused `strip_prefix_ci()` function
- Update all related tests to match new stricter behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants