Skip to content

Commit 93606e9

Browse files
committed
Add target_feature support for compute_*
This lets us gate code to virtual architectures at compile time using `cfg()`.
1 parent 3b646e6 commit 93606e9

File tree

8 files changed

+1210
-6
lines changed

8 files changed

+1210
-6
lines changed

crates/cuda_builder/src/lib.rs

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,23 @@ pub struct CudaBuilder {
9393
/// the GTX 1030, GTX 1050, GTX 1080, Tesla P40, etc. We default to this because
9494
/// Maxwell (5.x) will be deprecated in CUDA 12 and we anticipate for that. Moreover,
9595
/// `6.x` contains support for things like f64 atomic add and half precision float ops.
96+
///
97+
/// ## Target Features for Conditional Compilation
98+
///
99+
/// The chosen architecture enables a target feature that can be used for
100+
/// conditional compilation with `#[cfg(target_feature = "compute_XX")]`.
101+
/// This feature means "at least this capability", matching NVIDIA's semantics.
102+
///
103+
/// For other patterns (exact ranges, maximum capabilities), use boolean `cfg` logic.
104+
/// See the compute capabilities guide for examples.
105+
///
106+
/// For example, with `.arch(NvvmArch::Compute61)`:
107+
/// ```ignore
108+
/// #[cfg(target_feature = "compute_61")]
109+
/// {
110+
/// // Code that requires compute capability 6.1+
111+
/// }
112+
/// ```
96113
pub arch: NvvmArch,
97114
/// Flush denormal values to zero when performing single-precision floating point operations.
98115
/// `false` by default.
@@ -229,6 +246,11 @@ impl CudaBuilder {
229246
/// NOTE that this does not necessarily mean that code using a certain capability
230247
/// will not work on older capabilities. It means that if it uses certain
231248
/// features it may not work.
249+
///
250+
/// ## Target Features for Conditional Compilation
251+
///
252+
/// The chosen architecture enables target features for conditional compilation.
253+
/// See the documentation on the `arch` field for more details.
232254
pub fn arch(mut self, arch: NvvmArch) -> Self {
233255
self.arch = arch;
234256
self

0 commit comments

Comments
 (0)