Skip to content

Always build with JIT+LTO#1923

Merged
rapids-bot[bot] merged 28 commits intorapidsai:mainfrom
KyleFromNVIDIA:jit-lto-cuda-12
Mar 25, 2026
Merged

Always build with JIT+LTO#1923
rapids-bot[bot] merged 28 commits intorapidsai:mainfrom
KyleFromNVIDIA:jit-lto-cuda-12

Conversation

@KyleFromNVIDIA
Copy link
Copy Markdown
Member

Since #1909, we've been able to use older versions of the CUDA driver, since we no longer rely on cudaLibraryEnumerateKernels(). Since #1918, we've been using static cudart, which allows us to run on platforms with versions of CUDA older than 12.8 installed, since the runtime library API is now bundled with cuvs. Always build with JIT+LTO so that we can get the full compile time and binary size benefits in CUDA 12 too.

Since rapidsai#1909, we've been able to use
older versions of the CUDA driver, since we no longer rely on
`cudaLibraryEnumerateKernels()`. Since
rapidsai#1918, we've been using static
cudart, which allows us to run on platforms with versions of CUDA older
than 12.8 installed, since the runtime library API is now bundled with
cuvs. Always build with JIT+LTO so that we can get the full compile time
and binary size benefits in CUDA 12 too.
@KyleFromNVIDIA KyleFromNVIDIA requested review from a team as code owners March 16, 2026 20:16
@KyleFromNVIDIA KyleFromNVIDIA requested a review from msarahan March 16, 2026 20:16
@KyleFromNVIDIA KyleFromNVIDIA added breaking Introduces a breaking change improvement Improves an existing functionality labels Mar 16, 2026
@KyleFromNVIDIA KyleFromNVIDIA changed the base branch from main to release/26.04 March 16, 2026 20:17
@KyleFromNVIDIA
Copy link
Copy Markdown
Member Author

KyleFromNVIDIA commented Mar 16, 2026

It seems that even after #1918, we're still not using cudart_static, since rmm forces us to use the shared version:

./../../..//bin/gtests/libcuvs/STATS_TEST: symbol lookup error: /opt/conda/envs/test/bin/gtests/libcuvs/../../../lib/libcuvs.so: undefined symbol: cudaLibraryGetKernel, version libcudart.so.12

We have to get rmm on cudart_static.

@KyleFromNVIDIA
Copy link
Copy Markdown
Member Author

We've decided to switch to the driver API instead, since rmm is blocked on rapidsai/cudf#20814, which in turn is also blocked.

@KyleFromNVIDIA KyleFromNVIDIA requested a review from a team as a code owner March 16, 2026 23:26
@KyleFromNVIDIA KyleFromNVIDIA requested a review from a team as a code owner March 17, 2026 00:05
@KyleFromNVIDIA KyleFromNVIDIA requested review from a team as code owners March 23, 2026 13:54
Copy link
Copy Markdown
Member Author

@KyleFromNVIDIA KyleFromNVIDIA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the debug statement when finished

@KyleFromNVIDIA
Copy link
Copy Markdown
Member Author

Blocked on #1936

@gforsyth
Copy link
Copy Markdown
Contributor

Just merged #1936 in

@KyleFromNVIDIA
Copy link
Copy Markdown
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 4574fe3 into rapidsai:main Mar 25, 2026
223 of 228 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Unstructured Data Processing Mar 25, 2026
jrbourbeau pushed a commit to jrbourbeau/cuvs that referenced this pull request Mar 25, 2026
Since rapidsai#1909, we've been able to use older versions of the CUDA driver, since we no longer rely on `cudaLibraryEnumerateKernels()`. Since rapidsai#1918, we've been using static cudart, which allows us to run on platforms with versions of CUDA older than 12.8 installed, since the runtime library API is now bundled with cuvs. Always build with JIT+LTO so that we can get the full compile time and binary size benefits in CUDA 12 too.

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Divye Gala (https://github.com/divyegala)
  - Ben Frederickson (https://github.com/benfred)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#1923
jrbourbeau pushed a commit to jrbourbeau/cuvs that referenced this pull request Mar 25, 2026
Since rapidsai#1909, we've been able to use older versions of the CUDA driver, since we no longer rely on `cudaLibraryEnumerateKernels()`. Since rapidsai#1918, we've been using static cudart, which allows us to run on platforms with versions of CUDA older than 12.8 installed, since the runtime library API is now bundled with cuvs. Always build with JIT+LTO so that we can get the full compile time and binary size benefits in CUDA 12 too.

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Divye Gala (https://github.com/divyegala)
  - Ben Frederickson (https://github.com/benfred)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#1923
rapids-bot bot pushed a commit to rapidsai/cugraph that referenced this pull request Mar 31, 2026
rapidsai/cuvs#1923 switched cuVS to always enabling JIT+LTO, and therefore requiring nvJitLink at build and runtime.

Previously, it had only been required for CUDA 13.

cuGraph pulls cuVS via CPM and builds it from source, but wasn't updated to match, resulting in CUDA 12 builds here failing like this:

```text
 │ │ CMake Error at build/_deps/cuvs-src/cpp/CMakeLists.txt:754 (target_link_libraries):
 │ │   Target "cuvs" links to:
 │ │     CUDA::nvJitLink
 │ │   but the target was not found.  Possible reasons include:
 │ │     * There is a typo in the target name.
 │ │     * A find_package call is missing for an IMPORTED target.
 │ │     * An ALIAS target is missing.
 │ │ CMake Error at build/_deps/cuvs-src/cpp/CMakeLists.txt:812 (target_link_libraries):
 │ │   Target "cuvs_static" links to:
 │ │     CUDA::nvJitLink
 │ │   but the target was not found.  Possible reasons include:
 │ │     * There is a typo in the target name.
 │ │     * A find_package call is missing for an IMPORTED target.
 │ │     * An ALIAS target is missing.
```

([example build link](https://github.com/rapidsai/cugraph/actions/runs/23767677532/job/69251605108?pr=5469#step:11:1108))

This resolves that, by removing "if CUDA 13" types of guards around this project's libnvJitLink dependency.

## Notes for Reviewers

### Some builds may still fail here

Like this:

```text
  /__w/cugraph/cugraph/python/libcugraph/build/py3-none-linux_aarch64/_deps/cccl-src/lib/cmake/cub/../../../cub/cub/device/dispatch/kernels/kernel_scan_warpspeed.cuh(455): error: more than one instance of function template "cub::detail::scan::warpReduce" matches the argument list:
              function template "T raft::warpReduce(T, ReduceLambda)" (declared at line 49 of /pyenv/versions/3.14.3/lib/python3.14/site-packages/libraft/include/raft/util/reduction.cuh)
              function template "Tp cub::detail::scan::warpReduce(Tp, ScanOpT &)" (declared at line 247)
              argument types are: (unsigned int, raft::add_op)
     regWarpSum = warpReduce(regThreadSum, scan_op);
```

Until rapidsai/cuvs#1954 is resolved, which @viclafargue is working on in rapidsai/cuvs#1963

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #5479
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Introduces a breaking change improvement Improves an existing functionality

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants