Skip to content

Conversation

@abhilash1910
Copy link
Contributor

Description

Issue Link - #981

Changes to be addressed in this WIP PR:

  • LTO IR testing
  • Is there a way to add multiple modules?
    {If / when it is possible to add multiple modules, a test with code that uses something from libdevice is probably a good idea.
    It's also useful to be able to lazily add a module}
  • apply bitcode pattern input for libnvvm

cc @leofang

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Nov 5, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@abhilash1910 abhilash1910 marked this pull request as draft November 5, 2025 02:17
@leofang leofang added this to the cuda.core beta 9 milestone Nov 10, 2025
@leofang leofang added enhancement Any code-related improvements P1 Medium priority - Should do cuda.core Everything related to the cuda.core module labels Nov 10, 2025
@leofang
Copy link
Member

leofang commented Nov 17, 2025

Thanks, @abhilash1910! Any ETA to wrap this up?

@abhilash1910
Copy link
Contributor Author

pre-commit.ci autofix

@abhilash1910
Copy link
Contributor Author

pre-commit.ci autofix

@leofang leofang linked an issue Nov 25, 2025 that may be closed by this pull request
Copy link
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Abhilash! Leaving a few early feedbacks.

Comment on lines 481 to 487
# TODO: support pre-loaded headers & include names
# TODO: allow tuples once NVIDIA/cuda-python#72 is resolved

if options.extra_sources is not None:
raise ValueError("extra_sources is not supported by the NVRTC backend (C++ code_type)")

self._mnff.handle = handle_return(nvrtc.nvrtcCreateProgram(code.encode(), options._name, 0, [], []))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: organize the comments better

Suggested change
# TODO: support pre-loaded headers & include names
# TODO: allow tuples once NVIDIA/cuda-python#72 is resolved
if options.extra_sources is not None:
raise ValueError("extra_sources is not supported by the NVRTC backend (C++ code_type)")
self._mnff.handle = handle_return(nvrtc.nvrtcCreateProgram(code.encode(), options._name, 0, [], []))
# TODO: support pre-loaded headers & include names
if options.extra_sources is not None:
raise ValueError("extra_sources is not supported by the NVRTC backend (C++ code_type)")
# TODO: allow tuples once NVIDIA/cuda-python#72 is resolved
self._mnff.handle = handle_return(nvrtc.nvrtcCreateProgram(code.encode(), options._name, 0, [], []))

Comment on lines 538 to 546
bitcode_path = os.environ.get("BITCODE_NVVM_PATH")
if not bitcode_path:
pytest.skip("BITCODE_NVVM_PATH environment variable is not set.Disabling the test.")
bitcode_file = Path(bitcode_path)
if not bitcode_file.exists():
pytest.skip(f"Bitcode file not found: {bitcode_path}")

if bitcode_file.suffix != ".bc":
pytest.skip(f"Expected .bc file, got: {bitcode_file.suffix}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible for us to avoid having a file locally? We have bitcode in this repo already:

MINIMAL_NVVMIR_TXT_TEMPLATE = b"""\
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-i128:128:128-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"
target triple = "nvptx64-nvidia-cuda"
define void @kernel() {
entry:
ret void
}
!nvvm.annotations = !{!0}
!0 = !{void ()* @kernel, !"kernel", i32 1}
!nvvmir.version = !{!1}
!1 = !{i32 %d, i32 0, i32 %d, i32 0}
""" # noqa: E501
MINIMAL_NVVMIR_BITCODE_STATIC = {
(1, 3): # (major, debug_major)
"4243c0de3514000005000000620c30244a59be669dfbb4bf0b51804c01000000210c00007f010000"
"0b02210002000000160000000781239141c80449061032399201840c250508191e048b62800c4502"
"42920b42641032143808184b0a3232884870c421234412878c1041920264c808b1142043468820c9"
"01323284182a282a90317cb05c9120c3c8000000892000000b0000003222c80820624600212b2498"
"0c212524980c19270c85a4906032645c20246382a01801300128030173046000132677b00778a007"
"7cb0033a680377b0877420877408873618877a208770d8e012e5d006f0a0077640077a600774a007"
"7640076d900e71a00778a00778d006e980077a80077a80076d900e7160077a100776a0077160076d"
"900e7320077a300772a0077320076d900e7640077a600774a0077640076d900e71200778a0077120"
"0778a00771200778d006e6300772a0077320077a300772d006e6600774a0077640077a600774d006"
"f6100776a0077160077a100776d006f6300772a0077320077a300772d006f6600774a0077640077a"
"600774d006f610077280077a10077280077a10077280076de00e7160077a300772a0077640071a21"
"4c0e11de9c2e4fbbcfbe211560040000000000000000000000000620b141a0e86000004016080000"
"06000000321e980c19114c908c092647c6044362098c009401000000b1180000ac0000003308801c"
"c4e11c6614013d88433884c38c4280077978077398710ce6000fed100ef4800e330c421ec2c11dce"
"a11c6630053d88433884831bcc033dc8433d8c033dcc788c7470077b08077948877070077a700376"
"788770208719cc110eec900ee1300f6e300fe3f00ef0500e3310c41dde211cd8211dc2611e663089"
"3bbc833bd04339b4033cbc833c84033bccf0147660077b6807376887726807378087709087706007"
"76280776f8057678877780875f08877118877298877998812ceef00eeee00ef5c00eec300362c8a1"
"1ce4a11ccca11ce4a11cdc611cca211cc4811dca6106d6904339c84339984339c84339b8c3389443"
"3888033b94c32fbc833cfc823bd4033bb0c30cc7698770588772708374680778608774188774a087"
"19ce530fee000ff2500ee4900ee3400fe1200eec500e3320281ddcc11ec2411ed2211cdc811edce0"
"1ce4e11dea011e66185138b0433a9c833bcc50247660077b68073760877778077898514cf4900ff0"
"500e331e6a1eca611ce8211ddec11d7e011ee4a11ccc211df0610654858338ccc33bb0433dd04339"
"fcc23ce4433b88c33bb0c38cc50a877998877718877408077a28077298815ce3100eecc00ee5500e"
"f33023c1d2411ee4e117d8e11dde011e6648193bb0833db4831b84c3388c4339ccc33cb8c139c8c3"
"3bd4033ccc48b471080776600771088771588719dbc60eec600fede006f0200fe5300fe5200ff650"
"0e6e100ee3300ee5300ff3e006e9e00ee4500ef83023e2ec611cc2811dd8e117ec211de6211dc421"
"1dd8211de8211f66209d3bbc433db80339948339cc58bc7070077778077a08077a488777708719cb"
"e70eef300fe1e00ee9400fe9a00fe530c3010373a8077718875f988770708774a08774d087729881"
"844139e0c338b0433d904339cc40c4a01dcaa11de0411edec11c662463300ee1c00eec300fe9400f"
"e5000000792000001d000000721e482043880c19097232482023818c9191d144a01028643c313242"
"8e9021a318100a00060000006b65726e656c0000230802308240042308843082400c330c4230cc40"
"0c4441c84860821272b3b36b730973737ba30ba34b7b739b1b2528d271b3b36b4b9373b12b939b4b"
"7b731b2530000000a9180000250000000b0a7228877780077a587098433db8c338b04339d0c382e6"
"1cc6a10de8411ec2c11de6211de8211ddec11d1634e3600ee7500fe1200fe4400fe1200fe7500ef4"
"b08081077928877060077678877108077a28077258709cc338b4013ba4833d94c3026b1cd8211cdc"
"e11cdc201ce4611cdc201ce8811ec2611cd0a11cc8611cc2811dd861c1010ff4200fe1500ff4800e"
"00000000d11000000600000007cc3ca4833b9c033b94033da0833c94433890c30100000061200000"
"06000000130481860301000002000000075010cd14610000000000007120000003000000320e1022"
"8400fb020000000000000000650c00001f000000120394f000000000030000000600000006000000"
"4c000000010000005800000000000000580000000100000070000000000000000c00000013000000"
"1f000000080000000600000000000000700000000000000000000000010000000000000000000000"
"060000000000000006000000ffffffff00240000000000005d0c00000d0000001203946700000000"
"6b65726e656c31352e302e376e7670747836342d6e76696469612d637564613c737472696e673e00"
"00000000",
(2, 3): # (major, debug_major)
"4243c0de3514000005000000620c30244a59be669dfbb4bf0b51804c01000000210c000080010000"
"0b02210002000000160000000781239141c80449061032399201840c250508191e048b62800c4502"
"42920b42641032143808184b0a3232884870c421234412878c1041920264c808b1142043468820c9"
"01323284182a282a90317cb05c9120c3c8000000892000000b0000003222c80820624600212b2498"
"0c212524980c19270c85a4906032645c20246382a01801300128030173046000132677b00778a007"
"7cb0033a680377b0877420877408873618877a208770d8e012e5d006f0a0077640077a600774a007"
"7640076d900e71a00778a00778d006e980077a80077a80076d900e7160077a100776a0077160076d"
"900e7320077a300772a0077320076d900e7640077a600774a0077640076d900e71200778a0077120"
"0778a00771200778d006e6300772a0077320077a300772d006e6600774a0077640077a600774d006"
"f6100776a0077160077a100776d006f6300772a0077320077a300772d006f6600774a0077640077a"
"600774d006f610077280077a10077280077a10077280076de00e7160077a300772a0077640071a21"
"4c0e11de9c2e4fbbcfbe211560040000000000000000000000000620b141a0286100004016080000"
"06000000321e980c19114c908c092647c60443620914c10840190000b1180000ac0000003308801c"
"c4e11c6614013d88433884c38c4280077978077398710ce6000fed100ef4800e330c421ec2c11dce"
"a11c6630053d88433884831bcc033dc8433d8c033dcc788c7470077b08077948877070077a700376"
"788770208719cc110eec900ee1300f6e300fe3f00ef0500e3310c41dde211cd8211dc2611e663089"
"3bbc833bd04339b4033cbc833c84033bccf0147660077b6807376887726807378087709087706007"
"76280776f8057678877780875f08877118877298877998812ceef00eeee00ef5c00eec300362c8a1"
"1ce4a11ccca11ce4a11cdc611cca211cc4811dca6106d6904339c84339984339c84339b8c3389443"
"3888033b94c32fbc833cfc823bd4033bb0c30cc7698770588772708374680778608774188774a087"
"19ce530fee000ff2500ee4900ee3400fe1200eec500e3320281ddcc11ec2411ed2211cdc811edce0"
"1ce4e11dea011e66185138b0433a9c833bcc50247660077b68073760877778077898514cf4900ff0"
"500e331e6a1eca611ce8211ddec11d7e011ee4a11ccc211df0610654858338ccc33bb0433dd04339"
"fcc23ce4433b88c33bb0c38cc50a877998877718877408077a28077298815ce3100eecc00ee5500e"
"f33023c1d2411ee4e117d8e11dde011e6648193bb0833db4831b84c3388c4339ccc33cb8c139c8c3"
"3bd4033ccc48b471080776600771088771588719dbc60eec600fede006f0200fe5300fe5200ff650"
"0e6e100ee3300ee5300ff3e006e9e00ee4500ef83023e2ec611cc2811dd8e117ec211de6211dc421"
"1dd8211de8211f66209d3bbc433db80339948339cc58bc7070077778077a08077a488777708719cb"
"e70eef300fe1e00ee9400fe9a00fe530c3010373a8077718875f988770708774a08774d087729881"
"844139e0c338b0433d904339cc40c4a01dcaa11de0411edec11c662463300ee1c00eec300fe9400f"
"e5000000792000001e000000721e482043880c19097232482023818c9191d144a01028643c313242"
"8e9021a318100a00060000006b65726e656c0000230802308240042308843082400c23080431c320"
"04c30c045118858c04262821373bbb36973037b737ba30bab437b7b95102231d373bbbb6343917bb"
"32b9b9b437b7518203000000a9180000250000000b0a7228877780077a587098433db8c338b04339"
"d0c382e61cc6a10de8411ec2c11de6211de8211ddec11d1634e3600ee7500fe1200fe4400fe1200f"
"e7500ef4b08081077928877060077678877108077a28077258709cc338b4013ba4833d94c3026b1c"
"d8211cdce11cdc201ce4611cdc201ce8811ec2611cd0a11cc8611cc2811dd861c1010ff4200fe150"
"0ff4800e00000000d11000000600000007cc3ca4833b9c033b94033da0833c94433890c301000000"
"6120000006000000130481860301000002000000075010cd14610000000000007120000003000000"
"320e10228400fc020000000000000000650c00001f000000120394f0000000000300000006000000"
"060000004c000000010000005800000000000000580000000100000070000000000000000c000000"
"130000001f0000000800000006000000000000007000000000000000000000000100000000000000"
"00000000060000000000000006000000ffffffff00240000000000005d0c00000d00000012039467"
"000000006b65726e656c31352e302e376e7670747836342d6e76696469612d637564613c73747269"
"6e673e0000000000",
}
@pytest.fixture(params=("txt", "bitcode_static"))
def minimal_nvvmir(request):
major, minor, debug_major, debug_minor = nvvm.ir_version()
if request.param == "txt":
return MINIMAL_NVVMIR_TXT_TEMPLATE % (major, debug_major)
bitcode_static_binascii = MINIMAL_NVVMIR_BITCODE_STATIC.get((major, debug_major))
if bitcode_static_binascii:
return binascii.unhexlify(bitcode_static_binascii)
raise RuntimeError(
"Static bitcode for NVVM IR version "
f"{major}.{debug_major} is not available in this test.\n"
"Maintainers: Please run the helper script to generate it and add the "
"output to the MINIMAL_NVVMIR_BITCODE_STATIC dict:\n"
" ../../toolshed/build_static_bitcode_input.py"
)

so I suggest that we move it to the common place, say a new file under cuda_python_test_helpers:
https://github.com/NVIDIA/cuda-python/tree/main/cuda_python_test_helpers/cuda_python_test_helpers
and have it imported in both cuda.bindings/core tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be partially addressed now as I have yet to remove the existing invocations from cuda_bindings test

Comment on lines 532 to 533
extra_name = f"{options.name}_extra_{i}"
nvvm.add_module_to_program(self._mnff.handle, extra_source, len(extra_source), extra_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two notes:

  1. I am torn if we should allow users to specify the module names, instead of us making one up for them. Could you try my "sequence of 2-tuples" suggestion from our offline chat, and see how bad the code looks like? We should have something similar already in one of the options.
  2. As discussed let's check if all modules can be lazily loaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module enhancement Any code-related improvements P1 Medium priority - Should do

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NVVM support - follow-up

2 participants