Add ROCm/HIP compatibility to CuMesh, enabling all 3 extensions by ZJLi2013 · Pull Request #30 · JeffreyXiang/CuMesh

ZJLi2013 · 2026-04-08T05:37:12Z

Add ROCm/HIP compatibility to CuMesh, enabling all 3 extensions (cumesh._C, cumesh._cubvh, cumesh._cumesh_xatlas) to compile and run on AMD GPUs.

Changes (4 files)

src/atlas.cu: Replace cuda::std::plus() with portable cub::Sum() — the CCCL cuda::std header is not available on HIP
src/clean_up.cu: Use rocprim::tuple for int3_decomposer on HIP via #ifdef — cuda::std::tuple and thrust::tuple are both unavailable/broken on ROCm 6.4
src/dtypes.cuh: Add __host__ qualifier to all Vec3f and QEM methods — hipCUB's DeviceSegmentedReduce template instantiation requires host-callable constructors
setup.py: Guard cubvh-specific nvcc flags (--extended-lambda, -U__CUDA_NO_HALF_*) behind IS_HIP check — these flags are NVIDIA-specific and cause errors with hipcc

What works

All CUB device algorithms (RadixSort, Scan, Select, Reduce, ReduceByKey, SegmentedReduce) are automatically converted to hipCUB by PyTorch's hipify
All CUDA runtime APIs (cudaMalloc, cudaMemcpy, etc.) are automatically converted to HIP equivalents
cubvh submodule (half precision + Eigen) compiles without additional changes
xatlas (CPU-only) is unaffected

What is NOT covered

No functional changes to CUDA code paths — all changes are additive #ifdef guards or portable replacements
No new dependencies introduced

Usage on AMD GPUs

git clone --recursive https://github.com/JeffreyXiang/CuMesh.git
cd CuMesh
GPU_ARCHS=gfx942 pip install . --no-build-isolation

Test Environment

GPU: AMD Instinct MI300X (gfx942)
ROCm: 6.4.3
PyTorch: 2.6.0
Docker: rocm/pytorch:rocm6.4.3_ubuntu24.04_py3.12_pytorch_release_2.6.0

Test	Status
Compile all 3 extensions (_C, _cubvh, _xatlas)	✅
Import all modules	✅
Mesh init + read (vertex/face roundtrip)	✅
Compute face normals	✅
Compute vertex normals	✅
Mesh simplification (4→2 faces)	✅
Remove duplicate faces	✅

…lags Made-with: Cursor

ROCm/HIP support for AMD GPUs: fix cuda::std, dtypes __host__, nvcc f…

81b0b06

…lags Made-with: Cursor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ROCm/HIP compatibility to CuMesh, enabling all 3 extensions#30

Add ROCm/HIP compatibility to CuMesh, enabling all 3 extensions#30
ZJLi2013 wants to merge 1 commit intoJeffreyXiang:mainfrom
PhysicalAI-AIM:rocm

ZJLi2013 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZJLi2013 commented Apr 8, 2026

Changes (4 files)

What works

What is NOT covered

Usage on AMD GPUs

Test Environment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant