Skip to content

Add ROCm/HIP compatibility to CuMesh, enabling all 3 extensions#30

Open
ZJLi2013 wants to merge 1 commit intoJeffreyXiang:mainfrom
PhysicalAI-AIM:rocm
Open

Add ROCm/HIP compatibility to CuMesh, enabling all 3 extensions#30
ZJLi2013 wants to merge 1 commit intoJeffreyXiang:mainfrom
PhysicalAI-AIM:rocm

Conversation

@ZJLi2013
Copy link
Copy Markdown

@ZJLi2013 ZJLi2013 commented Apr 8, 2026

Add ROCm/HIP compatibility to CuMesh, enabling all 3 extensions (cumesh._C, cumesh._cubvh, cumesh._cumesh_xatlas) to compile and run on AMD GPUs.

Changes (4 files)

  • src/atlas.cu: Replace cuda::std::plus() with portable cub::Sum() — the CCCL cuda::std header is not available on HIP
  • src/clean_up.cu: Use rocprim::tuple for int3_decomposer on HIP via #ifdefcuda::std::tuple and thrust::tuple are both unavailable/broken on ROCm 6.4
  • src/dtypes.cuh: Add __host__ qualifier to all Vec3f and QEM methods — hipCUB's DeviceSegmentedReduce template instantiation requires host-callable constructors
  • setup.py: Guard cubvh-specific nvcc flags (--extended-lambda, -U__CUDA_NO_HALF_*) behind IS_HIP check — these flags are NVIDIA-specific and cause errors with hipcc

What works

  • All CUB device algorithms (RadixSort, Scan, Select, Reduce, ReduceByKey, SegmentedReduce) are automatically converted to hipCUB by PyTorch's hipify
  • All CUDA runtime APIs (cudaMalloc, cudaMemcpy, etc.) are automatically converted to HIP equivalents
  • cubvh submodule (half precision + Eigen) compiles without additional changes
  • xatlas (CPU-only) is unaffected

What is NOT covered

  • No functional changes to CUDA code paths — all changes are additive #ifdef guards or portable replacements
  • No new dependencies introduced

Usage on AMD GPUs

git clone --recursive https://github.com/JeffreyXiang/CuMesh.git
cd CuMesh
GPU_ARCHS=gfx942 pip install . --no-build-isolation

Test Environment

GPU: AMD Instinct MI300X (gfx942)
ROCm: 6.4.3
PyTorch: 2.6.0
Docker: rocm/pytorch:rocm6.4.3_ubuntu24.04_py3.12_pytorch_release_2.6.0

Test Status
Compile all 3 extensions (_C, _cubvh, _xatlas)
Import all modules
Mesh init + read (vertex/face roundtrip)
Compute face normals
Compute vertex normals
Mesh simplification (4→2 faces)
Remove duplicate faces

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant