[ROCm] Add AMD GPU support for TRELLIS.2 by andyluo7 · Pull Request #155 · microsoft/TRELLIS.2

andyluo7 · 2026-04-27T16:34:48Z

Summary

Enable TRELLIS.2 to run on AMD Instinct GPUs (MI300X / gfx942) with ROCm.

Your setup.sh already detects ROCm and installs ROCm PyTorch + ROCm flash-attention, which is great! However, the C++ extensions (O-Voxel) and rendering libraries (nvdiffrast) don't build/work on ROCm. This PR fills those gaps.

Changes

O-Voxel HIP port (5 files)

Add #ifdef __HIP_PLATFORM_AMD__ guards for CUDA→HIP header mapping in all .cu files
setup.py: default arch gfx942 for AMD MI300X

nvdiffrast ROCm adapter (2 new files)

trellis2/renderers/nvdiffrast_rocm_adapter.py: Pure PyTorch drop-in replacements for dr.rasterize(), dr.interpolate(), dr.texture(), dr.antialias(), dr.DepthPeeler
trellis2/renderers/rocm_compat.py: Auto-patches import nvdiffrast.torch as dr when nvdiffrast is unavailable

Companion PR

CuMesh HIP port: [ROCm] Add HIP support for AMD Instinct GPUs JeffreyXiang/CuMesh#31 (15 files, same approach)

Testing

✅ AMD MI300X (gfx942), ROCm 7.0.2, PyTorch 2.9.1
✅ O-Voxel compiles with hipcc
✅ CuMesh compiles with hipcc (import cumesh works)
✅ FlexGEMM compiles on ROCm (no changes needed)
✅ Pipeline class imports and loads pretrained model
✅ Rasterize/interpolate/texture adapter verified with unit tests
✅ All code cross-compilable — CUDA builds unaffected

Usage on ROCm

# Add this before importing renderers:
import trellis2.renderers.rocm_compat

# Then use normally
from trellis2.pipelines import Trellis2ImageTo3DPipeline
pipeline = Trellis2ImageTo3DPipeline.from_pretrained('microsoft/TRELLIS.2-4B')
pipeline.cuda()
result = pipeline.run(image)

Known Limitations

nvdiffrast rendering uses pure PyTorch fallback (no antialiasing, slower than CUDA rasterizer)
nvdiffrec PBR lighting is stubbed (not ported)
flash-attention falls back to SDPA on ROCm 7.0.2 (works but slower)
Only tested on gfx942 (MI300X)

Enable TRELLIS.2 to run on AMD Instinct GPUs (MI300X) with ROCm: ## O-Voxel HIP port (5 files) - Add #ifdef __HIP_PLATFORM_AMD__ guards for CUDA→HIP header mapping - setup.py: default arch gfx942 for AMD MI300X ## nvdiffrast ROCm adapter (new files) - nvdiffrast_rocm_adapter.py: Pure PyTorch implementations of dr.rasterize(), dr.interpolate(), dr.texture(), dr.antialias(), dr.DepthPeeler — works on any PyTorch device - rocm_compat.py: Auto-patches `import nvdiffrast.torch as dr` when nvdiffrast is not available (ROCm, CPU-only, etc.) ## Dependencies - CuMesh HIP port: JeffreyXiang/CuMesh#31 - flash-attention: falls back to SDPA (PyTorch native) on ROCm - nvdiffrec: stubbed with warning (PBR rendering not available) ## Testing - Tested on AMD MI300X (gfx942) with ROCm 7.0.2 + PyTorch 2.9.1 - Pipeline imports and loads pretrained model successfully - Core 3D generation (DiT inference → sparse voxels) works - Rendering uses pure PyTorch fallback (functional but slower) ## Known limitations - nvdiffrast rendering uses pure PyTorch (no antialiasing, slower) - nvdiffrec PBR lighting is stubbed out - flash-attention builds from ROCm fork but may need manual setup Signed-off-by: Andy Luo <andyluo7@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROCm] Add AMD GPU support for TRELLIS.2#155

[ROCm] Add AMD GPU support for TRELLIS.2#155
andyluo7 wants to merge 1 commit intomicrosoft:mainfrom
andyluo7:add-rocm-support

andyluo7 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

andyluo7 commented Apr 27, 2026

Summary

Changes

O-Voxel HIP port (5 files)

nvdiffrast ROCm adapter (2 new files)

Companion PR

Testing

Usage on ROCm

Known Limitations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant