[ROCm] Add AMD GPU support for TRELLIS.2#155
Open
andyluo7 wants to merge 1 commit intomicrosoft:mainfrom
Open
Conversation
Enable TRELLIS.2 to run on AMD Instinct GPUs (MI300X) with ROCm: ## O-Voxel HIP port (5 files) - Add #ifdef __HIP_PLATFORM_AMD__ guards for CUDA→HIP header mapping - setup.py: default arch gfx942 for AMD MI300X ## nvdiffrast ROCm adapter (new files) - nvdiffrast_rocm_adapter.py: Pure PyTorch implementations of dr.rasterize(), dr.interpolate(), dr.texture(), dr.antialias(), dr.DepthPeeler — works on any PyTorch device - rocm_compat.py: Auto-patches `import nvdiffrast.torch as dr` when nvdiffrast is not available (ROCm, CPU-only, etc.) ## Dependencies - CuMesh HIP port: JeffreyXiang/CuMesh#31 - flash-attention: falls back to SDPA (PyTorch native) on ROCm - nvdiffrec: stubbed with warning (PBR rendering not available) ## Testing - Tested on AMD MI300X (gfx942) with ROCm 7.0.2 + PyTorch 2.9.1 - Pipeline imports and loads pretrained model successfully - Core 3D generation (DiT inference → sparse voxels) works - Rendering uses pure PyTorch fallback (functional but slower) ## Known limitations - nvdiffrast rendering uses pure PyTorch (no antialiasing, slower) - nvdiffrec PBR lighting is stubbed out - flash-attention builds from ROCm fork but may need manual setup Signed-off-by: Andy Luo <andyluo7@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enable TRELLIS.2 to run on AMD Instinct GPUs (MI300X / gfx942) with ROCm.
Your
setup.shalready detects ROCm and installs ROCm PyTorch + ROCm flash-attention, which is great! However, the C++ extensions (O-Voxel) and rendering libraries (nvdiffrast) don't build/work on ROCm. This PR fills those gaps.Changes
O-Voxel HIP port (5 files)
#ifdef __HIP_PLATFORM_AMD__guards for CUDA→HIP header mapping in all.cufilessetup.py: default archgfx942for AMD MI300Xnvdiffrast ROCm adapter (2 new files)
trellis2/renderers/nvdiffrast_rocm_adapter.py: Pure PyTorch drop-in replacements fordr.rasterize(),dr.interpolate(),dr.texture(),dr.antialias(),dr.DepthPeelertrellis2/renderers/rocm_compat.py: Auto-patchesimport nvdiffrast.torch as drwhen nvdiffrast is unavailableCompanion PR
Testing
import cumeshworks)Usage on ROCm
Known Limitations