Open
Conversation
There was a problem hiding this comment.
Pull request overview
Updates AMD inference Dockerfiles to adjust FlashAttention build/install behavior (notably for Wan2.1) and expands the supported ROCm arch list for Mochi.
Changes:
- Replaces the pinned/parameterized FlashAttention wheel build in the Wan2.1 Dockerfile with a direct
setup.py installfrom an unpinned ROCm/flash-attention clone. - Adds
gfx950to thePYTORCH_ROCM_ARCHlist in the Mochi inference Dockerfile.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| docker/pyt_wan2.1_inference.ubuntu.amd.Dockerfile | Changes FlashAttention installation steps for Wan2.1 image builds. |
| docker/pyt_mochi_inference.ubuntu.amd.Dockerfile | Updates the ROCm architecture list used when building FlashAttention. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+67
to
+87
| #ARG BUILD_FA="1" | ||
| #ARG FA_BRANCH="v3.0.0.r1-cktile" | ||
| #ARG FA_REPO="https://github.com/ROCm/flash-attention.git" | ||
| #RUN if [ "$BUILD_FA" = "1" ]; then \ | ||
| # cd ${WORKSPACE_DIR} \ | ||
| # && pip uninstall -y flash-attention \ | ||
| # && rm -rf flash-attention \ | ||
| # && git clone ${FA_REPO} \ | ||
| # && cd flash-attention \ | ||
| # && git checkout ${FA_BRANCH} \ | ||
| # && git submodule update --init \ | ||
| # && GPU_ARCHS=${HIP_ARCHITECTURES} python3 setup.py bdist_wheel --dist-dir=dist \ | ||
| # && pip install dist/*.whl \ | ||
| # && python -c "import flash_attn; print(f'Flash Attention version == {flash_attn.__version__}')"; \ | ||
| # fi | ||
| # install flash attention | ||
| ENV FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE" | ||
|
|
||
| RUN git clone https://github.com/ROCm/flash-attention.git &&\ | ||
| cd flash-attention &&\ | ||
| python setup.py install |
Comment on lines
+67
to
+87
| #ARG BUILD_FA="1" | ||
| #ARG FA_BRANCH="v3.0.0.r1-cktile" | ||
| #ARG FA_REPO="https://github.com/ROCm/flash-attention.git" | ||
| #RUN if [ "$BUILD_FA" = "1" ]; then \ | ||
| # cd ${WORKSPACE_DIR} \ | ||
| # && pip uninstall -y flash-attention \ | ||
| # && rm -rf flash-attention \ | ||
| # && git clone ${FA_REPO} \ | ||
| # && cd flash-attention \ | ||
| # && git checkout ${FA_BRANCH} \ | ||
| # && git submodule update --init \ | ||
| # && GPU_ARCHS=${HIP_ARCHITECTURES} python3 setup.py bdist_wheel --dist-dir=dist \ | ||
| # && pip install dist/*.whl \ | ||
| # && python -c "import flash_attn; print(f'Flash Attention version == {flash_attn.__version__}')"; \ | ||
| # fi | ||
| # install flash attention | ||
| ENV FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE" | ||
|
|
||
| RUN git clone https://github.com/ROCm/flash-attention.git &&\ | ||
| cd flash-attention &&\ | ||
| python setup.py install |
Comment on lines
35
to
+36
| ARG FA_REPO="https://github.com/Dao-AILab/flash-attention.git" | ||
| ARG PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx1100;gfx1101;gfx1200;gfx1201 | ||
| ARG PYTORCH_ROCM_ARCH=gfx950;gfx90a;gfx942;gfx1100;gfx1101;gfx1200;gfx1201 |
lcskrishna
requested changes
Apr 24, 2026
| # install flash attention | ||
| ENV FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE" | ||
|
|
||
| RUN git clone https://github.com/ROCm/flash-attention.git &&\ |
Contributor
There was a problem hiding this comment.
Please use FA_BRANCH & FA_REPO arguments. These are meant to build using build arguments with whatever branch is needed. Already existing branch is the latest tag from Flash-attention, any specific reason to remove it?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Updated wan2.1 dockerfile with FA steps taken from ROCM FA repo.
Technical Details
Test Plan
Test Result
Submission Checklist