-
Notifications
You must be signed in to change notification settings - Fork 104
build: fix the blackwell dockerfile #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| # Simple OpenFold3 Dockerfile using NVIDIA PyTorch container | ||
| FROM nvcr.io/nvidia/pytorch:25.02-py3 | ||
| FROM nvcr.io/nvidia/pytorch:25.12-py3 | ||
|
|
||
| # Install system dependencies | ||
| RUN apt-get update && apt-get install -y \ | ||
|
|
@@ -13,15 +13,21 @@ RUN apt-get update && apt-get install -y \ | |
| libxft2 \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
|
|
||
| # Clone OpenFold3 source and modify environment file | ||
| # Install CUTLASS for DeepSpeed Evoformer attention kernel | ||
| # We need only the headers for DeepSpeed JIT, don't need the pip package with bindings | ||
| WORKDIR /opt | ||
| RUN git clone https://github.com/aqlaboratory/openfold-3.git && \ | ||
| cd openfold-3 && \ | ||
| cp -p environments/production-linux-64.yml environments/production.yml.backup && \ | ||
| grep -v "pytorch::pytorch" environments/production.yml > environments/production.yml.tmp && \ | ||
| mv environments/production.yml.tmp environments/production.yml | ||
|
Comment on lines
-18
to
-22
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was completely unused: everything is installed via the system python+pip |
||
| RUN git clone https://github.com/NVIDIA/cutlass --branch v3.6.0 --depth 1 | ||
|
|
||
| # Pre-compile DeepSpeed operations for Blackwell GPUs to avoid runtime compilation | ||
| # Create necessary cache directories | ||
| RUN python3 -c "import os; os.makedirs('/root/.triton/autotune', exist_ok=True)" | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is empirically needed in my tests, which is a bit odd |
||
|
|
||
| WORKDIR /opt/openfold-3 | ||
| # Set environment variables including CUDA architecture for Blackwell | ||
| ENV PYTHONUNBUFFERED=1 \ | ||
| PYTHONDONTWRITEBYTECODE=1 \ | ||
| KMP_AFFINITY=none \ | ||
| CUTLASS_PATH=/opt/cutlass \ | ||
| TORCH_CUDA_ARCH_LIST="12.1" | ||
|
Comment on lines
+25
to
+30
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can still remove some of these – all of those could be provided at runtime, and are quite specific to the use case here |
||
|
|
||
| # Install Python dependencies | ||
| RUN pip install --no-cache-dir \ | ||
|
|
@@ -46,36 +52,18 @@ RUN pip install --no-cache-dir \ | |
| awscli \ | ||
| memory_profiler \ | ||
| func_timeout \ | ||
| biotite==1.2.0 \ | ||
| "nvidia-cutlass<4" \ | ||
| "cuda-python<12.9.1" | ||
|
Comment on lines
-50
to
-51
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| biotite==1.2.0 | ||
|
|
||
| # Install CUTLASS for DeepSpeed Evoformer attention kernel | ||
| WORKDIR /opt | ||
| RUN git clone https://github.com/NVIDIA/cutlass --branch v3.6.0 --depth 1 | ||
| COPY pyproject.toml /opt/openfold3/ | ||
| COPY openfold3/__init__.py /opt/openfold3/openfold3/ | ||
| COPY scripts/ /opt/openfold3/scripts/ | ||
|
|
||
| # Install OpenFold3 package itself (provides run_openfold command) | ||
| WORKDIR /opt/openfold-3 | ||
| RUN python3 -m pip install --editable --no-deps . | ||
|
|
||
| # Set environment variables including CUDA architecture for Blackwell | ||
| ENV PYTHONUNBUFFERED=1 \ | ||
| PYTHONDONTWRITEBYTECODE=1 \ | ||
| KMP_AFFINITY=none \ | ||
| CUTLASS_PATH=/opt/cutlass \ | ||
| TORCH_CUDA_ARCH_LIST="12.0" | ||
|
|
||
| # Pre-compile DeepSpeed operations for Blackwell GPUs to avoid runtime compilation | ||
| # Create necessary cache directories | ||
| RUN python3 -c "import os; os.makedirs('/root/.triton/autotune', exist_ok=True)" | ||
| WORKDIR /opt/openfold3 | ||
| RUN python3 -m pip install --no-deps --editable . | ||
|
|
||
| # Create a Python sitecustomize.py to set TORCH_CUDA_ARCH_LIST before any imports | ||
| # This ensures the variable is set before PyTorch's cpp_extension checks it | ||
| RUN mkdir -p /usr/local/lib/python3.12/site-packages && \ | ||
| echo 'import os' > /usr/local/lib/python3.12/site-packages/sitecustomize.py && \ | ||
| echo 'os.environ.setdefault("TORCH_CUDA_ARCH_LIST", "12.0")' >> /usr/local/lib/python3.12/site-packages/sitecustomize.py && \ | ||
| echo 'os.environ.setdefault("CUTLASS_PATH", "/opt/cutlass")' >> /usr/local/lib/python3.12/site-packages/sitecustomize.py && \ | ||
| echo 'os.environ.setdefault("KMP_AFFINITY", "none")' >> /usr/local/lib/python3.12/site-packages/sitecustomize.py | ||
|
Comment on lines
-74
to
-78
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All of this can be removed |
||
| # Copy the entire source tree directly (at the very end for optimal caching) | ||
| COPY . /opt/openfold3 | ||
|
|
||
| # Default command | ||
| CMD ["/bin/bash"] | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| # Blackwell (sm_121) on aarch64 environment | ||
| name: openfold3-env | ||
| variables: | ||
| CUDA_HOME: /usr/local/cuda | ||
| PATH: /usr/local/cuda/bin:${PATH} | ||
| LD_LIBRARY_PATH: /usr/local/cuda/lib64:${LD_LIBRARY_PATH} | ||
| # Triton bundles its own ptaxs which does not support sm_121 | ||
| # This forces Triton to use the system ptaxas compiler, aware of sm_121 | ||
| TRITON_PTXAS_PATH: /usr/local/cuda/bin/ptxas | ||
| # Requires: git clone https://github.com/NVIDIA/cutlass --branch v3.6.0 --depth 1 ~/workspace/cutlass | ||
| CUTLASS_PATH: /home/jandom/workspace/cutlass | ||
| # Note: OMP_NUM_THREADS=1 is required to avoid threading conflicts | ||
| OMP_NUM_THREADS: "1" | ||
|
Comment on lines
+4
to
+13
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the really ugly part, especially the hard-coded paths specific to my box or $HOME – all of this get taken care of when using the docker image from nvidia with torch pre-installed |
||
|
|
||
| channels: | ||
| - conda-forge | ||
| - bioconda | ||
| - nvidia | ||
| dependencies: | ||
| - python | ||
| - awscli | ||
| - setuptools | ||
| - pip | ||
| - conda-forge::uv | ||
| - pytorch-lightning | ||
| - biopython | ||
| - numpy | ||
| - pandas | ||
| - PyYAML | ||
| - requests | ||
| - scipy | ||
| - tqdm | ||
| - typing-extensions | ||
| - wandb | ||
| - modelcif | ||
| - ml-collections | ||
| - rdkit=2025.09.3 | ||
| - mmseqs2 | ||
| - bioconda::hmmer | ||
| - bioconda::hhsuite | ||
| - bioconda::kalign2 | ||
| - bioconda::snakemake | ||
| - memory_profiler | ||
| - func_timeout | ||
| - boto3 | ||
| - conda-forge::python-lmdb=1.6 | ||
| - conda-forge::ijson | ||
| - pip: | ||
| # PyTorch stable cu130 for aarch64 - works on Blackwell via PTX JIT | ||
| - --extra-index-url https://download.pytorch.org/whl/cu130 | ||
| - torch>=2.9.0 | ||
|
Comment on lines
+50
to
+51
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is important to get a sufficiently high version of torch. A couple of things got removed or moved
|
||
| - biotite==1.2.0 | ||
| - deepspeed | ||
| - pdbeccdutils | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is important: with CUDA 12.9+ we get sm121 support out of the box