Skip to content

Integrating Triangular Op Triton Kernels #672

@GMNGeoffrey

Description

@GMNGeoffrey

We (AMD) recently introduced Triton kernels for Triangular operations (in particular, Triangle Attention) for OpenFold3 (https://www.amd.com/en/blogs/2026/openfold3-meets-amd-instinct-gpus-unlocking-scalable.html, aqlaboratory/openfold-3#166). These work as an alternative to cuequivariance, which is exclusive to (newish) Nvidia GPUs. I've prototyped integrating triangle attention into Boltz and see ~2x speedups in end-to-end time running on MI300X. Would you be interested in us adding these as an alternative to cuequivariance? We're exploring publishing them as a standalone pip package, but it would also be easy to start by vendoring in the single file for triangle attention for a quicker integration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions