Add torch.compile support for 2x faster inference #64

devdaniel · 2026-01-24T08:23:51Z

Adds --compile and --compile_mode flags to use torch.compile.

This is a massive performance improvement (2x) to inference speed (16 it/s to 32 it/s) tested on RTX 4090, RTX PRO 6000 and A100 taking it from 1:1 real-time inference speed to 2x real-time.
This should auto-detect triton/inductor availability and fall back with a warning.
Windows users will need to install triton-windows separately to use this.

devdaniel · 2026-01-24T18:23:25Z

May also need these deps package updates

    "numpy>=2.0.2",
    "torch>=2.10.0",
    "torchaudio>=2.10.0",
    "torchtune>=0.6.1",
    "torchao>=0.15.0",
    "torchvision>=0.25.0",

And this added

    "bitsandbytes>=0.49.0",

hectic-droid · 2026-01-24T18:55:13Z

Adds --compile and --compile_mode flags to use torch.compile.

This is a massive performance improvement (2x) to inference speed (16 it/s to 32 it/s) tested on RTX 4090, RTX PRO 6000 and A100 taking it from 1:1 real-time inference speed to 2x real-time. This should auto-detect triton/inductor availability and fall back with a warning. Windows users will need to install triton-windows separately to use this.

I tried triton and it will not run. I am in a windows 11 system. I tried with 2.10 and 3.0 found at https://huggingface.co/madbuda/triton-windows-builds. I do have a 5070ti which sometimes causes problems with pytorch and other installations.

devdaniel · 2026-01-24T21:20:15Z

Adds --compile and --compile_mode flags to use torch.compile.
This is a massive performance improvement (2x) to inference speed (16 it/s to 32 it/s) tested on RTX 4090, RTX PRO 6000 and A100 taking it from 1:1 real-time inference speed to 2x real-time. This should auto-detect triton/inductor availability and fall back with a warning. Windows users will need to install triton-windows separately to use this.

I tried triton and it will not run. I am in a windows 11 system. I tried with 2.10 and 3.0 found at https://huggingface.co/madbuda/triton-windows-builds. I do have a 5070ti which sometimes causes problems with pytorch and other installations.

For Windows, make sure you are using a compatible version with your PyTorch.

pip uninstall triton triton-windows -y
pip install "triton-windows>=3.2,<3.3"

Version compatibility from triton-windows:

PyTorch	Triton
2.6	3.2
2.7	3.3
2.8	3.4
2.9	3.5

devdaniel · 2026-01-24T21:35:11Z

I've updated the warning and README with the recommended triton-windows version for Windows users

jdluzen · 2026-01-25T18:43:19Z

Tried this out with AMD and WSL. It does reduce my memory usage significantly, and appears to speed it up to 11it/s on a 7900 XTX.

devdaniel added 2 commits January 24, 2026 00:12

Add torch.compile support for 2x faster inference

d7a5840

Update readme with torch.compile information

e9226fe

devdaniel marked this pull request as ready for review January 24, 2026 08:26

Updated triton-windows version recommendation

814dd66

fspecii mentioned this pull request Jan 25, 2026

feat: Add torch.compile support for ~2x faster inference fspecii/HeartMuLa-Studio#16

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add torch.compile support for 2x faster inference #64

Add torch.compile support for 2x faster inference #64

devdaniel commented Jan 24, 2026 •

edited

Loading

Uh oh!

devdaniel commented Jan 24, 2026

Uh oh!

hectic-droid commented Jan 24, 2026

Uh oh!

devdaniel commented Jan 24, 2026

Uh oh!

devdaniel commented Jan 24, 2026

Uh oh!

jdluzen commented Jan 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add torch.compile support for 2x faster inference #64

Are you sure you want to change the base?

Add torch.compile support for 2x faster inference #64

Conversation

devdaniel commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devdaniel commented Jan 24, 2026

Uh oh!

hectic-droid commented Jan 24, 2026

Uh oh!

devdaniel commented Jan 24, 2026

Uh oh!

devdaniel commented Jan 24, 2026

Uh oh!

jdluzen commented Jan 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

devdaniel commented Jan 24, 2026 •

edited

Loading