-
Notifications
You must be signed in to change notification settings - Fork 6.3k
[feature] Support Flux TensorRT Pipeline #12218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
cc @sayakpaul |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some questions. Thanks for starting this!
examples/flux-tensorrt/README.md
Outdated
engine_transformer_path = "path/to/transformer/engine_trt10.13.2.6.plan" | ||
engine_vae_path = "path/to/vae/engine_trt10.13.2.6.plan" | ||
engine_t5xxl_path = "path/to/t5/engine_trt10.13.2.6.plan" | ||
engine_clip_path = "path/to/clip/engine_trt10.13.2.6.plan" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to derive these? Can we host some of these files on the Hub and supplement in this project?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These .plan
files depend on the GPUs that we use. I only have them for H100 SXM, do you need these files to run?
examples/flux-tensorrt/README.md
Outdated
We follow the official [NVIDIA/TensorRT](https://github.com/NVIDIA/TensorRT) repository to build TensorRT. | ||
|
||
> **Note:** | ||
> TensorRT was originally built with `diffusers==0.31.1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TensorRT's build shouldn't depend on the diffusers version no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but in the TensorRT repo, it's used with this version. However, after building the TRT files, we can use my script for inference with the current diffusers version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe will have other ways for build or I will code a new one. This is just fast building.
examples/flux-tensorrt/README.md
Outdated
> - one **venv** for building, and | ||
> - another **venv** for inference. | ||
|
||
(🔜 TODO: Build scripts for the latest `diffusers` will be added later.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is needed for this? Can we maybe comment on that so that other contributors could pick it up if interested?
examples/flux-tensorrt/README.md
Outdated
pip install -r requirements.txt | ||
``` | ||
|
||
### ⚡ Fast Building with Static Shapes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this can be run from the repository NVIDIA/TensorRT
itself, then what is the purpose of this example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a guide for building .plan files, in case someone has no idea how to build them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh then I think it should live in the NVIDIA TensorRT repository as most of the examples seem to be taken from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay @sayakpaul, I will remove it and maybe write code to build it in Diffusers in the near future.
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
hey @sayakpaul, I added code to build TRT. You can take a look. I think everything is working well now. |
You can convert all ONNX checkpoints to TensorRT engines with a single command: | ||
```bash | ||
python convert_trt.py | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need onnx checkpoints for TensorRT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, at REAME, I mentioned it in this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be nice to have the possibility of taking a diffusers
checkpoint and then convert it to TensorRT.
Do you think that is possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I did it, but in my own way. I think I’ll code this for Diffusers and create a new PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be actually amazing!
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
hey @sayakpaul , did this check have sth wrong? |
What does this PR do?
This PR addresses issue #12202.
It introduces initial support for the
FluxPipeline
with TensorRT acceleration.For installation steps and usage examples, please check the updated README.md.
Below are sample results comparing PyTorch (before TRT) vs TensorRT (after TRT):
🐱 Example 1:
👧 Example 2: