[docs] Models #12248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

stevhliu wants to merge 2 commits into huggingface:main from stevhliu:models

+131 −33

Member

stevhliu commented Aug 27, 2025

Splits off the Models section from Load schedulers and models and creates a dedicated section for models to include device placement, torch dtype, AutoModel API, and saving as shards.


          init

e6e47fa

HuggingFaceDocBuilderDev commented Aug 27, 2025

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

fix

0d8233d

stevhliu requested a review from sayakpaul

August 28, 2025 22:21

sayakpaul reviewed

View reviewed changes

Member

sayakpaul left a comment

Thanks! Left some comments, LMK if they are unclear.

docs/source/en/using-diffusers/loading.md

-              | `"cuda"` | places model or pipeline on CUDA device |
-              | `"balanced"` | evenly distributes model or pipeline on all GPUs |
-              | `"auto"` | distribute model from fastest device first to slowest |
+              | `"cuda"` | places pipeline on CUDA device |

Member

sayakpaul Aug 29, 2025

"cuda" is just an example. If someone wants to do it for any other supported accelerator, I believe they pass it by their name 👀

Suggested change

      
            | `"cuda"` | places pipeline on CUDA device |
          
            | `"cuda"` | places pipeline on CUDA (or supported accelerator) device |

docs/source/en/using-diffusers/models.md

Comment on lines +40 to +43

+              model = AutoModel.from_pretrained(
+                  "Qwen/Qwen-Image",
+                  subfolder="transformer"
+              )

Member

sayakpaul Aug 29, 2025

Suggested change

      
            model = AutoModel.from_pretrained(
          
                "Qwen/Qwen-Image",
          
                subfolder="transformer"
          
            )
          
            model = AutoModel.from_pretrained(
          
                "Qwen/Qwen-Image", subfolder="transformer"
          
            )

docs/source/en/using-diffusers/models.md

Comment on lines +55 to +57

+                  "Qwen/Qwen-Image",
+                  subfolder="transformer"
+                  torch_dtype=torch.float16

Member

sayakpaul Aug 29, 2025

Suggested change

      
                "Qwen/Qwen-Image",
          
                subfolder="transformer"
          
                torch_dtype=torch.float16
          
                "Qwen/Qwen-Image",
          
                subfolder="transformer",
          
                torch_dtype=torch.bfloat16

docs/source/en/using-diffusers/models.md

+              )
+              ```
+              [torch.Tensor.to](https://docs.pytorch.org/docs/stable/generated/torch.Tensor.to.html) can also convert to a specific data type on the fly. However, it converts *all* weights to the requested data type unlike `torch_dtype` which respects `_keep_in_fp32_modules`. This argument preserves layers in `torch.float32` for numerical stability and best generation quality (see example [_keep_in_fp32_modules](https://github.com/huggingface/diffusers/blob/f864a9a352fa4a220d860bfdd1782e3e5af96382/src/diffusers/models/transformers/transformer_wan.py#L374))

Member

sayakpaul Aug 29, 2025

Shouldn't it be nn.Module.to()?

docs/source/en/using-diffusers/models.md

+              from diffusers import QwenImageTransformer2DModel
+              model = QwenImageTransformer2DModel.from_pretrained(
+                  "Qwen/Qwen-Image",,

Member

sayakpaul Aug 29, 2025

Suggested change

      
                "Qwen/Qwen-Image",, 
          
                "Qwen/Qwen-Image",

docs/source/en/using-diffusers/models.md

+              import torch
+              from diffusers import QwenImageTransformer2DModel
+              max_memory = {0: "16GB", 1: "16GB"}

Member

sayakpaul Aug 29, 2025

Umm, what would 0 and 1 denote in this case, though? I think this form of max_memory dict is reserved for the pipelines.

For models, you probably want to specify module names (regex should work, too). Cc: @SunMarc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet