Skip to content

Size not match for 'packed_noisy_model_input' and 'latent_image_ids'. #15

@solitaryTian

Description

@solitaryTian

When I run train_tdd_adv.sh, i meet the following error.
Passingtxt_ids3d torch.Tensor is deprecated.Please remove the batch dimension and pass it as a 2d torch Tensor Traceback (most recent call last): File "/maindata/data/shared/public/songtao.tian/test_code/Target-Driven-Distillation/train/FLUX/train_tdd_adv.py", line 1705, in <module> main(args) File "/maindata/data/shared/public/songtao.tian/test_code/Target-Driven-Distillation/train/FLUX/train_tdd_adv.py", line 1406, in main model_pred = transformer( File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/accelerate/utils/operations.py", line 823, in forward return model_forward(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/accelerate/utils/operations.py", line 811, in __call__ return convert_to_fp32(self.model_forward(*args, **kwargs)) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast return func(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/diffusers/models/transformers/transformer_flux.py", line 512, in forward encoder_hidden_states, hidden_states = torch.utils.checkpoint.checkpoint( File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/_compile.py", line 32, in inner return disable_fn(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 632, in _fn return fn(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint ret = function(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/diffusers/models/transformers/transformer_flux.py", line 507, in custom_forward return module(*inputs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/diffusers/models/transformers/transformer_flux.py", line 180, in forward attention_outputs = self.attn( File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 588, in forward return self.processor( File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 2318, in __call__ query = apply_rotary_emb(query, image_rotary_emb) File "/maindata/data/shared/public/songtao.tian/anaconda3/envs/tdd/lib/python3.10/site-packages/diffusers/models/embeddings.py", line 1208, in apply_rotary_emb out = (x.float() * cos + x_rotated.float() * sin).to(x.dtype) RuntimeError: The size of tensor a (4464) must match the size of tensor b (16320) at non-singleton dimension 2
When I print the shape of some matrix in the following code in 【train_tdd_adv.py】. I find the shape of latent_image_ids does not match the shape of packed_noisy_model_input. Since 【packed_noisy_model_input.shape=torch.Size([4, 3952, 64])】 and 【latent_image_ids.shape=torch.Size([15808, 3])】

model_pred = transformer( hidden_states=packed_noisy_model_input, # YiYi notes: divide it by 1000 for now because we scale it by 1000 in the transforme rmodel (we should not keep it but I want to keep the inputs same for the model for testing) timestep=timesteps / 1000, guidance=guidance, pooled_projections=pooled_prompt_embeds, encoder_hidden_states=prompt_embeds, txt_ids=text_ids, img_ids=latent_image_ids, return_dict=False, )[0]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions