Skip to content

Demo #4

@Bigtable123

Description

@Bigtable123

I have already downloaded the ROICtrl adapter and MovieAgent-ROICtrl-Frozen into the weights folder. Why is it still trying to download weights when I run run.sh?

Some weights of UNet2DConditionModel were not initialized from the model checkpoint at ./weight/MovieAgent-ROICtrl-Frozen/FrozenII/combined_model_base and are newly initialized: ['up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'position_net.null_position_feature', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'position_net.linears.2.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.2.bias', 'position_net.linears.2.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'position_net.linears.4.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'position_net.linears.0.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'position_net.linears.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'position_net.linears.4.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Keyword arguments {'safety_checker': None} are not expected by StableDiffusionEDLoRAPipeline and will be ignored.

Since I’m running the demo on a remote server, the download process is very slow. My usual approach is to download the weights locally and then upload them to the server, which is much faster. Is there any way to directly download these attention layers?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions