I have already downloaded the ROICtrl adapter and MovieAgent-ROICtrl-Frozen into the weights folder. Why is it still trying to download weights when I run run.sh?
Some weights of UNet2DConditionModel were not initialized from the model checkpoint at ./weight/MovieAgent-ROICtrl-Frozen/FrozenII/combined_model_base and are newly initialized: ['up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'position_net.null_position_feature', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'position_net.linears.2.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.2.bias', 'position_net.linears.2.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'position_net.linears.4.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'position_net.linears.0.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'position_net.linears.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'position_net.linears.4.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Keyword arguments {'safety_checker': None} are not expected by StableDiffusionEDLoRAPipeline and will be ignored.
Since I’m running the demo on a remote server, the download process is very slow. My usual approach is to download the weights locally and then upload them to the server, which is much faster. Is there any way to directly download these attention layers?
I have already downloaded the ROICtrl adapter and MovieAgent-ROICtrl-Frozen into the weights folder. Why is it still trying to download weights when I run run.sh?
Some weights of UNet2DConditionModel were not initialized from the model checkpoint at ./weight/MovieAgent-ROICtrl-Frozen/FrozenII/combined_model_base and are newly initialized: ['up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'position_net.null_position_feature', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'position_net.linears.2.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.2.bias', 'position_net.linears.2.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'position_net.linears.4.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'position_net.linears.0.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'position_net.linears.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.norm2.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.norm1.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'position_net.linears.4.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_reduction.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.alpha_dense', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.alpha_dense', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'mid_block.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_v.weight', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.context_proj.bias', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_v.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.alpha_coord_self_attn', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_q.weight', 'up_blocks.3.attentions.2.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.ff.net.0.proj.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm3.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.norm1.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_q.weight', 'up_blocks.3.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.1.attentions.1.transformer_blocks.0.roifuser.ff.net.2.bias', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.context_proj.weight', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'down_blocks.2.attentions.1.transformer_blocks.0.roifuser.norm3.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.0.transformer_blocks.0.roifuser.ff.net.2.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_k.weight', 'up_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm2.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'down_blocks.2.attentions.0.transformer_blocks.0.roifuser.context_proj.bias', 'mid_block.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.roifuser.norm1.bias', 'down_blocks.0.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_out.0.weight', 'up_blocks.2.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.2.bias', 'up_blocks.2.attentions.2.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.bias', 'up_blocks.1.attentions.2.transformer_blocks.0.roifuser.ff.net.2.bias', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.roi_self_attn.to_out.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.roiattn_fuser.channel_gate.mlp.0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.roifuser.alpha_roi_self_attn', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.3.attentions.1.transformer_blocks.0.roifuser.coord_self_attn.to_k.weight', 'up_blocks.2.attentions.0.transformer_blocks.0.roifuser.ff.net.0.proj.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Keyword arguments {'safety_checker': None} are not expected by StableDiffusionEDLoRAPipeline and will be ignored.Since I’m running the demo on a remote server, the download process is very slow. My usual approach is to download the weights locally and then upload them to the server, which is much faster. Is there any way to directly download these attention layers?