Add Holocine Structured Prompt & Shot Attention Support #1615

Dango233 · 2025-11-02T03:10:51Z

Recreated from #1566 #1566, rebased to align with Upstream.

For those who has installed previously - Please delete the previous main and pull this instead.

…t_attention

…理和全局描述拼接

…，增强灵活性

…_embeds 参数以支持帧数推断

…_embeds 和 total_frames 参数以增强帧数推断能力；修复 WanVideoSampler 中的错误信息以提供更清晰的调试信息；调整 WanModel 中的 global_tokens 处理逻辑以支持自动模式。

…数推断能力；移除 total_frames 参数，改为通过 image_embeds 推断帧数；调整 global_tokens 处理逻辑以支持新的参数结构。

…o_or_number 参数的验证逻辑；确保其值大于 0，并支持比率和绝对值模式；调整 pooled_tokens 计算逻辑以适应新的参数结构。

Introduce per-shot lora. Should work even without Holocine.

- Better lora handling - Fix a couple of Attention Discrepencies (not fully aligned yet)

BNP1111 · 2025-11-03T15:50:48Z

Loading transformer parameters to cuda:0: 100%|███████████████████████████████████| 1095/1095 [00:07<00:00, 140.43it/s]
Using 1053 LoRA weight patches for WanVideo model
Input sequence length: 126360
Sampling 321 frames at 832x480 with 5 steps
0%| | 0/5 [00:00<?, ?it/s]Error during model prediction: self and mat2 must have the same dtype, but got Float and Half
0%| | 0/5 [00:02<?, ?it/s]
Error during sampling: self and mat2 must have the same dtype, but got Float and Half
!!! Exception during processing !!! self and mat2 must have the same dtype, but got Float and Half
Traceback (most recent call last):
File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 510, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 324, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 298, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 286, in process_inputs
result = f(**inputs)
^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 3261, in process
raise e
File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 3146, in process
noise_pred, noise_pred_ovi, self.cache_state = predict_with_cfg(
^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 1768, in predict_with_cfg
raise e
File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 1618, in predict_with_cfg
noise_pred_cond, noise_pred_ovi, cache_state_cond = transformer(
^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 3387, in forward
x, x_ip, lynx_ref_feature, x_ovi = block(x, x_ip=x_ip, lynx_ref_feature=lynx_ref_feature, x_ovi=x_ovi, **kwargs) #run block
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 375, in call
return super().call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 736, in compile_wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 1235, in forward
q, k, v = self.self_attn.qkv_fn(input_x)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 469, in qkv_fn
q = self.norm_q(self.q(x).to(self.norm_q.weight.dtype)).to(x.dtype).view(b, s, n, d)
^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\custom_linear.py", line 107, in forward
weight = self.apply_lora(weight).to(self.compute_dtype)
File "E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\custom_linear.py", line 107, in torch_dynamo_resume_in_forward_at_107
weight = self.apply_lora(weight).to(self.compute_dtype)
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_dynamo\eval_frame.py", line 929, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_functorch\aot_autograd.py", line 1241, in forward
return compiled_fn(full_args)
^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_functorch_aot_autograd\runtime_wrappers.py", line 384, in runtime_wrapper
all_outs = call_func_at_runtime_with_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_functorch_aot_autograd\utils.py", line 126, in call_func_at_runtime_with_args
out = normalize_as_list(f(args))
^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_functorch_aot_autograd\runtime_wrappers.py", line 750, in inner_fn
outs = compiled_fn(args)
^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_functorch_aot_autograd\runtime_wrappers.py", line 556, in wrapper
return compiled_fn(runtime_args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\output_code.py", line 584, in call
return self.current_callable(inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch_inductor\utils.py", line 2716, in run
out = model(new_inputs)
^^^^^^^^^^^^^^^^^
File "E:\ComfyUI_windows_portable\temp\torchinductor_admin\5o\c5o465fsiajxgf3k75dsy2kh2ahcnjidecfkbzddpraerz4ev4kz.py", line 89, in call
extern_kernels.addmm(arg1_1, reinterpret_tensor(arg2_1, (126360, 5120), (5120, 1), 0), reinterpret_tensor(buf0, (5120, 5120), (1, 5120), 0), alpha=1, beta=1, out=buf1)
RuntimeError: self and mat2 must have the same dtype, but got Float and Half

Prompt executed in 12.94 seconds

after updated met error

Dango233 · 2025-11-03T16:30:00Z

Oh hmmm. Try not to use torch compile for now. I'll look at if we can use compile at all.

a3nima · 2025-11-03T22:31:53Z

I've tried to recreate the scene with the painter from Holocines project page. with 241 (15sec) frames in 720x408 but encountered an error. I disabled torch.

23:29:22.643 [Info] Sent Comfy backend direct prompt requested to backend #0 (from user local)
23:29:31.049 [Warning] [ComfyUI-0/STDERR] Traceback (most recent call last):
23:29:31.050 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 510, in execute
23:29:31.050 [Warning] [ComfyUI-0/STDERR] output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
23:29:31.051 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23:29:31.051 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 324, in get_output_data
23:29:31.052 [Warning] [ComfyUI-0/STDERR] return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
23:29:31.052 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23:29:31.053 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 298, in _async_map_node_over_list
23:29:31.053 [Warning] [ComfyUI-0/STDERR] await process_inputs(input_dict, i)
23:29:31.054 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 286, in process_inputs
23:29:31.054 [Warning] [ComfyUI-0/STDERR] result = f(**inputs)
23:29:31.055 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^
23:29:31.055 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 3261, in process
23:29:31.056 [Warning] [ComfyUI-0/STDERR] raise e
23:29:31.056 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 3146, in process
23:29:31.057 [Warning] [ComfyUI-0/STDERR] noise_pred, noise_pred_ovi, self.cache_state = predict_with_cfg(
23:29:31.057 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^
23:29:31.058 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 1768, in predict_with_cfg
23:29:31.058 [Warning] [ComfyUI-0/STDERR] raise e
23:29:31.059 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 1618, in predict_with_cfg
23:29:31.059 [Warning] [ComfyUI-0/STDERR] noise_pred_cond, noise_pred_ovi, cache_state_cond = transformer(
23:29:31.060 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^
23:29:31.060 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
23:29:31.060 [Warning] [ComfyUI-0/STDERR] return self._call_impl(*args, **kwargs)
23:29:31.061 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23:29:31.061 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
23:29:31.062 [Warning] [ComfyUI-0/STDERR] return forward_call(*args, **kwargs)
23:29:31.063 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23:29:31.064 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 3387, in forward
23:29:31.065 [Warning] [ComfyUI-0/STDERR] x, x_ip, lynx_ref_feature, x_ovi = block(x, x_ip=x_ip, lynx_ref_feature=lynx_ref_feature, x_ovi=x_ovi, **kwargs) #run block
23:29:31.065 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23:29:31.066 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
23:29:31.066 [Warning] [ComfyUI-0/STDERR] return self.call_impl(*args, **kwargs)
23:29:31.067 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23:29:31.068 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in call_impl
23:29:31.068 [Warning] [ComfyUI-0/STDERR] return forward_call(*args, **kwargs)
23:29:31.069 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23:29:31.069 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 1237, in forward
23:29:31.070 [Warning] [ComfyUI-0/STDERR] q, k = apply_rope_comfy(q, k, freqs)
23:29:31.070 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23:29:31.071 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 129, in apply_rope_comfy
23:29:31.071 [Warning] [ComfyUI-0/STDERR] xq_out = freqs_cis[..., 0] * xq[..., 0] + freqs_cis[..., 1] * xq[..., 1]
23:29:31.072 [Warning] [ComfyUI-0/STDERR] ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
23:29:31.073 [Warning] [ComfyUI-0/STDERR] RuntimeError: The size of tensor a (71370) must match the size of tensor b (69998) at non-singleton dimension 1
23:29:31.073 [Warning] [ComfyUI-0/STDERR]

Dango233 · 2025-11-04T02:37:01Z

I've tried to recreate the scene with the painter from Holocines project page. with 241 (15sec) frames in 720x408 but encountered an error. I disabled torch.

23:29:22.643 [Info] Sent Comfy backend direct prompt requested to backend #0 (from user local) 23:29:31.049 [Warning] [ComfyUI-0/STDERR] Traceback (most recent call last): 23:29:31.050 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 510, in execute 23:29:31.050 [Warning] [ComfyUI-0/STDERR] output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs) 23:29:31.051 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23:29:31.051 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 324, in get_output_data 23:29:31.052 [Warning] [ComfyUI-0/STDERR] return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs) 23:29:31.052 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23:29:31.053 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 298, in _async_map_node_over_list 23:29:31.053 [Warning] [ComfyUI-0/STDERR] await process_inputs(input_dict, i) 23:29:31.054 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 286, in process_inputs 23:29:31.054 [Warning] [ComfyUI-0/STDERR] result = f(**inputs) 23:29:31.055 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^ 23:29:31.055 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 3261, in process 23:29:31.056 [Warning] [ComfyUI-0/STDERR] raise e 23:29:31.056 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 3146, in process 23:29:31.057 [Warning] [ComfyUI-0/STDERR] noise_pred, noise_pred_ovi, self.cache_state = predict_with_cfg( 23:29:31.057 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^ 23:29:31.058 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 1768, in predict_with_cfg 23:29:31.058 [Warning] [ComfyUI-0/STDERR] raise e 23:29:31.059 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\nodes_sampler.py", line 1618, in predict_with_cfg 23:29:31.059 [Warning] [ComfyUI-0/STDERR] noise_pred_cond, noise_pred_ovi, cache_state_cond = transformer( 23:29:31.060 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^ 23:29:31.060 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl 23:29:31.060 [Warning] [ComfyUI-0/STDERR] return self._call_impl(*args, **kwargs) 23:29:31.061 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23:29:31.061 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl 23:29:31.062 [Warning] [ComfyUI-0/STDERR] return forward_call(*args, **kwargs) 23:29:31.063 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23:29:31.064 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 3387, in forward 23:29:31.065 [Warning] [ComfyUI-0/STDERR] x, x_ip, lynx_ref_feature, x_ovi = block(x, x_ip=x_ip, lynx_ref_feature=lynx_ref_feature, x_ovi=x_ovi, **kwargs) #run block 23:29:31.065 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23:29:31.066 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl 23:29:31.066 [Warning] [ComfyUI-0/STDERR] return self.call_impl(*args, **kwargs) 23:29:31.067 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23:29:31.068 [Warning] [ComfyUI-0/STDERR] File "d:\BILD-AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in call_impl 23:29:31.068 [Warning] [ComfyUI-0/STDERR] return forward_call(*args, **kwargs) 23:29:31.069 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23:29:31.069 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 1237, in forward 23:29:31.070 [Warning] [ComfyUI-0/STDERR] q, k = apply_rope_comfy(q, k, freqs) 23:29:31.070 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 23:29:31.071 [Warning] [ComfyUI-0/STDERR] File "D:\BILD-AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-WanVideoWrapper\wanvideo\modules\model.py", line 129, in apply_rope_comfy 23:29:31.071 [Warning] [ComfyUI-0/STDERR] xq_out = freqs_cis[..., 0] * xq[..., 0] + freqs_cis[..., 1] * xq[..., 1] 23:29:31.072 [Warning] [ComfyUI-0/STDERR] ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~ 23:29:31.073 [Warning] [ComfyUI-0/STDERR] RuntimeError: The size of tensor a (71370) must match the size of tensor b (69998) at non-singleton dimension 1 23:29:31.073 [Warning] [ComfyUI-0/STDERR]

Would you share your workflow - I can't replicate...

a3nima · 2025-11-04T09:20:32Z

WAN 2.2 T2V HOLOCINE WIP_b.json

caslix · 2025-11-05T08:47:51Z

Dango233, thank you for your implementation, it is competent. WanVideo VRAM Management node not working. Can you fix it in the Kijai Wrapper?

mr-lab · 2025-11-05T22:34:33Z

WAN 2.2 T2V HOLOCINE WIP_b.json

issue with lora , it was fixed on main branch , but for now toggle merge lora ON

tuolaku · 2025-11-06T09:13:30Z

In latest version, the error is as follows:
assert dtype in [torch.float16, torch.bfloat16], "Input tensors must be in dtype of torch.float16 or torch.bfloat16" with Merge Loras is ON.
This can be resolved by modifying model.py and adding explicit conversion to half precision:
q, k, v = q.half(), k.half(), v.half()

Can multiple different context windows adopt multiple rounds of inference? One-time inference consumes a lot of memory.

a3nima · 2025-11-06T17:43:18Z

WAN 2.2 T2V HOLOCINE WIP_b.json

issue with lora , it was fixed on main branch , but for now toggle merge lora ON

thanks for the suggestion, but at least with the lora loader in my workflow it did not change the result still getting the error. meanwhile I've installed a fresh comfyui python 3.12.10 , cuda=12.8 , pytorch 2.8

ivandobskygithub · 2025-11-08T22:49:01Z

@a3nima thanks for sharing your workflow, I trialed it and got the same error but changing the resolution to the one in the root holocine project fixed it for me at least (480x832).

tuolaku · 2025-11-14T02:03:40Z

I sincerely hope that kijai can merge it into the main branch, because holoCine can truly narrow the gap with closed-source models. plz look at this:

WanVideo2_2_I2V_00004.mp4

kabachuha · 2025-11-14T09:02:11Z

@kijai Can you review this? ❤️

5agado · 2025-11-14T20:03:54Z

If applying a lora I need to use "base_precision" fp32 for the WanVideo Model Loader , otherwise I get
RuntimeError: self and mat2 must have the same dtype, but got Float and Half

railep · 2025-11-16T10:20:48Z

Since Kijai is swamped with other things, I did a small workaround to use HoloCine but also Bindweave and the other updates. I downloaded your repo and deleted the node names in the python node files except the loader (renamed it with HoloCine Loader), the sampler (also renamed it), and the four HoloCine nodes. In the init, I deleted most lines except for the model_loader_nodes and the sampler_nodes. Now, it works for HoloCine and everything else too.
Btw.: Had to add the line mentioned above and change the resolution to make it work - q, k, v = q.half(), k.half(), v.half()

a3nima · 2025-11-17T19:51:08Z

@railep can you put the working holocine repo up for trying?

railep · 2025-11-17T21:25:45Z

@railep can you put the working holocine repo up for trying?

Here you go: https://github.com/railep/ComfyUI-WanVideoWrapper-Multishot
Clone the repo into your custom nodes. Ignore the warning about two identical repos (they should not do anything to each other since I lobotomized the repo and deleted every node except for the nodes for HoloCine - and they are renamed accordingly, so there are technically two WanVideoWrapper Sampler but one is called HoloCine and works with HoloCine), use the workflow from example_workflows and it should work.

Dango233 · 2025-11-18T01:09:52Z

I was too busy these weeks but i can tidy it up as soon as I have some time...

kabachuha · 2025-11-20T05:10:02Z

@railep The memory usage is too high. I used to fit the example fully in VRAM with Dango's repo.

The float/half issue is also present by default

kmpoaquests · 2025-11-20T21:03:08Z

To those experiencing issues with the workflow make the following changes

model loader
base precision - > fp32
quantization -> fp8 scaled

(I question the default setting of the text encoder since its at fp8 not scaled)

then if you have a larger video card like myself I have the rtx 6000 pro blackwell , set block swaps to 0 blocks to swap.

I did 6 steps -> 832x480 @241 frames, the video generated according to the prompts but it seems a bit soft with grainy artifacts that could be sharper not sure if it's because i need to gen at higher resolutions , also i noticed the enhance node but it's tied to teacache which as far as I'm aware of is only compatible with wan 2.1 , unsure how to proceed. Lastly the motion of the subject almost seems too fast or unnatural such as taking a jacket off or a hat it just peels off like a fruit rollup.

kmpoaquests · 2025-12-02T23:36:12Z

@Dango233 any chance we can have some multigpu support here , I modded my computer to house a rtx5090 along with a rtx 6000 pro blackwell 96gb vram , on a second note the first pass in the ksampler seems to be insanely long like 15 minutes , while the second sampler takes about 70-120 seconds which makes me wonder what the hell is the computer working on in that span of time I am working with the default number of scenes making each one 5 seconds FYI.

a3nima · 2025-12-04T18:38:59Z

I was too busy these weeks but i can tidy it up as soon as I have some time...

Hey Dango, you think you can make this ready to become part of the wrapper? It would be so nice 🤩

kmpoaquests · 2025-12-09T23:35:42Z

@Dango233 is this project dead ? Also can you go over how your nodes allow holocine to work but a regular t2v workflow isn't that great ? Trying to wrap my head around the mechanics outside of lora issues applying to the entire scene which makes sense to split those out per segment

Dango233 · 2025-12-10T10:01:49Z

It's not dead... It's just me too busy with my day job.

I'll try catch up with the issues and sort them.

tuolaku · 2025-12-11T10:45:12Z

它没死......只是我忙于白天的工作。

我会尽量赶上问题并理清。

Please..... ~~>_<~~

seitanism · 2025-12-12T14:51:59Z

It's not dead... It's just me too busy with my day job.

I'll try catch up with the issues and sort them.

that would be really awesome!

camoody1 · 2025-12-29T03:23:04Z

@Dango233 I'm really hoping you'll be able to get back to this now that Christmas has passed. HoloCine looks like a much better implementation than any of the other options for long-video I've seen recently.

kmpoaquests · 2025-12-29T21:56:33Z

I don't understand what problems this project still has. I've used it for a long time and basically haven't encountered any serious issues

One of the issues I experienced is that it seems if you are just using a usual workflow t2v with holocine and apply loras as you normally would it kind of ruins the output unless you are using the workflow and nodes dango233 made, at this point it seems easier to use svi 2.0 that works with wan 2.2 , amount of time and effort is rather high.

On another note that way I originally installed this node seems to have screwed up my original wanvideowrapper node , not really sure if that is intentional or not. On a rtx 6000 pro blackwell 96gb vram I struggled to produce anything at 400 frames. Seems like a giant waste of time and hoping someone has some rectifying thoughts or opinions.

Dango233 · 2025-12-30T14:10:52Z

@Dango233 I'm really hoping you'll be able to get back to this now that Christmas has passed. HoloCine looks like a much better implementation than any of the other options for long-video I've seen recently.

Unfortunately we asian people does not have Chirstmas holiday LOL

Dango233 · 2025-12-30T14:14:01Z

Turning to draft in order not to confuse people

Dango233 and others added 30 commits October 31, 2025 23:31

Add shot-aware attention pipeline for WanVideo

aabcf7d

Improve shot attention diagnostics and ensure module packaging

ad108d6

Add detailed logging for shot prompt parsing

e0175f8

Harden offset mapping parsing for shot-aware prompts

c8df7bd

增强偏移映射解析以支持字典结构，改进错误处理

813d8cf

增强结构化提示解析，添加偏移映射的规范化和错误处理日志

73e7a13

将结构化提示解析中的调试日志级别更改为警告，以提高错误可见性

e2ca9a0

增强结构化提示解析中的偏移映射日志，添加样本类型和键存在性检查

34f8d8e

增强结构化提示解析中的偏移映射处理，改进错误处理和日志记录

27c6b8d

1

62550a2

增强tokenizer初始化，添加快速模式回退机制和日志记录

b8ac0cc

修复tokenizer调用，增强兼容性并添加异常处理

0851254

Allow WanAttentionBlock to receive shot_config kwargs

a6769ee

Add dense fallback for shot attention when flash varlen unavailable

21eb91f

Cast cross-attn mask to query dtype for dense fallback

8b9740b

Align dense shot attention fallback with loader attention mode

92a395c

Remove sampler overrides and clean shot attention logging

bbad44b

添加 i2v_mode 和 prefix_tokens 支持到 WanVideoSetShotAttention 和 sparse_sho…

7a4fccc

…t_attention

添加 WanVideoStructuredShot 和 WanVideoStructuredPromptEncode 类，支持结构化镜头处…

b0def43

…理和全局描述拼接

添加支持在 WanModel 中根据不同模式附加镜头掩码特征

3402777

更新 WanVideoStructuredShot 返回类型，简化返回值；调整 WanVideoSetShotAttention 后端选项…

8a225f7

…，增强灵活性

Rename to align with wrapper naming style

5001ebd

Further rename

4525a14

更新 WanVideoHolocinePromptEncode 类，添加 image_embeds 参数以支持帧数推断

c3601e1

更新 WanVideoHolocineShotArgs 和 WanVideoHolocinePromptEncode 类，修改 image…

2594cae

…_embeds 参数以支持帧数推断

更新 WanVideoHolocineShotArgs 和 WanVideoHolocinePromptEncode 类，重构参数以增强帧…

6e250e8

…数推断能力；移除 total_frames 参数，改为通过 image_embeds 推断帧数；调整 global_tokens 处理逻辑以支持新的参数结构。

更新 WanVideoHolocineSetShotAttention 和 WanModel 类，增强 global_token_rati…

3c20972

…o_or_number 参数的验证逻辑；确保其值大于 0，并支持比率和绝对值模式；调整 pooled_tokens 计算逻辑以适应新的参数结构。

Align Holocine per-shot LoRA with global flow (#2)

bbbe5e2

Introduce per-shot lora. Should work even without Holocine.

Dango/dev (#3)

a996229

- Better lora handling - Fix a couple of Attention Discrepencies (not fully aligned yet)

Dango233 marked this pull request as draft December 30, 2025 14:14

Uh oh!

Add Holocine Structured Prompt & Shot Attention Support #1615

Are you sure you want to change the base?

Add Holocine Structured Prompt & Shot Attention Support #1615

Conversation

Dango233 commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BNP1111 commented Nov 3, 2025

Uh oh!

Dango233 commented Nov 3, 2025

Uh oh!

a3nima commented Nov 3, 2025

Uh oh!

Dango233 commented Nov 4, 2025

Uh oh!

a3nima commented Nov 4, 2025

Uh oh!

caslix commented Nov 5, 2025

Uh oh!

mr-lab commented Nov 5, 2025

Uh oh!

tuolaku commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a3nima commented Nov 6, 2025

Uh oh!

ivandobskygithub commented Nov 8, 2025

Uh oh!

tuolaku commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kabachuha commented Nov 14, 2025

Uh oh!

5agado commented Nov 14, 2025

Uh oh!

railep commented Nov 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a3nima commented Nov 17, 2025

Uh oh!

railep commented Nov 17, 2025

Uh oh!

Dango233 commented Nov 18, 2025

Uh oh!

kabachuha commented Nov 20, 2025

Uh oh!

kmpoaquests commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kmpoaquests commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a3nima commented Dec 4, 2025

Uh oh!

kmpoaquests commented Dec 9, 2025

Uh oh!

Dango233 commented Dec 10, 2025

Uh oh!

tuolaku commented Dec 11, 2025

Uh oh!

seitanism commented Dec 12, 2025

Uh oh!

camoody1 commented Dec 29, 2025

Uh oh!

kmpoaquests commented Dec 29, 2025

Uh oh!

Dango233 commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dango233 commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Dango233 commented Nov 2, 2025 •

edited

Loading

tuolaku commented Nov 6, 2025 •

edited

Loading

tuolaku commented Nov 14, 2025 •

edited

Loading

railep commented Nov 16, 2025 •

edited

Loading

kmpoaquests commented Nov 20, 2025 •

edited

Loading

kmpoaquests commented Dec 2, 2025 •

edited

Loading

Dango233 commented Dec 30, 2025 •

edited

Loading