Hello,
I have been testing LongVie with the Wan2.1-I2V-14B model on an RTX 5090 (WSL2 environment). I encountered several blocking issues that required manual code fixes to get the inference running.
Here is a summary of the bugs and the fixes I applied:
-
Hardcoded Internal Proxy in Shell Script
In sample_longvideo.sh, the first line exports a private internal proxy (10.66.70.227). Issue: This causes connection timeouts for anyone outside your internal network. Fix: Removed the export http_proxy=... line.
-
Crash Loading Sharded Models (utils.py)
The Wan2.1-I2V-14B model downloads as multiple .safetensors files (sharded). Issue: load_state_dict in diffsynth/models/utils.py crashes because it expects a single string path, but receives a list of filenames for sharded models. Fix: I patched load_state_dict to handle lists recursively:
# In diffsynth/models/utils.py
def load_state_dict(file_path, torch_dtype=None, device="cpu"):
# --- ADDED THIS BLOCK ---
if isinstance(file_path, list):
state_dict = {}
for f in file_path:
state_dict.update(load_state_dict(f, torch_dtype=torch_dtype, device=device))
return state_dict
# ------------------------
if file_path.endswith(".safetensors"):
return load_state_dict_from_safetensors(file_path, torch_dtype=torch_dtype, device=device)
else:
return load_state_dict_from_bin(file_path, torch_dtype=torch_dtype, device=device)
-
Typo in T5 Encoder Filename (inference.py)
Issue: In inference.py, the T5 encoder filename is misspelled as ...models_t5_umt05.... Fix: Changed umt05 to umt5 to match the actual Hugging Face filename (models_t5_umt5-xxl-enc-bf16.pth).
-
Invalid Device Type "gpu" (inference.py)
Issue: The VAE config in inference.py uses offload_device="gpu". Error: RuntimeError: ... device type at start of device string: gpu Fix: Changed "gpu" to "cuda".
-
Missing Dependency
Issue: import decord fails. Fix: Added decord to requirements.txt.
-
Folder Naming Inconsistencies
download_wan2.1.py downloads to a folder named Wan-AI, but inference.py looks for Wan2.1-I2V-14B-480P.
Scripts reference models (plural), but some instructions imply model (singular).
Environment:
GPU: RTX 5090 (requires CUDA 12.8 / Pytorch Nightly cu128 / Flash Attention built from source for sm_120)
OS: WSL2 (Ubuntu 22.04)
Thanks for the great work on the model!
Hello,
I have been testing LongVie with the Wan2.1-I2V-14B model on an RTX 5090 (WSL2 environment). I encountered several blocking issues that required manual code fixes to get the inference running.
Here is a summary of the bugs and the fixes I applied:
Hardcoded Internal Proxy in Shell Script
In sample_longvideo.sh, the first line exports a private internal proxy (10.66.70.227). Issue: This causes connection timeouts for anyone outside your internal network. Fix: Removed the export http_proxy=... line.
Crash Loading Sharded Models (utils.py)
The Wan2.1-I2V-14B model downloads as multiple .safetensors files (sharded). Issue: load_state_dict in diffsynth/models/utils.py crashes because it expects a single string path, but receives a list of filenames for sharded models. Fix: I patched load_state_dict to handle lists recursively:
Typo in T5 Encoder Filename (inference.py)
Issue: In inference.py, the T5 encoder filename is misspelled as ...models_t5_umt05.... Fix: Changed umt05 to umt5 to match the actual Hugging Face filename (models_t5_umt5-xxl-enc-bf16.pth).
Invalid Device Type "gpu" (inference.py)
Issue: The VAE config in inference.py uses offload_device="gpu". Error: RuntimeError: ... device type at start of device string: gpu Fix: Changed "gpu" to "cuda".
Missing Dependency
Issue: import decord fails. Fix: Added decord to requirements.txt.
Folder Naming Inconsistencies
download_wan2.1.py downloads to a folder named Wan-AI, but inference.py looks for Wan2.1-I2V-14B-480P.
Scripts reference models (plural), but some instructions imply model (singular).
Environment:
GPU: RTX 5090 (requires CUDA 12.8 / Pytorch Nightly cu128 / Flash Attention built from source for sm_120)
OS: WSL2 (Ubuntu 22.04)
Thanks for the great work on the model!