-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
Summary
During video generation with the wan2.1 pipeline, PrepareVideoLatentsBlock is failing with CUDA out-of-memory errors when attempting to allocate ~9 GiB for VAE encoding.
Error Details
Error in block: PrepareVideoLatentsBlock
Error in block: (auto_prepare_latents, AutoPrepareLatentsBlock)
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 9.07 GiB.
GPU 0 has a total capacity of 79.18 GiB of which 4.31 GiB is free.
Including non-PyTorch memory, this process has 74.86 GiB memory in use.
Of the allocated memory 65.06 GiB is allocated by PyTorch, and 9.12 GiB
is reserved by PyTorch but unallocated.
Location
src/scope/core/pipelines/wan2_1/blocks/prepare_video_latents.py line ~82:
latents = components.vae.encode_to_latent(block_state.video)Frequency
10+ occurrences in the last 12 hours (fal.ai staging logs)
Possible Causes
- Memory fragmentation — PyTorch has 9+ GiB reserved but unallocated
- Large video batch — VAE encoding entire video tensor at once
- No memory cleanup — Previous tensors not garbage collected before this allocation
Suggested Fixes
- Add
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:Trueto reduce fragmentation - Consider chunked VAE encoding for large videos
- Add explicit
torch.cuda.empty_cache()before large allocations - Implement graceful degradation (reduce resolution/frames on memory pressure)
Related
- After a CUDA OOM during pipeline load you cannot load the pipeline with a smaller resolution that should work or load a different pipeline #136 (OOM during pipeline load — different scenario, but related memory management)
Filed automatically by Tess via Grafana error monitoring
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels