ComfyUI-VRAM-Manager

ComfyUI-VRAM-Manager (formerly ComfyUI-DistorchMemoryManager) is an independent memory management custom node for ComfyUI. Provides Distorch memory management functionality for efficient GPU/CPU memory handling. Supports purging of SeedVR2, Qwen3-VL, and Nunchaku models (FLUX/Z-Image/Qwen-Image). Includes Model Patch Memory Cleaner for ModelPatchLoader workflows.

Overview

This custom node was created to address OOM (Out Of Memory) issues in video generation workflows like Upscaling with WAN2.2. The key point is that these OOM errors are caused by system RAM shortage, not VRAM shortage (can occur even on 64GB RAM systems depending on resolution and video length).

This is a completely original implementation designed specifically for Distorch memory management. Simply place it in the custom_nodes folder for easy installation and removal.

Features

Four Node Types

Model Patch Memory Cleaner (New in v1.2.0)

Description: Memory cleaner specifically for ModelPatcher loaded model patches
Features: Clears model patches loaded via ModelPatchLoader to prevent OOM during upscaling
Input: Any data type (ANY) passthrough
Output: Any data type (ANY) passthrough
Options:
- clear_model_patches: Clear model patches loaded via ModelPatchLoader (default: True)
- clean_gpu: Clear GPU memory (default: True)
- force_gc: Force garbage collection (default: True)
Use Case: Place this node after using ModelPatchLoader (e.g., Z-Image ControlNet, QwenImage BlockWise ControlNet, SigLIP MultiFeat Proj) and before upscaling operations to prevent OOM errors. This node is designed for patch model format loaded via ModelPatchLoader, which is an exceptional format different from standard ControlNet models.
Technical Details:
- Detects ModelPatcher instances with additional_models or attachments containing model patches
- Safely unloads model patches from VRAM
- Performs cleanup_models_gc() to prevent memory leaks

Purge VRAM V2 Compatibility (v1.10, Enhanced in v1.2.0, v2.0.0, v2.2.0)

Description: Restored LayerStyle's LayerUtility: Purge VRAM V2 inside the Distortch suite (original node) with enhanced model unloading capabilities, SeedVR2 support, and Qwen3-VL/Nunchaku model purging
Features: Identical UI/behavior; keeps legacy workflows working without LayerStyle. Enhanced in v1.2.0 with more aggressive model unloading and improved error handling. Enhanced in v2.0.0 with Qwen3-VL and Nunchaku model purging support. Enhanced in v2.2.0 with Nunchaku SDXL model support. Now supports SeedVR2 DiT and VAE model purging, Qwen3-VL models, and Nunchaku models (FLUX/Z-Image/Qwen-Image/SDXL).
Input: Any data type (ANY) passthrough
Options:
- purge_cache: Run gc.collect(), flush CUDA caches, call torch.cuda.ipc_collect()
- purge_models: Enhanced model unloading (v1.2.0):
  - Calls cleanup_models() to remove dead models
  - Calls cleanup_models_gc() for garbage collection
  - Marks all models as not currently used
  - Aggressively unloads models via model_unload()
  - Calls soft_empty_cache() if available
- purge_seedvr2_models: Clear SeedVR2 DiT and VAE models from cache
  - Clears all cached DiT models from SeedVR2's GlobalModelCache
  - Clears all cached VAE models from SeedVR2's GlobalModelCache
  - Clears runner templates
  - Properly releases model memory using SeedVR2's release_model_memory()
- purge_qwen3vl_models: Clear Qwen3-VL models from GPU memory (v2.0.0)
  - Searches for Qwen3-VL models in sys.modules and gc.get_objects()
  - Handles device_map="auto" case for multi-device models
  - Clears model parameters, buffers, and internal state
- purge_nunchaku_models: Clear Nunchaku models (FLUX/Z-Image/Qwen-Image/SDXL) from GPU memory (v2.0.0, Enhanced in v2.2.0)
  - Supports NunchakuFluxTransformer2dModel, NunchakuZImageTransformer2DModel, NunchakuQwenImageTransformer2DModel, and NunchakuSDXLUNet2DConditionModel (v2.2.0)
  - Disables CPU offload before clearing models
  - Searches in sys.modules, ComfyUI current_loaded_models, and gc.get_objects()
  - Clears cache and temporary data attributes (v2.2.0)
  - Handles NunchakuSDXL wrapper class with diffusion_model access (v2.2.0)
Enhancements in v1.2.0:
- More aggressive model unloading with proper error handling
- None checks and callable() checks for all method calls
- Improved error messages and logging
- Safe handling of models with None real_model references
- SeedVR2 model support for clearing DiT and VAE models
Enhancements in v2.0.0:
- Qwen3-VL model purging with device_map="auto" support
- Nunchaku model purging (FLUX/Z-Image/Qwen-Image) with CPU offload handling
- Enhanced CUDA cache clearing for all devices
- Comprehensive debug logging for model detection and purging
- Fixed any() function name collision with AnyType
- Changed display name to ComfyUI-VRAM-Manager
Enhancements in v2.2.0:
- Nunchaku SDXL model purging support (NunchakuSDXLUNet2DConditionModel)
- NunchakuSDXL wrapper class detection and handling
- Cache and temporary data clearing for all Nunchaku detection methods
- More aggressive garbage collection (3x gc.collect()) and CUDA cache clearing
- Improved VRAM release for Nunchaku SDXL models (approximately 2.5GB)
- Preserves model structure while clearing top-level parameters only
Reason: The original LayerStyle node disappeared upstream, so we duplicated it here to keep older workflows alive. Enhanced in v1.2.0 to provide better memory management. SeedVR2 support added to handle SeedVR2's independent model caching system. Enhanced in v2.0.0 to support Qwen3-VL and Nunchaku models, which are not managed by ComfyUI's standard model_management. Enhanced in v2.2.0 to support Nunchaku SDXL models, which require special handling due to their wrapper class structure and need for cache clearing.

Memory Manager (Advanced)

Description: Comprehensive memory management node (for advanced users)
Features: Detailed memory management with UI corruption protection
Input: Any data type (ANY)
Output: Any data type (ANY)
Options:
- clean_gpu: Clear GPU memory
- clean_cpu: Clear CPU memory (use with caution)
- force_gc: Force garbage collection
- reset_virtual_memory: Reset virtual memory
- restore_original_functions: Restore original functions

Patch Sage Attention DM (New in v2.3.0)

Description: Experimental node for patching ComfyUI's attention mechanism to use SageAttention
Features: Replaces ComfyUI's standard attention with SageAttention for improved memory efficiency and performance
Input: Model (MODEL)
Output: Model (MODEL)
Options:
- sage_attention: SageAttention mode selection
  - disabled: Disable SageAttention (restore original attention)
  - auto: Automatic SageAttention implementation
  - sageattn_qk_int8_pv_fp16_cuda: CUDA implementation (QK int8, PV FP16)
  - sageattn_qk_int8_pv_fp16_triton: Triton implementation (QK int8, PV FP16)
  - sageattn_qk_int8_pv_fp8_cuda: CUDA implementation (QK int8, PV FP8)
  - sageattn_qk_int8_pv_fp8_cuda++: CUDA implementation (QK int8, PV FP8, optimized)
  - sageattn3: SageAttention 3 implementation (Blackwell support)
  - sageattn3_per_block_mean: SageAttention 3 implementation (per-block mean version)
- allow_compile: Allow torch.compile for SageAttention function (requires sageattn 2.2.0 or higher, default: False)
Use Case: Use this node to replace ComfyUI's attention mechanism with SageAttention for better memory efficiency and performance. The node patches attention on each model execution and automatically cleans up afterward.
Technical Details:
- Uses ComfyUI's callback system (ON_PRE_RUN, ON_CLEANUP) to patch attention dynamically
- Automatically detects SageAttention version and logs detailed information
- Handles Flash-Attention state detection and logging when disabled
- Compatible with ComfyUI's attention function format via wrap_attn decorator
- Supports multiple SageAttention implementations (CUDA, Triton, SageAttention 3)

Installation

Clone or download to ComfyUI/custom_nodes/ directory:

cd ComfyUI/custom_nodes
git clone https://github.com/ussoewwin/ComfyUI-DistorchMemoryManager.git

Install dependencies:

cd ComfyUI-DistorchMemoryManager
pip install -r requirements.txt

Restart ComfyUI
Nodes will appear in the "Memory" category in the node palette

Usage

Basic Usage

Add any memory management node to your workflow
Connect any data to the input
Configure options as needed
Connect output to the next node

Recommended Workflow Placement

For ModelPatchLoader workflows:

[ModelPatchLoader] → [QwenImageDiffsynthControlnet] → [Model Patch Memory Cleaner] → [Upscaling Node]

For general memory management:

[Previous Node] → [Memory Manager] → [Next Node]

Recommended Settings

For ModelPatchLoader workflows (patch model format):

Use Model Patch Memory Cleaner
clear_model_patches: True
clean_gpu: True
force_gc: True
Place after: ModelPatchLoader usage, before upscaling operations
Note: This is for patch model format loaded via ModelPatchLoader (e.g., Z-Image ControlNet, QwenImage BlockWise ControlNet, SigLIP MultiFeat Proj), which is an exceptional format different from standard ControlNet models.

For video generation (WAN2.2, etc.):

Use Memory Manager
clean_gpu: True
force_gc: True
reset_virtual_memory: True

For maximum memory release:

Use Memory Manager
clean_cpu: True (Warning: possible UI corruption)

Troubleshooting

Out of Memory Errors

Solution:

For ModelPatchLoader workflows: Use Model Patch Memory Cleaner after ControlNet usage
For general workflows: Use Memory Manager
Enable clean_gpu and reset_virtual_memory
Enable force_gc if needed

OOM During Upscaling After ModelPatchLoader Usage

Solution:

Add Model Patch Memory Cleaner node after QwenImageDiffsynthControlnet (when using ModelPatchLoader)
Enable clear_model_patches: True
Enable clean_gpu: True
Enable force_gc: True
Note: This applies to patch model format loaded via ModelPatchLoader, not standard ControlNet models

UI Corruption

Solution:

Use Model Patch Memory Cleaner or Memory Manager
Keep clean_cpu disabled (if using Memory Manager)
Enable only essential options

OOM with Qwen3-VL Models

Solution:

Use DisTorchPurgeVRAMV2 node
Enable purge_qwen3vl_models: True to clear Qwen3-VL models from GPU memory
Enable purge_cache: True and purge_models: True for comprehensive cleanup
The node handles device_map="auto" case for multi-device models automatically

OOM with Nunchaku Models (FLUX/Z-Image/Qwen-Image/SDXL)

Solution:

Use DisTorchPurgeVRAMV2 node
Enable purge_nunchaku_models: True to clear Nunchaku models from GPU memory
The node automatically disables CPU offload before clearing models
Enable purge_cache: True and purge_models: True for comprehensive cleanup
Works with NunchakuFluxTransformer2dModel, NunchakuZImageTransformer2DModel, NunchakuQwenImageTransformer2DModel, and NunchakuSDXLUNet2DConditionModel (v2.2.0)
For Nunchaku SDXL models, the node now clears cache and temporary data attributes, releasing approximately 2.5GB of VRAM (v2.2.0)

Technical Details

Implemented Features

GPU memory clearing (torch.cuda.empty_cache())
GPU synchronization (torch.cuda.synchronize())
CPU memory clearing (gc.collect())
Virtual memory reset (comfy.model_management.free_memory())
Model patch detection and unloading (v1.2.0)
- Detects ModelPatcher instances with additional_models or attachments containing patch model format
- Safely unloads model patches via model_unload()
- Removes from current_loaded_models list
- Performs cleanup_models_gc() to prevent memory leaks
- Handles exceptional patch model format loaded via ModelPatchLoader (different from standard ControlNet)
Qwen3-VL model purging (v1.4.0)
- Searches for Qwen3-VL models in sys.modules and gc.get_objects()
- Handles device_map="auto" case for multi-device models
- Clears model parameters, buffers, and internal state
- Supports hf_device_map processing for distributed models
Nunchaku model purging (v1.4.0, Enhanced in v2.2.0)
- Supports NunchakuFluxTransformer2dModel, NunchakuZImageTransformer2DModel, NunchakuQwenImageTransformer2DModel, and NunchakuSDXLUNet2DConditionModel (v2.2.0)
- Automatically disables CPU offload before clearing models
- Searches in sys.modules, ComfyUI current_loaded_models, and gc.get_objects()
- Handles nested model structures (ModelPatcher, ComfyFluxWrapper)
- Clears offload_manager to release offloaded memory
- NunchakuSDXL wrapper class detection and diffusion_model access (v2.2.0)
- Cache and temporary data clearing (_cache, _state_dict_cache, _non_persistent_buffers_set) (v2.2.0)
- More aggressive garbage collection and CUDA cache clearing for better VRAM release (v2.2.0)

Safety Features

Safe implementation to prevent UI corruption
Error handling with exception processing
Gradual memory clearing
None checks and callable() checks for all method calls (v1.2.0)
Robust error handling in cleanup_models() and is_dead() methods

Additional Tips

Expanding paging file size can also reduce OOM occurrences during upscaling
Note: For OOM during video generation inference (where VRAM is critical), paging file expansion won't help
For ModelPatchLoader workflows: Always use Model Patch Memory Cleaner before upscaling to prevent OOM. Note that patch model format loaded via ModelPatchLoader is an exceptional format different from standard ControlNet models.
For Qwen3-VL workflows: Use DisTorchPurgeVRAMV2 with purge_qwen3vl_models: True after Qwen3-VL model usage to prevent OOM. The node automatically handles device_map="auto" case for models distributed across multiple devices.
For Nunchaku workflows (FLUX/Z-Image/Qwen-Image/SDXL): Use DisTorchPurgeVRAMV2 with purge_nunchaku_models: True after Nunchaku model usage to prevent OOM. The node automatically disables CPU offload and clears models from all detection locations (sys.modules, ComfyUI model management, and gc.get_objects()). For Nunchaku SDXL models (v2.2.0), the node now includes cache clearing functionality that can release approximately 2.5GB of VRAM.
For SageAttention workflows (v2.3.0): Use Patch Sage Attention DM node to replace ComfyUI's attention mechanism with SageAttention for improved memory efficiency and performance. The node supports multiple SageAttention implementations and automatically patches attention on each model execution. To disable SageAttention, run the node again with sage_attention set to disabled.

License

Apache License 2.0 - See LICENSE file for details

Contributing

Bug reports and feature requests are welcome on the GitHub Issues page.

Release History

See CHANGELOG.md for detailed release history.

About

ComfyUI-VRAM-Manager (formerly ComfyUI-DistorchMemoryManager) - Independent memory management custom node for ComfyUI with Distorch support

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
.github/workflows		.github/workflows
backup		backup
changelog		changelog
md		md
nodes		nodes
png		png
.gitignore		.gitignore
.tracking		.tracking
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
__init__.py.backup_before_unload_fix		__init__.py.backup_before_unload_fix
__init__.py.backup_qwen3vl_working		__init__.py.backup_qwen3vl_working
icon.png		icon.png
memory_manager.py		memory_manager.py
purge_vram.py		purge_vram.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ComfyUI-VRAM-Manager

Overview

Features

Four Node Types

Model Patch Memory Cleaner (New in v1.2.0)

Purge VRAM V2 Compatibility (v1.10, Enhanced in v1.2.0, v2.0.0, v2.2.0)

Memory Manager (Advanced)

Patch Sage Attention DM (New in v2.3.0)

Installation

Usage

Basic Usage

Recommended Workflow Placement

Recommended Settings

Troubleshooting

Out of Memory Errors

OOM During Upscaling After ModelPatchLoader Usage

UI Corruption

OOM with Qwen3-VL Models

OOM with Nunchaku Models (FLUX/Z-Image/Qwen-Image/SDXL)

Technical Details

Implemented Features

Safety Features

Additional Tips

License

Contributing

Release History

About

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

ComfyUI-VRAM-Manager

Overview

Features

Four Node Types

Model Patch Memory Cleaner (New in v1.2.0)

Purge VRAM V2 Compatibility (v1.10, Enhanced in v1.2.0, v2.0.0, v2.2.0)

Memory Manager (Advanced)

Patch Sage Attention DM (New in v2.3.0)

Installation

Usage

Basic Usage

Recommended Workflow Placement

Recommended Settings

Troubleshooting

Out of Memory Errors

OOM During Upscaling After ModelPatchLoader Usage

UI Corruption

OOM with Qwen3-VL Models

OOM with Nunchaku Models (FLUX/Z-Image/Qwen-Image/SDXL)

Technical Details

Implemented Features

Safety Features

Additional Tips

License

Contributing

Release History

About

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors 1

Languages

Packages