Skip to content

ussoewwin/Hybrid-Sensitivity-Weighted-Quantization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

210 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hybrid-Sensitivity-Weighted-Quantization (HSWQ)

High-fidelity FP8 quantization for SDXL, Flux1.dev, and Z Image Turbo diffusion models. HSWQ uses sensitivity and importance analysis instead of naive uniform cast. It offers two modes: standard-compatible (V1) and high-performance scaled (V2). V2 requires a dedicated loader and is not usable at the current time.

Technical details: md/HSWQ_ Hybrid Sensitivity Weighted Quantization.md

SDXL models: Hugging Face — Hybrid-Sensitivity-Weighted-Quantization-SDXL-fp8e4m3


How to quantize

Benchmark results: SDXL (MSE / SSIM)


Overview

Feature V1: Standard Compatible V2: High Performance Scaled
Compatibility Full (100%), any FP8 loader Requires dedicated loader — not usable at present
File format Standard FP8 (torch.float8_e4m3fn) Extended FP8 (weights + .scale metadata)
Image quality (SSIM) ~0.98 (max) Unmeasurable (no dedicated loader)
Mechanism Optimal clipping (smart clipping) Full-range scaling (dynamic scaling)
Benchmark Measurable Currently unmeasurable (no dedicated loader)
Use case Distribution, general users Unavailable until a dedicated loader exists

File size is reduced by about 30–40% vs FP16 while keeping best quality per use case.


Architecture

  1. Dual Monitor System — During calibration, two metrics are collected:

    • Sensitivity (output variance): layers that hurt image quality most if corrupted → top 10–25% kept in FP16 (for SDXL and ZIT, 10% often gives sufficient quality).
    • Importance (input mean absolute value): per-channel contribution → used as weights in the weighted histogram. Technical details: Dual Monitor System — Technical Guide.
  2. Rigorous FP8 Grid Simulation — Uses a physical grid (all 0–255 values cast to torch.float8_e4m3fn) instead of theoretical formulas, so MSE matches real runtime.

  3. Weighted MSE Optimization — Finds parameters that minimize quantization error using the importance histogram. Technical details: Weighted Histogram MSE — Technical Guide.


Modes

  • V1 (scaled=False): No scaling; only the clipping threshold (amax) is optimized. Output is standard FP8 weights. Use this mode — full compatibility with any FP8 loader.
  • V2 (scaled=True): Weights are scaled to FP8 range, quantized, and inverse scale S is stored in Safetensors (.scale). Requires a dedicated loader; not usable at the current time.

Recommended Parameters

  • Samples: 32 (recommended) — number of calibration samples.
  • Steps: 25 — number of inference steps per sample during calibration.
  • Keep ratio: 10–25% — keeps critical layers in FP16. For SDXL and ZIT, 10% often gives sufficient quality.
  • Latent: 32–256, default 128 — calibration latent size (H/W). Use --latent 32 for faster calibration, --latent 256 for higher fidelity.

Benchmark (Reference)

Model SSIM (Avg) File size Compatibility
Original FP16 1.0000 100% High
Naive FP8 0.75–0.93 50% High
HSWQ V1 0.86–0.98 60-70% (FP16 mixed) High
HSWQ V2 — (currently unmeasurable) 60-70% (FP16 mixed) Not usable (no dedicated loader)

HSWQ V1 gives a clear gain over Naive FP8 with full compatibility. V2 would offer higher quality but requires a dedicated loader; benchmark is currently unmeasurable and V2 is not usable at the current time.


Changelog

Version history and release notes are in CHANGELOG.md.


Base Repositories

This project is built upon the following repositories:

  • ComfyUI — The most powerful and modular diffusion model GUI, API and backend with a graph/nodes interface by @Comfy-Org.

About

HSWQ is a novel FP8 E4M3 quantization method that combines sensitivity analysis and importance-weighted histogram optimization, achieving superior quality compared to naive uniform quantization while maintaining standard loader compatibility.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages