GitHub - vaskers5/LUA: One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Model

One Small Step in Latent, One Giant Leap for Pixels:
Fast Latent Upscale Adapter for Your Diffusion Models

Aleksandr Razin^* · Danil Kazantsev^* · Ilya Makarov

This repository contains the official implementation of the paper "One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models".

We present the Latent Upscaler Adapter (LUA), a lightweight module that performs super-resolution directly on the generator's latent code before the final VAE decoding step. LUA integrates as a drop-in component, requiring no modifications to the base model or additional diffusion stages. It enables high-resolution synthesis through a single feed-forward pass in latent space, achieving comparable perceptual quality to pixel-space methods while reducing decoding and upscaling time.

Installation

git clone https://github.com/vaskers5/LUA.git
cd LUA
pip install -r requirements.txt

Quick Start

LUA weights are hosted on HuggingFace and downloaded automatically on first use.

Python API

import torch
from diffusers import FluxPipeline
from lua import load_model, upscale_latent

# Load models
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.to("cuda")
pipe.vae.enable_tiling()

lua_model = load_model(device="cuda")  # auto-downloads weights from HF

# Generate base latent at 1024x1024
result = pipe("a cat astronaut", output_type="latent", width=1024, height=1024)

# Unpack to VAE space
latent = pipe._unpack_latents(result.images, 1024, 1024, pipe.vae_scale_factor)
latent = (latent / pipe.vae.config.scaling_factor) + pipe.vae.config.shift_factor

# Upscale x2 (1024 -> 2048) or x4 (1024 -> 4096)
upscaled = upscale_latent(lua_model, latent, head="x2")

# Decode to image
image = pipe.vae.decode(upscaled.to(torch.bfloat16), return_dict=False)[0]
image = pipe.image_processor.postprocess(image, output_type="pil")[0]
image.save("output_2k.png")

CLI Inference

# 2K image (1024 -> 2048)
python inference.py --prompt "a mountain landscape, cinematic" --head x2

# 4K image (1024 -> 4096)
python inference.py --prompt "a mountain landscape, cinematic" --head x4 --output landscape_4k.png

# Use a local checkpoint
python inference.py --prompt "hello" --weights ./my_checkpoint.pth --head x2

Gradio Demo

Interactive demo with side-by-side comparison against direct FLUX generation:

python gradio_demo.py

The demo compares LUA path (FLUX@1024 + LUA upscale) vs Direct path (FLUX@target) at the same output resolution, with interactive magnifying loupes and timing breakdowns.

You can configure the FLUX model via environment variables:

FLUX_MODEL_ID="black-forest-labs/FLUX.1-dev" python gradio_demo.py

Model Details


Architecture	SwinIR-based transformer with multi-head upsampling
Parameters	~250M
Input	16-channel VAE latent (FLUX latent space)
Heads	x2 (2x upscaling), x4 (4x upscaling)
Output	Upscaled 16-channel VAE latent

LUA operates entirely in the latent space — it upscales the latent code before the VAE decoder, which means the expensive VAE decode only happens once at the target resolution.

Training

Training code will be released soon.

Citation

@article{razin2024lua,
  title={One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models},
  author={Razin, Aleksandr and Kazantsev, Danil and Makarov, Ilya},
  journal={arXiv preprint arXiv:2511.10629},
  year={2024}
}

License

This project is licensed under the Apache License 2.0 — see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
figures		figures
lua		lua
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
gradio_demo.py		gradio_demo.py
inference.py		inference.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

One Small Step in Latent, One Giant Leap for Pixels:
Fast Latent Upscale Adapter for Your Diffusion Models

Installation

Quick Start

Python API

CLI Inference

Gradio Demo

Model Details

Training

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

One Small Step in Latent, One Giant Leap for Pixels:Fast Latent Upscale Adapter for Your Diffusion Models

Installation

Quick Start

Python API

CLI Inference

Gradio Demo

Model Details

Training

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

One Small Step in Latent, One Giant Leap for Pixels:
Fast Latent Upscale Adapter for Your Diffusion Models

Packages