Skip to content

thinh-dao/SteerFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SteerFlow: Steering Rectified Flows for Faithful Inversion-Based Image Editing

Thinh Dao, Zhen Wang, Kien T. Pham, Long Chen

arXiv

TL;DR: A training-free, model-agnostic image editing framework that steers rectified flow velocity fields using amortized fixed-point inversion, trajectory interpolation, and adaptive masking for faithful inversion-based editing. Supports FLUX.1-dev and Stable Diffusion 3.5 Medium.

Abstract

Recent advances in flow-based generative models have enabled training-free, text-guided image editing by inverting an image into its latent noise and regenerating it under a new target conditional guidance. However, existing methods struggle to preserve source fidelity: higher-order solvers incur additional model inferences, truncated inversion constrains editability, and feature injection methods lack architectural transferability. To address these limitations, we propose SteerFlow, a model-agnostic editing framework with strong theoretical guarantees on source fidelity. In the forward process, we introduce an Amortized Fixed-Point Solver that implicitly straightens the forward trajectory by enforcing velocity consistency across consecutive timesteps, yielding a high-fidelity inverted latent. In the backward process, we introduce Trajectory Interpolation, which adaptively blends target-editing and source-reconstruction velocities to keep the editing trajectory anchored to the source. To further improve background preservation, we introduce an Adaptive Masking mechanism that spatially constrains the editing signal with concept-guided segmentation and source-target velocity differences. Extensive experiments on FLUX.1-dev and Stable Diffusion 3.5 Medium demonstrate that SteerFlow consistently achieves better editing quality than existing methods. Finally, we show that SteerFlow extends naturally to a complex multi-turn editing paradigm without accumulating drift.

Overview

SteerFlow performs text-driven image editing by controlling the velocity field during the ODE denoising process of flow-matching models. The key idea is to:

  1. Invert the source image to noise via an Amortized Fixed-Point Solver that enforces velocity consistency across timesteps.
  2. Steer the Denoising Trajectory with the target prompt, blending source and target velocities via Trajectory Interpolation and Adaptive Masking to ensure source consistency and background preservation.

Implementation

We provide two implementation options:

  • Diffusers-based implementation: Uses the HuggingFace diffusers library. Supports FLUX (black-forest-labs/FLUX.1-dev) and Stable Diffusion 3 (stabilityai/stable-diffusion-3-medium-diffusers).
  • Official FLUX implementation: Built on the original FLUX repository. Slightly better performance than the diffusers-based FLUX pipeline.

Acknowledgements

We thank UniEdit-Flow, FireFlow, FLUX, and SAM3 for their excellent work.

Citation

@misc{dao2025steerflow,
    title={SteerFlow: Steering Rectified Flows for Faithful Inversion-Based Image Editing},
    author={Thinh Dao and Zhen Wang and Kien T. Pham and Long Chen},
    year={2025},
    eprint={2604.01715},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages