IP-Prompter: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting

The stories and characters that captivate us as we grow up shape unique fantasy worlds, with images serving as the primary medium for visually experiencing these realms. Personalizing generative models through fine-tuning with theme-specific data has become a prevalent approach in text-to-image generation. However, unlike object customization, which focuses on learning specific objects, theme-specific generation encompasses diverse elements such as characters, scenes, and objects. Such diversity also introduces a key challenge: how to adaptively generate multi-character, multi-concept, and continuous theme-specific images (TSI). Moreover, fine-tuning approaches often come with significant computational overhead, time costs, and risks of overfitting. This paper explores a fundamental question: Can image generation models directly leverage images as contextual input, similarly to how large language models use text as context? To address this, we present IP-Prompter, a novel training-free TSI method for generation. IP-Prompter introduces visual prompting, a mechanism that integrates reference images into generative models, allowing users to seamlessly specify the target theme without requiring additional training. To further enhance this process, we propose a Dynamic Visual Prompting (DVP) mechanism, which iteratively optimizes visual prompts to improve the accuracy and quality of generated images. Our approach enables diverse applications, including consistent story generation, character design, realistic character generation, and style- guided image generation. Comparative evaluations against state-of-the-art personalization methods demonstrate that IP-Prompter achieves significantly better results and excels in maintaining character identity preserving, style consistency and text alignment, offering a robust and flexible solution for theme-specific image generation.

For details see the paper and Project Page.

(back to top)

Getting Started

Installation

Clone the repo

git clone https://github.com/zyxElsa/IP-Prompter.git

(back to top)

Prerequisites

For packages, see requirements.txt

conda create -n ip-prompter
conda activate ip-prompter
pip install -r requirements.txt

We use the FLUX.1-Fill-dev.

(back to top)

Inference

python inference.py --dataset {path to the reference images, e.g. 'images/tintin'}

Optional

--width {output image width}
--height {output image height}
--output_dir {path to save output images}
--model {path to generative model e.g. 'downloads/models/black-forest-labs/FLUX.1-Fill-dev'}
--seed {set specific seed or -1 for random seeds}
--specific {image names for user specific images, use ',' to split}

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Images		Images
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
prompts_list.py		prompts_list.py
requirements.txt		requirements.txt
search_images.py		search_images.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IP-Prompter: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting

Getting Started

Installation

Prerequisites

Inference

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

zyxElsa/IP-Prompter

Folders and files

Latest commit

History

Repository files navigation

IP-Prompter: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting

Getting Started

Installation

Prerequisites

Inference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages