Inpaint Anything: Segment Anything Meets Image Inpainting

Inpaint Anything can inpaint anything in images, videos and 3D scenes!

Authors: Tao Yu, Runseng Feng, Ruoyu Feng, Jinming Liu, Xin Jin, Wenjun Zeng and Zhibo Chen.
Institutes: University of Science and Technology of China; Eastern Institute for Advanced Study.
[Paper] [Website] [Hugging Face Homepage]

TL; DR: Users can select any object in an image by clicking on it. With powerful vision models, e.g., SAM, LaMa and Stable Diffusion (SD), Inpaint Anything is able to remove the object smoothly (i.e., Remove Anything). Further, prompted by user input text, Inpaint Anything can fill the object with any desired content (i.e., Fill Anything) or replace the background of it arbitrarily (i.e., Replace Anything).

📜 News

[2023/9/15] Remove Anything 3D code is available!
[2023/4/30] Remove Anything Video available! You can remove any object from a video!
[2023/4/24] Local web UI supported! You can run the demo website locally!
[2023/4/22] Website available! You can experience Inpaint Anything through the interface!
[2023/4/22] Remove Anything 3D available! You can remove any 3D object from a 3D scene!
[2023/4/13] Technical report on arXiv available!

🌟 Features

💡 Highlights

Any aspect ratio supported
2K resolution supported
Technical report on arXiv available (🔥NEW)
Website available (🔥NEW)
Local web UI available (🔥NEW)
Multiple modalities (i.e., image, video and 3D scene) supported (🔥NEW)

📌 Remove Anything Video

With a single click on an object in the first video frame, Remove Anything Video can remove the object from the whole video!

Click on an object in the first frame of a video;
SAM segments the object out (with three possible masks);
Select one mask;
A tracking model such as OSTrack is ultilized to track the object in the video;
SAM segments the object out in each frame according to tracking results;
A video inpainting model such as STTN is ultilized to inpaint the object in each frame.

Installation

Requires python>=3.8

python -m pip install torch torchvision torchaudio
python -m pip install -e segment_anything
python -m pip install -r lama/requirements.txt
python -m pip install jpeg4py lmdb

Usage

Download the model checkpoints provided in Segment Anything and STTN (e.g., sam_vit_h_4b8939.pth and sttn.pth), and put them into ./pretrained_models. Further, download OSTrack pretrained model from here (e.g., vitb_384_mae_ce_32x4_ep300.pth) and put it into ./pytracking/pretrain. For simplicity, you can also go here, directly download pretrained_models, put the directory into ./ and get ./pretrained_models. Additionally, download pretrain, put the directory into ./pytracking and get ./pytracking/pretrain.

For MobileSAM, the sam_model_type should use "vit_t", and the sam_ckpt should use "./weights/mobile_sam.pt". For the MobileSAM project, please refer to MobileSAM

bash script/remove_anything_video.sh

Specify a video, a point, video FPS and mask index (indicating using which mask result of the first frame), and Remove Anything Video will remove the object from the whole video.

python remove_anything_video.py \
    --input_video ./example/video/paragliding/original_video.mp4 \
    --coords_type key_in \
    --point_coords 652 162 \
    --point_labels 1 \
    --dilate_kernel_size 15 \
    --output_dir ./results \
    --sam_model_type "vit_h" \
    --sam_ckpt ./pretrained_models/sam_vit_h_4b8939.pth \
    --lama_config lama/configs/prediction/default.yaml \
    --lama_ckpt ./pretrained_models/big-lama \
    --tracker_ckpt vitb_384_mae_ce_32x4_ep300 \
    --vi_ckpt ./pretrained_models/sttn.pth \
    --mask_idx 2 \
    --fps 25

The --mask_idx is usually set to 2, which typically is the most confident mask result of the first frame. If the object is not segmented out well, you can try other masks (0 or 1).

Demo

Acknowledgments

Other Interesting Repositories

Citation

If you find this work useful for your research, please cite us:

@article{yu2023inpaint,
  title={Inpaint Anything: Segment Anything Meets Image Inpainting},
  author={Yu, Tao and Feng, Runseng and Feng, Ruoyu and Liu, Jinming and Jin, Xin and Zeng, Wenjun and Chen, Zhibo},
  journal={arXiv preprint arXiv:2304.06790},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
app		app
example		example
lama		lama
nerf		nerf
pretrained_models		pretrained_models
pytracking		pytracking
script		script
segment_anything		segment_anything
sttn		sttn
utils		utils
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
fill_anything.py		fill_anything.py
lama_inpaint.py		lama_inpaint.py
lama_requirements_windows.txt		lama_requirements_windows.txt
ostrack.py		ostrack.py
remove_anything.py		remove_anything.py
remove_anything_3d.py		remove_anything_3d.py
remove_anything_video.py		remove_anything_video.py
replace_anything.py		replace_anything.py
sam_segment.py		sam_segment.py
stable_diffusion_inpaint.py		stable_diffusion_inpaint.py
sttn_video_inpaint.py		sttn_video_inpaint.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Inpaint Anything: Segment Anything Meets Image Inpainting

📜 News

🌟 Features

💡 Highlights

📌 Remove Anything Video

Installation

Usage

Demo

Acknowledgments

Other Interesting Repositories

Citation

About

Uh oh!

Releases

Packages

Languages

License

Two-Shots-Are-Enough/Inpaint-Anything

Folders and files

Latest commit

History

Repository files navigation

Inpaint Anything: Segment Anything Meets Image Inpainting

📜 News

🌟 Features

💡 Highlights

📌 Remove Anything Video

Installation

Usage

Demo

Acknowledgments

Other Interesting Repositories

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages