Skip to content

pritish384/ip_adapter_gradio_interface

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

IP-Adapter FaceID Gradio Interface

An interactive Gradio app for generating Stable Diffusion images guided by a source face embedding (InsightFace) and text prompts. The notebook wires together IP-Adapter FaceID+, Stable Diffusion 1.5 Realistic Vision, and a simple UI for prompt + negative prompt control.

What it does

  • Detects and aligns a face from an input image with InsightFace.
  • Extracts a FaceID embedding and feeds it to IP-Adapter FaceID+.
  • Generates one or more SD images conditioned on the face, prompt, and negative prompt.
  • Runs as a Gradio interface for easy experimentation.

Requirements

  • Python 3.9+ (tested with recent PyTorch/Gradio stack)
  • GPU with CUDA (for reasonable speed) and latest NVIDIA drivers
  • Git LFS (for model weight downloads)

Setup

  1. Clone/download this folder and open the notebook IP_adapter_FaceID_Final.ipynb.
  2. In a Python environment with CUDA, install dependencies:
pip install diffusers einops accelerate insightface onnxruntime gradio
pip install huggingface-hub==0.25.2
pip install -U transformers
pip install git+https://github.com/tencent-ailab/IP-Adapter.git
  1. Pull the IP-Adapter FaceID weights (uses Git LFS):
git lfs install
git clone https://huggingface.co/h94/IP-Adapter-FaceID
  1. Ensure access to the base and VAE models referenced in the notebook (they will auto-download):
  • Base: SG161222/Realistic_Vision_V4.0_noVAE
  • VAE: stabilityai/sd-vae-ft-mse
  • Image encoder: laion/CLIP-ViT-H-14-laion2B-s32B-b79K

Running the app

  • Execute the cells in IP_adapter_FaceID_Final.ipynb in order. The final cell launches the Gradio interface.
  • By default it binds to http://127.0.0.1:7860. Use the public link shown in the cell output if provided by Gradio.

Interface fields

  • Input Image: source photo containing a clear, single face.
  • Prompt / Negative Prompt: text conditioning.
  • Guidance Scale (s_scale): IP-Adapter guidance strength (0.1–2.0).
  • Seed: random seed for reproducibility.
  • Width / Height: output resolution (defaults 512x768).
  • Samples: number of images to generate.
  • Steps: diffusion steps per image.

Tips

  • Use sharp, front-facing portraits for better face alignment.
  • If you see multiple faces detected, only the first face is used; crop your input for control.
  • Keep output sizes moderate (e.g., 512–768 on the longer side) to avoid VRAM exhaustion.
  • Switch v2 to True in the notebook if you want the plus-v2 checkpoint (adjust ip_ckpt path accordingly).

Troubleshooting

  • CUDA out of memory: lower width/height, steps, or samples; ensure torch_dtype=torch.float16 and GPU is free.
  • OnnxRuntime CPU errors: the notebook initializes InsightFace with CPUExecutionProvider; ensure onnxruntime is installed and CPU AVX2 is available.
  • Model download failures: confirm Hugging Face access and Git LFS is installed/enabled.

Source

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors