An interactive Gradio app for generating Stable Diffusion images guided by a source face embedding (InsightFace) and text prompts. The notebook wires together IP-Adapter FaceID+, Stable Diffusion 1.5 Realistic Vision, and a simple UI for prompt + negative prompt control.
- Detects and aligns a face from an input image with InsightFace.
- Extracts a FaceID embedding and feeds it to IP-Adapter FaceID+.
- Generates one or more SD images conditioned on the face, prompt, and negative prompt.
- Runs as a Gradio interface for easy experimentation.
- Python 3.9+ (tested with recent PyTorch/Gradio stack)
- GPU with CUDA (for reasonable speed) and latest NVIDIA drivers
- Git LFS (for model weight downloads)
- Clone/download this folder and open the notebook IP_adapter_FaceID_Final.ipynb.
- In a Python environment with CUDA, install dependencies:
pip install diffusers einops accelerate insightface onnxruntime gradio
pip install huggingface-hub==0.25.2
pip install -U transformers
pip install git+https://github.com/tencent-ailab/IP-Adapter.git- Pull the IP-Adapter FaceID weights (uses Git LFS):
git lfs install
git clone https://huggingface.co/h94/IP-Adapter-FaceID- Ensure access to the base and VAE models referenced in the notebook (they will auto-download):
- Base: SG161222/Realistic_Vision_V4.0_noVAE
- VAE: stabilityai/sd-vae-ft-mse
- Image encoder: laion/CLIP-ViT-H-14-laion2B-s32B-b79K
- Execute the cells in IP_adapter_FaceID_Final.ipynb in order. The final cell launches the Gradio interface.
- By default it binds to
http://127.0.0.1:7860. Use the public link shown in the cell output if provided by Gradio.
- Input Image: source photo containing a clear, single face.
- Prompt / Negative Prompt: text conditioning.
- Guidance Scale (
s_scale): IP-Adapter guidance strength (0.1–2.0). - Seed: random seed for reproducibility.
- Width / Height: output resolution (defaults 512x768).
- Samples: number of images to generate.
- Steps: diffusion steps per image.
- Use sharp, front-facing portraits for better face alignment.
- If you see multiple faces detected, only the first face is used; crop your input for control.
- Keep output sizes moderate (e.g., 512–768 on the longer side) to avoid VRAM exhaustion.
- Switch
v2toTruein the notebook if you want the plus-v2 checkpoint (adjustip_ckptpath accordingly).
CUDA out of memory: lower width/height, steps, or samples; ensuretorch_dtype=torch.float16and GPU is free.OnnxRuntimeCPU errors: the notebook initializes InsightFace withCPUExecutionProvider; ensure onnxruntime is installed and CPU AVX2 is available.- Model download failures: confirm Hugging Face access and Git LFS is installed/enabled.
- IP-Adapter repo: https://github.com/tencent-ailab/IP-Adapter