-
Notifications
You must be signed in to change notification settings - Fork 7
Description
I wanted to ask specifically about occlusion handling (e.g., hands in front of the face, glasses, hair strands, objects in or near the mouth). Right now many face swap models paint over these occluders. Some other SOTA models solve this, for example, HifiVFS preserves occlusions by disentangling identity from attributes, while DreamID (image based model) explicitly supervises attribute preservation (including occlusions) through its triplet id group training.
Have you experimented with adjusting the training pipeline for occlusion? eg:
Training with occlusion-heavy data (hands, glasses, hair, other objects, etc.)
Adding segmentation/matting losses to preserve occluders
Using visibility/occlusion masks during modulation or refinement
If not, do you have any recommendations on what you think would work best for CanonSwap, or approaches you’ve tried that didn’t work out?
Really curious whether you see this as mostly a data issue (need better occlusion-rich datasets) or a pipeline/architecture issue (need to explicitly model visibility). Any insights from your experiments would be super helpful!
Thanks again for releasing this project 🙏