Fine‑tuned CLIP‑based VLM for few‑shot facial attribute classification on WFLW—clone, pip install -e ., and hit the main script to train or infer in seconds.
multi-modal clip vlm few-shot-learning facial-attributes-classification vision-transformer multi-task-architecture prompt-engineering vision-language-model finetuning-llms
-
Updated
May 9, 2025 - Jupyter Notebook