Skip to content

1d1x1w/CDU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cross-domain Distillation for Unsupervised Domain Adaptation with Large Vision-language Models


Highlights

Architecture

Abstract: Large vision-language models (VLMs), incorporating the prompt learning mechanism, have achieved promising results in cross-domain tasks. However, leveraging VLMs to transfer the knowledge from the source domain to the target domain remains a challenging task for unsupervised domain adaptation (UDA). To this end, we propose \textbf{\underline{C}}ross-domain \textbf{\underline{D}}istillation for \textbf{\underline{U}}DA with LVMs (termed as CDU). Firstly, CDU trains a source model by embedding the knowledge of the source domain (including both each sample and its corresponding class category) into VLMs in a lightweight manner. Secondly, CDU makes full use of the image and text semantics from the source model to guide the target model learning, thereby achieving domain alignment to yield semantically consistent representations across domains. We conduct extensive experiments on 3 popular UDA datasets including Office-31, Office-Home, and Mini-DomainNet. Experimental results verify our method consistently surpasses the state-of-the-art (SOTA) UDA methods by a large margin with higher performance and lower model complexity on various UDA benchmarks. Take Office-Home as an example, the average accuracy of CDU exceeds existing methods by at least 3%, yet the number of learnable parameters only accounts for 17.9% and the inference time only takes up 4.3% compared to those of others. The code of this paper is available at GitHub: \textcolor{blue}{https://github.com/1d1x1w/CDU}.

Main Contributions

  • New perspective. To the best of our knowledge, this is the first attempt that leverages both the visual and textual semantic information of VLMs to transfer knowledge from the source domain to the target domain for UDA.

  • Novel method: We introduce a novel UDA approach CDU to implement lightweight cross-domain distillation that makes full use of both the image and text semantics of the source domain, generated by VLMs, to simultaneously guide the image features generation and text label prediction for the target domain.

  • High Performance: We conduct extensive experiments on 3 popular UDA datasets including Office-31, Office-Home and Mini-DomainNet. The experimental results validate the superiority of our CDU which achieves higher performance with lower model complexity compared with the state-of-the-art (SOTA) UDA methods on various cross-domain tasks.


Results

PMCC in comparison with existing prompt tuning methods

Results reported below show accuracy across 4 UDA datasets with ViT-B/16 backbone. Our PMCC method adopts the paradigm of multi-modal prompt tuning.

Name Office-Home Acc. Office-31 Acc.
CLIP 82.1 77.5
CoOp 83.9 89.4
CoCoOp 84.1 88.9
VPT-deep 83.9 89.4
MaPLe 84.2 89.6
DAPL 84.4 81.2
PDA 85.7 91.2
CDU(Ours) 90.1 94.0
Name Mini-DomainNet Acc.
DeiT 55.1
ViT 57.5
CLIP 69.3
SSRT 65.4
CDTrans 63.2
DAPL 73.6
PMTrans 69.6
PADCLIP 74.7
UniMoS 76.0
CDU(Ours) 78.0

Installation

For installation and other package requirements, please follow the instructions as follows. This codebase is tested on Ubuntu 22.04 LTS with python 3.7. Follow the below steps to create environment and install dependencies.

  • Setup conda environment.
# Create a conda environment
conda create -y -n cdu python=3.7

# Activate the environment
conda activate cdu

# Install torch (requires version >= 1.8.1) and torchvision
# Please refer to https://pytorch.org/get-started/previous-versions/ if your cuda version is different
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch
  • Install dassl library.
# Instructions borrowed from https://github.com/KaiyangZhou/Dassl.pytorch#installation

# Clone this repo
git clone https://github.com/KaiyangZhou/Dassl.pytorch.git
cd Dassl.pytorch

# Install dependencies
pip install -r requirements.txt

# Install this library (no need to re-build if the source code is modified)
python setup.py develop
cd ..
  • Clone PMCC code repository and install requirements.
# Clone CDU code base
git clone https://github.com/246dxw/CDU.git
cd CDU

# Install requirements
pip install -r requirements.txt

Data preparation

Please follow the instructions as follows to prepare all datasets. Datasets list:


Training and Evaluation

Please follow the instructions for training, evaluating and reproducing the results. Firstly, you need to modify the directory of data by yourself.

Source Model Training

# Example: trains on Office-Home dataset, and the source domian is art and the target domain is clipart (a-c)
bash scripts/cdu/main_cdusource.sh officehome b32_ep20_officehome CDUSOURCE ViT-L/14 4 a-c 0

Target Model Training

# Example: trains on Office-Home dataset, and the source domian is art and the target domain is clipart (a-c)
bash scripts/cdu/main_cdutarget.sh officehome b32_ep20_officehome CDUTARGET ViT-B/16 4 a-c 0

Evaluation

# evaluates on Office-Home dataset, and the source domian is art and the target domain is clipart (a-c)
bash scripts/cdu/eval_cdutarget.sh officehome b32_ep20_officehome CDUTARGET ViT-B/16 4 a-c 0

The details are at each method folder in [scripts folder](CDU/scripts at main · 246dxw/CDU(github.com)).

Acknowledgements

Our style of reademe refers to PDA. And our code is based on CoOp and CoCoOp, DAPLMaPLe and PDA etc. repository. We thank the authors for releasing their code. If you use their model and code, please consider citing these works as well. Supported methods are as follows:

Method Paper Code
CoOp IJCV 2022 link
CoCoOp CVPR 2022 link
VPT ECCV 2022 link
IVLP & MaPLe CVPR 2023 link
DAPL TNNLS 2023 link
PDA AAAI 2024 link

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published