Skip to content

parameter-efficient finetuning method for dynamic faical expression recongition (Electronics)

License

Notifications You must be signed in to change notification settings

SeoulTech-HCIRLab/FRU-Adapter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FRU-Adapter: Frame recalibration unit adapter for dynamic facial expression recognition

FRU-Adapter: Frame recalibration unit adapter for dynamic facial expression recognition

MyungbeomHer, Hamza Ghulam Nabi, and Ji-HyeongHan*

Seoul National University of Science and Technology & HCIR Lab

📰 News

[2025.10.08] FRU-Adapter is surpassed to smart-turn v3 in Turn Detection Task github
[2025.09.24] FRU-Adapter is surpassed to KOBERT in Name Entity Recognition (NER) Task github
[2025.02.28] FRU-Adapter is published for Electronics
[2025.02.21] FRU-Adapter is accepted for Electronics
[2025.01.16] We upload the code of FRU-Adapter

✨ Overview

Dynamic facial expression recognition (DFER) is one of the important challenges in computer vision, as it plays a crucial role in human-computer interaction. Recently, adapter-based approaches have been introduced to the DFER and they have achieved remarkable success. However, the adapters still suffer from the following problems: overlooking irrelevant frames and interference with pre-trained information. In this paper, we propose a frame recalibration unit adapter (FRU Adapter), which combines the strengths of frame recalibration unit (FRU) and temporal self-attention (T-SA) to address the aforementioned issues.


#Frame recalibration unit adapter
class FRU_Adapter(nn.Module):
    def __init__(self,
                 channel = 197,
                 embded_dim = 1024,
                 Frame = 16,
                 hidden_dim = 128):
        super().__init__()

        self.Frame = Frame

        self.linear1 = nn.Linear(embded_dim ,hidden_dim)
        self.linear2 = nn.Linear(hidden_dim,embded_dim)

        self.T_linear1 = nn.Linear(Frame, Frame)
        self.softmax = nn.Softmax(dim=1)
        self.ln = nn.LayerNorm(hidden_dim)
        
        self.TFormer = TemporalTransformer(frame=Frame,emb_dim=hidden_dim)

    #Frame recalibration unit
    def FRU(self, x):
        x1 = x.mean(-1).flatten(1) # bn t 
        x1 = self.T_linear1(x1) # bn t
        x1 = self.softmax(x1).unsqueeze(-1) #bn t 1
        x = x * x1 #bn t d
        return x 
    
    def forward(self, x):
        #x = bt N D 
        bt, n,d = x.shape
        x = rearrange(x, '(b t) n d-> (b n) t d', t = self.Frame, n = n, d = d)

        x = self.linear1(x) # bn t d
        x = self.ln(x) 

        _, _,down = x.shape

        x = rearrange(x, '(b n) t d-> b t (n d)', t = self.Frame, n = n, d = down)
        x = self.FRU(x)
        x = rearrange(x, 'b t (n d)-> (b n) t d', t = self.Frame, n = n, d = down)

        x = self.TFormer(x)
        x = self.linear2(x) # bn t d
        #bt n d
        x = rearrange(x, '(b n) t d-> (b t) n d', t = self.Frame, n = n, d = d)
        return x

models_vit.py

🚀 Main Results

✨ Dynamic Facial Expression Recognition

Result_on_DFEW, FERV39k, MAFW dataset

🔨 Installation

Run the following command to make virtual environments

conda create -n FRU_Adapter python=3.7.16
conda activate FRU_Adapter
pip install -r requirements.txt

➡️ Data Preparation

Please follow the files (e.g., dfew.py) in preprocess for data preparation.

Specifically, you need to enerate annotations for dataloader ("<path_to_video> <video_class>" in annotations). The annotation usually includes train.csv, val.csv and test.csv. The format of *.csv file is like:

dataset_root/video_1  label_1
dataset_root/video_2  label_2
dataset_root/video_3  label_3
...
dataset_root/video_N  label_N

An example of train.csv of DFEW fold1 (fd1) is shown as follows:

/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02522 5
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02536 5
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02578 6
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02581 5

To help you understand, DFEW_ZIP/Clip is divided into the 'clip_224x224_16f' and 'clip_224x224' datasets. In this part, we use 16-frame clip so we adopt to 'clip_224x224_16f' dataset. 'dataset_root' refers to /home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f, 'video_1' refers to 02522, and 'label_1' refers to 5.

Fine-tune with pre-trained weights

  1. Download the pre-trained weights from google drive and move it to the saved/model/pretrain.

  2. Run the following command to fine-tune the model on the target dataset.

  • main.sh: 16-frame. it use clip_224x224_16f in DFEW (16-frame uniform sampling. this script only use in DFEW, because the DFEW dataset has 16-frame and org-frame clip, the other dataset have only org frame.)
  • main_org.sh: org frame (e.g., 64 frame, 128 frame, etc.) it use clip_224x224 in DFEW (2-clip average results)
  • DFEW
scripts/dfew/main_org.sh 
scripts/dfew/main.sh 
  • FERV39k
scripts/FERV39k/main_org.sh 
  • MAFW
scripts/mafw/main_org.sh

📋 Reported Results and Fine-tuned Weights

The fine-tuned checkpoints (DFEW, FERV39k) can be downloaded from google drive and MAFW's checkpoint can be downloaded from google drive. and move it to the eval ckpts directory.

  • DFEW
scripts/dfew/main_org_eval.sh 
scripts/dfew/main_eval.sh 
  • FERV39k (16-frame uniform sampling)
scripts/FERV39k/main_org_eval.sh 
  • MAFW (16-frame uniform sampling)
scripts/mafw/main_org_eval.sh 
Datasets 16-frame uniformsampling 2-clip average
UAR WAR UAR WAR
FERV39K
FERV39K 38.65 50.12 41.08 52.70
DFEW
DFEW01 66.12 77.22 64.28 76.89
DFEW02 63.12 75.13 63.85 74.88
DFEW03 64.79 76.84 65.78 76.37
DFEW04 66.14 77.05 66.39 78.20
DFEW05 69.91 78.55 69.10 78.55
DFEW 66.02 76.96 65.88 76.98
MAFW
MAFW01 34.70 48.29 38.42 51.82
MAFW02 41.66 55.58 42.13 56.18
MAFW03 49.21 62.41 48.40 62.25
MAFW04 46.58 64.08 49.36 65.23
MAFW05 41.89 58.77 44.40 61.17
MAFW 42.80 57.83 44.54 59.33

☎️ Contact

If you have any questions, please feel free to reach me out at gblader@naver.com.

👍 Acknowledgements

This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program (IITP-2025-RS-2022-00156295) supervised by the IITP(Institute for Information Communications Technology Planning Evaluation).

✏️ Citation

If you think this project is helpful, please feel free to leave a star⭐️ and cite our paper:

@article{her2025fru,
  title={FRU-Adapter: Frame Recalibration Unit Adapter for Dynamic Facial Expression Recognition},
  author={Her, Myungbeom and Nabi, Hamza Ghulam and Han, Ji-Hyeong},
  journal={Electronics},
  volume={14},
  number={5},
  pages={978},
  year={2025},
  publisher={MDPI}
}

About

parameter-efficient finetuning method for dynamic faical expression recongition (Electronics)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published