FRU-Adapter: Frame recalibration unit adapter for dynamic facial expression recognition

MyungbeomHer, Hamza Ghulam Nabi, and Ji-HyeongHan*

Seoul National University of Science and Technology & HCIR Lab

📰 News

[2025.10.08] FRU-Adapter is surpassed to smart-turn v3 in Turn Detection Task github
[2025.09.24] FRU-Adapter is surpassed to KOBERT in Name Entity Recognition (NER) Task github
[2025.02.28] FRU-Adapter is published for Electronics
[2025.02.21] FRU-Adapter is accepted for Electronics
[2025.01.16] We upload the code of FRU-Adapter

✨ Overview

Dynamic facial expression recognition (DFER) is one of the important challenges in computer vision, as it plays a crucial role in human-computer interaction. Recently, adapter-based approaches have been introduced to the DFER and they have achieved remarkable success. However, the adapters still suffer from the following problems: overlooking irrelevant frames and interference with pre-trained information. In this paper, we propose a frame recalibration unit adapter (FRU Adapter), which combines the strengths of frame recalibration unit (FRU) and temporal self-attention (T-SA) to address the aforementioned issues.

#Frame recalibration unit adapter
class FRU_Adapter(nn.Module):
    def __init__(self,
                 channel = 197,
                 embded_dim = 1024,
                 Frame = 16,
                 hidden_dim = 128):
        super().__init__()

        self.Frame = Frame

        self.linear1 = nn.Linear(embded_dim ,hidden_dim)
        self.linear2 = nn.Linear(hidden_dim,embded_dim)

        self.T_linear1 = nn.Linear(Frame, Frame)
        self.softmax = nn.Softmax(dim=1)
        self.ln = nn.LayerNorm(hidden_dim)
        
        self.TFormer = TemporalTransformer(frame=Frame,emb_dim=hidden_dim)

    #Frame recalibration unit
    def FRU(self, x):
        x1 = x.mean(-1).flatten(1) # bn t 
        x1 = self.T_linear1(x1) # bn t
        x1 = self.softmax(x1).unsqueeze(-1) #bn t 1
        x = x * x1 #bn t d
        return x 
    
    def forward(self, x):
        #x = bt N D 
        bt, n,d = x.shape
        x = rearrange(x, '(b t) n d-> (b n) t d', t = self.Frame, n = n, d = d)

        x = self.linear1(x) # bn t d
        x = self.ln(x) 

        _, _,down = x.shape

        x = rearrange(x, '(b n) t d-> b t (n d)', t = self.Frame, n = n, d = down)
        x = self.FRU(x)
        x = rearrange(x, 'b t (n d)-> (b n) t d', t = self.Frame, n = n, d = down)

        x = self.TFormer(x)
        x = self.linear2(x) # bn t d
        #bt n d
        x = rearrange(x, '(b n) t d-> (b t) n d', t = self.Frame, n = n, d = d)
        return x

models_vit.py

🚀 Main Results

✨ Dynamic Facial Expression Recognition

🔨 Installation

Run the following command to make virtual environments

conda create -n FRU_Adapter python=3.7.16
conda activate FRU_Adapter
pip install -r requirements.txt

➡️ Data Preparation

Please follow the files (e.g., dfew.py) in preprocess for data preparation.

Specifically, you need to enerate annotations for dataloader ("<path_to_video> <video_class>" in annotations). The annotation usually includes train.csv, val.csv and test.csv. The format of *.csv file is like:

dataset_root/video_1  label_1
dataset_root/video_2  label_2
dataset_root/video_3  label_3
...
dataset_root/video_N  label_N

An example of train.csv of DFEW fold1 (fd1) is shown as follows:

/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02522 5
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02536 5
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02578 6
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02581 5

To help you understand, DFEW_ZIP/Clip is divided into the 'clip_224x224_16f' and 'clip_224x224' datasets. In this part, we use 16-frame clip so we adopt to 'clip_224x224_16f' dataset. 'dataset_root' refers to /home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f, 'video_1' refers to 02522, and 'label_1' refers to 5.

Fine-tune with pre-trained weights

Download the pre-trained weights from google drive and move it to the saved/model/pretrain.
Run the following command to fine-tune the model on the target dataset.

main.sh: 16-frame. it use clip_224x224_16f in DFEW (16-frame uniform sampling. this script only use in DFEW, because the DFEW dataset has 16-frame and org-frame clip, the other dataset have only org frame.)
main_org.sh: org frame (e.g., 64 frame, 128 frame, etc.) it use clip_224x224 in DFEW (2-clip average results)
DFEW

scripts/dfew/main_org.sh 
scripts/dfew/main.sh

FERV39k

scripts/FERV39k/main_org.sh

MAFW

scripts/mafw/main_org.sh

📋 Reported Results and Fine-tuned Weights

The fine-tuned checkpoints (DFEW, FERV39k) can be downloaded from google drive and MAFW's checkpoint can be downloaded from google drive. and move it to the eval ckpts directory.

DFEW

scripts/dfew/main_org_eval.sh 
scripts/dfew/main_eval.sh

FERV39k (16-frame uniform sampling)

scripts/FERV39k/main_org_eval.sh

MAFW (16-frame uniform sampling)

scripts/mafw/main_org_eval.sh

Datasets	16-frame uniformsampling		2-clip average
Datasets	UAR	WAR	UAR	WAR
FERV39K
FERV39K	38.65	50.12	41.08	52.70
DFEW
DFEW01	66.12	77.22	64.28	76.89
DFEW02	63.12	75.13	63.85	74.88
DFEW03	64.79	76.84	65.78	76.37
DFEW04	66.14	77.05	66.39	78.20
DFEW05	69.91	78.55	69.10	78.55
DFEW	66.02	76.96	65.88	76.98
MAFW
MAFW01	34.70	48.29	38.42	51.82
MAFW02	41.66	55.58	42.13	56.18
MAFW03	49.21	62.41	48.40	62.25
MAFW04	46.58	64.08	49.36	65.23
MAFW05	41.89	58.77	44.40	61.17
MAFW	42.80	57.83	44.54	59.33

☎️ Contact

If you have any questions, please feel free to reach me out at gblader@naver.com.

👍 Acknowledgements

This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program (IITP-2025-RS-2022-00156295) supervised by the IITP(Institute for Information Communications Technology Planning Evaluation).

✏️ Citation

If you think this project is helpful, please feel free to leave a star⭐️ and cite our paper:

@article{her2025fru,
  title={FRU-Adapter: Frame Recalibration Unit Adapter for Dynamic Facial Expression Recognition},
  author={Her, Myungbeom and Nabi, Hamza Ghulam and Han, Ji-Hyeong},
  journal={Electronics},
  volume={14},
  number={5},
  pages={978},
  year={2025},
  publisher={MDPI}
}

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
ablationstudy		ablationstudy
figs		figs
other_adapter		other_adapter
preprocess		preprocess
saved		saved
scripts		scripts
utils_SDL		utils_SDL
LICENSE		LICENSE
README.md		README.md
SDL.py		SDL.py
datasets_16f.py		datasets_16f.py
datasets_org.py		datasets_org.py
engine_for_confusionmatrix.py		engine_for_confusionmatrix.py
engine_for_finetuning.py		engine_for_finetuning.py
engine_for_finetuning_SDL.py		engine_for_finetuning_SDL.py
environment.yml		environment.yml
functional.py		functional.py
kinetics_16f.py		kinetics_16f.py
kinetics_confusionmatrix.py		kinetics_confusionmatrix.py
kinetics_org.py		kinetics_org.py
loss.py		loss.py
main_16f.py		main_16f.py
main_org.py		main_org.py
main_org_SDL.py		main_org_SDL.py
masking_generator.py		masking_generator.py
mixup.py		mixup.py
models_vit.py		models_vit.py
optim_factory.py		optim_factory.py
rand_augment.py		rand_augment.py
random_erasing.py		random_erasing.py
requirements.txt		requirements.txt
sampler.py		sampler.py
ssv2.py		ssv2.py
transforms.py		transforms.py
utils.py		utils.py
video_transforms.py		video_transforms.py
volume_transforms.py		volume_transforms.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FRU-Adapter: Frame recalibration unit adapter for dynamic facial expression recognition

📰 News

✨ Overview

🚀 Main Results

✨ Dynamic Facial Expression Recognition

🔨 Installation

➡️ Data Preparation

Fine-tune with pre-trained weights

📋 Reported Results and Fine-tuned Weights

☎️ Contact

👍 Acknowledgements

✏️ Citation

About

Uh oh!

Releases

Packages

Languages

License

SeoulTech-HCIRLab/FRU-Adapter

Folders and files

Latest commit

History

Repository files navigation

FRU-Adapter: Frame recalibration unit adapter for dynamic facial expression recognition

📰 News

✨ Overview

🚀 Main Results

✨ Dynamic Facial Expression Recognition

🔨 Installation

➡️ Data Preparation

Fine-tune with pre-trained weights

📋 Reported Results and Fine-tuned Weights

☎️ Contact

👍 Acknowledgements

✏️ Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages