FRU-Adapter: Frame recalibration unit adapter for dynamic facial expression recognition
MyungbeomHer, Hamza Ghulam Nabi, and Ji-HyeongHan*
Seoul National University of Science and Technology & HCIR Lab
[2025.10.08] FRU-Adapter is surpassed to smart-turn v3 in Turn Detection Task github
[2025.09.24] FRU-Adapter is surpassed to KOBERT in Name Entity Recognition (NER) Task github
[2025.02.28] FRU-Adapter is published for Electronics
[2025.02.21] FRU-Adapter is accepted for Electronics
[2025.01.16] We upload the code of FRU-Adapter
Dynamic facial expression recognition (DFER) is one of the important challenges in computer vision, as it plays a crucial role in human-computer interaction. Recently, adapter-based approaches have been introduced to the DFER and they have achieved remarkable success. However, the adapters still suffer from the following problems: overlooking irrelevant frames and interference with pre-trained information. In this paper, we propose a frame recalibration unit adapter (FRU Adapter), which combines the strengths of frame recalibration unit (FRU) and temporal self-attention (T-SA) to address the aforementioned issues.
#Frame recalibration unit adapter
class FRU_Adapter(nn.Module):
def __init__(self,
channel = 197,
embded_dim = 1024,
Frame = 16,
hidden_dim = 128):
super().__init__()
self.Frame = Frame
self.linear1 = nn.Linear(embded_dim ,hidden_dim)
self.linear2 = nn.Linear(hidden_dim,embded_dim)
self.T_linear1 = nn.Linear(Frame, Frame)
self.softmax = nn.Softmax(dim=1)
self.ln = nn.LayerNorm(hidden_dim)
self.TFormer = TemporalTransformer(frame=Frame,emb_dim=hidden_dim)
#Frame recalibration unit
def FRU(self, x):
x1 = x.mean(-1).flatten(1) # bn t
x1 = self.T_linear1(x1) # bn t
x1 = self.softmax(x1).unsqueeze(-1) #bn t 1
x = x * x1 #bn t d
return x
def forward(self, x):
#x = bt N D
bt, n,d = x.shape
x = rearrange(x, '(b t) n d-> (b n) t d', t = self.Frame, n = n, d = d)
x = self.linear1(x) # bn t d
x = self.ln(x)
_, _,down = x.shape
x = rearrange(x, '(b n) t d-> b t (n d)', t = self.Frame, n = n, d = down)
x = self.FRU(x)
x = rearrange(x, 'b t (n d)-> (b n) t d', t = self.Frame, n = n, d = down)
x = self.TFormer(x)
x = self.linear2(x) # bn t d
#bt n d
x = rearrange(x, '(b n) t d-> (b t) n d', t = self.Frame, n = n, d = d)
return x
Run the following command to make virtual environments
conda create -n FRU_Adapter python=3.7.16
conda activate FRU_Adapter
pip install -r requirements.txtPlease follow the files (e.g., dfew.py) in preprocess for data preparation.
Specifically, you need to enerate annotations for dataloader ("<path_to_video> <video_class>" in annotations).
The annotation usually includes train.csv, val.csv and test.csv. The format of *.csv file is like:
dataset_root/video_1 label_1
dataset_root/video_2 label_2
dataset_root/video_3 label_3
...
dataset_root/video_N label_N
An example of train.csv of DFEW fold1 (fd1) is shown as follows:
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02522 5
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02536 5
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02578 6
/home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f/02581 5
To help you understand, DFEW_ZIP/Clip is divided into the 'clip_224x224_16f' and 'clip_224x224' datasets. In this part, we use 16-frame clip so we adopt to 'clip_224x224_16f' dataset. 'dataset_root' refers to /home/gpuadmin/MB/DFEW_ZIP/Clip/clip_224x224_16f, 'video_1' refers to 02522, and 'label_1' refers to 5.
-
Download the pre-trained weights from google drive and move it to the saved/model/pretrain.
-
Run the following command to fine-tune the model on the target dataset.
- main.sh: 16-frame. it use clip_224x224_16f in DFEW (16-frame uniform sampling. this script only use in DFEW, because the DFEW dataset has 16-frame and org-frame clip, the other dataset have only org frame.)
- main_org.sh: org frame (e.g., 64 frame, 128 frame, etc.) it use clip_224x224 in DFEW (2-clip average results)
- DFEW
scripts/dfew/main_org.sh
scripts/dfew/main.sh - FERV39k
scripts/FERV39k/main_org.sh - MAFW
scripts/mafw/main_org.sh
The fine-tuned checkpoints (DFEW, FERV39k) can be downloaded from google drive and MAFW's checkpoint can be downloaded from google drive. and move it to the eval ckpts directory.
- DFEW
scripts/dfew/main_org_eval.sh
scripts/dfew/main_eval.sh - FERV39k (16-frame uniform sampling)
scripts/FERV39k/main_org_eval.sh - MAFW (16-frame uniform sampling)
scripts/mafw/main_org_eval.sh
| Datasets | 16-frame uniformsampling | 2-clip average | ||
|---|---|---|---|---|
| UAR | WAR | UAR | WAR | |
| FERV39K | ||||
| FERV39K | 38.65 | 50.12 | 41.08 | 52.70 |
| DFEW | ||||
| DFEW01 | 66.12 | 77.22 | 64.28 | 76.89 |
| DFEW02 | 63.12 | 75.13 | 63.85 | 74.88 |
| DFEW03 | 64.79 | 76.84 | 65.78 | 76.37 |
| DFEW04 | 66.14 | 77.05 | 66.39 | 78.20 |
| DFEW05 | 69.91 | 78.55 | 69.10 | 78.55 |
| DFEW | 66.02 | 76.96 | 65.88 | 76.98 |
| MAFW | ||||
| MAFW01 | 34.70 | 48.29 | 38.42 | 51.82 |
| MAFW02 | 41.66 | 55.58 | 42.13 | 56.18 |
| MAFW03 | 49.21 | 62.41 | 48.40 | 62.25 |
| MAFW04 | 46.58 | 64.08 | 49.36 | 65.23 |
| MAFW05 | 41.89 | 58.77 | 44.40 | 61.17 |
| MAFW | 42.80 | 57.83 | 44.54 | 59.33 |
If you have any questions, please feel free to reach me out at gblader@naver.com.
This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program (IITP-2025-RS-2022-00156295) supervised by the IITP(Institute for Information Communications Technology Planning Evaluation).
If you think this project is helpful, please feel free to leave a star⭐️ and cite our paper:
@article{her2025fru,
title={FRU-Adapter: Frame Recalibration Unit Adapter for Dynamic Facial Expression Recognition},
author={Her, Myungbeom and Nabi, Hamza Ghulam and Han, Ji-Hyeong},
journal={Electronics},
volume={14},
number={5},
pages={978},
year={2025},
publisher={MDPI}
}
