FakeFormer: Efficient Vulnerability-Driven Transformers for Generalisable Deepfake Detection

This is an official implementation for FakeFormer! [📜Paper]

Contact: dat.nguyen@uni.lu. Any questions or discussions are welcome!

Updates

27/11/2025: Release official code and pretrained weights 🌈.
06/09/2024: First version pre-released for this open source code 🌱.

Abstract

Recently, Vision Transformers (ViTs) have achieved unprecedented effectiveness in the general domain of image classification. Nonetheless, these models remain underexplored in the field of deepfake detection, given their lower performance as compared to Convolution Neural Networks (CNNs) in that specific context. In this paper, we start by investigating why plain ViT architectures exhibit a suboptimal performance when dealing with the detection of facial forgeries. Our analysis reveals that, as compared to CNNs, ViT struggles to model localized forgery artifacts that typically characterize deepfakes. Based on this observation, we propose a deepfake detection framework called FakeFormer, which extends ViTs to enforce the extraction of subtle inconsistency-prone information. For that purpose, an explicit attention learning guided by artifact-vulnerable patches and tailored to ViTs is introduced. Extensive experiments are conducted on diverse well-known datasets, including FF++, Celeb-DF, WildDeepfake, DFD, DFDCP, and DFDC. The results show that FakeFormer outperforms the state-of-the-art in terms of generalization and computational cost, without the need for large-scale training datasets.

Main Results

Results on 6 datasets (CDF1, CDF2, DFW, DFD, DFDC, DFDCP) under cross-dataset evaluation setting reported by AP and AUC.

CDF1

CDF2

DFW

DFD

DFDC

DFDCP

AUC	AP
97.25	98.36

AUC	AP
94.45	97.15

AUC	AP
81.74	83.72

AUC	AP
96.12	98.31

AUC	AP
78.91	80.01

AUC	AP
96.30	99.50

Recommended Environment

For experimental purposes, we encourage the installation of the following libraries. Both Conda or Python virtual env should work.

CUDA: 11.4
Python: >= 3.8.x
PyTorch: 1.8.0
TensorboardX: 2.5.1
ImgAug: 0.4.0
Scikit-image: 0.17.2
Torchvision: 0.9.0
Albumentations: 1.1.0
mmcv: 1.6.1

Pre-trained Models

📌 The pre-trained weights of FakeFormer and FakeSwin can be found here

Docker Build (Optional)

We further provide an optional Docker file that can be used to build a working env with Docker. More detailed steps can be found here.

Install docker to the system (skip the step if docker has already been installed):
```
sudo apt install docker
```
To start your docker environment, please go to the folder dockerfiles:
```
cd dockerfiles
```
Create a docker image (you can put any name you want):
```
docker build --tag 'fakeformer' .
```

Quickstart

Preparation

Prepare environment

Installing main packages as the recommended environment. Note that we recommend building mmcv from source as below.

git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
git checkout v1.6.1
MMCV_WITH_OPS=1 pip install -e .

Prepare dataset

Downloading FF++ Original dataset for training data preparation. Following the original split convention, it is first used to randomly extract frames and facial crops:

python package_utils/images_crop.py -d {dataset} \
-c {compression} \
-n {num_frames} \
-t {task}

(This script can also be utilized for cropping faces in other datasets such as CDF1, CDF2, DFD, DFDCP, DFDC for cross-evaluation test. You do not need to run crop for DFW as the data is already preprocessed).

Parameter	Value	Definition
-d	Subfolder in each dataset. For example: ['Face2Face','Deepfakes','FaceSwap','NeuralTextures', ...]	You can use one of those datasets.
-c	['raw','c23','c40']	You can use one of those compressions
-n	128	Number of frames (default 32 for val/test and 128 for train)
-t	['train', 'val', 'test']	Default train

These faces cropped are saved for online pseudo-fake generation in the training process, following the data structure below:

ROOT = '/data/deepfake_cluster/datasets_df'
└── Celeb-DFv2
    └──...
└── FF++
    └── c0
        ├── test
        │   └── frames
        │       └── Deepfakes
        |           ├── 000_003
        |           ├── 044_945
        |           ├── 138_142
        |           ├── ...
        │       ├── Face2Face
        │       ├── FaceSwap
        │       ├── NeuralTextures
        │       └── original
        |   └── videos
        ├── train
        │   └── frames
        │       └── aligned
        |           ├── 001
        |           ├── 002
        |           ├── ...  
        │       └── original
        |           ├── 001
        |           ├── 002
        |           ├── ...
        |   └── videos
        └── val
            └── frames
                ├── aligned
                └── original
            └── videos

Downloading Dlib [68] [81] facial landmarks detector pretrained and place into /pretrained/. whereas the 68 and 81 will be used for the BI and SBI synthesis, respectively.
Landmarks detection and alignment. At the same time, a folder for aligned images (aligned) is automatically created with the same directory tree as the original one. After completing the following script running, a file that stores metadata information of the data is saved at processed_data/c0/{SPLIT}_<n_landmarks>_FF++_processed.json.
```
python package_utils/geo_landmarks_extraction.py \
--config configs/data_preprocessing_c0.yaml \
--extract_landmarks \
--save_aligned
```
(Optional) Finally, if using BI synthesis, for the online pseudo-fake generation scheme, 30 similar landmarks are searched for each facial query image beforehand.
```
python package_utils/bi_online_generation.py \
-t search_similar_lms \
-f processed_data/c0/{SPLIT}_68_FF++_processed.json 
```
The final annotation file for training is created as processed_data/c0/dynamic_{SPLIT}BI_FF.json

Training script

We offer a number of config files for specific data synthesis. For FakeFormer with BI, open configs/spatial/vit_bi_small.yaml, please make sure you set TRAIN: True and FROM_FILE: True and run:
```
./scripts/vit_bi.sh
```
Otherwise, with SBI, with the config file configs/spatial/vit_sbi_small.yaml:
```
./scripts/vit_sbi.sh
```
You can also find other configs for FakeSwin in the configs/ folder.
Testing script

For FakeFormer with BI, open configs/spatial/vit_bi_small.yaml, with subtask: eval in the test section, we support evaluation mode, please turn off TRAIN: False and FROM_FILE: False and run:
```
./scripts/test_bi.sh
```
Otherwise, for SBI
```
./scripts/test_sbi.sh
```
⚠️ Please make sure you set the correct path to your downloaded pre-trained weights in the config files.

ℹ️ Flip test can be used by setting flip_test: True

ℹ️ The mode for single image inference is also provided, please set sub_task: test_image and pass an image path as an argument in test.py

Contact

Please contact dat.nguyen@uni.lu. Any questions or discussions are welcome!

License

Acknowledge

We acknowledge the excellent implementation from OpenMMLab (mmengine, mmcv), BI, SBI, and LAA-Net.

Citation

Please kindly consider citing our papers in your publications.

@article{nguyen2024fakeformer,
  title={Fakeformer: Efficient vulnerability-driven transformers for generalisable deepfake detection},
  author={Nguyen, Dat and Astrid, Marcella and Ghorbel, Enjie and Aouada, Djamila},
  journal={arXiv preprint arXiv:2410.21964},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
datasets		datasets
demo		demo
dockerfiles		dockerfiles
lib		lib
logs		logs
losses		losses
models		models
package_utils		package_utils
register		register
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
Third_Party_License_Notice		Third_Party_License_Notice

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FakeFormer: Efficient Vulnerability-Driven Transformers for Generalisable Deepfake Detection

Updates

Abstract

Main Results

Recommended Environment

Pre-trained Models

Docker Build (Optional)

Quickstart

Contact

License

Acknowledge

Citation

About

Uh oh!

Releases

Packages

Languages

License

10Ring/FakeFormer

Folders and files

Latest commit

History

Repository files navigation

FakeFormer: Efficient Vulnerability-Driven Transformers for Generalisable Deepfake Detection

Updates

Abstract

Main Results

Recommended Environment

Pre-trained Models

Docker Build (Optional)

Quickstart

Contact

License

Acknowledge

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages