Face ID

Overview

This project implements a face verification system using a model inspired from SNN, trained with contrastive Loss.

Predict whether two input face images represent the same person.
The model is designed to learn how to embed pairs of pictures such that two embedded images are closed to each other in the vector space if they come from the same person, whilst, two embedded images are far from each other if they come from different people.

Model Architecture

The model consists of three primary stages: image preprocessing, image embedding, distance-based comparison.

Image Preprocess

Every image taken from the dataset undergoes the following steps:

MTCNN: crops images with margin 30 such that faces are properly captured, and images are resized to 105 x 105 px to fit the input of the model.
Transformation Pipeline:
- Converts image to float in range [0, 1], normalizes image using mean and standard deviation computed from the training dataset. (helps stabilize training)
- Applies RandomHorizontalFlip() for data augmentation. (training only)

Image Embedding / Encoder

Architecture:

4 convolution blocks with RELU and MaxPool (4th layer has no max-pooling).
1 linear layer after flattening.
Output 4096-dimensional vector.
Euclidean Distance layer: computes the distance between 2 embedding vectors.

Loss Function

The model is trained using Contrastive Loss.
Adding L2 regularization to the loss function to mitigate overfitting.

Training

To train the model, simply run:

python3 train.py

This program automatically does:

Load training and validation datasets from dataset/train_ds.npy and dataset/val_ds.npy
Caches the preprocessed images
Optimizes the model and saves the model's state (checkpoint) after each epoch.

If batches/ folder does not exist, please read below:

Notice Please uncomment these 2 lines if this is the first time of running this program or want to override both batches/train and batches/validate folders:

cache_images(dataloader_train, mode="train")
cache_images(dataloader_val, mode="validate")

To speed up training and avoid redundant image preprocessing after each epoch, we cache all preprocessed images on first run. From now, you can comment them and make sure it remains intact.

Evaluation

To evaluate the model with validation set, run:

python3 evaluate.py

Evaluate the trained model over a range of thresholds (from 0.4 to 0.8) with the validation dataset.
Print the following metrics for each threshold:
- F1-score
- Accuracy
- Precision
- Recall

Inference

To test the model with test dataset, run:

python3 inference.py

Test the model with the test dataset on a fixed threshold=0.459.
Print the following metrics:
- F1-score
- Accuracy
- Precision
- Recall

Installation

Install dependencies, run:

pip install -r requirements.txt

Data Preparation

Download VGGFACE2 dataset (https://www.kaggle.com/datasets/hearfool/vggface2?select=train). Make sure to note the name of the downloaded folder.
To split this newly downloaded dataset into train/validate/test sets, run:

python3 prepare_dataset.py --split <FOLDER'S_NAME>

After getting 3 new folders data/train, data/validate, data/test, in order to remove the images that does not have proper human's face, run (Optional) :

python3 prepare_dataset.py --remove-bad

To generate train dataset with labelled data, run:

python3 prepare_dataset.py --train-dataset-generate

For more information regarding this and other steps, run:

python3 prepare_dataset.py --help

To generate validation dataset with labelled data, run:

python3 prepare_dataset.py --validate-dataset-generate

To generate train dataset with labelled data, run:

python3 prepare_dataset.py --test-dataset-generate

Since now, there are 2 folders:

data contains 3 sets and data is unlabelled within each set. ``
dataset contains 3 files .npy, representing 3 different sets, data is labelled.
To compute the mean and standard deviation from the train dataset, simply run:

python3 compute_mean_std.py

This makes a file named mean_std.pt stored in data/

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data_manipulate		data_manipulate
model		model
script		script
.gitignore		.gitignore
README.md		README.md
compute_mean_std.py		compute_mean_std.py
evaluate.py		evaluate.py
inference.py		inference.py
prepare_dataset.py		prepare_dataset.py
requirements.txt		requirements.txt
server.py		server.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Face ID

Table of Contents

Overview

Model Architecture

Image Preprocess

Image Embedding / Encoder

Loss Function

Training

Evaluation

Inference

Installation

Data Preparation

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Face ID

Table of Contents

Overview

Model Architecture

Image Preprocess

Image Embedding / Encoder

Loss Function

Training

Evaluation

Inference

Installation

Data Preparation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages