This project implements a face verification system using a model inspired from SNN, trained with contrastive Loss.
-
Predict whether two input face images represent the same person.
-
The model is designed to learn how to embed pairs of pictures such that two embedded images are closed to each other in the vector space if they come from the same person, whilst, two embedded images are far from each other if they come from different people.
The model consists of three primary stages: image preprocessing, image embedding, distance-based comparison.
Every image taken from the dataset undergoes the following steps:
- MTCNN: crops images with margin 30 such that faces are properly captured, and images are resized to 105 x 105 px to fit the input of the model.
- Transformation Pipeline:
- Converts image to float in range
[0, 1], normalizes image using mean and standard deviation computed from the training dataset. (helps stabilize training) - Applies
RandomHorizontalFlip()for data augmentation. (training only)
- Converts image to float in range
Architecture:
- 4 convolution blocks with
RELUandMaxPool(4th layer has no max-pooling). - 1 linear layer after flattening.
- Output 4096-dimensional vector.
- Euclidean Distance layer: computes the distance between 2 embedding vectors.
- The model is trained using Contrastive Loss.
- Adding L2 regularization to the loss function to mitigate overfitting.
To train the model, simply run:
python3 train.py This program automatically does:
- Load training and validation datasets from dataset/train_ds.npy and dataset/val_ds.npy
- Caches the preprocessed images
- Optimizes the model and saves the model's state (checkpoint) after each epoch.
If batches/ folder does not exist, please read below:
Notice
Please uncomment these 2 lines if this is the first time of running this program or want to override both batches/train and batches/validate folders:
cache_images(dataloader_train, mode="train")
cache_images(dataloader_val, mode="validate")To speed up training and avoid redundant image preprocessing after each epoch, we cache all preprocessed images on first run. From now, you can comment them and make sure it remains intact.
To evaluate the model with validation set, run:
python3 evaluate.py- Evaluate the trained model over a range of thresholds (from 0.4 to 0.8) with the validation dataset.
- Print the following metrics for each threshold:
- F1-score
- Accuracy
- Precision
- Recall
To test the model with test dataset, run:
python3 inference.py- Test the model with the test dataset on a fixed threshold=
0.459. - Print the following metrics:
- F1-score
- Accuracy
- Precision
- Recall
Install dependencies, run:
pip install -r requirements.txt-
Download VGGFACE2 dataset (https://www.kaggle.com/datasets/hearfool/vggface2?select=train). Make sure to note the name of the downloaded folder.
-
To split this newly downloaded dataset into train/validate/test sets, run:
python3 prepare_dataset.py --split <FOLDER'S_NAME>- After getting 3 new folders
data/train,data/validate,data/test, in order to remove the images that does not have proper human's face, run (Optional) :
python3 prepare_dataset.py --remove-bad- To generate train dataset with labelled data, run:
python3 prepare_dataset.py --train-dataset-generateFor more information regarding this and other steps, run:
python3 prepare_dataset.py --help- To generate validation dataset with labelled data, run:
python3 prepare_dataset.py --validate-dataset-generate- To generate train dataset with labelled data, run:
python3 prepare_dataset.py --test-dataset-generateSince now, there are 2 folders:
-
datacontains 3 sets and data is unlabelled within each set. `` -
datasetcontains 3 files .npy, representing 3 different sets, data is labelled. -
To compute the
meanandstandard deviationfrom the train dataset, simply run:
python3 compute_mean_std.pyThis makes a file named mean_std.pt stored in data/