Bachelor Project: Automated Early Detection of Diabetic Retinopathy in Retinal Fundus Photographs using Deep Learning
Welcome to the 'Bachelor Project: Automated Early Detection of Diabetic Retinopathy in Retinal Fundus Photographs using Deep Learning' repository! This is the Bachelor graduation project of Ghent University.
There are nine python codes(.py) and two jupyther notebook documents with the python language(.ipynb) in the folder py and ipynb respectively. The entire procedure is divided into 5 steps.
- Image preparation
- Split in 5 Folds
- Image Classification
- Visualization
Resizing_dataset.py resizes the images into an equal size, 224×224 pixels in width and height.
(a) A sample image showing the retina partially (2416 x 1736) (b) Resized image of (a) (224 x 224) (c) A sample image showing the retina entirely (2048 x 1536) (d) Resized image of (c) (224 x 224)
fold_split.py splits the initial dataset into five folds for 5-fold cross-validation and makes a CSV file having the columns of image name, class, and fold.
train.py, Test_dataset.py, SubPolicy.py, and CIFAR10Policy.py are used for training the model.
model_alexnet.py, models_vgg.py, Create_CAM_Final.py, and Test_dataset.py are used for creating class activation mapping(CAM). (The first codes are for extracting CAM using model AlexNet and VGGNet. In this study, only the model ResNet was used for extracting CAM as ResNet152 was the best model for the dataset.)
cm_plot_jupyter.ipynb creates the confusion matrix and Transforms vs Albumentations.ipynb calculates the image processing time when the transform from torchvision or the open-source library albumentations is used.
The micro-averaged F1 score was the highest when the dataset was trained with the ResNet152 model. In particular, the highest micro-averaged F1 score was obtained when Fold 3 was used as the test set. Moreover, it can be seen that the standard deviation for each model is all very low, indicating that the microaveraged F1 scores of each fold are closer to the mean.
When the selected best model is evaluated with Fold 3 as the test set, the confusion matrix is as follows.
Among the results of the CAM for the best performing ResNet152 model, those following two results are 'class 2: Moderate' and 'class 3: Severe' respectively. The parts marked with a red ellipse are regions having important features when DR is classified into each class.
The description of the features are in the discussion section of the paper.












