This repository contains the code for the paper: Automated Detection of Benign and Malignant Skin Lesions from Reflectance Confocal Microscopy Images Using Deep Learning
This project implements deep learning models for analyzing Reflectance Confocal Microscopy (RCM) images, specifically:
-
Layer Classification: ResNet18-based model for identifying anatomical skin layers (epidermis, DEJ, dermis, poor quality)
-
Lesion Classification: ResNet34+GRU model for benign/malignant lesion classification
rcm-analysis-paper/
├── README.md # Project documentation
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore file
├── src/ # Source code
│ ├── data/
│ │ ├── __init__.py
│ │ ├── layer_dataset.py # Dataset for layer classification
│ │ └── lesion_dataset.py # Dataset for lesion classification (HDF5)
│ ├── models/
│ │ ├── __init__.py
│ │ ├── layer_classifier.py # ResNet18 for layer classification
│ │ ├── lesion_classifier.py # ResNet34+GRU for lesion classification
│ │ ├── bestRes18model_epoch.ckpt # Best ResNet18 model checkpoint (for layer identification task)
│ │ └── bestlesion_GRU_model.ckpt # Best Lesion GRU model checkpoint (for lesion classification)
│ ├── training/
│ │ ├── __init__.py
│ │ ├── train_layers.py # Layer pre-training (DDP)
│ │ ├── finetune_layers.py # Layer fine-tuning (4th gen)
│ │ ├── layer_finetune_lightning.py # Lightning module for layer fine-tuning
│ │ ├── train_lesions.py # Lesion training (Lightning)
│ │ ├── finetune_lesions.py # Lesion fine-tuning (K-fold CV)
│ │ ├── lesion_lightning_module.py # Lightning module for lesion training
│ │ └── lesion_finetune_lightning.py # Lightning module for lesion fine-tuning
│ ├── evaluation/
│ │ ├── __init__.py
│ │ ├── evaluate_layers.py # Layer evaluation with ROC curves
│ │ └── evaluate_lesions.py # Lesion evaluation with ablation study
│ └── utils/
│ └── __init__.py
├── configs/ # Configuration files
│ ├── layer_config.yaml # Layer pre-training config
│ ├── layer_finetune_config.yaml # Layer fine-tuning config
│ ├── layer_evaluation_config.yaml # Layer evaluation config
│ ├── lesion_config.yaml # Lesion pre-training config
│ ├── lesion_finetune_config.yaml # Lesion fine-tuning config
│ └── lesion_evaluation_config.yaml # Lesion evaluation config
├── scripts/ # Shell scripts
│ ├── run_layer_training.sh # Layer pre-training runner
│ ├── run_layer_finetune.sh # Layer fine-tuning runner
│ ├── run_layer_evaluation.sh # Layer evaluation runner
│ ├── run_lesion_training.sh # Lesion training runner
│ ├── run_lesion_finetune.sh # Lesion fine-tuning runner
│ └── run_lesion_evaluation.sh # Lesion evaluation runner
└── notebooks/
└── RCM_Analysis_Demo.ipynb # Some demonstration
-
Python 3.8.16
-
CUDA-capable GPU(s)
-
PyTorch 1.13.1+cu116
- Clone the repository:
git clone https://github.com/Tofunmi19/rcm-analysis-paper
cd rcm-analysis-paper- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt-
Prepare your data: Ensure your CSV file contains columns path and labels, where labels is a string representation of a 4-element binary vector.
-
Update configuration: Edit configs/layer_config.yaml with your data paths and training parameters.
-
Train the model:
bash scripts/run_layer_training.sh-
Prepare your 4th generation data: Update configs/layer_finetune_config.yaml with your 4th generation data paths.
-
Fine-tune from pre-trained model:
# Fine-tune from pre-trained checkpoint
bash scripts/run_layer_finetune.sh --pretrained_model path/to/pretrained/model.ckpt
# Test run with reduced epochs
bash scripts/run_layer_finetune.sh --pretrained_model path/to/pretrained/model.ckpt --test-
Prepare your data: Ensure your HDF5 file contains lesion groups with images and label datasets.
-
Update configuration: Edit configs/lesion_config.yaml with your data paths and training parameters.
-
Train the model:
# Regular training
bash scripts/run_lesion_training.sh
# Test run with small dataset
bash scripts/run_lesion_training.sh --test
# Fine-tuning from pre-trained checkpoint
bash scripts/run_lesion_training.sh --fine_tune path/to/checkpoint.ckpt-
data.csv_file: Path to 3rd generation training CSV file
-
data.image_dir: Directory containing 3rd generation RCM images
-
training.batch_size: Batch size per GPU
-
training.num_epochs: Number of training epochs
-
training.learning_rate: Learning rate
-
data.csv_file: Path to 4th generation CSV file
-
data.image_dir: Directory containing 4th generation RCM images
-
data.test_size: Fixed test set size (default: 450)
-
fine_tuning.freeze_backbone: Whether to freeze backbone during fine-tuning
-
data.hdf5_file: Path to HDF5 dataset file
-
training.batch_size: Number of lesions per batch (each lesion contains multiple images)
-
model.hidden_size: GRU hidden dimension
-
sequence.max_length: Maximum sequence length per lesion
-
Base: ResNet18
-
Modifications:
- First convolutional layer adapted for grayscale input (1 channel)
- Custom classifier head for 4-class multi-label classification
-
Input: 128×128 grayscale images
-
Output: 4-dimensional binary vector [epidermis, DEJ, dermis, poor_quality]
-
Base: ResNet34 + GRU
-
Modifications:
- First convolutional layer adapted for grayscale input (1 channel)
- ResNet34 backbone for feature extraction (512-dim features)
- GRU layer for sequence modeling (256 hidden units)
- Custom classifier head for binary classification
-
Input: Sequences of up to 11 RCM images per lesion (224×224)
-
Output: Binary classification (benign/malignant)
-
Training Framework: PyTorch Lightning
-
Distributed Training: Single or multi-GPU training with PyTorch Lightning
-
Mixed Precision: Automatic Mixed Precision (AMP) for faster training
-
Augmentations: Random horizontal flip, rotation, affine transformations
-
Loss Function: Cross-Entropy Loss with optional class weighting
-
Optimizer: AdamW with step learning rate decay
-
Sequence Handling: Variable-length sequences with padding and packing
-
Metrics: Accuracy, Precision, Recall, AUC
-
Checkpointing: Save best models based on multiple metrics
-
Visualization: Confusion matrices and ROC curves
-
Minimum: 1 CUDA-capable GPU with 8GB+ VRAM
-
Recommended: Multiple GPUs for distributed training
-
RAM: 16GB+ system RAM recommended
If you use this code in your research, please cite:
MIT License
Please reach out to the authors of the paper, thank you
- PyTorch team and PyTorch Lightning team for their frameworks
- Original ResNet authors
- Google- Stanford Institute for Human centered Artificial Intelligence (HAI) grant