This is the codebase for preprocessing data and training unitask models and multitasks models. The data is host at figshare with the following link:
- The
Process_datafolder contains the code for reading the raw input images and linking with the three metadata files to create the labels for anatomical landmarks and lessons. - The
Unet2+_basedata folder contains the code for training unitask models and multitasks models with Unet2+ architecture - The
ESFPNet_basedata folder contains the code for training unitask models and multitasks models with Unet2+ architecture
Step 1: Utilize the script to_json_label_{Anatomical landmark, Lesions}.py for generating labels from datasets for both cancer and non-cancer types
The input for the script includes the following files:
annotation.jsonlabels.jsonobjects.json
The code generates two separate sets of labels for the tasks and categorizes the labels based on cancer and non-cancer types, then save the outputs in the following JSON files:
labels_Lung_lesions.jsonlabels_Anatomical_landmarks.json
Example scripts:
python to_json_label_Anatomical_landmark.py --data_annots ./data/Lung_cancer/annotation.json --data_objects ./data/Lung_cancer/objects.json --data_labels ./data/Lung_cancer/labels.json --path_save ./data/Lung_cancer/labels_Anatomical_landmarks.jsonStep 2: Execute the script combine_json.py to combine labels from both cancer and non-cancer cases for Lesions and Anatomical Landmarks tasks.
The script requires the following input JSON files:
labels_Lung_lesions.json (cancer)labels_Lung_lesions.json (non-cancer)labels_Anatomical_landmarks.json (cancer)labels_Anatomical_landmarks.json (non-cancer)
The code merge the labels for both cancer and non-cancer cases for each task and save the combined outputs in the following JSON files:
labels_Lung_lesions_final.jsonlabels_Anatomical_landmarks_final.json
Example scripts:
python combine_json.py --data_json_labels_cancer ./data/Lung_cancer/labels_Lung_lesions.json --data_json_labels_non_cancer ./data/Non_lung_cancer/labels_Lung_lesions.json --path_save ./data/labels_Lung_lesions_final.jsonStep 3: Run the script annots_to_masks.py to convert annotations into ground truth images for both Lesions and Anatomical Landmarks tasks, considering cancer and non-cancer types.
The script requires the following inputs:
annotation.jsonlabels.jsonobjects.jsonType of tasks(specify either "lesions" or "anatomical landmarks")
Based on the specified task type, generate masks (ground truth) for image segmentations (both cancer and non-cancer cases)
Save the resulting masks as outputs, representing the ground truth for the segmentation of images.
|-- Lung_cancer
| |-- imgs
| | |-- images
| |-- masks_Lung_lesions
| | |-- masks
| |-- masks_Anatomical_landmarks
| | |-- masks
|-- Non_lung_cancer
| |-- imgs
| | |-- images
| |-- masks_Lung_lesions
| | |-- masks
| |-- masks_Anatomical_landmarks
| | |-- masksExample scripts:
python annots_to_mask.py --data_annots ./data/Lung_cancer/annotation.json --data_objects ./data/Lung_cancer/objects.json --data_labels ./data/Lung_cancer/labels.json --path_save ./data/Lung_cancer/masks_Lung_lesions --type label_LesionsAfter all steps in the process data phase, your data structure looks like this:
|-- Lung_cancer
| |-- imgs
| | |-- images
| |-- masks_Lung_lesions <-- After 3rd step
| | |-- masks
| |-- masks_Anatomical_landmarks <-- After 3rd step
| | |-- masks
| |-- labels_Lung_lesions.json <-- After 1st step
| |-- labels_Anatomical_landmarks.json <-- After 1st step
| |-- annotations.json
| |-- objects.json
| |-- labels.json
|-- Non_lung_cancer
| |-- imgs
| | |-- images
| |-- masks_Lung_lesions <-- After 3rd step
| | |-- masks
| |-- masks_Anatomical_landmarks <-- After 3rd step
| | |-- masks
| |-- labels_Lung_lesions.json <-- After 1st step
| |-- labels_Anatomical_landmarks.json <-- After 1st step
| |-- annotations.json
| |-- objects.json
| |-- labels.json
|-- labels_Lung_lesions_final.json <-- After 2nd step
|-- labels_Anatomical_landmarks_final.json <-- After 2nd stepStep 4: Execute the script split_dataset.py to perform the dataset split for images and masks related to Anatomical Landmarks or Lung Lesions.
The script requires the following input parameters:
Labels JSON file(labels_Lung_lesions_final.jsonorlabels_Anatomical_landmarks_final.json)Folder containing cancer images(./Lung_cancer/imgs)Folder containing cancer masks(./Lung_cancer/masks_Lung_lesionsor./Lung_cancer/masks_Anatomical_landmarks)Folder containing non-cancer images(./Non_lung_cancer/imgs)Folder containing non-cancer masks(./Non_lung_cancer/masks_Lung_lesionsor./Non_lung_cancer/masks_Anatomical_landmarks)
The dataset will be split into training, validation, and test sets. The code organizes the outputs into a "dataset" folder, which includes subfolders for train, val, and test. Each of these subfolders comprise two subdirectories: one for images and another for masks.
Example scripts:
python split_dataset.py --label_json_path ./data/labels_Lung_lesions_final.json --path_cancer_imgs ./data/Lung_cancer/imgs --path_non_cancer_imgs ./data/Non_lung_cancer/imgs --path_cancer_masks ./data/Lung_cancer/masks_Lung_lesions --path_non_cancer_masks ./data/Non_lung_cancer/masks_Lung_lesions- Download the pretrained Mixtransformer from this link: Pretrained Model
- Put the pretrained models under "Pretrained" folder
Use train_joint_{Anatomical_Landmarks or Lung_lesions}_ESFPNet.py file to train joint model of ESFPNet baseline.
The script requires the following input parameters:
Label JSON file(labels_Lung_lesions_final.jsonorlabels_Anatomical_landmarks_final.json)Folder containing splitted dataset(./dataset/Anatomical_landmarks or ./dataset/Lung_lesions)
Example scripts:
python train_joint_Anatomical_Landmarks_ESFPNet.py --label_json_path ./data/labels_Anatomical_landmarks_final.json --dataset ./dataset/Anatomical_landmarksUse train_segment_ESFPNet.py file to train segmentation model of ESFPNet baseline.
The script requires the following input parameters:
Label JSON file(labels_Lung_lesions_final.jsonorlabels_Anatomical_landmarks_final.json)Folder containing splitted dataset(./dataset/Anatomical_landmarks or ./dataset/Lung_lesions)Task for saved model(Anatomical_landmarksorLung_lesions) Example scripts:
python train_segment_ESFPNet.py --dataset ./dataset/Anatomical_landmarks --task Anatomical_landmarksUse train_clf_{Anatomical_Landmarks or Lung_lesions}_ESFPNet.py file to train classification model of ESFPNet baseline.
The script requires the following input parameters:
Label JSON file(labels_Lung_lesions_final.jsonorlabels_Anatomical_landmarks_final.json)Folder containing splitted dataset(./dataset/Anatomical_landmarks or ./dataset/Lung_lesions)
Example scripts:
python train_clf_Anatomical_Landmarks_ESFPNet.py --label_json_path ./data/labels_Anatomical_landmarks_final.json --dataset ./dataset/Anatomical_landmarksUse infer_joint_{Anatomical_Landmarks or Lung_lesions}.py file to perform inference on joint model of ESFPNet baseline.
The script requires the following input parameters:
Label JSON file(labels_Lung_lesions_final.jsonorlabels_Anatomical_landmarks_final.json)Folder containing images of test dataset(./dataset/Anatomical_landmarks/test/imgs or ./dataset/Lung_lesions/test/imgs)Folder containing masks of test dataset(./dataset/Anatomical_landmarks/test/imgs or ./dataset/Lung_lesions/test/imgs)Path to saved-model(./SaveModel/Anatomical_Landmarks_multimodel/Mean_best.pt,...)Path to save output images(./output_dir)
Example scripts:
python infer_joint_Anatomical_Landmarks.py --label_json_path ./data/labels_Anatomical_landmarks_final.json --path_imgs_test ./dataset/Anatomical_landmarks/test/imgs --path_masks_test ./dataset/Anatomical_landmarks/test/masks --saved_model ./SaveModel/Anatomical_Landmarks_multimodel/Mean_best.pt --log_dir ./output_dirUse infer_segment.py file to perform inference on segmentation model of ESFPNet baseline.
The script requires the following input parameters:
Folder containing images of test dataset(./dataset/Anatomical_landmarks/test/imgs or ./dataset/Lung_lesions/test/imgs)Folder containing masks of test dataset(./dataset/Anatomical_landmarks/test/imgs or ./dataset/Lung_lesions/test/imgs)Path to saved-model(./SaveModel/Anatomical_landmarks/Segmentation_model.pt,...)Path to save output images(./output_dir)
Example scripts:
python infer_segment.py.py --path_imgs_test ./dataset/Anatomical_landmarks/test/imgs --path_masks_test ./dataset/Anatomical_landmarks/test/masks --saved_model ./SaveModel/Anatomical_landmarks/Segmentation_model.pt --log_dir ./output_dirUse infer_clf_{Anatomical_Landmarks or Lung_lesions}.py file to perform inference on classification model of ESFPNet baseline.
The script requires the following input parameters:
Label JSON file(labels_Lung_lesions_final.jsonorlabels_Anatomical_landmarks_final.json)Folder containing images of test dataset(./dataset/Anatomical_landmarks/test/imgs or ./dataset/Lung_lesions/test/imgs)Folder containing masks of test dataset(./dataset/Anatomical_landmarks/test/imgs or ./dataset/Lung_lesions/test/imgs)Path to saved-model(./SaveModel/Anatomical_landmarks/Classification_model.pt,...)
Example scripts:
python infer_joint_Anatomical_Landmarks.py --label_json_path ./data/labels_Anatomical_landmarks_final.json --path_imgs_test ./dataset/Anatomical_landmarks/test/imgs --path_masks_test ./dataset/Anatomical_landmarks/test/masks --saved_model ./SaveModel/Anatomical_landmarks/Classification_model.pt Using the same scripts as the ESFPNet based model.