LIDC-IDRI dataset harmonization

LIDC-IDRI dataset includes a large-scale annotated CT images. The original LIDC data is stored in Dicom series format and the corresponding annoations were stored as XML file format. Standard PyLIDC library was designed to read the data and extract the images, segmentation masks, or some labeled features. In different studies, researchers have employed this dataset in different ways. For example, many people computed the overlapping regions of masks annotated by different experts while some used union regions as segmentation masks. Another example is related to the number of manual annotations for each nodule. In some publicatoins, they employed only the nodules which were annotated by at least two experts, while some other employed all the nodules. This simple script aims to extract comprehensive data from the LIDC in a harmonized way for further analyses. In specific, the script returns:

For each patient ID, a directory containing:
Orignal volume in .nii.gz format
segmentation masks in .nii.gz format for all the labeled nodules (both overlapped and union masks)
images and masks are saved in full image size and same spacing system
A csv file containing:
Name of the saved images
name of the saved segmentation masks(both overlapped and union masks)
Number of annotation for each nodule(mask)
Malignancy score
calcificatoin score
nodule volume
nodule diameter
lobulatoin score
spiculation score
sphericity score
subtlety score
nodule surface
texture score

set-up

Clone the git project:

https://github.com/Astarakee/lidc.git

Changed the directory to the cloned folder:

cd lidc

Install the required libraries:

pip install -r requirements.txt

Create the PyLIDC configutatin file and save it in the home directory: /home/[user]/.pylidcrc This config file should contain:

[dicom]
path = absolute_path_to_original_LIDC_dir
warn = True

Execute the script:

python main.py -i absolute_path_to_original_LIDC_dir -o absolute_path_to_saving_dir

TODO

Add requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
utils		utils
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LIDC-IDRI dataset harmonization

set-up

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LIDC-IDRI dataset harmonization

set-up

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages