- Aim :
- preprocessing of medical neuroimages and extraction of brain features for further numerical analyses;
- usage of standard toolboxes (SPM12, FSL, MrTRix, AMICO, AFQ) embedded in user-friendly interface;
- coverage of different image types: (?) structural imaging (T1w), (?) quantitative imaging (MT, PD, ..), diffusion imaging.
- Quick overview of main functions:
- Image DICOM to NIFTI conversion;
- Creation of multiparametric maps (MT, PD..);
- Image segmentation;
- Creation of neuromorphometric atlases based on structural image;
- Projection of quantitative maps on the neuromorphometric atlas;
- DWI: whole brain tractography, connectome creation, automated fiber quantification (AFQ).
- Concept:
On the top level we can consider pipeline to be consisted of four main parts (Figure 1):
- Configuration file: used for setting up input/output folders (generic, pointing where specific subject folder can be found), and step settings (e.g. templates to use);
- Protocol file: includes all the valid imaging protocols that can be processed in pipeline;
- Code reading the configuration and protocol files aimed on setting up the paths and settings for a particular subject (manual or automatic input of Subject ID);
- Code running the analyses. From the point of developer, this concept helps to keep the code clean, to easily set-up computation for new subject, or to call sub-functions.
Figure 1 Concept of pipeline
- Scheduler: The pipeline is implemented in ProActive scheduler. Advantage of this scheduler is the tracking of the jobs’ evolution, access to log files, automatic jobs’ parallelization in case of several subjects.
- Prerequisites:
The code is written in Matlab R2016b on linux system.
The following software should be installed:
- SPM12;
- Mrtrix3 (no later than version 3.0_RC3-98-gc44a9940) (for diffusion);
- FSL (no later than version 5.0) (for diffusion);
TODO: docker for PIP, so no pre-installations is needed.
- Data structure: It is expected that the data of each subject is contained in the folder that is named by the ID of the subject (Subject_ID_folder). The structure of Subject_ID_folder is following: ./session_number/protocol_name/repeat_number/images_ The list of protocol names is provided in Protocol files (described later). It is expected that the Subject_ID_folder of raw images (used as starting point in the pipe-line) is contained in the folder named with the date of scan (Date_folder).
Example: Root to raw (Dicom) images on the LREN server for subject ‘PR00906_SB070399’ and protocol ‘al_mepi2d_3mm_dropout’: /mnt/HDD2/LREN/Colab/pipeline-dev/Strorage/DataRemoteServer/IRMMP16/prisma/2014/20150709/PR00906_SB070399/1/al_mepi2d_3mm_dropout/2/
The root ‘/mnt/HDD2/LREN/Colab/pipeline-dev/Strorage/DataRemoteServer/IRMMP16/prisma/2014/’ will contain all the Date_folders for all subjects. The Date_folder for this particular subject is ‘20150709’. Subject_ID_folder ‘PR00906_SB070399’. Session_number is ‘1’. Protocol_name is ‘al_mepi2d_3mm_dropout’. Repeat_number is ‘2’. TODO: remove dependency on the Date_folder
- Configuration file:
The aim of the configuration file is to provide all the generic settings to pipeline, mainly the roots where the Date_folder or Subject_ID_folder can be found.
The file written as txt file and can be corrected in any editor. The code reads the file and convert it in Matlab structure variable (Figure 2).
Figure 2 Snapshot of configuration file in txt format (top) and converted in Matlab structure variable (bottom).
Each pipeline step has its subfield in configuration structure (Figure 3).
Figure 3 - Snapshot of configuration settings for “DICOM to NIFTI” and “MPMs” calculation.
Each pipeline step has at least 3 variables (paths) to adapt to your images: InputFolder, TempCalcFolder, OuputFolder.
The philosophy behind this 3 paths is that, first of all, you may want to keep your input and output images in different folders (InputFolder, OuputFolder). Second, if you run analysis on the computer which is different from InputFolder, it would be more efficient (and also safe) to create TempCalcFolder on that computer. It is not obligatory to provide three different paths for these three folders.
IMPORTANT: TempCalcFolder is cleared before each analysis unless it coincides with InputFolder, and after each successful analysis unless it coincides with OuputFolder.
All the provided roots should be generic (not subject depended). The subject specific Date_folder or Subject_ID_folder will be found/written automatically under these paths.
Currently: pip_config_file.txt
- Protocols file:
Protocols file is a txt file that contained all the protocol types that can be used in pipeline (Figure 4).
Figure 4 Snapshot of protocols file.
The file is divided in subsections: Dicom2Nifti, DWI, MPM, MPMOutputs, MPRAGE, ATLASING, fMRI. IMPORTANT: following sections provide list of protocols or names and the order in the list is not important: Dicom2Nifti, MPMOutputs, MPRAGE, ATLASING. However, in the remaining sections (DWI, MPM, MPMOutputs, fMRI), the order of items is important! For example, the items in subsections of DWI: [diffusion], [fieldmap], [ReadoutTimes] should match!
TODO: The current txt file will be replaced by the matlab structure with more proper field names that will intuitively guide for understanding or update of protocol files.
Currently: Protocols_definition.txt
- Setting up analysis:
The matlab file TemplateCommands.m contains the standard setting for the pipeline process.
- First, set up as a working directory the folder that contains the pipeline. Currently:
cd('/mnt/HDD2/LREN/Colab/pipeline-dev/Compute/Code/ProcessingPipeline/Automatic_Computation_LREN/Adeliya/new_code/v4_12.04.2019');
- Next, add to the path directories of the pipeline and spm12. You may use function pip_add_path.m Currently:
pip_add_path('/mnt/HDD2/LREN/Colab/pipeline-dev/Compute/Code/spm12',...
'/mnt/HDD2/LREN/Colab/pipeline-dev/Compute/Code/ProcessingPipeline/Automatic_Computation_LREN/Adeliya/new_code/v4_12.04.2019');
- Load configuration file:
cfg = pip_init_cfg('pip_config_file.txt');
If the path of configuration file is added to the root, no need to provide it (but be sure, no more files with such name in the root). Otherwise, provide full path.
- Set up overwrite variable:
owrite = 1; in case you want any existing file to be overwritten, or
owrite = 0; otherwise
- Choose subjects for calculation: Currently, there are two options for choosing subjects for calculation: manually provide list of subjects, or automatically chose all subjects from the input directory Manually:
Subj_to_compute = pip_init({'PR03536_LG190399', 'PR03536_LG190400'}, cfg, owrite);
Automatically:
Subj_to_compute = pip_init(cfg, owrite);
IMPORTANT: The folder provided under cfg.RawImagesFolder is used for searching subject IDs. It is expected that the Subject_ID_folder is contained in the Date_folder. Date_folder should not be provided under the cfg.RawImagesFolder (only path up to Date_Folder)! The idea behind automatic choice of subject IDs is to find all the IDs in the cfg.RawImagesFolder and in the cfg.(step).OutputFolder, set difference between IDs. It is supposed that if the output folder for particular subject exists, the analysis was conducted. If the flag owrite == 1, all the IDs from cfg.RawImagesFolder will be chosen for analysis.
TODO (LREN): introduce database for LREN subjects that will contain all the information on the processed, successful, failed and remaining subjects.
- Running analysis:
Ns = numel(Subj_to_compute); %number of subjects
for i = 1:Ns
Pipeline_Computation(Subj_to_compute{i}, cfg, 'dcm2nii', owrite);
Pipeline_Computation(Subj_to_compute{i}, cfg, 'mpm', owrite);
Pipeline_Computation(Subj_to_compute{i}, cfg, 'atlasing', owrite);
Pipeline_Computation(Subj_to_compute{i}, cfg, 'atlas_map_projection', owrite);
Pipeline_Computation(Subj_to_compute{i}, cfg, 'diffusion', owrite);
end
In case of access to parallel toolbox of matlab, the “parfor” loop can be used.





