HALFpipe is a user-friendly software that facilitates reproducible
analysis of fMRI data, including preprocessing, single-subject, and
group analysis. It provides state-of-the-art preprocessing using
fmriprep, but removes the
necessity to convert data to the BIDS format. Common
resting-state and task-based fMRI features can then be calculated on the
fly using FSL and nipype for statistics.
Subscribe to our mailing list to stay up to date with new developments and releases.
If you encounter issues, please see the troubleshooting section of this document.
Some sections of this document are marked as outdated. While we are working on updating them, the paper and the analysis manual should be able to answer most questions.
- Getting started
- User interface
- Running on a high-performance computing cluster
- Quality checks
- Outputs
- Troubleshooting
- Command line flags
- Contact
HALFpipe is distributed as a container, meaning that all required
software comes bundled in a monolithic file, the container. This allows
for easy installation on new systems, and makes data analysis more
reproducible, because software versions are guaranteed to be the same
for all users.
The first step is to install one of the supported container platforms. If you’re using a high-performance computing cluster, more often than not Singularity will already be available.
If not, we recommend using the latest version ofSingularity. However, it can be somewhat cumbersome to install, as it needs to be built from source.
The NeuroDebian package repository provides an older version of Singularity for some Linux distributions.
If you are running mac OS, then you should be able to run the
container with Docker Desktop.
If you are running Windows, you can also try running with Docker
Desktop, but we have not done any compatibility testing yet, so issues
may occur, for example with respect to file systems.
| Container platform | Version | Installation |
|---|---|---|
| Singularity | 3.x | https://sylabs.io/guides/3.8/user-guide/quick_start.html |
| Singularity | 2.x | sudo apt install singularity-container |
| Docker | See https://docs.docker.com/engine/install/ |
The second step is to download the HALFpipe to your computer. This
requires approximately 5 gigabytes of storage.
| Container platform | Version | Installation |
|---|---|---|
| Singularity | 3.x | https://download.fmri.science/singularity/halfpipe-halfpipe-latest.sif |
| Singularity | 2.x | https://download.fmri.science/singularity/halfpipe-halfpipe-latest.simg |
| Docker | docker pull halfpipe/halfpipe:latest |
Singularity version 3.x creates a container image file called
HALFpipe_{version}.sif in the directory where you run the pull
command. For Singularity version 2.x the file is named
halfpipe-halfpipe-master-latest.simg. Whenever you want to use the
container, you need pass Singularity the path to this file.
NOTE:Singularitymay store a copy of the container in its cache directory. The cache directory is located by default in your home directory at~/.singularity. If you need to save disk space in your home directory, you can safely delete the cache directory after downloading, i.e. by runningrm -rf ~/.singularity. Alternatively, you could move the cache directory somewhere with more free disk space using a symlink. This way, files will automatically be stored there in the future. For example, if you have a lot of free disk space in/mnt/storage, then you could first runmv ~/.singularity /mnt/storageto move the cache directory, and thenln -s /mnt/storage/.singularity ~/.singularityto create the symlink.
Docker will store the container in its storage base directory, so it
does not matter from which directory you run the pull command.
The third step is to run the downloaded container. You may need to
replace halfpipe-halfpipe-latest.simg with the actual path and
filename where Singularity downloaded your container.
| Container platform | Command |
|---|---|
| Singularity | singularity run --containall --bind /:/ext
halfpipe-halfpipe-latest.simg |
| Docker | docker run --interactive --tty --volume /:/ext
halfpipe/halfpipe |
You should now see the user interface.
Containers are by default isolated from the host computer. This adds
security, but also means that the container cannot access the data it
needs for analysis. HALFpipe expects all inputs (e.g., image files
and spreadsheets) and outputs (the working directory) to be places in
the path/ext (see also `--fs-root
<#data-file-system-root---fs-root>`__). Using the option --bind
/:/ext, we instruct Singularity to map all of the host file system
(/) to that path (/ext). You can also run HALFpipe and only
map only part of the host file system, but keep in mind that any
directories that are not mapped will not be visible later.
Singularity passes the host shell environment to the container by
default. This means that in some cases, the host computer’s
configuration can interfere with the software. To avoid this, we need to
pass the option --containall. Docker does not pass the host
shell environment by default, so we don’t need to pass an option.
Outdated
The user interface asks a series of questions about your data and the
analyses you want to run. In each question, you can press Control+C
to cancel the current question and go back to the previous one.
Control+D exits the program without saving. Note that these keyboard
shortcuts are the same on Mac.
To run preprocessing, at least a T1-weighted structural image and a BOLD image file is required. Preprocessing and data analysis proceeds automatically. However, to be able to run automatically, data files need to be input in a way suitable for automation.
For this kind of automation, HALFpipe needs to know the
relationships between files, such as which files belong to the same
subject. However, even though it would be obvious for a human, a program
cannot easily assign a file name to a subject, and this will be true as
long as there are differences in naming between different researchers or
labs. One researcher may name the same file subject_01_rest.nii.gz
and another subject_01/scan_rest.nii.gz.
In HALFpipe, we solve this issue by inputting file names in a
specific way. For example, instead of subject_01/scan_rest.nii.gz,
HALFpipe expects you to input {subject}/scan_rest.nii.gz.
HALFpipe can then match all files on disk that match this naming
schema, and extract the subject ID subject_01. Using the extracted
subject ID, other files can now be matched to this image. If all input
files are available in BIDS format, then this step can be skipped.
Specify working directoryAll intermediate and outputs ofHALFpipewill be placed in the working directory. Keep in mind to choose a location with sufficient free disk space, as intermediates can be multiple gigabytes in size for each subject.Is the data available in BIDS format?YesSpecify the path of the BIDS directory
NoSpecify anatomical/structural dataSpecify the path of the T1-weighted image filesSpecify functional dataSpecify the path of the BOLD image filesCheck repetition time values/Specify repetition time in secondsAdd more BOLD image files?YesLoop back to 2NoContinue
Do slice timing?YesCheck slice acquisition direction valuesCheck slice timing values
NoSkip this step
Specify field maps?If the data was imported from a BIDS directory, this step will be omitted.YesSpecify the type of the field maps- EPI (blip-up blip-down)
Specify the path of the blip-up blip-down EPI image files
- Phase difference and magnitude (used by Siemens scanners)
Specify the path of the magnitude image filesSpecify the path of the phase/phase difference image filesSpecify echo time difference in seconds
- Scanner-computed field map and magnitude (used by GE /
Philips scanners)
Specify the path of the magnitude image filesSpecify the path of the field map image files
- EPI (blip-up blip-down)
Add more field maps?Loop back to 1Specify effective echo spacing for the functional data in secondsSpecify phase encoding direction for the functional data
NoSkip this step
Features are analyses that are carried out on the preprocessed data, in other words, first-level analyses.
Specify first-level features?YesSpecify the feature typeTask-basedSpecify feature nameSpecify images to useSpecify the event file type
SPM multiple conditionsA MATLAB .mat file containing three arrays:names(condition),onsetsanddurationsFSL 3-columnOne text file for each condition. Each file has its corresponding condition in the filename. The first column specifies the event onset, the second the duration. The third column of the files is ignored, so parametric modulation is not supportedBIDS TSVA tab-separated table with named columnstrial_type(condition),onsetandduration. While BIDS supports defining additional columns,HALFpipewill currently ignore these
Specify the path of the event filesSelect conditions to add to the modelSpecify contrastsSpecify contrast nameSpecify contrast valuesAdd another contrast?YesLoop back to 1NoContinue
Apply a temporal filter to the design matrix?A separate temporal filter can be specified for the design matrix. In contrast, the temporal filtering of the input image and any confound regressors added to the design matrix is specified in 10. In general, the two settings should matchApply smoothing?YesSpecify smoothing FWHM in mm
NoContinue
Grand mean scaling will be applied with a mean of 10000.000000Temporal filtering will be applied using a gaussian-weighted filterSpecify the filter width in secondsRemove confounds?
Seed-based connectivitySpecify feature nameSpecify images to useSpecify binary seed mask file(s)Specify the path of the binary seed mask image filesCheck space valuesAdd binary seed mask image file
Dual regressionSpecify feature nameSpecify images to use- TODO
Atlas-based connectivity matrixSpecify feature nameSpecify images to use- TODO
ReHoSpecify feature nameSpecify images to use- TODO
fALFFSpecify feature nameSpecify images to use- TODO
NoSkip this step
Add another first-level feature?YesLoop back to 1NoContinue
Output a preprocessed image?YesSpecify setting nameSpecify images to useApply smoothing?YesSpecify smoothing FWHM in mm
NoContinue
Do grand mean scaling?YesSpecify grand mean
NoContinue
Apply a temporal filter?YesSpecify the type of temporal filterGaussian-weightedFrequency-based
NoContinue
Remove confounds?
NoContinue
Models are statistical analyses that are carried out on the features.
TODO
Log in to your cluster’s head node
Request an interactive job. Refer to your cluster’s documentation for how to do this
- In the interactive job, run the
HALFpipeuser interface, but add the flag--use-clusterto the end of the command.For example,singularity run --containall --bind /:/ext halfpipe-halfpipe-latest.sif --use-cluster As soon as you finish specifying all your data, features and models in the user interface,
HALFpipewill now generate everything needed to run on the cluster. For hundreds of subjects, this can take up to a few hours.When
HALFpipeexits, edit the generated submit scriptsubmit.slurm.shaccording to your cluster’s documentation and then run it. This submit script will calculate everything except group statistics.As soon as all processing has been completed, you can run group statistics. This is usually very fast, so you can do this in an interactive session. Run
singularity run --containall --bind /:/ext halfpipe-halfpipe-latest.sif --only-model-chunkand then selectRun without modificationin the user interface.
A common issue with remote work via secure shell is that the connection may break after a few hours. For batch jobs this is not an issue, but for interactive jobs this can be quite frustrating. When the connection is lost, the node you were connected to will automatically quit all programs you were running. To prevent this, you can run interactive jobs within
screenortmux(whichever is available). These commands allow you to open sessions in the terminal that will continue running in the background even when you close or disconnect. Here’s a quick overview of how to use the commands (more in-depth documentation is available for example at http://www.dayid.org/comp/tm.html).
- Open a new screen/tmux session on the head node by running either
screenortmux- Request an interactive job from within the session, for example with
srun --pty bash -i- Run the command that you want to run
- Detach from the screen/tmux session, meaning disconnecting with the ability to re-connect later For screen, this is done by first pressing
Control+a, then letting go, and then pressingdon the keyboard. For tmux, it’sControl+binstead ofControl+a. Note that this is alwaysControl, even if you’re on a mac.- Close your connection to the head node with
Control+d.screen/tmuxwill remain running in the background- Later, connect again to the head node. Run
screen -rortmux attachto check back on the interactive job. If everything went well and the command you wanted to run finished, close the interactive job withControl+dand then thescreen/tmuxsession withControl+dagain. If the command hasn’t finished yet, detach as before and come back later
Are you getting a "missing dependencies" error? Some clusters configure singularity with an option called mount hostfs that will bind all cluster file systems into the container. These file systems may in some cases have paths that conflict with where software is installed in theHALFpipecontainer, effectively overwriting that software. You can disable this by adding the option--no-mount hostfsright aftersingularity run.
Please see the manual
Outdated
- A visual report page
reports/index.html - A table with image quality metrics
reports/reportvals.txt - A table containing the preprocessing status
reports/reportpreproc.txt - The untouched
fmriprepderivatives. Some files have been omitted to save disk spacefmriprepis very strict about only processing data that is compliant with the BIDS standard. As such, we may need to format subjects names for compliance. For example, an input subject namedsubject_01will appear assubject01in thefmriprepderivatives.derivatives/fmriprep
- For task-based, seed-based connectivity and dual regression features,
HALFpipeoutputs the statistical maps for the effect, the variance, the degrees of freedom of the variance and the z-statistic. In FSL, the effect and variance are also calledcopeandvarcopederivatives/halfpipe/sub-.../func/..._stat-effect_statmap.nii.gzderivatives/halfpipe/sub-.../func/..._stat-variance_statmap.nii.gzderivatives/halfpipe/sub-.../func/..._stat-dof_statmap.nii.gzderivatives/halfpipe/sub-.../func/..._stat-z_statmap.nii.gzThe design and contrast matrix used for the final model will be outputted alongside the statistical mapsderivatives/halfpipe/sub-.../func/sub-..._task-..._feature-..._desc-design_matrix.tsvderivatives/halfpipe/sub-.../func/sub-..._task-..._feature-..._desc-contrast_matrix.tsv - ReHo and fALFF are not calculated based on a linear model. As such, only one statistical map of the z-scaled values will be output
derivatives/halfpipe/sub-.../func/..._alff.nii.gzderivatives/halfpipe/sub-.../func/..._falff.nii.gzderivatives/halfpipe/sub-.../func/..._reho.nii.gz For every feature, a
.jsonfile containing a summary of the preprocessing- settings, and a list of the raw data files that were used for the analysis (
RawSources)derivatives/halfpipe/sub-.../func/....json - For every feature, the corresponding brain mask is output beside the statistical maps. Masks do not differ between different features calculated, they are only copied out repeatedly for convenience
derivatives/halfpipe/sub-.../func/...desc-brain_mask.nii.gz - Atlas-based connectivity outputs the time series and the full covariance and correlation matrices as text files
derivatives/halfpipe/sub-.../func/..._timeseries.txtderivatives/halfpipe/sub-.../func/..._desc-covariance_matrix.txtderivatives/halfpipe/sub-.../func/..._desc-correlation_matrix.txt
- Masked, preprocessed BOLD image
derivatives/halfpipe/sub-.../func/..._bold.nii.gz - Just like for features
derivatives/halfpipe/sub-.../func/..._bold.json - Just like for features
derivatives/halfpipe/sub-.../func/sub-..._task-..._setting-..._desc-brain_mask.nii.gz - Filtered confounds time series, where all filters that are applied to the BOLD image are applied to the regressors as well. Note that this means that when grand mean scaling is active, confounds time series are also scaled, meaning that values such as
framewise displacementcan not be interpreted in terms of their original units anymore.derivatives/halfpipe/sub-.../func/sub-..._task-..._setting-..._desc-confounds_regressors.tsv
grouplevel/...
- If an error occurs, this will be output to the command line and
simultaneously to the
err.txtfile in the working directory - If the error occurs while running, usually a text file detailing the
error will be placed in the working directory. These are text files
and their file names start with
crash- Usually, the last line of these text files contains the error message. Please read this carefully, as may allow you to understand the error
- For example, consider the following error message:
ValueError: shape (64, 64, 33) for image 1 not compatible with first image shape (64, 64, 34) with axis == NoneThis error message may seem cryptic at first. However, looking at the message more closely, it suggests that two input images have different, incompatible dimensions. In this case,HALFpipecorrectly recognized this issue, and there is no need for concern. The images in question will simply be excluded from preprocessing and/or analysis - In some cases, the cause of the error can be a bug in the
HALFpipecode. Please check that no similar issue has been reported here on GitHub. In this case, please submit an issue.
--verboseBy default, only errors and warnings will be output to the command line.
This makes it easier to see when something goes wrong, because there is
less output. However, if you want to be able to inspect what is being
run, you can add the --verbose flag to the end of the command used
to call HALFpipe.
Verbose logs are always written to the log.txt file in the working
directory, so going back and inspecting this log is always possible,
even if the --verbose flag was not specified.
Specifying the flag --debug will print additional, fine-grained
messages. It will also automatically start the Python Debugger when an error occurs.
You should only use --debug if you know what you’re doing.
--keepHALFpipe saves intermediate files for each pipeline step. This
speeds up re-running with different settings, or resuming after a job
after it was cancelled. The intermediate file are saved by the nipype workflow engine, which is what
HALFpipe uses internally. nipype saves the intermediate files in
the nipype folder in the working directory.
In environments with limited disk capacity, this can be problematic. To
limit disk usage, HALFpipe can delete intermediate files as soon as
they are not needed anymore. This behavior is controlled with the
--keep flag.
The default option --keep some keeps all intermediate files from
fMRIPrep and MELODIC, which would take the longest to re-run. We believe
this is a good tradeoff between disk space and computer time. --keep
all turns of all deletion of intermediate files. --keep none
deletes as much as possible, meaning that the smallest amount possible
of disk space will be used.
--nipype-<omp-nthreads|memory-gb|n-procs|run-plugin>HALFpipe chooses sensible defaults for all of these values.
Outdated
--<only|skip>-<spec-ui|workflow|run|model-chunk>A HALFpipe run is divided internally into three stages, spec-ui,
workflow, and run.
- The
spec-uistage is where you specify things in the user interface. It creates thespec.jsonfile that contains all the information needed to runHALFpipe. To only run this stage, use the option--only-spec-ui. To skip this stage, use the option--skip-spec-ui - The
workflowstage is whereHALFpipeuses thespec.jsondata to search for all the files that match what was input in the user interface. It then generates anipypeworkflow for preprocessing, feature extraction and group models.nipypethen validates the workflow and prepares it for execution. This usually takes a couple of minutes and cannot be parallelized. For hundreds of subjects, this may even take a few hours. This stage has the corresponding option--only-workflowand--skip-workflow.
- This stage saves several intermediate files. These are named
workflow.{uuid}.pickle.xz,execgraph.{uuid}.pickle.xzandexecgraph.{n_chunks}_chunks.{uuid}.pickle.xz. Theuuidin the file name is a unique identifier generated from thespec.jsonfile and the input files. It is re-calculated every time we run this stage. The uuid algorithm produces a different output if there are any changes (such as when new input files for new subjects become available, or thespec.jsonis changed, for example to add a new feature or group model). Otherwise, theuuidstays the same. Therefore, if a workflow file with the calculateduuidalready exists, then we do not need to run this stage. We can simple re-use the workflow from the existing file, and save some time. - In this stage, we can also decide to split the execution into chunks.
The flag
--subject-chunkscreates one chunk per subject. The flag--use-clusterautomatically activates--subject-chunks. The flag--n-chunksallows the user to specify a specific number of chunks. This is useful if the execution should be spread over a set number of computers. In addition to these, a model chunk is generated.
- The
runstage loads theexecgraph.{n_chunks}_chunks.{uuid}.pickle.xzfile generated in the previous step and runs it. This file usually contains two chunks, one for the subject level preprocessing and feature extraction (“subject level chunk”), and one for group statistics (“model chunk”). To run a specific chunk, you can use the flags--only-chunk-index ...and--only-model-chunk.
--workdirTODO
--fs-rootThe HALFpipe container, or really most containers, contain the
entire base system needed to run
For questions or support, please submit an issue or contact us via e-mail at enigma@charite.de.