Spatial self-supervised Peak Learning and correlation-based Evaluation of peak picking in Mass Spectrometry Imaging
S3PL is a spatial self-supervised peak learning autoencoder network which performs spatially structured peak picking on profile mass spectrometry imaging (MSI) data. This repository contains the source code for the corresponding paper:
Philipp Weigand, Nikolas Ebert, Shad A. Mohammed, Denis Abu Sammour, Carsten Hopf & Oliver Wasenmüller
CeMOS - Research and Transfer Center, University of Applied Sciences Mannheim
We provide two ways for testing our code with minimal effort:
-
Google colab demo for the Colorectal Adenocarcinoma tissue sections.
-
Docker image on dockerhub, where the enviroment and the Colorectal Adenocarcinoma dataset are already set up. To execute the script, install Docker and run one of the following commands based on your operating system.
# Linux / macOS # # CPU # docker run --rm -v $(pwd)/results:/workspace/results philippweigand/s3pl:latest # GPU # docker run --gpus all --rm -v $(pwd)/results:/workspace/results philippweigand/s3pl:latest # Windows PowerShell # # CPU # docker run --rm -v ${PWD}/results:/workspace/results philippweigand/s3pl:latest # GPU # docker run --gpus all --rm -v ${PWD}/results:/workspace/results philippweigand/s3pl:latest #Windows cmd.exe# # CPU # docker run --rm -v %cd%/results:/workspace/results philippweigand/s3pl:latest # GPU # docker --gpus all run --rm -v %cd%/results:/workspace/results philippweigand/s3pl:latest
For reproducibility, we recommend using the docker setup described below.
We recommend using Docker for a clean setup. Alternatively, setup a local environment.
-
Clone our repository using git
git clone https://github.com/CeMOS-IS/S3PL.git cd S3PL/ -
Build the Docker image by using the following command in the terminal of this projects folder:
docker build -t s3pl . -
Run the Docker container:
docker run -it --rm --gpus '"device=0"' -v .:/workspace s3pl or docker run -it --rm -v /path/to/S3PL:/workspace s3pl (If you are not in the S3PL directory)
Depending on your GPU you might have to change the base image in the first line of the Dockerfile.
-
Clone our repository using git
git clone https://github.com/CeMOS-IS/S3PL.git -
Setup a virtual environment for example with Anaconda
conda create --name s3pl python=3.11.5 conda activate s3pl -
Install the required python packages:
pip install -r requirements.txt
At first use, create the data/ folder within the s3pl/ folder. For every dataset, create a new folder in the data/ folder. As example data you can use the selected tissue sections from the CAC dataset, which we also use in our paper. It is available here
The dataset folder should contain the .imzML files and the corresponding .ibd files. If there are segmentation masks available, create a masks/ folder and put the corresponding segmentation masks in there (filename_mask.npy).
The folder structure should be as follows:
s3pl
├── data
├── dataset1
│ ├── masks
│ │ ├── file1_mask.npy
│ │ ├── file2_mask.npy
│ │ └── ...
│ ├── file1.imzML
│ ├── file1.ibd
│ ├── file2.imzML
│ ├── file2.ibd
│ └── ...
├── cac
│ ├── masks
│ │ ├── 40TopL_mask.npy
│ │ ├── 40TopL_mask.npy
│ │ └── ...
│ ├── 40TopL.imzML
│ ├── 40TopL.ibd
│ └── ...
└── ...
To use S3PL for peak picking, specify the .imzML file as the data_dir in the s3pl/config.json file.
Then, run the s3pl/main.py file:
python s3pl/main.py \
--data_dir "/data/cac/40TopL.imzML" \
--number_classes 3 \
--eval_picking True \
--peaks_per_spectral_patch 256 \
--spectral_patch_size 9 \
--kernel_depth_d1 51 \
--kernel_depth_d2 1 \
--n_epochs 10 \
--batch_size 16 \
--learning_rate 1e-2 \
--dropout 0 \
--random_seed 1
The resulting peak list will be stored in a .csv file in the folder
results/<name_of_the_training>/peak_evaluation_<filename>.txt
Available configurations in config.json:
data_dir: relative path to .imzML file,number_classes: the number of classes in the segmentation mask (e.g. 3 classes: background, tumor, healthy tissue),evaluate_peak_picking: whether peak picking should be evaluated using our correlation-based procedure (segmentation masks are required for this),
Parameters in config.json:
-
number_peaks$n$ : the number of total peaks to be picked, -
peaks_per_spectral_patch$z$ : the number of peaks per spectral patch, -
spectral_patch_size$p$ : size of the spectral patch$p$ =$h$ =$w$ , -
kernel_depth_d1$d_1$ : depth$d_1$ of the 3D convolution, -
kernel_depth_d2$d_2$ : depth$d_2$ of the 3D transposed convolution
The number of total peaks number_peaks in the s3pl/config.json file. In order to use different numbers of peaks s3pl/data_configs.py file and assign null to the number_peaks parameter. The same applies for the number of classes using the number_classes parameter.
Email: p.weigand@doktoranden.th-mannheim.de
If you use this code in your research, please cite our paper:
@article{,
title={},
author={},
journal={},
volume={},
number={},
pages={},
year={},
publisher={}
}
