PiDeeL: metabolic pathway-informed deep learning model for survival analysis and pathological classification of gliomas
Official implementation of PiDeeL (Pathway-Informed Deep Learning Model), a deep learning framework that incorporates metabolic pathway information for improved survival prediction in glioma patients using HRMAS NMR spectroscopy data.
PiDeeL: metabolic pathway-informed deep learning model for survival analysis and pathological classification of gliomas
Published in Bioinformatics, Volume 39, Issue 11, November 2023
π Read the paper
PiDeeL integrates biological metabolic pathway knowledge into the neural network architecture to predict patient survival outcomes. By constraining the network's weights according to metabolite-pathway relationships, PiDeeL achieves better interpretability and performance.
-
Clone the repo
git clone https://github.com/ciceklab/PiDeeL/ cd PiDeeL -
Create conda environment
conda env create --name PiDeeL --file PiDeeL.yml conda activate PiDeeL
- PyTorch 2.0+
- pycox (for survival analysis)
- scikit-survival
- scikit-learn
- pandas, numpy, matplotlib
- shap
PiDeeL/
βββ README.md
βββ PiDeeL.yml # Conda environment file
βββ system_fig.png
β
βββ run/ # Inference with pretrained models
β βββ predict.py # Main prediction script
β βββ model.py # Model architecture
β βββ model_utils.py # Utility functions
β βββ PiDeeL_2layer.pth # Pretrained 2-layer model
β βββ PiDeeL_3layer.pth # Pretrained 3-layer model
β βββ PiDeeL_4layer.pth # Pretrained 4-layer model
β βββ sample_quant.pickle # Sample input data
β
βββ reproduction/ # Full experiment reproduction
βββ config.py # Central configuration
βββ scripts/ # Training scripts
β βββ load_targeted_data.py
β βββ model_utils.py
β βββ 1layer/, 2layer/, 3layer/, 4layer/
β βββ baseline/ # Cox-PH, CWGB, RF
βββ figures/ # Paper figure generation
β βββ run_fig2.py, run_fig4.py, ...
β βββ run_all_figures.py
βββ models/ # Saved model weights
βββ logs/ # C-Index results
βββ plots/ # Training loss plots
βββ pideel_data/ # Data directory
Use pretrained models to predict survival risk scores for new samples.
-
Navigate to the run directory:
cd run/ -
Prepare your input data:
- Use the provided
sample_quant.pickleas a template - Or use the automated metabolite quantification pipeline from Cakmakci et al. to quantify your HRMAS NMR spectroscopy data
- Use the provided
-
Run prediction:
python predict.py --layer 4 --dev gpu
Options:
--layer: Select model architecture (2, 3, or 4)--dev: Device to use (gpuorcpu)
-
Output: Risk scores printed to terminal
The easiest way to reproduce the main comparison figure is using the Jupyter notebook at the repo root:
jupyter notebook reproduce_main_comparison.ipynbBy default, the notebook uses pretrained model logs included in the repository to generate Figure 2 immediately.
To retrain models from scratch, set the RETRAIN flag in the notebook:
RETRAIN = True # Set to True to retrain all models instead of using pretrained logsThe preprocessed samples are already included in the repository under reproduction/pideel_data/targeted/. No additional download is required to reproduce the results.
If you want access to the raw HRMAS NMR spectroscopy data, you can download it from Zenodo:
https://zenodo.org/record/7228791
Navigate to the reproduction directory:
cd reproduction/PiDeeL and DeepSurv models:
# 2-layer
python scripts/2layer/no_pathway/main.py # DeepSurv
python scripts/2layer/pathway/main.py # PiDeeL
# 3-layer
python scripts/3layer/no_pathway/main.py
python scripts/3layer/pathway/main.py
# 4-layer
python scripts/4layer/no_pathway/main.py
python scripts/4layer/pathway/main.pyBaseline models:
python scripts/baseline/coxph/main.py # Cox Proportional Hazards
python scripts/baseline/cwgb/main.py # Component-wise Gradient Boosting
python scripts/baseline/rf/main.py # Random Survival Forestcd reproduction/figures/
# Generate all figures
python run_all_figures.py
# Or generate individual figures
python run_fig2.py # Main comparison (Fig. 2)
python run_fig4.py # Random connections ablation
python run_fig5.py # Dropout analysis
python run_fig6.py # External validation- C-Index results:
reproduction/logs/*/c_indices.txt - Generated figures:
reproduction/figures/ - Training plots:
reproduction/plots/
The code automatically detects GPU availability:
- GPU available: Uses CUDA for training (recommended)
- CPU only: Falls back to CPU (slower but functional)
No code changes needed - device selection is automatic via config.py.
If you use this code in your research, please cite:
@article{kaynar2023pideel,
title={PiDeeL: metabolic pathway-informed deep learning model for survival analysis and pathological classification of gliomas},
author={Kaynar, Gun and Cakmakci, Doruk and Bund, Caroline and Todeschi, Julien and Namer, Izzie Jacques and Cicek, A Ercument},
journal={Bioinformatics},
volume={39},
number={11},
pages={btad684},
year={2023},
publisher={Oxford University Press}
}
Distributed under the MIT License.
Gun Kaynar - gunkaynar.com
A. Ercument Cicek - http://ciceklab.cs.bilkent.edu.tr/ercument
This work was supported by grants from BPI France (ExtempoRMN Project), HΓ΄pitaux Universitaires de Strasbourg, Bruker BioSpin, Univ. de Strasbourg and the Centre National de la Recherche Scientifique; also by TUBA GEBIP, Bilim Akademisi BAGEP and TUSEB Research Incentive awards to AEC.
