This repository contains the model and training code for Altimeter - a transformer model for peptide spectrum prediction.
Altimeter models are hosted on Koina and can be queried via their REST API:
- Altimeter_2024_splines_index
- Altimeter_2024_splines
- Altimeter_2024_intensities
- Altimeter_2024_isotopes
- Data: trained on a reprocessed ProteomeTools dataset covering tryptic and non-tryptic peptides (trypsin, LysC, AspN, HLA ligands) acquired on an Orbitrap Fusion Lumos with NCEs from 20–40; only methionine oxidation and cysteine carbamidomethylation were considered.
- Model: predicts fragment ion intensities across NCEs using cubic B-splines; an isotopes variant re-creates full fragment isotope patterns based on isolation efficiencies; validated for HCD on Orbitrap instruments only.
- Performance: achieves a median normalized spectral angle of 0.941 on a held-out test set and performs consistently across proteases, with slight degradation for long peptides or extreme charge states.
- Input: peptides 6–40 amino acids long (6–30 recommended), charge 1–7 (1–4 recommended), NCE between 20–40, with variable methionine oxidation and static carbamidomethylated cysteines.
The training data is available from Zenodo: https://zenodo.org/records/15875054
Download and unpack the archive into a working directory, e.g. ~/altimeter_data:
wget https://zenodo.org/records/15875054/files/Altimeter_training_data.tar.gz?download=1 -O Altimeter_training_data.tar.gz
mkdir -p ~/altimeter_data
tar -xzf Altimeter_training_data.tar.gz -C ~/altimeter_dataAfter extraction, update config/data.yaml so the paths point to your dataset location:
base_path: /path/to/altimeter_data/
ion_dictionary_path: /path/to/altimeter_data/saved_model/ion_dictionary.txt
dataset_path: datasets/
position_path: txt_pos/
label_path: labels/
saved_model_path: saved_model/docker pull dennisgoldfarb/pytorch_ris:lightning
docker run --gpus all -v $PWD:/workspace/Altimeter \
-v /path/to/altimeter_data:/data \
-w /workspace/Altimeter/altimeter \
dennisgoldfarb/pytorch_ris:lightning \
python3 train.py ../config/data.yamlCreate a Python environment with PyTorch, PyTorch Lightning, and the repository's dependencies, then run:
python train.py ../config/data.yamlThe repository provides export.py to serialize the model for
serving. Run the script inside the Docker image used for training:
docker run --gpus all -v $PWD:/workspace/Altimeter \
-v /path/to/altimeter_data:/data \
-w /workspace/Altimeter/altimeter \
dennisgoldfarb/pytorch_ris:lightning \
python3 export.py \
/data/model.ts \
/data/model.onnx \
--dic-config ../config/data.yaml \
--model-config /data/saved_model/model_config.yaml \
--model-ckpt /data/saved_model/checkpoint.ckpt # replace with your checkpointThe first argument is the output path for the TorchScript model and the second argument specifies the ONNX file for the spline model. Adjust the paths to match your environment.