GitHub - GoldfarbLab/Altimeter

This repository contains the model and training code for Altimeter - a transformer model for peptide spectrum prediction.

Koina deployment

Altimeter models are hosted on Koina and can be queried via their REST API:

Model overview

Data: trained on a reprocessed ProteomeTools dataset covering tryptic and non-tryptic peptides (trypsin, LysC, AspN, HLA ligands) acquired on an Orbitrap Fusion Lumos with NCEs from 20–40; only methionine oxidation and cysteine carbamidomethylation were considered.
Model: predicts fragment ion intensities across NCEs using cubic B-splines; an isotopes variant re-creates full fragment isotope patterns based on isolation efficiencies; validated for HCD on Orbitrap instruments only.
Performance: achieves a median normalized spectral angle of 0.941 on a held-out test set and performs consistently across proteases, with slight degradation for long peptides or extreme charge states.
Input: peptides 6–40 amino acids long (6–30 recommended), charge 1–7 (1–4 recommended), NCE between 20–40, with variable methionine oxidation and static carbamidomethylated cysteines.

Dataset

The training data is available from Zenodo: https://zenodo.org/records/15875054

Download and unpack the archive into a working directory, e.g. ~/altimeter_data:

wget https://zenodo.org/records/15875054/files/Altimeter_training_data.tar.gz?download=1 -O Altimeter_training_data.tar.gz
mkdir -p ~/altimeter_data
tar -xzf Altimeter_training_data.tar.gz -C ~/altimeter_data

After extraction, update config/data.yaml so the paths point to your dataset location:

base_path: /path/to/altimeter_data/
ion_dictionary_path: /path/to/altimeter_data/saved_model/ion_dictionary.txt
dataset_path: datasets/
position_path: txt_pos/
label_path: labels/
saved_model_path: saved_model/

Training

Using Docker

docker pull dennisgoldfarb/pytorch_ris:lightning
docker run --gpus all -v $PWD:/workspace/Altimeter \
    -v /path/to/altimeter_data:/data \
    -w /workspace/Altimeter/altimeter \
    dennisgoldfarb/pytorch_ris:lightning \
    python3 train.py ../config/data.yaml

Without Docker

Create a Python environment with PyTorch, PyTorch Lightning, and the repository's dependencies, then run:

python train.py ../config/data.yaml

Export to ONNX and TorchScript

The repository provides export.py to serialize the model for serving. Run the script inside the Docker image used for training:

docker run --gpus all -v $PWD:/workspace/Altimeter \
    -v /path/to/altimeter_data:/data \
    -w /workspace/Altimeter/altimeter \
    dennisgoldfarb/pytorch_ris:lightning \
    
    python3 export.py \
         /data/model.ts \
         /data/model.onnx \
       --dic-config ../config/data.yaml \
       --model-config /data/saved_model/model_config.yaml \
       --model-ckpt /data/saved_model/checkpoint.ckpt  # replace with your checkpoint

The first argument is the output path for the TorchScript model and the second argument specifies the ONNX file for the spline model. Adjust the paths to match your environment.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
altimeter		altimeter
assets		assets
config		config
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Koina deployment

Model overview

Dataset

Training

Using Docker

Without Docker

Export to ONNX and TorchScript

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Koina deployment

Model overview

Dataset

Training

Using Docker

Without Docker

Export to ONNX and TorchScript

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages