Skip to content

GoldfarbLab/Altimeter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Altimeter Logo

This repository contains the model and training code for Altimeter - a transformer model for peptide spectrum prediction.

Koina deployment

Altimeter models are hosted on Koina and can be queried via their REST API:

Model overview

  • Data: trained on a reprocessed ProteomeTools dataset covering tryptic and non-tryptic peptides (trypsin, LysC, AspN, HLA ligands) acquired on an Orbitrap Fusion Lumos with NCEs from 20–40; only methionine oxidation and cysteine carbamidomethylation were considered.
  • Model: predicts fragment ion intensities across NCEs using cubic B-splines; an isotopes variant re-creates full fragment isotope patterns based on isolation efficiencies; validated for HCD on Orbitrap instruments only.
  • Performance: achieves a median normalized spectral angle of 0.941 on a held-out test set and performs consistently across proteases, with slight degradation for long peptides or extreme charge states.
  • Input: peptides 6–40 amino acids long (6–30 recommended), charge 1–7 (1–4 recommended), NCE between 20–40, with variable methionine oxidation and static carbamidomethylated cysteines.

Dataset

The training data is available from Zenodo: https://zenodo.org/records/15875054

Download and unpack the archive into a working directory, e.g. ~/altimeter_data:

wget https://zenodo.org/records/15875054/files/Altimeter_training_data.tar.gz?download=1 -O Altimeter_training_data.tar.gz
mkdir -p ~/altimeter_data
tar -xzf Altimeter_training_data.tar.gz -C ~/altimeter_data

After extraction, update config/data.yaml so the paths point to your dataset location:

base_path: /path/to/altimeter_data/
ion_dictionary_path: /path/to/altimeter_data/saved_model/ion_dictionary.txt
dataset_path: datasets/
position_path: txt_pos/
label_path: labels/
saved_model_path: saved_model/

Training

Using Docker

docker pull dennisgoldfarb/pytorch_ris:lightning
docker run --gpus all -v $PWD:/workspace/Altimeter \
    -v /path/to/altimeter_data:/data \
    -w /workspace/Altimeter/altimeter \
    dennisgoldfarb/pytorch_ris:lightning \
    python3 train.py ../config/data.yaml

Without Docker

Create a Python environment with PyTorch, PyTorch Lightning, and the repository's dependencies, then run:

python train.py ../config/data.yaml

Export to ONNX and TorchScript

The repository provides export.py to serialize the model for serving. Run the script inside the Docker image used for training:

docker run --gpus all -v $PWD:/workspace/Altimeter \
    -v /path/to/altimeter_data:/data \
    -w /workspace/Altimeter/altimeter \
    dennisgoldfarb/pytorch_ris:lightning \
    
    python3 export.py \
         /data/model.ts \
         /data/model.onnx \
       --dic-config ../config/data.yaml \
       --model-config /data/saved_model/model_config.yaml \
       --model-ckpt /data/saved_model/checkpoint.ckpt  # replace with your checkpoint

The first argument is the output path for the TorchScript model and the second argument specifies the ONNX file for the spline model. Adjust the paths to match your environment.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages