Mass Spec Coding Club

The Mass Spec Coding Club (MSCC) is a community dedicated to education of computer coding applied to mass spectrometry applications. Our goal is to make coding accessible to mass spectrometry researchers and provide free resources and open-source examples.

As the community develops, we will continue to post more content, and we welcome contributions from anyone.

Discord Server

Want to chat with community members and join meetings, join us on the Mass Spec Coding Club Discord Server. It's easy to set up, and you can run it from a browser if you'd like. We will pick a time soon and start hosting meetings/office hours there.

In the meantime, feel free to post questions to the text channels there, and people can answer.

Learning Modules

Module 0: Setting Up Python and Plotting A Spectrum

This series of lessons will cover how to set up Python from scratch and write a simple script to plot a mass spectrum. Skills and learning outcomes are outlined below each video

Lesson 0.0: Setting up Python from Scratch
- How to set up and run Python
- Setting variables
- Printing to the terminal
Lesson 0.1: Loading Data Into Python
- Importing libraries
- Reading from text files into NumPy arrays
- Intro to array slicing
Lesson 0.2: Plotting a Spectrum
- Plotting a spectrum with MatPlotLib
- Normalizing the y-axis
Lesson 0.3: Too Fast, Go Back - Review and Background from Module 0
- Fundamentals of how computers work
- Basics of code concepts
- Discussion of variables, functions, and classes
- How to define functions

The data files, Python code, and notes used in this module are available in the "Module 0" folder.

Module 1: Calculating Masses

The goal of module 1 is to show how Python can be used to predict masses of various molecules, starting with proteins.

Lesson 1.0: Calculating Protein Masses
- Using a Dictionary
- Creating a function
- Looping through protein sequence to calculate the protein mass
Lesson 1.1: Improving Our Protein Mass Calculator
- String manipulation
- Passing variables through functions
- If/then statements
- Monoisotopic mass calculation for protein
Lesson 1.2: Calculating Masses from Glycans, SMILES, and Formulas
- Using Glypy to calculate masses from GlycoCT strings
- Using molmass to calculate masses from formulas
- Using RDKit to calculate masses from SMILES strings
Lesson 1.3: Too Fast, Go Back - For Loops, If/Then, and Function Options
- Writing For loops
- If/Then statements and Boolean tests
- Passing options to functions
Homework 1
- For those who want to test their skills and calculate some RNA masses, check out homework1.py in Module 1.

Module 2: Parsing Fasta and Excel Files

The goal of module 2 is to parse larger FASTA and Excel files to pull in the desired information and write it to an Excel file output.

Lesson 2.0: Importing FASTA Files
- Importing Pyteomics library
- Importing our own functions from Module 1
- Looping through a FASTA file to calculate masses
Lesson 2.1: Parsing FASTA Descriptions and Writing to Excel
- Using string splitting to parse protein names and entries
- Creating lists and adding to a Pandas DataFrame
- Exporting DataFrames
Lesson 2.2: Reading, Parsing, and Exporting DataFrames - DDA to MRM
- Reading Excel files to Pandas DataFrames
- Parsing DataFrame rows
- Extracting the position of the most abundant product ion
- Exporting Transition List to Excel
Lesson 2.3: Too Fast, Go Back - NumPy Array Slicing and Boolean Indexing
- Slicing NumPy arrays
- Using Boolean indexing for more sophisticated array slicing
Homework 2
- For those who want more practice, follow up on my suggestion from Lesson 2.2 to turn an MRM list into a PRM list. There are several ways you could do this. You could pick the top N peaks in the product ion spectrum (np.sort might be useful here). You could also pick any peak above a specific relative intensity threshold (Boolean indexing might be useful here). Just add each as a new row to the datalist with the rest of the things the same.

Check back for more videos, and reach out if you like these mtmarty@utexas.edu.

Ideas for Future Tutorials

Here are some ideas that users have suggested. If you have other suggestions, please enter them in the "What Projects Would You Like to See?" discussion. If you would like to volunteer to make a module on one of these topics, please add your name here.

Plotting multiple spectra with for loops and string parsing (Michael Marty)
Reading vendor files
Writing to different output files
Exploring other Python MS packages
How to use public databases (Ming?)
Applications to polymers and oligonucleotides
Ion mobility
Using Git and GitHub
Gasp, R!
- There are a lot of great R resources for MS already, so maybe we could organize and link those here too.

Funding

Funding is provided by the National Science Foundation: CHE-1845230.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Future Module - Adding to Plots		Future Module - Adding to Plots
Module0PlottingSpectra		Module0PlottingSpectra
Module1CalculatingMasses		Module1CalculatingMasses
Module2LargerDatabases		Module2LargerDatabases
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mass Spec Coding Club

Discord Server

Learning Modules

Module 0: Setting Up Python and Plotting A Spectrum

Module 1: Calculating Masses

Module 2: Parsing Fasta and Excel Files

Ideas for Future Tutorials

Funding

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mass Spec Coding Club

Discord Server

Learning Modules

Module 0: Setting Up Python and Plotting A Spectrum

Module 1: Calculating Masses

Module 2: Parsing Fasta and Excel Files

Ideas for Future Tutorials

Funding

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages