An R script for the annotation of small-molecule LC-MS data
This repoistory contains full workflow application for annotating a full signal intesity matrix recieved from e.g. slaw (see https://github.com/zamboni-lab/SLAW for further details) with MS1/MS2 annotation with the MetaboAnnotation package (see https://github.com/rformassspectrometry/MetaboAnnotation).
Example workflow can be found in MetaboliteAnnotationWorkflow.R and run on command line as:
Rscript MetaboliteAnnotationWorkflow.R settings.yaml
All required input parameters are stored in a yaml-file. See test_input/settings.yaml for a example settingsfile.
Following parameters need to be provided:
cores: Defining on how many cores the calculatation shall be performed.tolerance_MS1andtolerance_MS2: Defining the absolute allowed tolerance for annotation.ppm_MS1andppm_MS2: Defining the relative allowed tolerance.adducts_posandadducts_negwhich are calculated for MS1 annotation.dp_treshthe dot product score threshold applied for annotation.int_treshdefining the relative intensity threshold which will be removed from the MS2 query sepctra.toleranceRT_MS1andtoleranceRT_MS2defining the window in retention time in seconds used for annotation.output_dirwhich is defining the directory to store the result data.save_rdsa logical (TRUE/FALSE) defining if all calculated objects shall be stored. These can be easy loaded again into e.g. an R session.save_tsva logical (TRUE/FALSE) defining if match results should be provided as tsv.
All required input variables are stored in a yaml-file. See test_input/settings.yaml for a example settingsfile.
Here, we define the paths for following input files:
- The slaw output containing the datamatrix and fused mgf file to use for both ionization modes
studydesign_posandstudydesign_neg: A*.csvfile containing the Sample metadata used to be stored in thecolDataof the finalSummarizedExperiment(optional if parametersamplegroup=TRUE).- Input library directories (positive and negative mode) containg the MS1 annotation lists. Two different directories can be provided distiguishing between inhouse-data
inhouseand extern dataext. For inhouse data, an additional retention time matching will be performed. The*.csvfiles should contain the rows:
| id | name | formula | exact_mass | rt |
|---|
While retention time is only needed for the inhouse file.
- Input library directories (positive and negative mode) containing the MS2 spectral libraries. Following input formats are supported:
*.mbas MassBank record file*.mspas MS2 MSP file*.rdsa Spectra object containing the library spectra
In the output directory following directories will be created:
Annotation_MS1_external: Containing the result files of the MS1 annotation without retention time.Annotation_MS1_inhouse: Containing the result files of the MS1 annotation with retention time.Annotation_MS2_external: Containing the result files of the MS2 annotation without retention time.Annotation_MS2_inhouse: Containing the result files of the MS2 annotation with retention time.QFeatures_MS1: Containing*.rdsfiles with the QFeatures SummarizedExperiment objects.
- If the
matchedSpectraObjects are stored (settings parametersave_rds= TRUE), these can be reviewed by using the command:
Rscript ShinyMetaboAnnotation/app.R [matchedObject.rds]
The used MatchedObject can be either defined by the commandline argument or later loaded in later in the Shiny application.
Alternatively, the application can be opened in RStudio and using the run application button in the top of the source.
By clicking on the Browse... button, a rds file containing a MatchedSpectra Object can be loaded.
The side panel contains all features with matches revealed from the MS2 query data, defined by mass and retention time. By clicking on the side panel entry, the spectra will be loaded. The table on the bottom contains all matches revealed. Note that the library spectra will only be loaded if clicking on the table row for selection.
On the bottom, you can select if you verify the match or if it is a false positive.
If you have finalized your selection either click on the save verification button if you have uploaded the file by commandline. This will generate a new MatchedSpectra Object only containing the TRUE annotations in a *_verified.rds file.
If you have uploaded the Object though the Browse... button, please note that you should use the Select file locationbutton for stroage.
