[Discussion] Deisotoping parameters

[WIP]

@mobiusklein from https://github.com/mobiusklein/ms_deisotope communicated and mentioned a couple of things regarding our current use of deisotoping.

I will try to distill here the contents and implications:

1. Numpy array support

> Reading your code, I noticed you weren’t relying on deconvolute_peaks to do input coercion for you, which it turns out was because I wasn’t calling prepare_peaklist before passing the peak list into the deconvoluter itself. I’ve fixed that. I’ve also made it so prepare_peaklist will work with a pair of numpy arrays for m/z and intensity without needing to zip them together yourself first. This fix will be live in version v0.0.46, which I’ll release tonight.

This entails changing the version, and using the new API


```
There were two things I wanted to ask about though.

https://github.com/TalusBio/diadem/blob/113521ff7cf5ecb807695f1d706319b7a4ebb053/diadem/mzml.py#L565-L566
The first is where you intend to use the deisotoped output? The way you’re using it, you’re letting ms_deisotope strip out all the isotopic peaks, but then you’re keeping the charge state-specific m/z values. You probably want to work with all singly charged values downstream in your code, which means you should pass your deconvoluted peaks to ms_deisotope.decharge first, which transforms all peaks to be singly charged. Otherwise, your downstream code will miss out on those multiply charged ions unless you search for their m/zs explicitly but you’ll have discarded all the evidence for those charge state assignments.

The second is w.r.t. the comment “I do not really have a reason to use one scorer over other rn”. I think your choice of MSDeconVFitter is probably safe, especially for MS2 data. If you’re finding you’re missing peaks downstream, you can safely lower the threshold from 10 to 0 and/or pass retention_strategy=ms_deisotope.deconvolution.TopNRetentionStrategy(50) to deconvolute_peaks. That will keep the top 50 most abundant peaks as singly charged even if they didn’t pass the deconvolution score threshold. Setting the threshold to 0 means the deconvoluter will reject outright bad matches, but will accept more truncated isotopic patterns.This is especially true of Orbitrap data which will discard low abundance isotopic peaks.
```

ATM I am using it as 
```python
peaks = prepare_peaklist(
    [
        (mz, inten)
        for mz, inten in zip(curr_spec.mz, curr_spec.intensity)
    ]
)
deconvoluted_peaks, _ = ms_deisotope.deconvolute_peaks(
    peaks,
    averagine=ms_deisotope.peptide,
    scorer=ms_deisotope.MSDeconVFitter(10.0),
    charge_range=(1, 3),
)
```

This would entail changing to

```python
deconvoluted_peaks, _ = ms_deisotope.deconvolute_peaks(
    peaks,
    averagine=ms_deisotope.peptide,
    scorer=ms_deisotope.MSDeconVFitter(0),
    retention_strategy=ms_deisotope.deconvolution.TopNRetentionStrategy(50),
    charge_range=(1, 3),
)
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] Deisotoping parameters #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Discussion] Deisotoping parameters #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions