Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Deisotoping parameters #6

Open
jspaezp opened this issue Jan 26, 2023 · 0 comments
Open

[Discussion] Deisotoping parameters #6

jspaezp opened this issue Jan 26, 2023 · 0 comments
Labels
question Further information is requested
Milestone

Comments

@jspaezp
Copy link
Contributor

jspaezp commented Jan 26, 2023

[WIP]

@mobiusklein from https://github.com/mobiusklein/ms_deisotope communicated and mentioned a couple of things regarding our current use of deisotoping.

I will try to distill here the contents and implications:

  1. Numpy array support

Reading your code, I noticed you weren’t relying on deconvolute_peaks to do input coercion for you, which it turns out was because I wasn’t calling prepare_peaklist before passing the peak list into the deconvoluter itself. I’ve fixed that. I’ve also made it so prepare_peaklist will work with a pair of numpy arrays for m/z and intensity without needing to zip them together yourself first. This fix will be live in version v0.0.46, which I’ll release tonight.

This entails changing the version, and using the new API

There were two things I wanted to ask about though.

https://github.com/TalusBio/diadem/blob/113521ff7cf5ecb807695f1d706319b7a4ebb053/diadem/mzml.py#L565-L566
The first is where you intend to use the deisotoped output? The way you’re using it, you’re letting ms_deisotope strip out all the isotopic peaks, but then you’re keeping the charge state-specific m/z values. You probably want to work with all singly charged values downstream in your code, which means you should pass your deconvoluted peaks to ms_deisotope.decharge first, which transforms all peaks to be singly charged. Otherwise, your downstream code will miss out on those multiply charged ions unless you search for their m/zs explicitly but you’ll have discarded all the evidence for those charge state assignments.

The second is w.r.t. the comment “I do not really have a reason to use one scorer over other rn”. I think your choice of MSDeconVFitter is probably safe, especially for MS2 data. If you’re finding you’re missing peaks downstream, you can safely lower the threshold from 10 to 0 and/or pass retention_strategy=ms_deisotope.deconvolution.TopNRetentionStrategy(50) to deconvolute_peaks. That will keep the top 50 most abundant peaks as singly charged even if they didn’t pass the deconvolution score threshold. Setting the threshold to 0 means the deconvoluter will reject outright bad matches, but will accept more truncated isotopic patterns.This is especially true of Orbitrap data which will discard low abundance isotopic peaks.

ATM I am using it as

peaks = prepare_peaklist(
    [
        (mz, inten)
        for mz, inten in zip(curr_spec.mz, curr_spec.intensity)
    ]
)
deconvoluted_peaks, _ = ms_deisotope.deconvolute_peaks(
    peaks,
    averagine=ms_deisotope.peptide,
    scorer=ms_deisotope.MSDeconVFitter(10.0),
    charge_range=(1, 3),
)

This would entail changing to

deconvoluted_peaks, _ = ms_deisotope.deconvolute_peaks(
    peaks,
    averagine=ms_deisotope.peptide,
    scorer=ms_deisotope.MSDeconVFitter(0),
    retention_strategy=ms_deisotope.deconvolution.TopNRetentionStrategy(50),
    charge_range=(1, 3),
)
@wfondrie wfondrie added this to the Alpha Release milestone Mar 27, 2023
@wfondrie wfondrie added the question Further information is requested label Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants