Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find all isotope peaks in a spectrum #185

Open
jorainer opened this issue Mar 12, 2021 · 8 comments
Open

Find all isotope peaks in a spectrum #185

jorainer opened this issue Mar 12, 2021 · 8 comments

Comments

@jorainer
Copy link
Member

Given a spectrum, find all sets of peaks that could represent isotope groups (e.g. C12, C13 peaks). This functionality could then be used e.g. in a filterIsotopes function or another function to extract just isotope peaks from a Spectra (e.g. to pass it to functions to predict the formula based on the isotope pattern).

@ococrook
Copy link

@jorainer I have highly accurate isotope distributions and background proportions if you need them and so code to simulate the isotope distribution given the sequence

@jorainer
Copy link
Member Author

@andreavicini is currently calculating distributions based on all chemical formulas of metabolites from HMDB (human metabolome database). On what did you calculate that?

Our approach is currently simpler than isotope distribution simulation - we're essentially looking for peaks with a difference in m/z that matches the expected difference for an isotope (e.g. C12, C13) allowing a user-defined ppm and checking that the intensity is lower than a certain threshold.
Would you have a different idea to identify isotope peaks in a peak matrix (i.e. m/z values and intensities from one spectrum)?

@ococrook
Copy link

ococrook commented Mar 29, 2021

So currently, I'm using, for example 12C and 13C that their masses are c(12.0000000, 13.0033548378) and their proportions are prob = c(0.9893, 0.0107) etc. I can then take any sequence and charge and simluate what the isotope distribution as a spectra looks like and then match the peaks within 2ppm error of each peak in the reference.

It looks like your use cases is slightly different, but thought I'd share in case its useful to discuss

@jorainer
Copy link
Member Author

OK, if I get you correctly, in your case the sequence (=chemical formula) and the charge is known beforehand. That's definitely also a good use case. Is that somewhat similar to what envipat and Rdisop are doing?

My use case at present is a completely unsupervised one, given that I have a spectrum, identify groups of peaks that could represent isotope peaks of a (yet unknown) compound.

@ococrook
Copy link

Yep, exactly, mine is more simialr to envipat, just it returns a spectra object so its easier to use. Though, would also be cool in your unsupervised approach to be able to identify a glyco or phospho group (because that is unknown for us).

Thanks for clarification - look forward to the development!

@sgibb
Copy link
Member

sgibb commented Mar 29, 2021 via email

@jorainer
Copy link
Member Author

Thanks @sgibb ! I completely forgot about that one!

@hechth
Copy link

hechth commented Nov 22, 2021

Maybe you can get inspired here: https://github.com/RECETOX/recetox-xMSannotator/blob/main/xmsannotator/R/compute_isotopes.R

The rdkit chem library gives you the pattern, so with some spectral matching you could maybe identify those peaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants