Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow rescoring of multiple hits per spectrum #83

Closed
JB91451 opened this issue Oct 26, 2022 · 2 comments
Closed

Allow rescoring of multiple hits per spectrum #83

JB91451 opened this issue Oct 26, 2022 · 2 comments
Labels
feature new feature
Milestone

Comments

@JB91451
Copy link

JB91451 commented Oct 26, 2022

Dear all,

Thank you for creating this nice tool.

I have recently tried to re-score some comet results (pin files). However, during the searches I usually set "num_output_lines" to a value grater than 1 to export also lower-than-best scoring results. Often this improves score adjustment by the TPP/Prophets pipeline. Unfortunately it seems that these lower-hit ranks are also written to the percolator files and result in the error below. The same might be true for other search engines that can output such hits.
The error traces back to the _get_spectrum_index_column method in the percolator.py file where the pattern string discards the spectrum identifier information on charge and hit-rank (e.g. there are scans like ..._623_2_1; ..._623_2_2; ... which all become spec id 623).
Would it be possible to discard these lower-ranking hits automatically and just throw a warning instead? I guess this would be the cleanest solution as I am not sure if percolator can handle the information properly.

Best,
Juergen

The error is:
Traceback (most recent call last):
File "C:\Programs\Python310\lib\site-packages\ms2rescore_main_.py", line 15, in main
rescore.run()
File "C:\Programs\Python310\lib\site-packages\ms2rescore_init_.py", line 233, in run
peprec = self.pipeline.get_peprec()
File "C:\Programs\Python310\lib\site-packages\ms2rescore\id_file_parser.py", line 224, in get_peprec
return self.peprec_from_pin()
File "C:\Programs\Python310\lib\site-packages\ms2rescore\id_file_parser.py", line 179, in peprec_from_pin
peprec = self.original_pin.to_peptide_record(
File "C:\Programs\Python310\lib\site-packages\ms2rescore\percolator.py", line 470, in to_peptide_record
peprec_df["spec_id"] = self._get_spectrum_index_column(
File "C:\Programs\Python310\lib\site-packages\ms2rescore\percolator.py", line 270, in _get_spectrum_index_column
raise PercolatorInError("Issue in matching spectrum IDs, duplicates found.")
ms2rescore.percolator.PercolatorInError: Issue in matching spectrum IDs, duplicates found.

@ArthurDeclercq
Copy link
Collaborator

Hi @JB91451,

Thank you for using MS²Rescore! We are aware of the issues with multiple rank rescoring (the non-possibility of doing so). We are currently working on a major refactoring of MS²Rescore where these issues will be addressed. So you will be able to provide provide lower rank psm as well without getting an error!

Thank you for your patience!

@ArthurDeclercq ArthurDeclercq added the feature new feature label Nov 17, 2022
@RalfG RalfG changed the title Error when parsing pin files with multiple search hits (e.g. from comet) Allow rescoring of multiple hits per spectrum Oct 15, 2023
@RalfG
Copy link
Member

RalfG commented Jul 31, 2024

Control of multi-rank PSM rescoring is now fully implemented in v3.1.0:

https://ms2rescore.readthedocs.io/en/v3.1.0/userguide/configuration/#multi-rank-rescoring

@RalfG RalfG closed this as completed Jul 31, 2024
@RalfG RalfG added this to the v3.1.0 milestone Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature new feature
Projects
None yet
Development

No branches or pull requests

3 participants