Skip to content

Support for Percolator output files #103

@wsnoble

Description

@wsnoble

It would be great if Percolator PSM-level tab-delimited output files could be supported by FlashLFQ. I looked into this, and there are two major and several minor challenges associated with making this happen.

The first major issue is that Percolator does not include the mzML file name, but only an integer file index, in the output file. This is obviously problematic, and it's something we can look into fixing on the Percolator end of things.

The second major issue is that Percolator does not include retention time. This is harder for us to fix, because this information is also not included in the outputs of many common search engines. It seems like, if you have the scan numbers in the Percolator output, it should be feasible for FlashLFQ to grab the RT from the mzML file. Is this doable?

The other minor issues are that Percolator does not have a "Base Sequence" column and that Percolator uses comma-delimited protein ID lists, rather than semi-colon delimited lists. These latter ones, along with differences in column naming, should be easy to handle.

Here is a tiny sample Percolator output file.
sample.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions