Skip to content

Commit

Permalink
Update the Readme
Browse files Browse the repository at this point in the history
  • Loading branch information
alchemistmatt committed Apr 23, 2020
1 parent 249809c commit fb62175
Showing 1 changed file with 21 additions and 21 deletions.
42 changes: 21 additions & 21 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@

Converts an .mzid file created by MS-GF+ to a tab-delimited text file.

Although MS-GF+ has this option (see [MzidToTsv.html](http://htmlpreview.github.io/?https://github.com/sangtaekim/msgfplus/blob/master/doc/MzidToTsv.html) ),
Although MS-GF+ has this option (see [MzidToTsv.html](https://msgfplus.github.io/msgfplus/MzidToTsv.html) ),
MzidToTsvConverter.exe can convert the .mzid file faster, using less memory.

## Details

MzidToTsvConverter reads in the mzid from MS-GF+ and creates a nearly identical tsv file as the MS-GF+ converter -- nearly identical because the number formatting is slightly different.
MzidToTsvConverter reads in the .mzid created by MS-GF+ and creates a nearly identical tsv file as the MS-GF+ converter -- nearly identical because the number formatting is slightly different.

MzidToTsvConverter uses PSI_Interface.dll to read the mzid file.

Expand Down Expand Up @@ -65,25 +65,25 @@ MzidToTsvConverter uses PSI_Interface.dll to read the mzid file.

The columns in the .tsv file created by the MzidToTsvConverter are:

|Column | Description | Example |
|--------------|--------------|-----------|
| #SpecFile | Spectrum file name | Dataset.mzML |
| SpecID | Spectrum ID | controllerType=0 controllerNumber=1 scan=16231 |
| ScanNum | Scan number | 16231 |
| ScanTime(Min) | (Can be disable with switch `-ne`) Scan Start time, minutes | 52.534 |
| FragMethod | Fragmentation method for the given MS/MS spectrum. Will be CID, ETD, or HCD. However, when spectra from the same precursor are merged, fragmentation methods of merged spectra will be shown in the form "FragMethod1/FragMethod2/..." (e.g. CID/ETD, CID/HCD/ETD). | HCD |
| Precursor | m/z value of the precursor ion | 767.04388 |
| IsotopeError | Isotope Error, indicating which isotope in the isotopic distribution the parent ion m/z corresponds to. Typically 0, indicating the first isotope. If 1, that means the second isotope was chosen for fragmentation. | 0 |
| PrecursorError(ppm) | Mass Difference (in ppm) between the observed parent ion and the computed mass of the identified peptide. This value is automatically corrected if the second or third isotope is chosen for fragmentation. | -0.8753 |
| Charge | Charge state of the parent ion | 3 |
| Peptide | The identified peptide, with prefix and suffix residues. Also includes a numeric representation of both static and dynamic post translational modifications. | K.VPPAPVPC+57.021PPPS+79.966PGPSAVPSSPK.S |
| Protein | Name of the protein this peptide comes from | BAG3_HUMAN |
| DeNovoScore | The MSGFScore of the optimal scoring peptide. Larger scores are better. | 110 |
| MSGFScore | This is MS-GF+'s main scoring value for the identified peptide. Larger scores are better. | 99 |
| SpecEValue | This is MS-GF+'s main scoring value related to peptide confidence (spectrum level e-value) of the peptide-spectrum match. MS-GF+ assumes that the peptide with the lowest SpecEValue value (closest to 0) is correct, and all others are incorrect. | 4.23E-21 |
| EValue | Probability that a match with this SpecEValue is spurious; the lower this number (closer to 0), the better the match. This is a database level e-value, representing the probability that a random PSM has an equal or better score against a random database of the same size. | 9.29E-14 |
| QValue | If MS-GF+ searches a target/decoy database, the QValue (FDR) is computed based on the distribution of SpecEValue values for forward and reverse hits. If the target/decoy search was not used, this column will be EFDR and is an estimated FDR. | 0 |
| PepQValue | Peptide-level QValue (FDR) estimated using the target-decoy approach; only shown if a target/decoy search was used. If multiple spectra are matched to the same peptide, only the best-scoring match is retained and used to compute FDR. | 0 |
|Column | Description | Example |
|----------------|-------------------------------------------------------------------------|-----------------------|
| #SpecFile | Spectrum file name | Dataset.mzML |
| SpecID | Spectrum ID | controllerType=0 controllerNumber=1 scan=16231 |
| ScanNum | Scan number | 16231 |
| ScanTime(Min) | (Can be disable with switch `-ne`) Scan Start time, minutes | 52.534 |
| FragMethod | Fragmentation method for the given MS/MS spectrum. Will be CID, ETD, or HCD. However, when spectra from the same precursor are merged, fragmentation methods of merged spectra will be shown in the form "FragMethod1/FragMethod2/..." (e.g. CID/ETD, CID/HCD/ETD). | HCD |
| Precursor | m/z value of the precursor ion | 767.04388 |
| IsotopeError | Isotope Error, indicating which isotope in the isotopic distribution the parent ion m/z corresponds to. Typically 0, indicating the first isotope. If 1, that means the second isotope was chosen for fragmentation. | 0 |
| PrecursorError(ppm) | Mass Difference (in ppm) between the observed parent ion and the computed mass of the identified peptide. This value is automatically corrected if the second or third isotope is chosen for fragmentation. | -0.8753 |
| Charge | Charge state of the parent ion | 3 |
| Peptide | The identified peptide, with prefix and suffix residues. Also includes a numeric representation of both static and dynamic post translational modifications. | K.VPPAPVPC+57.021PPPS+79.966PGPSAVPSSPK.S |
| Protein | Name of the protein this peptide comes from | BAG3_HUMAN |
| DeNovoScore | The MSGFScore of the optimal scoring peptide. Larger scores are better. | 110 |
| MSGFScore | This is MS-GF+'s main scoring value for the identified peptide. Larger scores are better. | 99 |
| SpecEValue | This is MS-GF+'s main scoring value related to peptide confidence (spectrum level e-value) of the peptide-spectrum match. MS-GF+ assumes that the peptide with the lowest SpecEValue value (closest to 0) is correct, and all others are incorrect. | 4.23E-21 |
| EValue | Probability that a match with this SpecEValue is spurious; the lower this number (closer to 0), the better the match. This is a database level e-value, representing the probability that a random PSM has an equal or better score against a random database of the same size. | 9.29E-14 |
| QValue | If MS-GF+ searches a target/decoy database, the QValue (FDR) is computed based on the distribution of SpecEValue values for forward and reverse hits. If the target/decoy search was not used, this column will be EFDR and is an estimated FDR. | 0 |
| PepQValue | Peptide-level QValue (FDR) estimated using the target-decoy approach; only shown if a target/decoy search was used. If multiple spectra are matched to the same peptide, only the best-scoring match is retained and used to compute FDR. | 0 |

Notes on QValue and PepQValue
* QValue is defined as the minimum false discovery rate (FDR) at which the test may be called significant
Expand Down

0 comments on commit fb62175

Please sign in to comment.