Skip to content

MSMS (DDA) workflow

Shubhra Agrawal edited this page Apr 29, 2019 · 1 revision

DDA or data-dependant acquisition is a standard way of running MS/MS experiments. There is one MS1 full-scan to survey the metabolite intensities and identify high intensity metabolites to fragment, and a series of MS1 and MS2 scans running in parallel where the aforementioned precursors are isolated and fragmented to collect their product ion spectra.

These fragmentation patterns depend on the structure of the metabolite and therefore differ between two ions of the same m/z. Comparing the fragmentation pattern of a metabolite to publicly available spectral libraries can help in identifying metabolites much more accurately than LC-MS experiments.

This is a step-by-step tutorial for the complete DDA workflow to help you get the best results from your MS/MS data.

1. Launch El-MAVEN

image

2. Set import filters

image

Go to the Options dialog and move to the Import tab. These settings need to be changed at the start of the analysis before any samples are loaded. If the data was collected for different polarities, processing should be done in two different sessions. One with the "Positive Scans Only" option and another with the "Negative Scans Only" option selected from the drop-down.

Additional filters are also available to reduce the amount of data imported into El-MAVEN

2. Import samples

image

El-MAVEN accepts the following input formats:

  • mzXML
  • mzML
  • CDF
  • emDB (project format)
  • mzrollDB (MAVEN project format)

Click on the "Open" button, select all samples from an experiment and open them. If you have an mzrollDB file from MAVEN, you can import that into El-MAVEN and continue your session as well.

image

You can see the loaded samples in the sample widget on the left. If you have the cohort information for this dataset, you can enter that information in the Set column and sort your samples by cohort.

3. Load spectral library

Reference spectral libraries are created by fragmentation of known metabolites in controlled conditions. One such source of libraries is the MoNA database (link). You can download the MS/MS library of your choice for both positive and negative polarity datasets, or create one for your lab.

image

The NIST format (.msp) is human-readable, and contains general metabolite information like name, chemical formula, molecular structure in the form of SMILES, along with experimental information like collision energy, polarity, instrument type and most importantly, the fragments and their relative abundance.

You can import the spectral library into El-MAVEN in the same way as a usual compound database. Open the compound widget, click on 'Open' and navigate to the folder where the library exists as shown below.

image

Once the library has been loaded, you can browse through the metabolites. Double-clicking on a metabolite in this list will display the formula and expected m/z of different fragments.

4. Automated peak detection

Automated peak detection is used to find peak groups based on the selected compound database/spectral library and generate a Peak Table with group statistics. You can also set a number of parameters to filter out groups that are not relevant to your interests in the Peaks dialog. image

image

The Match Fragmentation section is enabled for MS/MS data when a spectral library is selected. Switching it on would ensure that metabolites are only assigned to a group if the fragmentation spectra of the group matches the reference spectra for that metabolite. You can tweak certain parameters to determine what qualifies as a match and what does not.

  • Fragment Mass Tolerance- This value depends on the mass resolution of the instrument this data was collected on. It could be the same as the precursor mass resolution or lower.
  • Match at least X peaks- X denotes the minimum number of fragments that must be in common with the reference spectra for the metabolite to be considered a match
  • Minimum Score- A match score is calculated for every spectra, based on the number of fragments in the group and reference spectrum as well as the number of matches and mismatches between the two. The scoring algorithm is based on hypergeometric distribution. Higher the score, greater the probability of a genuine match. You can set the minimum threshold for MS2 score based on observation.

For labeled experiments, you can switch on Report Isotopic Peaks and select the relavent labels from the Isotope Detection Options within it.

Group filtering settings are used to reduce the number of irrelevant groups based on intensity and quality thresholds, baseline intensity and more.

Click on Find Peaks once the parameters have been set to your preference and a Peak Table will be generated with the list of groups that qualify for manual curation.

5. Peak curation

image

Manual curation is an important step in processing data through El-MAVEN. Even if you set high thresholds for MS2 score and group filtration, there might be some metabolites that have more than one peak group assigned to them. It is recommended that you go through such groups to determine the best match for a metabolite using the MS1 peaks, MS2 scores and the corresponding fragmentation spectra.

image

Clicking on a group in the Peak Table will bring up the Fragmentation Spectra widget where you can see the reference spectra overlaid with the average observed spectra for the group. The matching reference fragments show up blue and the mismatches are colored red. Fewer the mismatches, greater the probability of correct annotation.

Tip: The Fragmentation Spectra widget shows a purity percentage in the title. A low purity percentage denotes that there were 2 or more different precursors during the fragmentation event. This information can be very valuable when evaluating a spectral match. For example, most of the reference fragments might have a match but there are additional fragments in the observed spectra without any corresponding fragments in the reference. If the purity of this spectra is low, you can be reasonably sure that the extra fragments are coming from some other precursor.

Other factors to consider:
image

Since the average group spectra is created from multiple fragmentation events across samples, the number of MS2 events and their retention times can help during curation. In case of DDA experiments, you will see certain markers (solid triangles) on the EIC x-axis. These markers denote a fragmentation event (or MS2 event) for the corresponding sample. Clicking on a marker will bring up the individual spectra obtained during that fragmentation event. The obtained spectra is more reliable if the markers are closer to the peak top. The MS2 Events List lists all the fragmentation events for the selected EIC and can be toggled from the right hand tool bar image.

6. Manual bookmarking

Metabolites in the spectral library are marked green if one or more groups in the Peak Table have been annotated as that metabolite. Once you have selected one group per metabolite in your peak list, you can go through the library to manually bookmark metabolites that might have been missed during automated peak detection.

Open the Fragmentation Spectra widget from the right hand tool bar image. Clicking on any peak group with MS2 events should pull up the fragmentation and reference spectra. If the match seems good, you can double-click the peak highlights. This will open a new Bookmark Table with the selected group added to the list.

Once all metabolites have been found, the Bookmark Table can be merged with your Peak Table by clicking on the image icon in the toolbar and selecting the desired Peak Table.

6. Export

Once the peak list has been curated, you can export the data for further analysis. There are multiple export formats in El-MAVEN:

  • Export to SpreadSheet- You can export your Peak Table to a spreadsheet by clicking on image icon in the toolbar. This format is a condensed version of your curated data in a comma separated file, usually preferred for its concise nature. Some of the important data points are compound name, rt, MS2 scores, sample-wise intensities, isotope abundance etc.
  • Export EICs to JSON- This option is available in the Peak Table toolbar image to export the EIC data points in a JSON file. Since this format is very data-heavy, it is usually exported as an input to downstream pipelines.
  • Export as SQLite- You can save your session as a SQLite database file by clicking on the Save Project option under the File menu. This format is recommended for automating downstream pipelines as SQLite databases are much easier to parse than JSONs.
  • Export to Polly- Polly is a cloud platform for metabolomic data analysis and visualisation. Data from El-MAVEN can directly be used for flux analysis or for absolute quantification. You can find more information on Polly and its services here

7. Save session

El-MAVEN allows you to save your project at any point in your analysis. This project stores all generated peak tables, relevant file paths and user settings in a SQLite file (.emDB). Importing this project in El-MAVEN will restore your analysis to its previous state.

The project can be saved using Save Project option under the File menu, or by pressing Ctrl+S or Command+S.

Note: Application layout is not saved as part of the project. The state or position of your widgets will not be restored on loading a project.