Skip to content

Implementation of the LC-MS²Struct model published in the manuscript "Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data" by Bach et al.

License

Notifications You must be signed in to change notification settings

aalto-ics-kepaco/msms_rt_ssvm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LC-MS²Struct

This package implements a Structured Support Vector Machine (SSVM) model for the molecule structure prediction of liquid chromatography (LC) tandem mass spectrometry data (MS²). This work is part of the publication:

"Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data",

Eric Bach, Emma L. Schymanski and Juho Rousu, 2022

We consider the output of an LC-MS² experiment as structured output. The structure is thereby assumed to be imposed by the observed retention orders (RO) of the MS features, i.e. MS¹-information, MS²-spectrum, and retention time (RT). We assume, that for each MS feature a set of potential molecular structures, the so-called candidate set, can be generated. The idea is to predict a ranking of the candidate structures associated with each features. The SSVM framework allows us to predict rankings that are not independent of each other, but are taking into account the observed ROs, which are assumed to give structure respectively additional constraints which improve the ranking.

Installation

That's how you install the package:

  1. Clone the package and change to the directory:
git clone https://github.com/aalto-ics-kepaco/msms_rt_ssvm
cd msms_rt_ssvm
  1. Create a conda environment and install dependencies:
conda env create -f environment.yml
conda activate lcms2struct
  1. Install the package:
pip install .
  1. Leave the package directory:
cd ..  
  1. Clone the package-dependency "msmsrt_scorer", implementing the max-marginal (see Paper) inference, and change to the directory:
git clone https://github.com/aalto-ics-kepaco/msms_rt_score_integration
cd msms_rt_score_integration
  1. Install the "msmsrt_scorer" package (it is assumed that the conda environment is active):
pip install .
  1. (Optional) Change back to the msms_rt_ssvm directory and test the package:
cd ../msms_rt_ssvm

# Unpack test databases
gunzip --keep ssvm/tests/Bach2020_test_db.sqlite.gz
gunzip --keep ssvm/tests/Massbank_test_db.sqlite.gz

# Run the tests
python -m unittest discover -s ssvm/tests -p 'unittests*.py'

## Expected output ##
# .............s................s.....................s...................s.....s..................................s......
# ----------------------------------------------------------------------
# Ran 121 tests in 99.599s
# 
# OK (skipped=6)

All code was developed and tested in a Linux environment. Other operating systems are not supported.

Usage

Example usages of the package can be found the repository of the experiments done for the manuscript.

Cite the package

If you use this package, please cite our original publication:

@article {Bach2022,
  author = {Bach, Eric and Schymanski, Emma L. and Rousu, Juho},
  title = {Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data},
  elocation-id = {2022.02.11.480137},
  year = {2022},
  doi = {10.1101/2022.02.11.480137}, 
  publisher = {Cold Spring Harbor Laboratory},
  URL = {https://www.biorxiv.org/content/early/2022/04/27/2022.02.11.480137},
  eprint = {https://www.biorxiv.org/content/early/2022/04/27/2022.02.11.480137.full.pdf},
  journal = {bioRxiv}
}

Software citation: DOI

About

Implementation of the LC-MS²Struct model published in the manuscript "Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data" by Bach et al.

Resources

License

Stars

Watchers

Forks

Packages