Chemical acetylation of ligands and two-step digestion protocol for reducing co-digestion in affinity purification-mass spectrometry
David M. Hollenstein, Margarita Maurer-Granofszky, Wolfgang Reiter, Dorothea Anrather, Thomas Gossenreiter, Riccardo Babic, Natascha Hartl, Claudine Kraft, Markus Hartl
This repository provides the complete source code that was used for downstream processing of the mass spectrometry data used in this study and for creating the plots and tables shown in the publication. Please refer to the sections below for detailed instructions on how to reproduce the data analysis from the publication.
The repository inlcudes several folders with the following content:
- database: Contains FASTA files used for peptide spectrum matching with FragPipe (needs to be downloaded from PRIDE).
- distributions: Contains distribution files for the
xlsxreport
andmsreport
python libraries. - manuscript_analysis: Contains Jupyter notebooks used to generate the plots shown in the manuscript.
- ms_data: Contains
RAW
files and FragPipe output folders (needs to be downloaded from PRIDE). - msreport_analysis: Contains Jupyter notebooks used for the downstream processing of the FragPipe output.
- plots: Output folder for files generated by the Jupyter notebooks from the manuscript_analysis folder.
- python_scripts: Contains an additional Python module used to generate TIC plots.
- qtable_data: Contains files exported during the downstream processing of the FragPipe output. These files are used for further analysis and for generating the plots shown in the manuscript.
To prepare the local repository, follow these steps:
- Clone the repository from GitHub into a local folder that is accessible by JupyterLab (by default somewhere in your user directory).
git clone https://github.com/maxperutzlabs-ms/Publication_Resources
- Navigate to the 2023_hollenstein_chemical-ligand-acetylation folder.
- Download the five zipped FragPipe
SEARCH
folders from the PRIDE repository and extract them into the ms_data folder. - Download the two FASTA files from the PRIDE repository and place them into the database folder.
- Optional, only required for generating the TIC and base peak plots.
- Download the
RAW
files from the PRIDE repository and copy them into the respective subfolder in the ms_data folder. Information about whichRAW
files belong to which FragPipe folder can be found in the rawfile_annotation.xlsx table in the PRIDE repository. - Download and install ProteoWizard.
- Use MsConvert from ProteoWizard to convert the
RAW
files tomzML
(Enable peak picking enabled on MS1 and MS2 level with the “Prefer Vendor” option).
- Download the
To set up the Python and R environments, follow these instructions:
- Install Python version 3.9 (we recommend installing Python into a fresh virtual environment).
- Install the required Python libraries using pip and the provided requirements.txt file.
pip install -r ./requirements.txt
- Install the
xlsxreport
andmsreport
python libraries using the files from the distributions directory.pip install ./distributions/xlsxreport-0.0.6-py3-none-any.whl pip install ./distributions/msreport-0.0.13-py3-none-any.whl
- Setup a local xlsxreport app folder by running the following script in your terminal.
xlsxreport_setup
- Install JupyterLab by following the installation instructions from the JupyterLab website.
- Install R version 4.2.1 and the LIMMA package version 3.54.2 .
- Note that it might be necessary to define the
R_HOME
system environment variable for running the Jupyter notebooks.
- Note that it might be necessary to define the
- Launch JupyterLab by executing the
jupyter lab
command in your terminal. - Once JupyterLab is running, navigate to the msreport_analysis folder within JupyterLab's file browser.
- Execute the code from the Jupyter notebooks located in the msreport_analysis folder. The order of execution does not matter.
- Running these notebooks creates the protein and peptide tables that are used in the Jupyter notebooks from the manuscript_analysis folder and saves them into the qtable_data folder.
- The qtable_data already contains those output files, running the Jupyter notebooks simply overrides them.
- Launch JupyterLab by executing the
jupyter lab
command in your terminal. - Once JupyterLab is running, navigate to the manuscript_analysis folder within JupyterLab's file browser.
- Execute the code from the Jupyter notebooks located in the manuscript_analysis folder.
- Running these notebooks generates the plots shown in the manuscript and saves them into the plots folder.
- The plots folder already contains the graphics used in the manuscript, running the Jupyter notebooks simply overrides them.