HLAProphet

HLAProphet is a tool that allows for personalized quantification of the HLA proteins in TMT MS/MS data using FragPipe. HLAProphet takes a list of known HLA types for all samples in an experiment and creates a fasta file containing a harmonized list of protein sequences. This HLA fasta file is then appended to an existing reference proteome for use as an augmented search database with FragPipe. After running FragPipe, HLAProphet then uses a modified version of the TMT-integrator algorithm to quantify HLA proteins at the gene and allele level.

Setup

Install the HLAProphet conda environment
```
conda install --file HLAProphet.yml
```
Install FragPipe: https://fragpipe.nesvilab.org/
Install MSFragger: https://msfragger.nesvilab.org/
Install Philosopher: https://philosopher.nesvilab.org/

HLAProphet workflow part 1:

Generate HLA types for all samples using any method, formatted as seen in examples/types.csv

Create a local version of the IMGT/HLA protein database using scripts/make_imgt_database.py

#DOWNLOAD_DIR is a folder where the source IMGT reference files will be downloaded
#IMGT_FASTA is the filename of the final combined fasta file containing all HLA sequences
python scripts/make_imgt_database.py \
    $DOWNLOAD_DIR \
    $IMGT_FASTA

Create an HLA fasta reference using scripts/make_hla_fasta.py. If two separate HLA types produce the same protein product, the protein is only included once in the output database. A relationship table is produced to tie original HLA types to the matching sequence in the HLA fasta, after clashes are resolved.

#HLA_TYPES is a table containing HLA types of all samples in an experiment
#IMGT_FASTA is the combined IMGT database fasta file created in the previous step
#HLA_FASTA is the output filename for the fasta file containing all HLA sequences for the experiment
#HLA_RELATIONSHIPS is a relationship table matching original HLA types in $HLA_TYPES to the condensed HLA types in $HLA_FASTA
python scripts/make_hla_fasta.py \
    $HLA_TYPES \
    $IMGT_FASTA \
    $HLA_FASTA \
    $HLA_RELATIONSHIPS

Predict tryptic peptides using scripts/tryptic_peptides.py

#HLA_FASTA is the HLA fasta file created in the previous step
#N_MISSED_CLEAVAGE is the number of allowed missed cleavages by trypsin, suggested value is 2
#MIN_PEPTIDE_LENGTH is the minimum tryptic peptide length to keep, suggested value is 7
#MAX_PEPTIDE_LENGTH is the maximum tryptic peptide length to keep, suggested value is 50
#TRYPTIC_PEPTIDES is the filename of the final output file containing all predicted tryptic peptides for all HLA sequences
python scripts/tryptic_peptides.py \
    $HLA_FASTA \
    $N_MISSED_CLEAVAGE \
    $MIN_PEPTIDE_LENGTH \
    $MAX_PEPTIDE_LENGTH \
    $TRYPTIC_PEPTIDES

Fragpipe workflow

Create personalized database using philosopher. This step combines a standard reference proteome (i.e. GENCODE) with the cohort personalized HLA reference produced by HLAProphet. Be sure to remove existing HLA sequences in the reference proteome before combining.
```
cd examples
philosopher workspace --init
philosopher database --custom $GENCODE --add example_HLA.fa --contam
philosopher workspace --clean
cd ../
```
Save sample manifest in FragPipe using the GUI.
Save experiment workflow in FragPipe using the GUI.

Run FragPipe in headless mode

fragpipe --headless \
    --manifest $MANIFEST \
    --workflow $WORKFLOW \
    --threads $NCORES \
    --workdir $FRAGPIPE_WORKDIR \
    --config-msfragger $MSFRAGGER \
    --config-philosopher $PHILOSOPHER

HLAProphet workflow part 2:

Run HLA quant.

#HLA_RELATIONSHIPS is the relationship table created by make_hla_fasta.py
#TRYPTIC_PEPTIDES is the list of tryptic peptides created by tryptic_peptides.py
#FRAGPIPE_WORKDIR is the folder containing fragpipe outputs
#REF_PATTERN is the pattern used to identify reference pool samples
#PLEX_SIZE is the number of samples per plex
#POOL_N is the number of samples contributing to the reference pool
#OUTDIR is the directory to save output files
#OUT_PREFIX is the prefix to use for output filenames
python scripts/hla_quant.py \
    $HLA_RELATIONSHIPS \
    $TRYPTIC_PEPTIDES \
    $FRAGPIPE_WORKDIR \
    $REF_PATTERN \
    $PLEX_SIZE \
    $POOL_N \
    $OUTDIR  \
    $OUT_PREFIX

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
examples		examples
scripts		scripts
.gitignore		.gitignore
HLAProphet.yml		HLAProphet.yml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HLAProphet

Setup

HLAProphet workflow part 1:

Fragpipe workflow

HLAProphet workflow part 2:

About

Releases

Packages

Languages

mctp/HLAProphet

Folders and files

Latest commit

History

Repository files navigation

HLAProphet

Setup

HLAProphet workflow part 1:

Fragpipe workflow

HLAProphet workflow part 2:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages