Skip to content

Metagenomics pipeline for human diet analysis using organelles genomes

License

Notifications You must be signed in to change notification settings

maxibor/organdiet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

OrganDiet is a Nextflow pipeline to infer a human diet based on shotgun metagenomics data.

Currently in development. For now, you can only run it on a Linux based machine

Dependancies

Quick start

Assuming you already have all databases and the conda environment installed

conda activate organdiet
nextflow run maxibor/organdiet --reads '*_R{1,2}.fastq.gz' -with-report run_report.html -with-dag flowchart.png

Installation

1. Set up conda environments

wget https://github.com/maxibor/organdiet/archive/v0.2.2.zip
unzip v0.2.2.zip
cd organdiet-0.2.2
conda env create -f envs/organdiet.yml
source activate organdiet

2. Set up Taxonomy database

  • Install taxonomy database: ./bin/basta taxonomy -o ./taxonomy

3. Download the Bowtie2 index for the host genome

From illumina iGenomes

mkdir hs_genome
cd hs_genome
wget ftp://igenome:[email protected]/Homo_sapiens/Ensembl/GRCh37/Homo_sapiens_Ensembl_GRCh37.tar.gz
tar -xvzf Homo_sapiens_Ensembl_GRCh37.tar.gz
cd ..

4. Download the organellome database and build Bowtie2 index

From NCBI Refseq organelles genomes

./bin/download_organellome_db.sh
bowtie2-build organellome_db/organellome.fa organellome_db/organellome

4. nt/nr database set up: two solutions

Case 1: You plan on using the nr database

4.1.1 Set up nr database for Diamond
mkdir nr_diamond_db
cd nr_diamond_db
wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz
gunzip nr.gz
mv nr nr.fa
diamond makedb --in nr.fa -d nr
cd ..

4.1.2 Set up TaxID mapping database

  • Install prot database:

    ./bin/basta download prot -d ./taxonomy

Case 2: You plan on using the nt database

4.1.1 Download and extract the centrifuge database
mkdir nt_db
cd nt_db
wget http://som1.ific.uv.es/nt/nt.cf.7z
7z e nt.cf.7z
cd ..

4.1.2 Set up krona mapping database

  • Install krona database:

    ktUpdateTaxonomy.sh ./taxonomy

Get help

nextflow run maxibor/organdiet --help

An example workflow for this pipeline

Credits

The OrganDiet pipeline uses many tools listed below:

The author of OrganDiet also got some inspiration and help from the following awesome developers: