EVTA is a bioinformatics pipeline for analyzing evolutionary changes in genomic variants across time intervals. It takes paired-end sequencing files (FASTQ) from two timepoints and uses the first timepoint as a reference to identify mutations that have occurred between the timespan.
The pipeline consists of the following steps:
- Read QC and Merging: The raw paired sequencing files (FASTQs) are quality-controlled and merged into a single file using bbduk and bbmerge.
- Map inoculum sample against SIVmac239M: The inoculum sample (if provided, if not use the genbank reference) is mapped against the SIVmac239M reference genome to create an Inoculum Consensus FASTA.
- Map Timepoint 1 FASTQs to Inoculum reference: The merged FASTQs from Timepoint 1 are mapped to the Inoculum Consensus FASTA to generate a Timepoint 1 Consensus FASTA.
- Map Timepoint 2 FASTQs to Timepoint 1 consensus: The merged FASTQs from Timepoint 2 are mapped to the Timepoint 1 Consensus FASTA.
- Transfer SIVmac239M annotations to new consensus: The annotations from the SIVmac239M reference are transferred to the Timepoint 2 BAM file using liftoff, creating a Revised GFF Annotation File.
- Call variants with the Revised Annotation File: Variants are called using the Revised GFF Annotation File, generating a Variant Call Format (VCF) file. Optional: Barcode region from SIVmac239M is removed using bedtools. This can be configured in the nextflow.config file.
This repository contains sample data and sample output files to demonstrate the pipeline's functionality. To run the pipeline on your own data, follow these steps:
- Clone the repository
- Configure the Docker image
- Place your paired-end sequencing files (FASTQs) in the
data/
directory - Mount volumes to Docker container
- Configure your paths and settings in nextflow.config
- Run the main.nf program
The pipeline will process your data, and the output files will be generated in the configured results directory.
docker build -t evta .
OR
docker pull willgardnerbiotech/evta_1.0
Contributions are welcome! If you encounter any issues or have suggestions for improvements, please open an issue or submit a pull request.
This project is licensed under the MIT License.