From 697f585ee9db1f839356e7a1242742445e907ba6 Mon Sep 17 00:00:00 2001 From: SherineAwad <6249447+SherineAwad@users.noreply.github.com> Date: Sat, 19 Feb 2022 19:19:34 +0000 Subject: [PATCH] Update README --- README.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 65eefcb..694f7e7 100644 --- a/README.md +++ b/README.md @@ -10,16 +10,17 @@ Snakemake Workflow for Variant Calling This is a GATK variant calling snakemake pipeline written by Sherine Awad. -We are using GATK4 GVCF mode. To run the pipeline, edit the config file to match your samples file name and reference genome. -Your samples names should be listed by default in samples.tsv file. You can change this file name in *config file* if neededi by editing the **SAMPLES** entry in the config file. +We are using GATK4 GVCF mode. To run the pipeline, edit the *config file* to match your samples names and other parameters. + +Your samples names should be listed by default in **samples.tsv** file. You can change this file name in *config file* if needed by editing the **SAMPLES** entry in the *config file*. The pipeline expects samples with suffix ".r_1.fq.gz" and ".r_2.fq.gz" if samples are paired-end. Any prefix before this suffix is the sample name and to be written in the "samples.tsv". For single-end reads, the samples suffix is ".fq.gz" and any prefix before this suffix is written in the **"samples.tsv"**. For example, if your sample name is sample1.s_1.r_1.fq.gz, then your sample name in the samples file should be sample1.s_1. -You need to update the config file with whether your samples are paired-end or single reads. If your samples are paired-end, then the **PAIRD** entry in the config file should be set to TRUE, otherwise, set the **PAIRED** entry in the config file to FALSE. You can change the **samples.tsv** name in the config file. +You need to update the *config file* with whether your samples are paired-end or single reads. If your samples are paired-end, then the **PAIRD** entry in the config file should be set to TRUE, otherwise, set the **PAIRED** entry in the config file to FALSE. You can change the **samples.tsv** name in the *config file*. -You need to update your interval list, by editing the **intervals.list** file to list only the chromosomes of interest. You can change the name of this file by editing the config file entry INTERVALS. +You need to update your interval list, by editing the **intervals.list** file to list only the chromosomes of interest. You can change the name of this file by editing the *config file* entry **INTERVALS**. The pipeline pulls automatically the resources needed by GATK from Broad Institute resource bundles. The pipeline uses **Annovar** for annotations.