Skip to content

Commit

Permalink
getting started and run pipeline
Browse files Browse the repository at this point in the history
  • Loading branch information
salvidm committed Dec 14, 2023
1 parent 5fe124d commit 1935820
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 4 deletions.
3 changes: 2 additions & 1 deletion docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,14 @@ nav_order: 2
## Getting Started

1. Install [Nextflow](https://www.nextflow.io/docs/latest/getstarted.html#installation) (>=21.04.0).
**Windows users**: this [step-by-step](https://www.nextflow.io/blog/2021/setup-nextflow-on-windows.html) tutorial could make your life much easier.

2. Install [Docker](https://docs.docker.com/get-docker/) or [Singularity](https://sylabs.io/).

3. Run the pipeline on a test dataset using Docker to validate your installation.

```
nextflow run genepi/nf-gwas -r v1.0.0 -profile test,<docker,singularity>
nextflow run genepi/nf-gwas -r v<[latest tag](https://github.com/genepi/nf-gwas/tags)> -profile test,<docker,singularity>
```
### Run the pipeline on your data
Expand Down
6 changes: 3 additions & 3 deletions docs/gwas-regenie-101/run-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ nav_order: 3

### Running the nf-gwas pipeline

To run the pipeline on your data, prepare the phenotype and (optional) covariate files as described [here](https://rgcgithub.github.io/regenie/options/#input)). In addition, you need the genotyping data for step 1 in bim,bed,fam format and your imputed genotypes in VCF or BGEN format. Transfer all these files using FileZilla to the folder of your choice on the server.
To run the pipeline on your data, prepare the phenotype and (optional) covariate files as described [here](https://rgcgithub.github.io/regenie/options/#input)). In addition, you need the genotyping data for step 1 in bim,bed,fam format and your imputed genotypes in VCF or BGEN format. Transfer all these files using FileZilla to the folder of your choice on the server.

Now, you have to prepare a configuration file for the pipeline. For this, you can use any text editor but for example the text editor [Atom](https://atom.io/) is very convenient since it can also highlight different kinds of codes etc. The required and optional parameters for the configuration file are all listed [here](../params/params) of the pipeline. To make your own config file, it is the easiest to copy one of the exemplary [config files](https://github.com/genepi/nf-gwas/tree/main/conf/tests). Adapt all the paths and parameters to fit your data and save the file (e.g. as: first-gwas.config). If you added additional parameters, just make sure, that they are within the curly brackets.
Now, you we need to prepare a configuration file for the pipeline. You can use any text editor! For example, we use the IDE [Visual Studio Code](https://code.visualstudio.com/), which has some very convenient features, including highlighting different code elements. The required and optional parameters for the configuration file are all listed [here](../params/params) of the pipeline. To make your own config file, it is the easiest to copy one of the exemplary [config files](https://github.com/genepi/nf-gwas/blob/main/conf/test.config). Adapt all the paths and parameters to fit your data and save the file (e.g. as: first-gwas.config). If you´ve used additional parameters, just make sure that they are within the curly brackets.

Just one possibly helpful fact on the side here: as indicated on the GitHub repository, the genotypes have to be a single merged file but the imputed genotypes can also be one file per chromosome. If we have them in single files per chromosome we can put the path for example as follows into the configuration file `/home/myHome/GWAS/imputed\_data/\*vcf.gz`. The asterisk (\*) is a wildcard. So it will take all the files from the imputed\_data folder that end with `vcf.gz`.
Useful tip: as indicated on the GitHub repository, the genotypes have to be a single merged file but the imputed genotypes can also be one file per chromosome. If we have them in single files per chromosome we can put the path for example as follows into the configuration file `/home/myHome/GWAS/imputed\_data/\*vcf.gz`. The asterisk (\*) is a wildcard. So it will take all the files from the imputed\_data folder that end with `vcf.gz`.

Now you can transfer the file via FileZilla to your folder of choice on the server (as an example let's say we put the `first-gwas.config` into the folder `/home/myHome/GWAS`).

Expand Down

0 comments on commit 1935820

Please sign in to comment.