diff --git a/.gitattributes b/.gitattributes index 7fe55006..050bb120 100644 --- a/.gitattributes +++ b/.gitattributes @@ -1 +1,3 @@ *.config linguist-language=nextflow +modules/nf-core/** linguist-generated +subworkflows/nf-core/** linguist-generated diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index d1b57a92..b4bff9b6 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -68,16 +68,13 @@ If you wish to contribute a new step, please use the following coding standards: 1. Define the corresponding input channel into your new process from the expected previous process channel 2. Write the process block (see below). 3. Define the output channel if needed (see below). -4. Add any new flags/options to `nextflow.config` with a default (see below). -5. Add any new flags/options to `nextflow_schema.json` with help text (with `nf-core schema build`). -6. Add any new flags/options to the help message (for integer/text parameters, print to help the corresponding `nextflow.config` parameter). -7. Add sanity checks for all relevant parameters. -8. Add any new software to the `scrape_software_versions.py` script in `bin/` and the version command to the `scrape_software_versions` process in `main.nf`. -9. Do local tests that the new code works properly and as expected. -10. Add a new test command in `.github/workflow/ci.yml`. -11. If applicable add a [MultiQC](https://https://multiqc.info/) module. -12. Update MultiQC config `assets/multiqc_config.yaml` so relevant suffixes, name clean up, General Statistics Table column order, and module figures are in the right order. -13. Optional: Add any descriptions of MultiQC report sections and output files to `docs/output.md`. +4. Add any new parameters to `nextflow.config` with a default (see below). +5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core schema build` tool). +6. Add sanity checks and validation for all relevant parameters. +7. Perform local tests to validate that the new code works as expected. +8. If applicable, add a new test command in `.github/workflow/ci.yml`. +9. Update MultiQC config `assets/multiqc_config.yaml` so relevant suffixes, file name clean up and module plots are in the appropriate order. If applicable, add a [MultiQC](https://https://multiqc.info/) module. +10. Add a description of the output files and if relevant any appropriate images from the MultiQC report to `docs/output.md`. ### Default values @@ -102,27 +99,6 @@ Please use the following naming schemes, to make it easy to understand what is g If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core bump-version --nextflow . [min-nf-version]` -### Software version reporting - -If you add a new tool to the pipeline, please ensure you add the information of the tool to the `get_software_version` process. - -Add to the script block of the process, something like the following: - -```bash - --version &> v_.txt 2>&1 || true -``` - -or - -```bash - --help | head -n 1 &> v_.txt 2>&1 || true -``` - -You then need to edit the script `bin/scrape_software_versions.py` to: - -1. Add a Python regex for your tool's `--version` output (as in stored in the `v_.txt` file), to ensure the version is reported as a `v` and the version number e.g. `v2.1.1` -2. Add a HTML entry to the `OrderedDict` for formatting in MultiQC. - ### Images and figures For overview images and other documents we follow the nf-core [style guidelines and examples](https://nf-co.re/developers/design_guidelines). diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md deleted file mode 100644 index 429da641..00000000 --- a/.github/ISSUE_TEMPLATE/bug_report.md +++ /dev/null @@ -1,63 +0,0 @@ ---- -name: Bug report -about: Report something that is broken or incorrect -labels: bug ---- - - - -## Check Documentation - -I have checked the following places for your error: - -- [ ] [nf-core website: troubleshooting](https://nf-co.re/usage/troubleshooting) -- [ ] [nf-core/viralrecon pipeline documentation](https://nf-co.re/viralrecon/usage) - -## Description of the bug - - - -## Steps to reproduce - -Steps to reproduce the behaviour: - -1. Command line: -2. See error: - -## Expected behaviour - - - -## Log files - -Have you provided the following extra information/files: - -- [ ] The command used to run the pipeline -- [ ] The `.nextflow.log` file - -## System - -- Hardware: -- Executor: -- OS: -- Version - -## Nextflow Installation - -- Version: - -## Container engine - -- Engine: -- version: - -## Additional context - - diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml new file mode 100644 index 00000000..f332a0b2 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -0,0 +1,52 @@ + +name: Bug report +description: Report something that is broken or incorrect +labels: bug +body: + + - type: markdown + attributes: + value: | + Before you post this issue, please check the documentation: + + - [nf-core website: troubleshooting](https://nf-co.re/usage/troubleshooting) + - [nf-core/viralrecon pipeline documentation](https://nf-co.re/viralrecon/usage) + + - type: textarea + id: description + attributes: + label: Description of the bug + description: A clear and concise description of what the bug is. + validations: + required: true + + - type: textarea + id: command_used + attributes: + label: Command used and terminal output + description: Steps to reproduce the behaviour. Please paste the command you used to launch the pipeline and the output from your terminal. + render: console + placeholder: | + $ nextflow run ... + + Some output where something broke + + - type: textarea + id: files + attributes: + label: Relevant files + description: | + Please drag and drop the relevant files here. Create a `.zip` archive if the extension is not allowed. + Your verbose log file `.nextflow.log` is often useful _(this is a hidden file in the directory where you launched the pipeline)_ as well as custom Nextflow configuration files. + + - type: textarea + id: system + attributes: + label: System information + description: | + * Nextflow version _(eg. 21.10.3)_ + * Hardware _(eg. HPC, Desktop, Cloud)_ + * Executor _(eg. slurm, local, awsbatch)_ + * Container engine: _(e.g. Docker, Singularity, Conda, Podman, Shifter or Charliecloud)_ + * OS _(eg. CentOS Linux, macOS, Linux Mint)_ + * Version of nf-core/viralrecon _(eg. 1.1, 1.5, 1.8.2)_ diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml index 0481d615..040a0ddb 100644 --- a/.github/ISSUE_TEMPLATE/config.yml +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -1,4 +1,3 @@ -blank_issues_enabled: false contact_links: - name: Join nf-core url: https://nf-co.re/join diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md deleted file mode 100644 index 2a55ff47..00000000 --- a/.github/ISSUE_TEMPLATE/feature_request.md +++ /dev/null @@ -1,32 +0,0 @@ ---- -name: Feature request -about: Suggest an idea for the nf-core/viralrecon pipeline -labels: enhancement ---- - - - -## Is your feature request related to a problem? Please describe - - - - - -## Describe the solution you'd like - - - -## Describe alternatives you've considered - - - -## Additional context - - diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml new file mode 100644 index 00000000..b29e283d --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.yml @@ -0,0 +1,11 @@ +name: Feature request +description: Suggest an idea for the nf-core/viralrecon pipeline +labels: enhancement +body: + - type: textarea + id: description + attributes: + label: Description of feature + description: Please describe your suggestion for a new feature. It might help to describe a problem or use case, plus any alternatives that you have considered. + validations: + required: true diff --git a/.github/workflows/awsfulltest.yml b/.github/workflows/awsfulltest.yml index b265065d..f5c7195f 100644 --- a/.github/workflows/awsfulltest.yml +++ b/.github/workflows/awsfulltest.yml @@ -18,10 +18,11 @@ jobs: platform: ["illumina", "nanopore"] steps: - name: Launch workflow via tower - uses: nf-core/tower-action@master + uses: nf-core/tower-action@v2 + with: workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }} - bearer_token: ${{ secrets.TOWER_BEARER_TOKEN }} + access_token: ${{ secrets.TOWER_ACCESS_TOKEN }} compute_env: ${{ secrets.TOWER_COMPUTE_ENV }} pipeline: ${{ github.repository }} revision: ${{ github.sha }} @@ -30,4 +31,4 @@ jobs: { "outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/viralrecon/results-${{ github.sha }}/platform_${{ matrix.platform }}" } - profiles: '[ "test_full_${{ matrix.platform }}", "aws_tower" ]' + profiles: test_full_${{ matrix.platform }},aws_tower diff --git a/.github/workflows/awstest.yml b/.github/workflows/awstest.yml index 4b1b78d8..b20a19c2 100644 --- a/.github/workflows/awstest.yml +++ b/.github/workflows/awstest.yml @@ -11,17 +11,17 @@ jobs: runs-on: ubuntu-latest steps: - name: Launch workflow via tower - uses: nf-core/tower-action@master + uses: nf-core/tower-action@v2 with: workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }} - bearer_token: ${{ secrets.TOWER_BEARER_TOKEN }} + access_token: ${{ secrets.TOWER_ACCESS_TOKEN }} compute_env: ${{ secrets.TOWER_COMPUTE_ENV }} pipeline: ${{ github.repository }} revision: ${{ github.sha }} workdir: s3://${{ secrets.AWS_S3_BUCKET }}/work/viralrecon/work-${{ github.sha }} parameters: | { - "outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/viralrecon/results-${{ github.sha }}" + "outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/viralrecon/results-test-${{ github.sha }}" } - profiles: '[ "test", "aws_tower" ]' + profiles: test,aws_tower diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index ecd5edef..2c999500 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -8,26 +8,36 @@ on: release: types: [published] +env: + NXF_ANSI_LOG: false + CAPSULE_LOG: none + jobs: test: name: Run workflow tests # Only run on push if this is the nf-core dev branch (merged PRs) if: ${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/viralrecon') }} runs-on: ubuntu-latest - env: - NXF_VER: ${{ matrix.nxf_ver }} - NXF_ANSI_LOG: false strategy: matrix: - # Nextflow versions: check pipeline minimum and current latest - nxf_ver: ["21.04.0", ""] + # Nextflow versions + include: + # Test pipeline minimum Nextflow version + - NXF_VER: '21.10.3' + NXF_EDGE: '' + # Test latest edge release of Nextflow + - NXF_VER: '' + NXF_EDGE: '1' steps: - name: Check out pipeline code uses: actions/checkout@v2 - name: Install Nextflow env: - CAPSULE_LOG: none + NXF_VER: ${{ matrix.NXF_VER }} + # Uncomment only if the edge release is more recent than the latest stable release + # See https://github.com/nextflow-io/nextflow/issues/2467 + # NXF_EDGE: ${{ matrix.NXF_EDGE }} run: | wget -qO- get.nextflow.io | bash sudo mv nextflow /usr/local/bin/ @@ -40,29 +50,24 @@ jobs: name: Test workflow parameters if: ${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/viralrecon') }} runs-on: ubuntu-latest - env: - NXF_VER: ${{ matrix.nxf_ver }} - NXF_ANSI_LOG: false strategy: matrix: parameters: - [ - --skip_fastp, - --skip_variants, - --skip_cutadapt, - --skip_kraken2, - --skip_assembly, - "--spades_mode corona", - "--spades_mode metaviral", - "--skip_plasmidid false --skip_asciigenome", - ] + - "--consensus_caller ivar" + - "--variant_caller bcftools --consensus_caller ivar" + - "--skip_fastp --skip_pangolin" + - "--skip_variants" + - "--skip_cutadapt --skip_snpeff" + - "--skip_kraken2" + - "--skip_assembly" + - "--spades_mode corona" + - "--spades_mode metaviral" + - "--skip_plasmidid false --skip_asciigenome" steps: - name: Check out pipeline code uses: actions/checkout@v2 - name: Install Nextflow - env: - CAPSULE_LOG: none run: | wget -qO- get.nextflow.io | bash sudo mv nextflow /usr/local/bin/ @@ -75,19 +80,16 @@ jobs: name: Test SISPA workflow if: ${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/viralrecon') }} runs-on: ubuntu-latest - env: - NXF_VER: ${{ matrix.nxf_ver }} - NXF_ANSI_LOG: false strategy: matrix: - parameters: [--gff false, "--genome 'NC_045512.2'"] + parameters: + - "--gff false" + - "--genome 'NC_045512.2'" steps: - name: Check out pipeline code uses: actions/checkout@v2 - name: Install Nextflow - env: - CAPSULE_LOG: none run: | wget -qO- get.nextflow.io | bash sudo mv nextflow /usr/local/bin/ @@ -100,26 +102,19 @@ jobs: name: Test Nanopore workflow if: ${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/viralrecon') }} runs-on: ubuntu-latest - env: - NXF_VER: ${{ matrix.nxf_ver }} - NXF_ANSI_LOG: false strategy: matrix: parameters: - [ - --gff false, - --input false, - --min_barcode_reads 10000, - --min_guppyplex_reads 10000, - "--artic_minion_caller medaka --sequencing_summary false --fast5_dir false", - ] + - "--gff false" + - "--input false" + - "--min_barcode_reads 10000" + - "--min_guppyplex_reads 10000" + - "--artic_minion_caller medaka --sequencing_summary false --fast5_dir false" steps: - name: Check out pipeline code uses: actions/checkout@v2 - name: Install Nextflow - env: - CAPSULE_LOG: none run: | wget -qO- get.nextflow.io | bash sudo mv nextflow /usr/local/bin/ diff --git a/.github/workflows/linting_comment.yml b/.github/workflows/linting_comment.yml index 90f03c6f..44d72994 100644 --- a/.github/workflows/linting_comment.yml +++ b/.github/workflows/linting_comment.yml @@ -15,6 +15,7 @@ jobs: uses: dawidd6/action-download-artifact@v2 with: workflow: linting.yml + workflow_conclusion: completed - name: Get PR number id: pr_number diff --git a/CHANGELOG.md b/CHANGELOG.md index 9bd587a8..bde3bd6a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,7 +3,74 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). -## [[2.2](https://github.com/nf-core/rnaseq/releases/tag/2.2)] - 2021-07-29 +## [[2.3](https://github.com/nf-core/viralrecon/releases/tag/2.3)] - 2022-02-04 + +### :warning: Major enhancements + +* Please see [Major updates in v2.3](https://github.com/nf-core/viralrecon/issues/271) for a more detailed list of changes added in this version. +* When using `--protocol amplicon`, in the previous release, iVar was used for both the variant calling and consensus sequence generation. The pipeline will now perform the variant calling and consensus sequence generation with iVar and BCFTools/BEDTools, respectively. +* Bump minimum Nextflow version from `21.04.0` -> `21.10.3` + +### Enhancements & fixes + +* Port pipeline to the updated Nextflow DSL2 syntax adopted on nf-core/modules +* Updated pipeline template to [nf-core/tools 2.2](https://github.com/nf-core/tools/releases/tag/2.2) +* [[#209](https://github.com/nf-core/viralrecon/issues/209)] - Check that contig in primer BED and genome fasta match +* [[#218](https://github.com/nf-core/viralrecon/issues/218)] - Support for compressed FastQ files for Nanopore data +* [[#232](https://github.com/nf-core/viralrecon/issues/232)] - Remove duplicate variants called by ARTIC ONT pipeline +* [[#235](https://github.com/nf-core/viralrecon/issues/235)] - Nextclade version bump +* [[#244](https://github.com/nf-core/viralrecon/issues/244)] - Fix BCFtools consensus generation and masking +* [[#245](https://github.com/nf-core/viralrecon/issues/245)] - Mpileup file as output +* [[#246](https://github.com/nf-core/viralrecon/issues/246)] - Option to generate consensus with BCFTools / BEDTools using iVar variants +* [[#247](https://github.com/nf-core/viralrecon/issues/247)] - Add strand-bias filtering option and codon fix in consecutive positions in ivar tsv conversion to vcf +* [[#248](https://github.com/nf-core/viralrecon/issues/248)] - New variants reporting table + +### Parameters + +| Old parameter | New parameter | +|-------------------------------|---------------------------------------| +| | `--nextclade_dataset` | +| | `--nextclade_dataset_name` | +| | `--nextclade_dataset_reference` | +| | `--nextclade_dataset_tag` | +| | `--skip_consensus_plots` | +| | `--skip_variants_long_table` | +| | `--consensus_caller` | +| `--callers` | `--variant_caller` | + +> **NB:** Parameter has been __updated__ if both old and new parameter information is present. +> **NB:** Parameter has been __added__ if just the new parameter information is present. +> **NB:** Parameter has been __removed__ if new parameter information isn't present. + +### Software dependencies + +Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own [Biocontainer](https://biocontainers.pro/#/registry). This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference. + +| Dependency | Old version | New version | +|-------------------------------|-------------|-------------| +| `bcftools` | 1.11 | 1.14 | +| `blast` | 2.10.1 | 2.12.0 | +| `bowtie2` | 2.4.2 | 2.4.4 | +| `cutadapt` | 3.2 | 3.5 | +| `fastp` | 0.20.1 | 0.23.2 | +| `kraken2` | 2.1.1 | 2.1.2 | +| `minia` | 3.2.4 | 3.2.6 | +| `mosdepth` | 0.3.1 | 0.3.2 | +| `nanoplot` | 1.36.1 | 1.39.0 | +| `nextclade` | | 1.10.2 | +| `pangolin` | 3.1.7 | 3.1.19 | +| `picard` | 2.23.9 | 2.26.10 | +| `python` | 3.8.3 | 3.9.5 | +| `samtools` | 1.10 | 1.14 | +| `spades` | 3.15.2 | 3.15.3 | +| `tabix` | 0.2.6 | 1.11 | +| `vcflib` | | 1.0.2 | + +> **NB:** Dependency has been __updated__ if both old and new version information is present. +> **NB:** Dependency has been __added__ if just the new version information is present. +> **NB:** Dependency has been __removed__ if new version information isn't present. + +## [[2.2](https://github.com/nf-core/viralrecon/releases/tag/2.2)] - 2021-07-29 ### Enhancements & fixes @@ -26,7 +93,7 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi > **NB:** Dependency has been __added__ if just the new version information is present. > **NB:** Dependency has been __removed__ if new version information isn't present. -## [[2.1](https://github.com/nf-core/rnaseq/releases/tag/2.1)] - 2021-06-15 +## [[2.1](https://github.com/nf-core/viralrecon/releases/tag/2.1)] - 2021-06-15 ### Enhancements & fixes @@ -67,7 +134,7 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi > **NB:** Dependency has been __added__ if just the new version information is present. > **NB:** Dependency has been __removed__ if new version information isn't present. -## [[2.0](https://github.com/nf-core/rnaseq/releases/tag/2.0)] - 2021-05-13 +## [[2.0](https://github.com/nf-core/viralrecon/releases/tag/2.0)] - 2021-05-13 ### :warning: Major enhancements @@ -220,7 +287,7 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi > **NB:** Dependency has been __added__ if just the new version information is present. > **NB:** Dependency has been __removed__ if new version information isn't present. -## [[1.1.0](https://github.com/nf-core/rnaseq/releases/tag/1.1.0)] - 2020-06-23 +## [[1.1.0](https://github.com/nf-core/viralrecon/releases/tag/1.1.0)] - 2020-06-23 ### Added @@ -263,7 +330,7 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi * Update minia `3.2.3` -> `3.2.4` * Update plasmidid `1.5.2` -> `1.6.3` -## [[1.0.0](https://github.com/nf-core/rnaseq/releases/tag/1.0.0)] - 2020-06-01 +## [[1.0.0](https://github.com/nf-core/viralrecon/releases/tag/1.0.0)] - 2020-06-01 Initial release of nf-core/viralrecon, created with the [nf-core](http://nf-co.re/) template. diff --git a/CITATIONS.md b/CITATIONS.md index 6a3f41ef..ca45fbab 100644 --- a/CITATIONS.md +++ b/CITATIONS.md @@ -91,6 +91,9 @@ * [Unicycler](https://www.ncbi.nlm.nih.gov/pubmed/28594827/) > Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol. 2017 Jun 8;13(6):e1005595. doi: 10.1371/journal.pcbi.1005595. eCollection 2017 Jun. PubMed PMID: 28594827; PubMed Central PMCID: PMC5481147. +* [Vcflib](https://www.biorxiv.org/content/early/2021/05/23/2021.05.21.445151) + > Garrison E, Kronenberg ZN, Dawson ET, Pedersen BS, P Pjotr. Vcflib and tools for processing the VCF variant call format. bioRxiv 2021 May.doi: 10.1101/2021.05.21.445151. + ## Software packaging/containerisation tools * [Anaconda](https://anaconda.com) diff --git a/README.md b/README.md index 417b8336..f915e315 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,11 @@ -# ![nf-core/viralrecon](docs/images/nf-core-viralrecon_logo.png) +# ![nf-core/viralrecon](docs/images/nf-core-viralrecon_logo_light.png#gh-light-mode-only) ![nf-core/viralrecon](docs/images/nf-core-viralrecon_logo_dark.png#gh-dark-mode-only) [![GitHub Actions CI Status](https://github.com/nf-core/viralrecon/workflows/nf-core%20CI/badge.svg)](https://github.com/nf-core/viralrecon/actions?query=workflow%3A%22nf-core+CI%22) [![GitHub Actions Linting Status](https://github.com/nf-core/viralrecon/workflows/nf-core%20linting/badge.svg)](https://github.com/nf-core/viralrecon/actions?query=workflow%3A%22nf-core+linting%22) [![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/viralrecon/results) [![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.3901628-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.3901628) -[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A521.04.0-23aa62.svg?labelColor=000000)](https://www.nextflow.io/) +[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A521.10.3-23aa62.svg?labelColor=000000)](https://www.nextflow.io/) [![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/) [![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/) [![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/) @@ -28,6 +28,8 @@ The pipeline has numerous options to allow you to run only specific aspects of t The SRA download functionality has been removed from the pipeline (`>=2.1`) and ported to an independent workflow called [nf-core/fetchngs](https://nf-co.re/fetchngs). You can provide `--nf_core_pipeline viralrecon` when running nf-core/fetchngs to download and auto-create a samplesheet containing publicly available samples that can be accepted directly by the Illumina processing mode of nf-core/viralrecon. +A number of improvements were made to the pipeline recently, mainly with regard to the variant calling. Please see [Major updates in v2.3](https://github.com/nf-core/viralrecon/issues/271) for a more detailed description. + ### Illumina 1. Merge re-sequenced FastQ files ([`cat`](http://www.linfo.org/cat.html)) @@ -41,13 +43,14 @@ The SRA download functionality has been removed from the pipeline (`>=2.1`) and 4. Duplicate read marking ([`picard`](https://broadinstitute.github.io/picard/); *optional*) 5. Alignment-level QC ([`picard`](https://broadinstitute.github.io/picard/), [`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/)) 6. Genome-wide and amplicon coverage QC plots ([`mosdepth`](https://github.com/brentp/mosdepth/)) - 7. Choice of multiple variant calling and consensus sequence generation routes ([`iVar variants and consensus`](https://github.com/andersen-lab/ivar); *default for amplicon data* *||* [`BCFTools`](http://samtools.github.io/bcftools/bcftools.html), [`BEDTools`](https://github.com/arq5x/bedtools2/); *default for metagenomics data*) + 7. Choice of multiple variant callers ([`iVar variants`](https://github.com/andersen-lab/ivar); *default for amplicon data* *||* [`BCFTools`](http://samtools.github.io/bcftools/bcftools.html); *default for metagenomics data*) * Variant annotation ([`SnpEff`](http://snpeff.sourceforge.net/SnpEff.html), [`SnpSift`](http://snpeff.sourceforge.net/SnpSift.html)) + * Individual variant screenshots with annotation tracks ([`ASCIIGenome`](https://asciigenome.readthedocs.io/en/latest/)) + 8. Choice of multiple consensus callers ([`BCFTools`](http://samtools.github.io/bcftools/bcftools.html), [`BEDTools`](https://github.com/arq5x/bedtools2/); *default for both amplicon and metagenomics data* *||* [`iVar consensus`](https://github.com/andersen-lab/ivar)) * Consensus assessment report ([`QUAST`](http://quast.sourceforge.net/quast)) * Lineage analysis ([`Pangolin`](https://github.com/cov-lineages/pangolin)) * Clade assignment, mutation calling and sequence quality checks ([`Nextclade`](https://github.com/nextstrain/nextclade)) - * Individual variant screenshots with annotation tracks ([`ASCIIGenome`](https://asciigenome.readthedocs.io/en/latest/)) - 8. Intersect variants across callers ([`BCFTools`](http://samtools.github.io/bcftools/bcftools.html)) + 9. Create variants long format table collating per-sample information for individual variants ([`BCFTools`](http://samtools.github.io/bcftools/bcftools.html)), functional effect prediction ([`SnpSift`](http://snpeff.sourceforge.net/SnpSift.html)) and lineage analysis ([`Pangolin`](https://github.com/cov-lineages/pangolin)) 6. _De novo_ assembly 1. Primer trimming ([`Cutadapt`](https://cutadapt.readthedocs.io/en/stable/guide.html); *amplicon data only*) 2. Choice of multiple assembly tools ([`SPAdes`](http://cab.spbu.ru/software/spades/) *||* [`Unicycler`](https://github.com/rrwick/Unicycler) *||* [`minia`](https://github.com/GATB/minia)) @@ -72,22 +75,26 @@ The SRA download functionality has been removed from the pipeline (`>=2.1`) and * Lineage analysis ([`Pangolin`](https://github.com/cov-lineages/pangolin)) * Clade assignment, mutation calling and sequence quality checks ([`Nextclade`](https://github.com/nextstrain/nextclade)) * Individual variant screenshots with annotation tracks ([`ASCIIGenome`](https://asciigenome.readthedocs.io/en/latest/)) + * Create variants long format table collating per-sample information for individual variants ([`BCFTools`](http://samtools.github.io/bcftools/bcftools.html)), functional effect prediction ([`SnpSift`](http://snpeff.sourceforge.net/SnpSift.html)) and lineage analysis ([`Pangolin`](https://github.com/cov-lineages/pangolin)) 8. Present QC, visualisation and custom reporting for sequencing, raw reads, alignment and variant calling results ([`MultiQC`](http://multiqc.info/)) ## Quick Start -1. Install [`Nextflow`](https://www.nextflow.io/docs/latest/getstarted.html#installation) (`>=21.04.0`) +1. Install [`Nextflow`](https://www.nextflow.io/docs/latest/getstarted.html#installation) (`>=21.10.3`) 2. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/), [`Podman`](https://podman.io/), [`Shifter`](https://nersc.gitlab.io/development/shifter/how-to-use/) or [`Charliecloud`](https://hpc.github.io/charliecloud/) for full pipeline reproducibility _(please only use [`Conda`](https://conda.io/miniconda.html) as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_ 3. Download the pipeline and test it on a minimal dataset with a single command: ```console - nextflow run nf-core/viralrecon -profile test, + nextflow run nf-core/viralrecon -profile test,YOURPROFILE ``` + Note that some form of configuration will be needed so that Nextflow knows how to fetch the required software. This is usually done in the form of a config profile (`YOURPROFILE` in the example command above). You can chain multiple config profiles in a comma-separated string. + + > * The pipeline comes with config profiles called `docker`, `singularity`, `podman`, `shifter`, `charliecloud` and `conda` which instruct the pipeline to use the named tool for software management. For example, `-profile test,docker`. > * Please check [nf-core/configs](https://github.com/nf-core/configs#documentation) to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use `-profile ` in your command. This will enable either `docker` or `singularity` and set the appropriate execution settings for your local compute environment. - > * If you are using `singularity` then the pipeline will auto-detect this and attempt to download the Singularity images directly as opposed to performing a conversion from Docker images. If you are persistently observing issues downloading Singularity images directly due to timeout or network issues then please use the `--singularity_pull_docker_container` parameter to pull and convert the Docker image instead. Alternatively, it is highly recommended to use the [`nf-core download`](https://nf-co.re/tools/#downloading-pipelines-for-offline-use) command to pre-download all of the required containers before running the pipeline and to set the [`NXF_SINGULARITY_CACHEDIR` or `singularity.cacheDir`](https://www.nextflow.io/docs/latest/singularity.html?#singularity-docker-hub) Nextflow options to be able to store and re-use the images from a central location for future pipeline runs. + > * If you are using `singularity` and are persistently observing issues downloading Singularity images directly due to timeout or network issues, then you can use the `--singularity_pull_docker_container` parameter to pull and convert the Docker image instead. Alternatively, you can use the [`nf-core download`](https://nf-co.re/tools/#downloading-pipelines-for-offline-use) command to download images first, before running the pipeline. Setting the [`NXF_SINGULARITY_CACHEDIR` or `singularity.cacheDir`](https://www.nextflow.io/docs/latest/singularity.html?#singularity-docker-hub) Nextflow options enables you to store and re-use the images from a central location for future pipeline runs. > * If you are using `conda`, it is highly recommended to use the [`NXF_CONDA_CACHEDIR` or `conda.cacheDir`](https://www.nextflow.io/docs/latest/conda.html) settings to store the environments in a central location for future pipeline runs. 4. Start running your own analysis! @@ -138,7 +145,14 @@ The SRA download functionality has been removed from the pipeline (`>=2.1`) and ./fastq_dir_to_samplesheet.py samplesheet.csv ``` - * You can find the default keys used to specify `--genome` in the [genomes config file](https://github.com/nf-core/configs/blob/master/conf/pipeline/viralrecon/genomes.config). Where possible we are trying to collate links and settings for standard primer sets to make it easier to run the pipeline with standard keys; see [usage docs](https://nf-co.re/viralrecon/usage#illumina-primer-sets). + * You can find the default keys used to specify `--genome` in the [genomes config file](https://github.com/nf-core/configs/blob/master/conf/pipeline/viralrecon/genomes.config). This provides default options for + * Reference genomes (including SARS-CoV-2) + * Genome associates primer sets + * [Nextclade datasets](https://docs.nextstrain.org/projects/nextclade/en/latest/user/datasets.html) + + The Pangolin and Nextclade lineage and clade definitions change regularly as new SARS-CoV-2 lineages are discovered. For instructions to use more recent versions of lineage analysis tools like Pangolin and Nextclade please refer to the [updating containers](https://nf-co.re/viralrecon/usage#updating-containers) section in the usage docs. + + Where possible we are trying to collate links and settings for standard primer sets to make it easier to run the pipeline with standard keys; see [usage docs](https://nf-co.re/viralrecon/usage#illumina-primer-sets). ## Documentation @@ -146,7 +160,7 @@ The nf-core/viralrecon pipeline comes with documentation about the pipeline [usa ## Credits -These scripts were originally written by [Sarai Varona](https://github.com/svarona), [Miguel JuliĆ”](https://github.com/MiguelJulia) and [Sara Monzon](https://github.com/saramonzon) from [BU-ISCIII](https://github.com/BU-ISCIII) and co-ordinated by Isabel Cuesta for the [Institute of Health Carlos III](https://eng.isciii.es/eng.isciii.es/Paginas/Inicio.html), Spain. Through collaboration with the nf-core community the pipeline has now been updated substantially to include additional processing steps, to standardise inputs/outputs and to improve pipeline reporting; implemented primarily by [Harshil Patel](https://github.com/drpatelh) from [The Bioinformatics & Biostatistics Group](https://www.crick.ac.uk/research/science-technology-platforms/bioinformatics-and-biostatistics/) at [The Francis Crick Institute](https://www.crick.ac.uk/), London. +These scripts were originally written by [Sarai Varona](https://github.com/svarona), [Miguel JuliĆ”](https://github.com/MiguelJulia), [Erika Kvalem](https://github.com/ErikaKvalem) and [Sara Monzon](https://github.com/saramonzon) from [BU-ISCIII](https://github.com/BU-ISCIII) and co-ordinated by Isabel Cuesta for the [Institute of Health Carlos III](https://eng.isciii.es/eng.isciii.es/Paginas/Inicio.html), Spain. Through collaboration with the nf-core community the pipeline has now been updated substantially to include additional processing steps, to standardise inputs/outputs and to improve pipeline reporting; implemented and maintained primarily by Harshil Patel ([@drpatelh](https://github.com/drpatelh)) from [Seqera Labs, Spain](https://seqera.io/). The key steps in the Nanopore implementation of the pipeline are carried out using the [ARTIC Network's field bioinformatics pipeline](https://github.com/artic-network/fieldbioinformatics) and were inspired by the amazing work carried out by contributors to the [connor-lab/ncov2019-artic-nf pipeline](https://github.com/connor-lab/ncov2019-artic-nf) originally written by [Matt Bull](https://github.com/m-bull) for use by the [COG-UK](https://github.com/COG-UK) project. Thank you for all of your incredible efforts during this pandemic! @@ -157,6 +171,7 @@ Many thanks to others who have helped out and contributed along the way too, inc | [Aengus Stewart](https://github.com/stewarta) | [The Francis Crick Institute, UK](https://www.crick.ac.uk/) | | [Alexander Peltzer](https://github.com/apeltzer) | [Boehringer Ingelheim, Germany](https://www.boehringer-ingelheim.de/) | | [Alison Meynert](https://github.com/ameynert) | [University of Edinburgh, Scotland](https://www.ed.ac.uk/) | +| [Anthony Underwood](https://github.com/antunderwood) | [Centre for Genomic Pathogen Surveillance](https://www.pathogensurveillance.net) | | [Anton Korobeynikov](https://github.com/asl) | [Saint Petersburg State University, Russia](https://english.spbu.ru/) | | [Artem Babaian](https://github.com/ababaian) | [University of British Columbia, Canada](https://www.ubc.ca/) | | [Dmitry Meleshko](https://github.com/1dayac) | [Saint Petersburg State University, Russia](https://english.spbu.ru/) | diff --git a/assets/dummy_file.txt b/assets/dummy_file.txt deleted file mode 100644 index e69de29b..00000000 diff --git a/assets/multiqc_config_illumina.yaml b/assets/multiqc_config_illumina.yaml index 48bf55e8..3cff53aa 100644 --- a/assets/multiqc_config_illumina.yaml +++ b/assets/multiqc_config_illumina.yaml @@ -24,7 +24,6 @@ run_modules: module_order: - fastqc: name: "PREPROCESS: FastQC (raw reads)" - anchor: "fastqc_raw" info: "This section of the report shows FastQC results for the raw reads before adapter trimming." path_filters: - "./fastqc/*.zip" @@ -59,53 +58,26 @@ module_order: name: "VARIANTS: mosdepth" info: "This section of the report shows genome-wide coverage metrics generated by mosdepth." - pangolin: - name: "VARIANTS: Pangolin (iVar)" - anchor: "pangolin_ivar" - info: "This section of the report shows Pangolin lineage analysis results for variants called by iVar." + name: "VARIANTS: Pangolin" + info: "This section of the report shows Pangolin lineage analysis results for the called variants." path_filters: - - "./variants_ivar/*.pangolin.csv" + - "./variants/*.pangolin.csv" - bcftools: - name: "VARIANTS: BCFTools (iVar)" - anchor: "bcftools_ivar" - info: "This section of the report shows BCFTools stats results for variants called by iVar." + name: "VARIANTS: BCFTools" + info: "This section of the report shows BCFTools stats results for the called variants." path_filters: - - "./variants_ivar/*.txt" + - "./variants/*.txt" - snpeff: - name: "VARIANTS: SnpEff (iVar)" - anchor: "snpeff_ivar" - info: "This section of the report shows SnpEff results for variants called by iVar." + name: "VARIANTS: SnpEff" + info: "This section of the report shows SnpEff results for the called variants." path_filters: - - "./variants_ivar/*.csv" + - "./variants/*.csv" - quast: - name: "VARIANTS: QUAST (iVar)" - anchor: "quast_ivar" - info: "This section of the report shows QUAST results for consensus sequences generated from variants with iVar." + name: "VARIANTS: QUAST" + anchor: "quast_variants" + info: "This section of the report shows QUAST QC results for the consensus sequence." path_filters: - - "./variants_ivar/*.tsv" - - pangolin: - name: "VARIANTS: Pangolin (BCFTools)" - anchor: "pangolin_bcftools" - info: "This section of the report shows Pangolin lineage analysis results for variants called by BCFTools." - path_filters: - - "./variants_bcftools/*.pangolin.csv" - - bcftools: - name: "VARIANTS: BCFTools (BCFTools)" - anchor: "bcftools_bcftools" - info: "This section of the report shows BCFTools stats results for variants called by BCFTools." - path_filters: - - "./variants_bcftools/*.txt" - - snpeff: - name: "VARIANTS: SnpEff (BCFTools)" - anchor: "snpeff_bcftools" - info: "This section of the report shows SnpEff results for variants called by BCFTools." - path_filters: - - "./variants_bcftools/*.csv" - - quast: - name: "VARIANTS: QUAST (BCFTools)" - anchor: "quast_bcftools" - info: "This section of the report shows QUAST results for consensus sequence generated from BCFTools variants." - path_filters: - - "./variants_bcftools/*.tsv" + - "./variants/*.tsv" - cutadapt: name: "ASSEMBLY: Cutadapt (primer trimming)" info: "This section of the report shows Cutadapt results for reads after primer sequence trimming." @@ -210,38 +182,22 @@ custom_data: "% Coverage > 10x": description: "Coverage > 10x calculated by mosdepth" format: "{:,.2f}" - "# SNPs (iVar)": - description: "Total number of SNPs called by iVar" - format: "{:,.0f}" - "# INDELs (iVar)": - description: "Total number of INDELs called by iVar" - format: "{:,.0f}" - "# Missense variants (iVar)": - description: "Total number of variants called by iVar and identified as missense mutations with SnpEff" - format: "{:,.0f}" - "# Ns per 100kb consensus (iVar)": - description: "Number of N bases per 100kb in consensus sequence generated by iVar" - format: "{:,.2f}" - "Pangolin lineage (iVar)": - description: "Pangolin lineage inferred from the consensus sequence generated by iVar" - "Nextclade clade (iVar)": - description: "Nextclade clade inferred from the consensus sequence generated by iVar" - "# SNPs (BCFTools)": - description: "Total number of SNPs called by BCFTools" + "# SNPs": + description: "Total number of SNPs" format: "{:,.0f}" - "# INDELs (BCFTools)": - description: "Total number of INDELs called by BCFTools" + "# INDELs": + description: "Total number of INDELs" format: "{:,.0f}" - "# Missense variants (BCFTools)": - description: "Total number of variants called by BCFTools and identified as missense mutations with SnpEff" + "# Missense variants": + description: "Total number of variants identified as missense mutations with SnpEff" format: "{:,.0f}" - "# Ns per 100kb consensus (BCFTools)": - description: "Number of N bases per 100kb in consensus sequence generated by BCFTools" + "# Ns per 100kb consensus": + description: "Number of N bases per 100kb in consensus sequence" format: "{:,.2f}" - "Pangolin lineage (BCFTools)": - description: "Pangolin lineage inferred from the consensus sequence generated by BCFTools" - "Nextclade clade (BCFTools)": - description: "Nextclade clade inferred from the consensus sequence generated by BCFTools" + "Pangolin lineage": + description: "Pangolin lineage inferred from the consensus sequence" + "Nextclade clade": + description: "Nextclade clade inferred from the consensus sequence" pconfig: id: "summary_variants_metrics_plot" table_title: "Variant calling metrics" @@ -326,6 +282,7 @@ custom_data: extra_fn_clean_exts: - ".markduplicates" - ".unclassified" + - "_MN908947.3" extra_fn_clean_trim: - "Consensus_" diff --git a/assets/nf-core-viralrecon_logo.png b/assets/nf-core-viralrecon_logo.png deleted file mode 100644 index 4fa3bed7..00000000 Binary files a/assets/nf-core-viralrecon_logo.png and /dev/null differ diff --git a/assets/nf-core-viralrecon_logo_light.png b/assets/nf-core-viralrecon_logo_light.png new file mode 100644 index 00000000..079aaf1c Binary files /dev/null and b/assets/nf-core-viralrecon_logo_light.png differ diff --git a/assets/samplesheet.csv b/assets/samplesheet.csv deleted file mode 100644 index 5f653ab7..00000000 --- a/assets/samplesheet.csv +++ /dev/null @@ -1,3 +0,0 @@ -sample,fastq_1,fastq_2 -SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz -SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz, diff --git a/assets/schema_input.json b/assets/schema_input.json index d44d187f..9e9d343e 100644 --- a/assets/schema_input.json +++ b/assets/schema_input.json @@ -37,8 +37,7 @@ } }, "required": [ - "sample", - "fastq_1" + "sample" ] } } diff --git a/assets/sendmail_template.txt b/assets/sendmail_template.txt index 15f1ccd6..4d7f5463 100644 --- a/assets/sendmail_template.txt +++ b/assets/sendmail_template.txt @@ -12,9 +12,9 @@ $email_html Content-Type: image/png;name="nf-core-viralrecon_logo.png" Content-Transfer-Encoding: base64 Content-ID: -Content-Disposition: inline; filename="nf-core-viralrecon_logo.png" +Content-Disposition: inline; filename="nf-core-viralrecon_logo_light.png" -<% out << new File("$projectDir/assets/nf-core-viralrecon_logo.png"). +<% out << new File("$projectDir/assets/nf-core-viralrecon_logo_light.png"). bytes. encodeBase64(). toString(). diff --git a/bin/ivar_variants_to_vcf.py b/bin/ivar_variants_to_vcf.py index 4abd3453..d3f126ea 100755 --- a/bin/ivar_variants_to_vcf.py +++ b/bin/ivar_variants_to_vcf.py @@ -1,38 +1,144 @@ #!/usr/bin/env python + import os import sys import re import errno import argparse +import numpy as np +from scipy.stats import fisher_exact def parse_args(args=None): - Description = "Convert iVar variants tsv file to vcf format." - Epilog = """Example usage: python ivar_variants_to_vcf.py """ + Description = "Convert iVar variants TSV file to VCF format." + Epilog = """Example usage: python ivar_variants_to_vcf.py """ parser = argparse.ArgumentParser(description=Description, epilog=Epilog) - parser.add_argument("FILE_IN", help="Input tsv file.") - parser.add_argument("FILE_OUT", help="Full path to output vcf file.") + parser.add_argument("file_in", help="Input iVar TSV file.") + parser.add_argument("file_out", help="Full path to output VCF file.") parser.add_argument( "-po", "--pass_only", - dest="PASS_ONLY", - help="Only output variants that PASS all filters.", + help="Only output variants that PASS filters.", action="store_true", ) parser.add_argument( "-af", - "--allele_freq_thresh", + "--allele_freq_threshold", type=float, - dest="ALLELE_FREQ_THRESH", default=0, - help="Only output variants where allele frequency greater than this number (default: 0).", + help="Only output variants where allele frequency is greater than this number (default: 0).", + ) + parser.add_argument( + "-is", + "--ignore_strand_bias", + default=False, + help="Does not take strand bias into account, use this option when not using amplicon sequencing.", + action="store_true" + ) + parser.add_argument( + "-ic", + "--ignore_merge_codons", + help="Output variants without taking into account if consecutive positions belong to the same codon.", + action="store_true" ) return parser.parse_args(args) +def check_consecutive(mylist): + ''' + Description: + This function checks if a list of three or two numbers are consecutive and returns how many items are consecutive. + input: + my_list - A list of integers + return: + Number of items consecutive in the list - [False, 1, 2] + ''' + my_list = list(map(int, mylist)) + + ## Check if the list contains consecutive numbers + if sorted(my_list) == list(range(min(my_list), max(my_list)+1)): + return len(my_list) + else: + ## If not, and the list is > 1, remove the last item and reevaluate. + if len(my_list) > 1: + my_list.pop() + if sorted(my_list) == list(range(min(my_list), max(my_list)+1)): + return len(my_list) + else: + return False + return False + + +def codon_position(seq1,seq2): + ''' + Description: + Function to compare two codon nucleotide sequences (size 3) and retuns the position where it differs. + Input: + seq1 - list size 3 [A,T,C,G] + seq2 - list size 3 [A,T,C,G] + Returns: + Returns position where seq1 != seq2 + ''' + if seq1 == "NA": + return False + + ind_diff = [i for i in range(len(seq1)) if seq1[i] != seq2[i]] + if len(ind_diff) > 1: + print("There has been an issue, more than one difference between the seqs.") + return False + else: + return ind_diff[0] + + +def rename_vars(dict_lines,num_collapse): + ''' + Description: + The function set the vars acordingly to the lines to collapse do to consecutive variants. + Input: + dict_lines - Dict with var lines. + num_collapse - number of lines to collapse [2,3] + Returns:: + Vars fixed. + ''' + CHROM = dict_lines["CHROM"][0] + POS = dict_lines["POS"][0] + ID = dict_lines["ID"][0] + # If two consecutive collapse 2 lines into one. + if int(num_collapse) == 2: + REF = str(dict_lines["REF"][0]) + str(dict_lines["REF"][1]) + ALT = str(dict_lines["ALT"][0]) + str(dict_lines["ALT"][1]) + # If three consecutive collapse 3 lines into one. + elif int(num_collapse) == 3: + REF = str(dict_lines["REF"][0]) + str(dict_lines["REF"][1]) + str(dict_lines["REF"][2]) + ALT = str(dict_lines["ALT"][0]) + str(dict_lines["ALT"][1]) + str(dict_lines["ALT"][2]) + ## TODO Check how much differences we found among DPs in the three positions of a codon. + REF_DP = dict_lines["REF_DP"][0] + REF_RV = dict_lines["REF_RV"][0] + ALT_DP = dict_lines["ALT_DP"][0] + ALT_RV = dict_lines["ALT_RV"][0] + QUAL = dict_lines["QUAL"][0] + REF_CODON = REF + ALT_CODON = ALT + FILTER =dict_lines["FILTER"][0] + # INFO DP depends on the decision in the todo above. SB is left with the first one. + INFO = dict_lines["INFO"][0] + FORMAT = dict_lines["FORMAT"][0] + # sample depends on the decision in the todo above. + SAMPLE = dict_lines["SAMPLE"][0] + return CHROM,POS,ID,REF,ALT,QUAL,FILTER,INFO,FORMAT,SAMPLE + + def make_dir(path): + ''' + Description: + Create directory if it doesn't exist. + Input: + path - path where the directory will be created. + Returns: + None + ''' if not len(path) == 0: try: os.makedirs(path) @@ -41,42 +147,91 @@ def make_dir(path): raise -def ivar_variants_to_vcf(FileIn, FileOut, passOnly=False, minAF=0): - filename = os.path.splitext(FileIn)[0] - header = ( - "##fileformat=VCFv4.2\n" - "##source=iVar\n" - '##INFO=\n' - '##FILTER=\n' - '##FILTER= 0.05">\n' - '##FORMAT=\n' - '##FORMAT=\n' - '##FORMAT=\n' - '##FORMAT=\n' - '##FORMAT=\n' - '##FORMAT=\n' - '##FORMAT=\n' - '##FORMAT=\n' - ) - header += ( - "#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\tFORMAT\t" + filename + "\n" - ) +def ivar_variants_to_vcf(file_in, file_out, pass_only=False, min_allele_frequency=0, ignore_strand_bias=False, ignore_merge_codons=False): + ''' + Description: + Main function to convert iVar variants TSV to VCF. + Input: + file_in : iVar variants TSV file + file_out : VCF output file + pass_only : Only keep variants that PASS filter [True, False] + min_allele_freq : Minimum allele frequency to keep a variant [0] + ignore_strand_bias : Do not apply strand-bias filter [True, False] + ignore_merge_codons : Do not take into account consecutive positions belong to the same codon. + Returns: + None + ''' + ## Create output directory + filename = os.path.splitext(file_in)[0] + out_dir = os.path.dirname(file_out) + make_dir(out_dir) + + ## Define VCF header + header_source = [ + "##fileformat=VCFv4.2", + "##source=iVar" + ] + header_info = [ + '##INFO=' + ] + header_filter = [ + '##FILTER=', + '##FILTER= 0.05">' + ] + header_format = [ + '##FORMAT=', + '##FORMAT=', + '##FORMAT=', + '##FORMAT=', + '##FORMAT=', + '##FORMAT=', + '##FORMAT=', + '##FORMAT=', + ] + header_cols = [ + f"#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\tFORMAT\t{filename}" + ] + if not ignore_strand_bias: + header_info += [ + '##INFO=' + ] + header_filter += [ + '##FILTER=' + ] + header = header_source + header_info + header_filter + header_format + header_cols - varList = [] - varCountDict = {"SNP": 0, "INS": 0, "DEL": 0} - OutDir = os.path.dirname(FileOut) - make_dir(OutDir) - fout = open(FileOut, "w") - fout.write(header) - with open(FileIn) as f: - for line in f: + ## Initialise variables + var_list = [] + var_count_dict = {"SNP": 0, "INS": 0, "DEL": 0} + dict_lines = {'CHROM':[], 'POS':[], 'ID':[], 'REF':[], 'ALT':[], 'REF_DP':[], 'REF_RV':[], 'ALT_DP':[], 'ALT_RV':[], 'QUAL':[], 'REF_CODON':[], 'ALT_CODON':[], 'FILTER': [], 'INFO':[], 'FORMAT':[], 'SAMPLE':[]} + write_line = False + fout = open(file_out, "w") + fout.write('\n'.join(header) + '\n') + with open(file_in, 'r') as fin: + for line in fin: if not re.match("REGION", line): line = re.split("\t", line) + + ## Assign intial fields to variables CHROM = line[0] POS = line[1] ID = "." REF = line[2] ALT = line[3] + + ## REF/ALF depths + REF_DP = int(line[4]) + REF_RV = int(line[5]) + REF_FW = REF_DP - REF_RV + ALT_RV = int(line[8]) + ALT_DP = int(line[7]) + ALT_FW = ALT_DP - ALT_RV + + ## Perform a fisher_exact test for strand bias detection + table = np.array([[REF_FW, REF_RV], [ALT_FW, ALT_RV]]) + oddsr, pvalue = fisher_exact(table, alternative='greater') + + ## Determine variant type var_type = "SNP" if ALT[0] == "+": ALT = REF + ALT[1:] @@ -85,76 +240,159 @@ def ivar_variants_to_vcf(FileIn, FileOut, passOnly=False, minAF=0): REF += ALT[1:] ALT = line[2] var_type = "DEL" + QUAL = "." + + ## Determine FILTER field + INFO = f"DP={line[11]}" pass_test = line[13] - if pass_test == "TRUE": - FILTER = "PASS" + if ignore_strand_bias: + if pass_test == "TRUE": + FILTER = "PASS" + else: + FILTER = "ft" else: - FILTER = "FAIL" - INFO = "DP=" + line[11] + ## Add SB in the FILTER field if strand-bias p-value is significant + if pvalue < 0.05 and pass_test == "TRUE": + FILTER = "sb" + elif pvalue > 0.05 and pass_test == "TRUE": + FILTER = "PASS" + elif pvalue <= 0.05 and pass_test == "FALSE": + FILTER = "ft;sb" + else: + FILTER = "ft" + INFO += f":SB_PV={str(round(pvalue, 5))}" + FORMAT = "GT:REF_DP:REF_RV:REF_QUAL:ALT_DP:ALT_RV:ALT_QUAL:ALT_FREQ" - SAMPLE = ( - "1:" - + line[4] - + ":" - + line[5] - + ":" - + line[6] - + ":" - + line[7] - + ":" - + line[8] - + ":" - + line[9] - + ":" - + line[10] - ) - oline = ( - CHROM - + "\t" - + POS - + "\t" - + ID - + "\t" - + REF - + "\t" - + ALT - + "\t" - + QUAL - + "\t" - + FILTER - + "\t" - + INFO - + "\t" - + FORMAT - + "\t" - + SAMPLE - + "\n" - ) - writeLine = True - if passOnly and FILTER != "PASS": - writeLine = False - if float(line[10]) < minAF: - writeLine = False - if (CHROM, POS, REF, ALT) in varList: - writeLine = False + SAMPLE = f'1:{":".join(line[4:11])}' + + REF_CODON = line[15] + ALT_CODON = line[17] + param_list = [CHROM, POS, ID, REF, ALT, REF_DP, REF_RV, ALT_DP, ALT_RV, QUAL, REF_CODON, ALT_CODON, FILTER, INFO, FORMAT, SAMPLE] + + if ignore_merge_codons or var_type != "SNP": + write_line = True + oline = (CHROM + "\t" + POS + "\t" + ID + "\t" + REF + "\t" + ALT + "\t" + QUAL + "\t" + FILTER + "\t" + INFO + "\t" + FORMAT + "\t" + SAMPLE + "\n") + + else: + ## dict_lines contains all the informative fields for 3 positions in the vcf. + # dict_lines has a maximum size of three. + + ## Always fill dict_lines until size 2. + if len(dict_lines["POS"]) == 0 or len(dict_lines["POS"]) == 1: + for i,j in enumerate(dict_lines): + dict_lines.setdefault(j, []).append(param_list[i]) + write_line=False + + # If queue has size 2, we include the third line + elif len(dict_lines["POS"]) == 2: + for i,j in enumerate(dict_lines): + dict_lines.setdefault(j, []).append(param_list[i]) + # Are two positions in the dict consecutive? + if check_consecutive(dict_lines["POS"]) == 2: + ## If the first position is not on the third position of the codon they are in the same codon. + if codon_position(dict_lines["REF_CODON"][0],dict_lines["ALT_CODON"][0]) != 2: + write_line = True + num_collapse = "2" + CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO, FORMAT, SAMPLE = rename_vars(dict_lines, num_collapse) + oline = (CHROM + "\t" + POS + "\t" + ID + "\t" + REF + "\t" + ALT + "\t" + QUAL + "\t" + FILTER + "\t" + INFO + "\t" + FORMAT + "\t" + SAMPLE + "\n") + ## We removed the first two items in dict_lines with have been just processed. + for i,j in enumerate(dict_lines): + dict_lines[list(dict_lines.keys())[i]].pop(0) + dict_lines[list(dict_lines.keys())[i]].pop(0) + else: + write_line = True + oline =(dict_lines["CHROM"][0] + "\t" + dict_lines["POS"][0] + "\t" + dict_lines["ID"][0] + "\t" + dict_lines["REF"][0] + "\t" + dict_lines["ALT"][0] + "\t" + dict_lines["QUAL"][0] + "\t" + dict_lines["FILTER"][0] + "\t" + dict_lines["INFO"][0] + "\t" + dict_lines["FORMAT"][0] + "\t" + dict_lines["SAMPLE"][0] + "\n") + for i,j in enumerate(dict_lines): + dict_lines[list(dict_lines.keys())[i]].pop(0) + + # Are the three positions in the dict consecutive? + elif check_consecutive(dict_lines["POS"]) == 3: + ## we check the first position in which codon position is to process it acordingly. + # If first position is in the first codon position all three positions belong to the same codon. + if codon_position(dict_lines["REF_CODON"][0], dict_lines["ALT_CODON"][0]) == 0: + write_line = True + num_collapse = 3 + CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO, FORMAT, SAMPLE = rename_vars(dict_lines, num_collapse) + oline = (CHROM + "\t" + POS + "\t" + ID + "\t" + REF + "\t" + ALT + "\t" + QUAL + "\t" + FILTER + "\t" + INFO + "\t" + FORMAT + "\t" + SAMPLE + "\n") + for i,j in enumerate(dict_lines): + dict_lines[list(dict_lines.keys())[i]].pop(0) + dict_lines[list(dict_lines.keys())[i]].pop(0) + # we empty the dict_lines + dict_lines = {'CHROM':[], 'POS':[], 'ID':[], 'REF':[], 'ALT':[], 'REF_DP':[], 'REF_RV':[], 'ALT_DP':[], 'ALT_RV':[], 'QUAL':[], 'REF_CODON':[], 'ALT_CODON':[], 'FILTER':[], 'INFO':[], 'FORMAT':[], 'SAMPLE':[]} + # If first position is in the second codon position, we have the two first positions belonging to the same codon and the last one independent. + elif codon_position(dict_lines["REF_CODON"][0], dict_lines["ALT_CODON"][0]) == 1: + write_line = True + num_collapse = 2 + CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO, FORMAT, SAMPLE = rename_vars(dict_lines, num_collapse) + oline = (CHROM + "\t" + POS + "\t" + ID + "\t" + REF + "\t" + ALT + "\t" + QUAL + "\t" + FILTER + "\t" + INFO + "\t" + FORMAT + "\t" + SAMPLE + "\n") + for i,j in enumerate(dict_lines): + dict_lines[list(dict_lines.keys())[i]].pop(0) + dict_lines[list(dict_lines.keys())[i]].pop(0) + ## Finally if we have the first position in the last codon position, we write first position and left the remaining two to be evaluated in the next iteration. + elif codon_position(dict_lines["REF_CODON"][0], dict_lines["ALT_CODON"][0]) == 2: + write_line = True + oline =(dict_lines["CHROM"][0] + "\t" + dict_lines["POS"][0] + "\t" + dict_lines["ID"][0] + "\t" + dict_lines["REF"][0] + "\t" + dict_lines["ALT"][0] + "\t" + dict_lines["QUAL"][0] + "\t" + dict_lines["FILTER"][0] + "\t" + dict_lines["INFO"][0] + "\t" + dict_lines["FORMAT"][0] + "\t" + dict_lines["SAMPLE"][0] + "\n") + for i,j in enumerate(dict_lines): + dict_lines[list(dict_lines.keys())[i]].pop(0) + + elif check_consecutive(dict_lines["POS"]) == False: + write_line = True + oline =(dict_lines["CHROM"][0] + "\t" + dict_lines["POS"][0] + "\t" + dict_lines["ID"][0] + "\t" + dict_lines["REF"][0] + "\t" + dict_lines["ALT"][0] + "\t" + dict_lines["QUAL"][0] + "\t" + dict_lines["FILTER"][0] + "\t" + dict_lines["INFO"][0] + "\t" + dict_lines["FORMAT"][0] + "\t" + dict_lines["SAMPLE"][0] + "\n") + for i,j in enumerate(dict_lines): + dict_lines[list(dict_lines.keys())[i]].pop(0) + else: + print("Something went terribly wrong!!" + str(len(dict_lines["POS"]))) + + ## Determine whether to output variant + if pass_only and FILTER != "PASS": + write_line = False + if float(line[10]) < min_allele_frequency: + write_line = False + if (CHROM, POS, REF, ALT) in var_list: + write_line = False else: - varList.append((CHROM, POS, REF, ALT)) - if writeLine: - varCountDict[var_type] += 1 + var_list.append((CHROM, POS, REF, ALT)) + + ## Write to file + if write_line: + var_count_dict[var_type] += 1 fout.write(oline) - fout.close() ## Print variant counts to pass to MultiQC - varCountList = [(k, str(v)) for k, v in sorted(varCountDict.items())] - print("\t".join(["sample"] + [x[0] for x in varCountList])) - print("\t".join([filename] + [x[1] for x in varCountList])) + var_count_list = [(k, str(v)) for k, v in sorted(var_count_dict.items())] + print("\t".join(["sample"] + [x[0] for x in var_count_list])) + print("\t".join([filename] + [x[1] for x in var_count_list])) + + ## Handle last 3 lines. + if len(dict_lines["POS"]) == 2: + if check_consecutive(dict_lines["POS"]) == 2: + if codon_position(dict_lines["REF_CODON"][0],dict_lines["ALT_CODON"][0]) != 2: + write_line = True + num_collapse = 2 + CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO, FORMAT, SAMPLE = rename_vars(dict_lines, num_collapse) + oline = (CHROM + "\t" + POS + "\t" + ID + "\t" + REF + "\t" + ALT + "\t" + QUAL + "\t" + FILTER + "\t" + INFO + "\t" + FORMAT + "\t" + SAMPLE + "\n") + fout.write(oline) + else: + oline = (dict_lines["CHROM"][0] + "\t" + dict_lines["POS"][0] + "\t" + dict_lines["ID"][0] + "\t" + dict_lines["REF"][0] + "\t" + dict_lines["ALT"][0] + "\t" + dict_lines["QUAL"][0] + "\t" + dict_lines["FILTER"][0] + "\t" + dict_lines["INFO"][0] + "\t" + dict_lines["FORMAT"][0] + "\t" + dict_lines["SAMPLE"][0] + "\n") + oline1 = (dict_lines["CHROM"][1] + "\t" + dict_lines["POS"][1] + "\t" + dict_lines["ID"][1] + "\t" + dict_lines["REF"][1] + "\t" + dict_lines["ALT"][1] + "\t" + dict_lines["QUAL"][1] + "\t" + dict_lines["FILTER"][1] + "\t" + dict_lines["INFO"][1] + "\t" + dict_lines["FORMAT"][1] + "\t" + dict_lines["SAMPLE"][1] + "\n") + fout.write(oline) + fout.write(oline1) + elif len(dict_lines["POS"]) == 1: + oline =(dict_lines["CHROM"][0] + "\t" + dict_lines["POS"][0] + "\t" + dict_lines["ID"][0] + "\t" + dict_lines["REF"][0] + "\t" + dict_lines["ALT"][0] + "\t" + dict_lines["QUAL"][0] + "\t" + dict_lines["FILTER"][0] + "\t" + dict_lines["INFO"][0] + "\t" + dict_lines["FORMAT"][0] + "\t" + dict_lines["SAMPLE"][0] + "\n") + fout.write(oline) + fout.close() def main(args=None): args = parse_args(args) ivar_variants_to_vcf( - args.FILE_IN, args.FILE_OUT, args.PASS_ONLY, args.ALLELE_FREQ_THRESH + args.file_in, + args.file_out, + args.pass_only, + args.allele_freq_threshold, + args.ignore_strand_bias, + args.ignore_merge_codons, ) diff --git a/bin/make_bed_mask.py b/bin/make_bed_mask.py index efd99057..46e06bed 100755 --- a/bin/make_bed_mask.py +++ b/bin/make_bed_mask.py @@ -27,28 +27,11 @@ def find_indels_vcf(vcf_in): var_pos = line[1] ref = line[3] alt = line[4] - if len(alt) > len(ref): - indels_pos_len[var_pos] = len(alt) - elif len(ref) > len(alt): + if len(ref) != len(alt): indels_pos_len[var_pos] = len(ref) return indels_pos_len -def find_dels_vcf(vcf_in): - encoding = "utf-8" - dels_pos_len = {} - with gzip.open(vcf_in, "r") as f: - for line in f: - if "#" not in str(line, encoding): - line = re.split("\t", str(line, encoding)) - var_pos = line[1] - ref = line[3] - alt = line[4] - if len(ref) > len(alt): - dels_pos_len[var_pos] = len(ref) - return dels_pos_len - - def make_bed_mask(bed_in, bed_out, indels_pos_len): fout = open(bed_out, "w") indels_positions = [] @@ -66,7 +49,7 @@ def make_bed_mask(bed_in, bed_out, indels_pos_len): for position in indels_positions: indel_init_pos = position indel_whole_length = indels_pos_len[position] - indel_end_pos = int(indel_init_pos) + int(indel_whole_length) + indel_end_pos = int(indel_init_pos) + int(indel_whole_length)-1 if int(init_pos) in range( int(indel_init_pos), int(indel_end_pos) ) or int(end_pos) in range(int(indel_init_pos), int(indel_end_pos)): @@ -75,12 +58,12 @@ def make_bed_mask(bed_in, bed_out, indels_pos_len): else: oline = ref_genome + "\t" + init_pos + "\t" + end_pos if test: - fout.write(oline) + fout.write(oline + "\n") def main(args=None): args = parse_args(args) - indels_pos_len = find_dels_vcf(args.VCF_IN) + indels_pos_len = find_indels_vcf(args.VCF_IN) make_bed_mask(args.BED_IN, args.BED_OUT, indels_pos_len) diff --git a/bin/make_variants_long_table.py b/bin/make_variants_long_table.py new file mode 100755 index 00000000..750ffe16 --- /dev/null +++ b/bin/make_variants_long_table.py @@ -0,0 +1,252 @@ +#!/usr/bin/env python + +import os +import sys +import glob +import errno +import shutil +import logging +import argparse +import pandas as pd +from matplotlib import table + + +logger = logging.getLogger() + + +pd.set_option('display.max_columns', None) +pd.set_option('display.max_rows', None) + + +def parser_args(args=None): + Description = 'Create long/wide tables containing variant information.' + Epilog = """Example usage: python make_variants_long_table.py --bcftools_query_dir ./bcftools_query/ --snpsift_dir ./snpsift/ --pangolin_dir ./pangolin/""" + parser = argparse.ArgumentParser(description=Description, epilog=Epilog) + parser.add_argument("-bd", "--bcftools_query_dir" , type=str, default="./bcftools_query" , help="Directory containing output of BCFTools query for each sample (default: './bcftools_query').") + parser.add_argument("-sd", "--snpsift_dir" , type=str, default="./snpsift" , help="Directory containing output of SnpSift for each sample (default: './snpsift').") + parser.add_argument("-pd", "--pangolin_dir" , type=str, default="./pangolin" , help="Directory containing output of Pangolin for each sample (default: './pangolin').") + parser.add_argument("-bs", "--bcftools_file_suffix", type=str, default=".bcftools_query.txt" , help="Suffix to trim off BCFTools query file name to obtain sample name (default: '.bcftools_query.txt').") + parser.add_argument("-ss", "--snpsift_file_suffix" , type=str, default=".snpsift.txt" , help="Suffix to trim off SnpSift file name to obtain sample name (default: '.snpsift.txt').") + parser.add_argument("-ps", "--pangolin_file_suffix", type=str, default=".pangolin.csv" , help="Suffix to trim off Pangolin file name to obtain sample name (default: '.pangolin.csv').") + parser.add_argument("-of", "--output_file" , type=str, default="variants_long_table.csv", help="Full path to output file (default: 'variants_long_table.csv').") + parser.add_argument("-vc", "--variant_caller" , type=str, default="ivar" , help="Tool used to call the variants (default: 'ivar').") + return parser.parse_args(args) + + +def make_dir(path): + if not len(path) == 0: + try: + os.makedirs(path) + except OSError as exception: + if exception.errno != errno.EEXIST: + raise + + +def get_file_dict(file_dir, file_suffix): + files = glob.glob(os.path.join(file_dir, f'*{file_suffix}')) + samples = [os.path.basename(x).rstrip(f'{file_suffix}') for x in files] + + return dict(zip(samples, files)) + + +def three_letter_aa_to_one(hgvs_three): + aa_dict= { + 'Ala': 'A', 'Arg': 'R', 'Asn': 'N', 'Asp': 'D', 'Cys': 'C', + 'Gln': 'Q', 'Glu': 'E', 'Gly': 'G', 'His': 'H', 'Ile': 'I', + 'Leu': 'L', 'Lys': 'K', 'Met': 'M', 'Phe': 'F', 'Pro': 'P', + 'Pyl': 'O', 'Ser': 'S', 'Sec': 'U', 'Thr': 'T', 'Trp': 'W', + 'Tyr': 'Y', 'Val': 'V', 'Asx': 'B', 'Glx': 'Z', 'Xaa': 'X', + 'Xle': 'J', 'Ter': '*' + } + hgvs_one = hgvs_three + for key in aa_dict: + if key in hgvs_one: + hgvs_one = hgvs_one.replace(str(key),str(aa_dict[key])) + + return hgvs_one + + +## Returns a pandas dataframe in the format: + # CHROM POS REF ALT FILTER DP REF_DP ALT_DP AF + # 0 MN908947.3 241 C T PASS 642 375 266 0.41 + # 1 MN908947.3 1875 C T PASS 99 63 34 0.34 +def ivar_bcftools_query_to_table(bcftools_query_file): + table = pd.read_table(bcftools_query_file, header='infer') + table = table.dropna(how='all', axis=1) + old_colnames = list(table.columns) + new_colnames = [x.split(']')[-1].split(':')[-1] for x in old_colnames] + table.rename(columns=dict(zip(old_colnames, new_colnames)), inplace=True) + + if not table.empty: + table[["ALT_DP", "DP"]] = table[["ALT_DP", "DP"]].apply(pd.to_numeric) + table['AF'] = table['ALT_DP'] / table['DP'] + table['AF'] = table['AF'].round(2) + + return table + + +## Returns a pandas dataframe in the format: + # CHROM POS REF ALT FILTER DP REF_DP ALT_DP AF + # 0 MN908947.3 241 C T . 24 8 16 0.67 + # 1 MN908947.3 3037 C T . 17 5 12 0.71 +def bcftools_bcftools_query_to_table(bcftools_query_file): + table = pd.read_table(bcftools_query_file, header='infer') + table = table.dropna(how='all', axis=1) + old_colnames = list(table.columns) + new_colnames = [x.split(']')[-1].split(':')[-1] for x in old_colnames] + table.rename(columns=dict(zip(old_colnames, new_colnames)), inplace=True) + + if not table.empty: + table[['REF_DP','ALT_DP']] = table['AD'].str.split(',', expand=True) + table[["ALT_DP", "DP"]] = table[["ALT_DP", "DP"]].apply(pd.to_numeric) + table['AF'] = table['ALT_DP'] / table['DP'] + table['AF'] = table['AF'].round(2) + table.drop('AD', axis=1, inplace=True) + + return table + + +## Returns a pandas dataframe in the format: + # CHROM POS REF ALT FILTER DP REF_DP ALT_DP AF + # 0 MN908947.3 241 C T PASS 30 1 29 0.97 + # 1 MN908947.3 1163 A T PASS 28 0 28 1.00 +def nanopolish_bcftools_query_to_table(bcftools_query_file): + table = pd.read_table(bcftools_query_file, header='infer') + table = table.dropna(how='all', axis=1) + old_colnames = list(table.columns) + new_colnames = [x.split(']')[-1].split(':')[-1] for x in old_colnames] + table.rename(columns=dict(zip(old_colnames, new_colnames)), inplace=True) + + ## Split out ref/alt depths from StrandSupport column + if not table.empty: + table_cp = table.copy() + table_cp[['FORW_REF_DP','REV_REF_DP', 'FORW_ALT_DP','REV_ALT_DP']] = table_cp['StrandSupport'].str.split(',', expand=True) + table_cp[['FORW_REF_DP','REV_REF_DP', 'FORW_ALT_DP','REV_ALT_DP']] = table_cp[['FORW_REF_DP','REV_REF_DP', 'FORW_ALT_DP','REV_ALT_DP']].apply(pd.to_numeric) + + table['DP'] = table_cp[['FORW_REF_DP','REV_REF_DP', 'FORW_ALT_DP','REV_ALT_DP']].sum(axis=1) + table['REF_DP'] = table_cp[['FORW_REF_DP','REV_REF_DP']].sum(axis=1) + table['ALT_DP'] = table_cp[['FORW_ALT_DP','REV_ALT_DP']].sum(axis=1) + table['AF'] = table['ALT_DP'] / table['DP'] + table['AF'] = table['AF'].round(2) + table.drop('StrandSupport', axis=1, inplace=True) + + return table + + +## Returns a pandas dataframe in the format: + # CHROM POS REF ALT FILTER DP REF_DP ALT_DP AF + # 0 MN908947.3 241 C T PASS 21 0 21 1.00 + # 1 MN908947.3 3037 C T PASS 28 0 25 0.89 +def medaka_bcftools_query_to_table(bcftools_query_file): + table = pd.read_table(bcftools_query_file, header='infer') + table = table.dropna(how='all', axis=1) + old_colnames = list(table.columns) + new_colnames = [x.split(']')[-1].split(':')[-1] for x in old_colnames] + table.rename(columns=dict(zip(old_colnames, new_colnames)), inplace=True) + + if not table.empty: + table[['REF_DP','ALT_DP']] = table['AC'].str.split(',', expand=True) + table[["ALT_DP", "DP"]] = table[["ALT_DP", "DP"]].apply(pd.to_numeric) + table['AF'] = table['ALT_DP'] / table['DP'] + table['AF'] = table['AF'].round(2) + table.drop('AC', axis=1, inplace=True) + + return table + + +def get_pangolin_lineage(pangolin_file): + table = pd.read_csv(pangolin_file, sep=",", header="infer") + + return table['lineage'][0] + + +def snpsift_to_table(snpsift_file): + table = pd.read_table(snpsift_file, sep="\t", header='infer') + table = table.loc[:, ~table.columns.str.contains('^Unnamed')] + old_colnames = list(table.columns) + new_colnames = [x.replace('ANN[*].', '') for x in old_colnames] + table.rename(columns=dict(zip(old_colnames, new_colnames)), inplace=True) + table = table.loc[:, ['CHROM', 'POS', 'REF', 'ALT', 'GENE', 'EFFECT', 'HGVS_C', 'HGVS_P']] + + ## Split by comma and get first value in cols = ['ALT','GENE','EFFECT','HGVS_C','HGVS_P'] + for i in range(len(table)): + for j in range(3,8): + table.iloc[i,j] = str(table.iloc[i,j]).split(",")[0] + + ## Amino acid substitution + aa = [] + for index,item in table["HGVS_P"].iteritems(): + hgvs_p = three_letter_aa_to_one(str(item)) + aa.append(hgvs_p) + table["HGVS_P_1LETTER"] = pd.Series(aa) + + return table + + +def main(args=None): + args = parser_args(args) + + ## Create output directory if it doesn't exist + out_dir = os.path.dirname(args.output_file) + make_dir(out_dir) + + ## Check correct variant caller has been provided + variant_callers = ['ivar', 'bcftools', 'nanopolish', 'medaka'] + if args.variant_caller not in variant_callers: + logger.error(f"Invalid option '--variant caller {args.variant_caller}'. Valid options: " + ', '.join(variant_callers)) + sys.exit(1) + + ## Find files and create a dictionary {'sample': '/path/to/file'} + bcftools_files = get_file_dict(args.bcftools_query_dir, args.bcftools_file_suffix) + snpsift_files = get_file_dict(args.snpsift_dir, args.snpsift_file_suffix) + pangolin_files = get_file_dict(args.pangolin_dir, args.pangolin_file_suffix) + + ## Check all files are provided for each sample + if set(bcftools_files) != set(snpsift_files): + logger.error(f"Number of BCFTools ({len(bcftools_files)}) and SnpSift ({len(snpsift_files)}) files do not match!") + sys.exit(1) + else: + if pangolin_files: + if set(bcftools_files) != set(pangolin_files): + logger.error(f"Number of BCFTools ({len(bcftools_files)}) and Pangolin ({len(pangolin_files)}) files do not match!") + sys.exit(1) + + ## Create per-sample table and write to file + sample_tables = [] + for sample in sorted(bcftools_files): + + ## Read in BCFTools query file + bcftools_table = None + if args.variant_caller == 'ivar': + bcftools_table = ivar_bcftools_query_to_table(bcftools_files[sample]) + elif args.variant_caller == 'bcftools': + bcftools_table = bcftools_bcftools_query_to_table(bcftools_files[sample]) + elif args.variant_caller == 'nanopolish': + bcftools_table = nanopolish_bcftools_query_to_table(bcftools_files[sample]) + elif args.variant_caller == 'medaka': + bcftools_table = medaka_bcftools_query_to_table(bcftools_files[sample]) + + if not bcftools_table.empty: + + ## Read in SnpSift file + snpsift_table = snpsift_to_table(snpsift_files[sample]) + + merged_table = pd.DataFrame(data = bcftools_table) + merged_table.insert(0,'SAMPLE', sample) + merged_table = pd.merge(merged_table, snpsift_table, how='outer') + merged_table['CALLER'] = args.variant_caller + + ## Read in Pangolin lineage file + if pangolin_files: + merged_table['LINEAGE'] = get_pangolin_lineage(pangolin_files[sample]) + + sample_tables.append(merged_table) + + ## Merge table across samples + if sample_tables: + merged_tables = pd.concat(sample_tables) + merged_tables.to_csv(args.output_file, index=False, encoding='utf-8-sig') + + +if __name__ == '__main__': + sys.exit(main()) diff --git a/bin/multiqc_to_custom_csv.py b/bin/multiqc_to_custom_csv.py index 34e54348..dac8b7ae 100755 --- a/bin/multiqc_to_custom_csv.py +++ b/bin/multiqc_to_custom_csv.py @@ -221,48 +221,25 @@ def main(args=None): ], ), ( - "multiqc_bcftools_stats_bcftools_ivar.yaml", - [ - ("# SNPs (iVar)", ["number_of_SNPs"]), - ("# INDELs (iVar)", ["number_of_indels"]), - ], - ), - ( - "multiqc_snpeff_snpeff_ivar.yaml", - [("# Missense variants (iVar)", ["MISSENSE"])], - ), - ( - "multiqc_quast_quast_ivar.yaml", - [("# Ns per 100kb consensus (iVar)", ["# N's per 100 kbp"])], - ), - ( - "multiqc_pangolin_pangolin_ivar.yaml", - [("Pangolin lineage (iVar)", ["lineage"])], - ), - ("multiqc_ivar_nextclade_clade.yaml", [("Nextclade clade (iVar)", ["clade"])]), - ( - "multiqc_bcftools_stats_bcftools_bcftools.yaml", + "multiqc_bcftools_stats.yaml", [ - ("# SNPs (BCFTools)", ["number_of_SNPs"]), - ("# INDELs (BCFTools)", ["number_of_indels"]), + ("# SNPs", ["number_of_SNPs"]), + ("# INDELs", ["number_of_indels"]), ], ), ( - "multiqc_snpeff_snpeff_bcftools.yaml", - [("# Missense variants (BCFTools)", ["MISSENSE"])], + "multiqc_snpeff.yaml", + [("# Missense variants", ["MISSENSE"])], ), ( - "multiqc_quast_quast_bcftools.yaml", - [("# Ns per 100kb consensus (BCFTools)", ["# N's per 100 kbp"])], + "multiqc_quast_quast_variants.yaml", + [("# Ns per 100kb consensus", ["# N's per 100 kbp"])], ), ( - "multiqc_pangolin_pangolin_bcftools.yaml", - [("Pangolin lineage (BCFTools)", ["lineage"])], - ), - ( - "multiqc_bcftools_nextclade_clade.yaml", - [("Nextclade clade (BCFTools)", ["clade"])], + "multiqc_pangolin.yaml", + [("Pangolin lineage", ["lineage"])], ), + ("multiqc_nextclade_clade.yaml", [("Nextclade clade", ["clade"])]), ] illumina_assembly_files = [ diff --git a/bin/scrape_software_versions.py b/bin/scrape_software_versions.py deleted file mode 100755 index fa933f47..00000000 --- a/bin/scrape_software_versions.py +++ /dev/null @@ -1,36 +0,0 @@ -#!/usr/bin/env python -from __future__ import print_function -import os - -results = {} -version_files = [x for x in os.listdir(".") if x.endswith(".version.txt")] -for version_file in version_files: - - software = version_file.replace(".version.txt", "") - if software == "pipeline": - software = "nf-core/viralrecon" - - with open(version_file) as fin: - version = fin.read().strip() - results[software] = version - -# Dump to YAML -print( - """ -id: 'software_versions' -section_name: 'nf-core/viralrecon Software Versions' -section_href: 'https://github.com/nf-core/viralrecon' -plot_type: 'html' -description: 'are collected at run time from the software output.' -data: | -
-""" -) -for k, v in sorted(results.items()): - print("
{}
{}
".format(k, v)) -print("
") - -# Write out as tsv file: -with open("software_versions.tsv", "w") as f: - for k, v in sorted(results.items()): - f.write("{}\t{}\n".format(k, v)) diff --git a/conf/base.config b/conf/base.config index 59d5e195..780d527c 100644 --- a/conf/base.config +++ b/conf/base.config @@ -47,4 +47,7 @@ process { errorStrategy = 'retry' maxRetries = 2 } + withName:CUSTOM_DUMPSOFTWAREVERSIONS { + cache = false + } } diff --git a/conf/modules.config b/conf/modules.config index 82d3f090..64efa4c1 100644 --- a/conf/modules.config +++ b/conf/modules.config @@ -1,485 +1,45 @@ /* ======================================================================================== - Config file for defining DSL2 per module options + Config file for defining DSL2 per module options and publishing paths ======================================================================================== Available keys to override module options: - args = Additional arguments appended to command in module. - args2 = Second set of arguments appended to command in module (multi-tool modules). - args3 = Third set of arguments appended to command in module (multi-tool modules). - publish_dir = Directory to publish results. - publish_by_meta = Groovy list of keys available in meta map to append as directories to "publish_dir" path - If publish_by_meta = true - Value of ${meta['id']} is appended as a directory to "publish_dir" path - If publish_by_meta = ['id', 'custompath'] - If "id" is in meta map and "custompath" isn't then "${meta['id']}/custompath/" - is appended as a directory to "publish_dir" path - If publish_by_meta = false / null - No directories are appended to "publish_dir" path - publish_files = Groovy map where key = "file_ext" and value = "directory" to publish results for that file extension - The value of "directory" is appended to the standard "publish_dir" path as defined above. - If publish_files = null (unspecified) - All files are published. - If publish_files = false - No files are published. - suffix = File name suffix for output files. + ext.args = Additional arguments appended to command in module. + ext.args2 = Second set of arguments appended to command in module (multi-tool modules). + ext.args3 = Third set of arguments appended to command in module (multi-tool modules). + ext.prefix = File name prefix for output files. ---------------------------------------------------------------------------------------- */ -params { - modules { - 'sra_ids_to_runinfo' { - publish_dir = 'public_data' - publish_files = ['tsv':'runinfo'] - } - 'sra_runinfo_to_ftp' { - publish_dir = 'public_data' - publish_files = ['tsv':'runinfo'] - } - 'sra_fastq_ftp' { - args = '-C - --max-time 1200' - publish_dir = 'public_data' - publish_files = ['fastq.gz':'', 'md5':'md5'] - } - 'sra_to_samplesheet' { - publish_dir = 'public_data' - publish_files = false - } - 'sra_merge_samplesheet' { - publish_dir = 'public_data' - } - 'nanopore_collapse_primers' { - publish_dir = 'genome' - } - 'nanopore_snpeff_build' { - publish_dir = 'genome' - } - 'nanopore_pycoqc' { - publish_dir = 'pycoqc' - } - 'nanopore_artic_guppyplex' { - args = '--min-length 400 --max-length 700' - publish_files = false - publish_dir = 'guppyplex' - } - 'nanopore_nanoplot' { - publish_by_meta = true - } - 'nanopore_artic_minion' { - args = '--normalise 500' - publish_files = ['.sorted.bam':'', '.sorted.bam.bai':'', 'fail.vcf':'', 'merged.vcf':'', 'primers.vcf':'', 'gz':'', 'tbi':'', '.consensus.fasta':''] - publish_dir = "${params.artic_minion_caller}" - } - 'nanopore_filter_bam' { - args = '-b -F 4' - suffix = '.mapped.sorted' - publish_dir = "${params.artic_minion_caller}" - } - 'nanopore_filter_bam_stats' { - suffix = '.mapped.sorted' - publish_files = ['bai':'', 'stats':'samtools_stats', 'flagstat':'samtools_stats', 'idxstats':'samtools_stats'] - publish_dir = "${params.artic_minion_caller}" - } - 'nanopore_bcftools_stats' { - publish_files = ['txt':''] - publish_dir = "${params.artic_minion_caller}/bcftools_stats" - } - 'nanopore_mosdepth_genome' { - args = '--fast-mode' - publish_files = ['summary.txt':''] - publish_dir = "${params.artic_minion_caller}/mosdepth/genome" - } - 'nanopore_plot_mosdepth_regions_genome' { - args = '--input_suffix .regions.bed.gz' - publish_files = ['tsv':'', 'pdf': ''] - publish_dir = "${params.artic_minion_caller}/mosdepth/genome" - } - 'nanopore_mosdepth_amplicon' { - args = '--fast-mode --use-median --thresholds 0,1,10,50,100,500' - publish_files = ['summary.txt':''] - publish_dir = "${params.artic_minion_caller}/mosdepth/amplicon" - } - 'nanopore_plot_mosdepth_regions_amplicon' { - args = '--input_suffix .regions.bed.gz' - publish_files = ['tsv':'', 'pdf': ''] - publish_dir = "${params.artic_minion_caller}/mosdepth/amplicon" - } - 'nanopore_pangolin' { - publish_dir = "${params.artic_minion_caller}/pangolin" - } - 'nanopore_nextclade' { - publish_files = ['csv':''] - publish_dir = "${params.artic_minion_caller}/nextclade" - } - 'nanopore_asciigenome' { - publish_dir = "${params.artic_minion_caller}/asciigenome" - publish_by_meta = true - } - 'nanopore_quast' { - publish_files = ['quast':''] - publish_dir = "${params.artic_minion_caller}" - } - 'nanopore_snpeff' { - publish_files = ['csv':'', 'txt':'', 'html':''] - publish_dir = "${params.artic_minion_caller}/snpeff" - } - 'nanopore_snpeff_bgzip' { - suffix = '.snpeff' - publish_dir = "${params.artic_minion_caller}/snpeff" - } - 'nanopore_snpeff_tabix' { - args = '-p vcf -f' - suffix = '.snpeff' - publish_dir = "${params.artic_minion_caller}/snpeff" - } - 'nanopore_snpeff_stats' { - suffix = '.snpeff' - publish_files = ['txt':'bcftools_stats'] - publish_dir = "${params.artic_minion_caller}/snpeff" - } - 'nanopore_snpsift' { - publish_dir = "${params.artic_minion_caller}/snpeff" - } - 'nanopore_multiqc' { - args = '' - publish_files = ['_data':"${params.artic_minion_caller}", 'html':"${params.artic_minion_caller}", 'csv':"${params.artic_minion_caller}"] - } - 'illumina_bedtools_getfasta' { - args = '-s -nameOnly' - publish_dir = 'genome' - } - 'illumina_collapse_primers_illumina' { - publish_dir = 'genome' - } - 'illumina_bowtie2_build' { - args = '--seed 1' - publish_dir = 'genome/index' - } - 'illumina_snpeff_build' { - publish_dir = 'genome/db' - } - 'illumina_blast_makeblastdb' { - args = '-parse_seqids -dbtype nucl' - publish_dir = 'genome/db' - } - 'illumina_kraken2_build' { - args = '' - args2 = '' - args3 = '' - publish_dir = 'genome/db' - } - 'illumina_cat_fastq' { - publish_files = false - publish_dir = 'fastq' - } - 'illumina_fastqc_raw' { - args = '--quiet' - publish_dir = 'fastqc/raw' - } - 'illumina_fastqc_trim' { - args = '--quiet' - publish_dir = 'fastqc/trim' - } - 'illumina_fastp' { - args = '--cut_front --cut_tail --trim_poly_x --cut_mean_quality 30 --qualified_quality_phred 30 --unqualified_percent_limit 10 --length_required 50' - publish_files = ['json':'', 'html':'', 'log': 'log'] - } - 'illumina_kraken2_kraken2' { - args = '--report-zero-counts' - publish_files = ['txt':''] - } - 'illumina_bowtie2_align' { - args = '--local --very-sensitive-local --seed 1' - args2 = '-F4' - publish_files = ['log':'log'] - publish_dir = 'variants/bowtie2' - } - 'illumina_bowtie2_sort_bam' { - suffix = '.sorted' - publish_files = ['bam':'', 'bai':'', 'stats':'samtools_stats', 'flagstat':'samtools_stats', 'idxstats':'samtools_stats'] - publish_dir = 'variants/bowtie2' - } - 'illumina_ivar_trim' { - args = '-m 30 -q 20' - suffix = '.ivar_trim' - publish_files = ['log':'log'] - publish_dir = 'variants/bowtie2' - } - 'illumina_ivar_trim_sort_bam' { - suffix = '.ivar_trim.sorted' - publish_files = ['stats':'samtools_stats', 'flagstat':'samtools_stats', 'idxstats':'samtools_stats'] - publish_dir = 'variants/bowtie2' - } - 'illumina_picard_markduplicates' { - args = 'ASSUME_SORTED=true VALIDATION_STRINGENCY=LENIENT TMP_DIR=tmp' - suffix = '.markduplicates.sorted' - publish_files = ['bam': '', 'metrics.txt':'picard_metrics'] - publish_dir = 'variants/bowtie2' - } - 'illumina_picard_markduplicates_sort_bam' { - suffix = '.markduplicates.sorted' - publish_files = ['bai':'', 'stats':'samtools_stats', 'flagstat':'samtools_stats', 'idxstats':'samtools_stats'] - publish_dir = 'variants/bowtie2' - } - 'illumina_picard_collectmultiplemetrics' { - args = 'VALIDATION_STRINGENCY=LENIENT TMP_DIR=tmp' - publish_files = ['metrics':'picard_metrics', 'pdf': 'picard_metrics/pdf'] - publish_dir = 'variants/bowtie2' - } - 'illumina_mosdepth_genome' { - args = '--fast-mode' - publish_files = ['summary.txt':''] - publish_dir = 'variants/bowtie2/mosdepth/genome' - } - 'illumina_plot_mosdepth_regions_genome' { - args = '--input_suffix .regions.bed.gz' - publish_files = ['tsv':'', 'pdf': ''] - publish_dir = 'variants/bowtie2/mosdepth/genome' - } - 'illumina_mosdepth_amplicon' { - args = '--fast-mode --use-median --thresholds 0,1,10,50,100,500' - publish_files = ['summary.txt':''] - publish_dir = 'variants/bowtie2/mosdepth/amplicon' - } - 'illumina_plot_mosdepth_regions_amplicon' { - args = '--input_suffix .regions.bed.gz' - publish_files = ['tsv':'', 'pdf': ''] - publish_dir = 'variants/bowtie2/mosdepth/amplicon' - } - 'illumina_ivar_variants' { - args = '-t 0.25 -q 20 -m 10' - args2 = '--count-orphans --no-BAQ --max-depth 0 --min-BQ 0' - publish_dir = 'variants/ivar' - } - 'illumina_ivar_variants_to_vcf' { - publish_files = ['log':'log'] - publish_dir = 'variants/ivar' - } - 'illumina_ivar_tabix_bgzip' { - publish_dir = 'variants/ivar' - } - 'illumina_ivar_tabix_tabix' { - args = '-p vcf -f' - publish_dir = 'variants/ivar' - } - 'illumina_ivar_bcftools_stats' { - publish_files = ['txt':'bcftools_stats'] - publish_dir = 'variants/ivar' - } - 'illumina_ivar_consensus' { - args = '-t 0.75 -q 20 -m 10 -n N' - args2 = '--count-orphans --no-BAQ --max-depth 0 --min-BQ 0 -aa' - suffix = '.consensus' - publish_dir = 'variants/ivar/consensus' - } - 'illumina_ivar_consensus_plot' { - suffix = '.consensus' - publish_dir = 'variants/ivar/consensus/base_qc' - } - 'illumina_ivar_snpeff' { - publish_files = ['csv':'', 'txt':'', 'html':''] - publish_dir = 'variants/ivar/snpeff' - } - 'illumina_ivar_snpeff_bgzip' { - suffix = '.snpeff' - publish_dir = 'variants/ivar/snpeff' - } - 'illumina_ivar_snpeff_tabix' { - args = '-p vcf -f' - suffix = '.snpeff' - publish_dir = 'variants/ivar/snpeff' - } - 'illumina_ivar_snpeff_stats' { - suffix = '.snpeff' - publish_files = ['txt':'bcftools_stats'] - publish_dir = 'variants/ivar/snpeff' - } - 'illumina_ivar_snpsift' { - publish_dir = 'variants/ivar/snpeff' - } - 'illumina_ivar_quast' { - publish_files = ['quast':''] - publish_dir = 'variants/ivar' - } - 'illumina_ivar_pangolin' { - publish_dir = 'variants/ivar/pangolin' - } - 'illumina_ivar_nextclade' { - publish_files = ['csv':''] - publish_dir = 'variants/ivar/nextclade' - } - 'illumina_ivar_asciigenome' { - publish_dir = 'variants/ivar/asciigenome' - publish_by_meta = true - } - 'illumina_bcftools_mpileup' { - args = '--count-orphans --no-BAQ --max-depth 0 --min-BQ 20 --annotate FORMAT/AD,FORMAT/ADF,FORMAT/ADR,FORMAT/DP,FORMAT/SP,INFO/AD,INFO/ADF,INFO/ADR' - args2 = '--ploidy 1 --keep-alts --keep-masked-ref --multiallelic-caller --variants-only' - args3 = "--include 'INFO/DP>=10'" - publish_files = ['gz':'', 'gz.tbi':'', 'stats.txt':'bcftools_stats'] - publish_dir = 'variants/bcftools' - } - 'illumina_bcftools_consensus_genomecov' { - args = "-bga | awk '\$4 < 10'" - suffix = '.coverage' - publish_files = false - publish_dir = 'variants/bcftools' - } - 'illumina_bcftools_consensus_merge' { - suffix = '.coverage.merged' - publish_files = false - publish_dir = 'variants/bcftools' - } - 'illumina_bcftools_consensus_mask' { - suffix = '.coverage.masked' - publish_files = false - publish_dir = 'variants/bcftools' - } - 'illumina_bcftools_consensus_maskfasta' { - suffix = '.masked' - publish_files = false - publish_dir = 'variants/bcftools' - } - 'illumina_bcftools_consensus_bcftools' { - suffix = '.consensus' - publish_dir = 'variants/bcftools/consensus' - } - 'illumina_bcftools_consensus_plot' { - suffix = '.consensus' - publish_dir = 'variants/bcftools/consensus/base_qc' - } - 'illumina_bcftools_snpeff' { - publish_files = ['csv':'', 'txt':'', 'html':''] - publish_dir = 'variants/bcftools/snpeff' - } - 'illumina_bcftools_snpeff_bgzip' { - suffix = '.snpeff' - publish_dir = 'variants/bcftools/snpeff' - } - 'illumina_bcftools_snpeff_tabix' { - args = '-p vcf -f' - suffix = '.snpeff' - publish_dir = 'variants/bcftools/snpeff' - } - 'illumina_bcftools_snpeff_stats' { - suffix = '.snpeff' - publish_files = ['txt':'bcftools_stats'] - publish_dir = 'variants/bcftools/snpeff' - } - 'illumina_bcftools_snpsift' { - publish_dir = 'variants/bcftools/snpeff' - } - 'illumina_bcftools_quast' { - publish_files = ['quast':''] - publish_dir = 'variants/bcftools' - } - 'illumina_bcftools_pangolin' { - publish_dir = 'variants/bcftools/pangolin' - } - 'illumina_bcftools_nextclade' { - publish_files = ['csv':''] - publish_dir = 'variants/bcftools/nextclade' - } - 'illumina_bcftools_asciigenome' { - publish_dir = 'variants/bcftools/asciigenome' - publish_by_meta = true - } - 'illumina_bcftools_isec' { - args = '--nfiles +2 --output-type z' - publish_dir = 'variants/intersect' - } - 'illumina_cutadapt' { - args = '--overlap 5 --minimum-length 30 --error-rate 0.1' - suffix = '.primer_trim' - publish_files = ['log':'log'] - publish_dir = 'assembly/cutadapt' - } - 'illumina_cutadapt_fastqc' { - args = '--quiet' - suffix = 'primer_trim' - publish_dir = 'assembly/cutadapt/fastqc' - } - 'illumina_spades' { - args = '' - publish_files = ['log':'log', 'fa':'', 'gfa':''] - publish_dir = "assembly/spades/${params.spades_mode}" - } - 'illumina_spades_bandage' { - args = '--height 1000' - publish_dir = "assembly/spades/${params.spades_mode}/bandage" - } - 'illumina_spades_blastn' { - args = "-outfmt '6 stitle std slen qlen qcovs'" - publish_dir = "assembly/spades/${params.spades_mode}/blastn" - } - 'illumina_spades_blastn_filter' { - suffix = '.filter.blastn' - publish_dir = "assembly/spades/${params.spades_mode}/blastn" - } - 'illumina_spades_abacas' { - args = '-m -p nucmer' - publish_dir = "assembly/spades/${params.spades_mode}/abacas" - } - 'illumina_spades_plasmidid' { - args = '--only-reconstruct -C 47 -S 47 -i 60 --no-trim -k 0.80' - publish_files = ['html':'', 'tab':'', 'logs':'', 'images':''] - publish_dir = "assembly/spades/${params.spades_mode}/plasmidid" - } - 'illumina_spades_quast' { - publish_files = ['quast':''] - publish_dir = "assembly/spades/${params.spades_mode}" - } - 'illumina_unicycler' { - publish_files = ['log':'log', 'fa':'', 'gfa':''] - publish_dir = 'assembly/unicycler' - } - 'illumina_unicycler_bandage' { - args = '--height 1000' - publish_dir = 'assembly/unicycler/bandage' - } - 'illumina_unicycler_blastn' { - args = "-outfmt '6 stitle std slen qlen qcovs'" - publish_dir = 'assembly/unicycler/blastn' - } - 'illumina_unicycler_blastn_filter' { - suffix = '.filter.blastn' - publish_dir = 'assembly/unicycler/blastn' - } - 'illumina_unicycler_abacas' { - args = '-m -p nucmer' - publish_dir = 'assembly/unicycler/abacas' - } - 'illumina_unicycler_plasmidid' { - args = '--only-reconstruct -C 47 -S 47 -i 60 --no-trim -k 0.80' - publish_files = ['html':'', 'tab':'', 'logs':'', 'images':''] - publish_dir = 'assembly/unicycler/plasmidid' - } - 'illumina_unicycler_quast' { - publish_files = ['quast':''] - publish_dir = 'assembly/unicycler' - } - 'illumina_minia' { - args = '-kmer-size 31 -abundance-min 20' - publish_dir = 'assembly/minia' - } - 'illumina_minia_blastn' { - args = "-outfmt '6 stitle std slen qlen qcovs'" - publish_dir = 'assembly/minia/blastn' - } - 'illumina_minia_blastn_filter' { - suffix = '.filter.blastn' - publish_dir = 'assembly/minia/blastn' - } - 'illumina_minia_abacas' { - args = '-m -p nucmer' - publish_dir = 'assembly/minia/abacas' - } - 'illumina_minia_plasmidid' { - args = '--only-reconstruct -C 47 -S 47 -i 60 --no-trim -k 0.80' - publish_files = ['html':'', 'tab':'', 'logs':'', 'images':''] - publish_dir = 'assembly/minia/plasmidid' - } - 'illumina_minia_quast' { - publish_files = ['quast':''] - publish_dir = 'assembly/minia' - } - 'illumina_multiqc' { - args = '' - publish_files = ['_data':'', 'html':''] - } +// +// General configuration options +// + +process { + publishDir = [ + path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + + withName: 'SAMPLESHEET_CHECK' { + publishDir = [ + path: { "${params.outdir}/pipeline_info" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: 'CUSTOM_DUMPSOFTWAREVERSIONS' { + publishDir = [ + path: { "${params.outdir}/pipeline_info" }, + mode: 'copy', + pattern: '*_versions.yml' + ] } } + +if (params.platform == 'nanopore') { + includeConfig 'modules_nanopore.config' +} else if (params.platform == 'illumina') { + includeConfig 'modules_illumina.config' +} diff --git a/conf/modules_illumina.config b/conf/modules_illumina.config new file mode 100644 index 00000000..d21a5eeb --- /dev/null +++ b/conf/modules_illumina.config @@ -0,0 +1,1073 @@ +/* +======================================================================================== + Config file for defining DSL2 per module options and publishing paths +======================================================================================== + Available keys to override module options: + ext.args = Additional arguments appended to command in module. + ext.args2 = Second set of arguments appended to command in module (multi-tool modules). + ext.args3 = Third set of arguments appended to command in module (multi-tool modules). + ext.prefix = File name prefix for output files. +---------------------------------------------------------------------------------------- +*/ + +def variant_caller = params.variant_caller +if (!variant_caller) { variant_caller = params.protocol == 'amplicon' ? 'ivar' : 'bcftools' } + +def assemblers = params.assemblers ? params.assemblers.split(',').collect{ it.trim().toLowerCase() } : [] + +// +// Pre-processing and general configuration options +// + +process { + withName: '.*:.*:PREPARE_GENOME:GUNZIP_.*' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: '.*:.*:PREPARE_GENOME:UNTAR_.*' { + ext.args2 = '--no-same-owner' + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'CAT_FASTQ' { + publishDir = [ + path: { "${params.outdir}/fastq" }, + enabled: false + ] + } +} + +if (!params.skip_fastqc) { + process { + withName: '.*:.*:FASTQC_FASTP:FASTQC_RAW' { + ext.args = '--quiet' + publishDir = [ + path: { "${params.outdir}/fastqc/raw" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } +} + +if (!params.skip_fastp) { + process { + withName: 'FASTP' { + ext.args = '--cut_front --cut_tail --trim_poly_x --cut_mean_quality 30 --qualified_quality_phred 30 --unqualified_percent_limit 10 --length_required 50' + publishDir = [ + [ + path: { "${params.outdir}/fastp" }, + mode: 'copy', + pattern: "*.{json,html}" + ], + [ + path: { "${params.outdir}/fastp/log" }, + mode: 'copy', + pattern: "*.log" + ], + [ + path: { "${params.outdir}/fastp" }, + mode: 'copy', + pattern: "*.fail.fastq.gz", + enabled: params.save_trimmed_fail + ] + ] + } + + withName: 'MULTIQC_TSV_FAIL_READS' { + publishDir = [ + path: { "${params.outdir}/multiqc" }, + enabled: false + ] + } + } + + if (!params.skip_fastqc) { + process { + withName: '.*:.*:FASTQC_FASTP:FASTQC_TRIM' { + ext.args = '--quiet' + publishDir = [ + path: { "${params.outdir}/fastqc/trim" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } +} + +if (!params.skip_kraken2) { + process { + withName: 'KRAKEN2_BUILD' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'KRAKEN2_KRAKEN2' { + ext.args = '--report-zero-counts' + publishDir = [ + path: { "${params.outdir}/kraken2" }, + mode: 'copy', + pattern: "*.txt" + ] + } + } +} + +// +// Variant calling configuration options +// + +if (!params.skip_variants) { + process { + withName: 'BOWTIE2_BUILD' { + ext.args = '--seed 1' + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'BOWTIE2_ALIGN' { + ext.args = '--local --very-sensitive-local --seed 1' + ext.args2 = '-F4' + publishDir = [ + [ + path: { "${params.outdir}/variants/bowtie2/log" }, + mode: 'copy', + pattern: "*.log" + ], + [ + path: { "${params.outdir}/variants/bowtie2/unmapped" }, + mode: 'copy', + pattern: "*.fastq.gz", + enabled: params.save_unaligned + ] + ] + } + + withName: '.*:.*:ALIGN_BOWTIE2:.*:SAMTOOLS_SORT' { + ext.prefix = { "${meta.id}.sorted" } + publishDir = [ + path: { "${params.outdir}/variants/bowtie2" }, + mode: 'copy', + pattern: "*.bam" + ] + } + + withName: '.*:.*:ALIGN_BOWTIE2:.*:SAMTOOLS_INDEX' { + publishDir = [ + path: { "${params.outdir}/variants/bowtie2" }, + mode: 'copy', + pattern: "*.bai" + ] + } + + withName: '.*:.*:ALIGN_BOWTIE2:.*:BAM_STATS_SAMTOOLS:.*' { + publishDir = [ + path: { "${params.outdir}/variants/bowtie2/samtools_stats" }, + mode: 'copy', + pattern: "*.{stats,flagstat,idxstats}" + ] + } + + withName: 'MULTIQC_TSV_FAIL_MAPPED' { + publishDir = [ + path: { "${params.outdir}/multiqc" }, + enabled: false + ] + } + } + + if (params.protocol == 'amplicon' || !params.skip_asciigenome) { + process { + withName: 'CUSTOM_GETCHROMSIZES' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + } + } + + if (!params.skip_ivar_trim && params.protocol == 'amplicon') { + process { + withName: 'IVAR_TRIM' { + ext.args = [ + '-m 30 -q 20', + params.ivar_trim_noprimer ? '' : '-e', + params.ivar_trim_offset ? "-x ${params.ivar_trim_offset}" : '' + ].join(' ').trim() + ext.prefix = { "${meta.id}.ivar_trim" } + publishDir = [ + path: { "${params.outdir}/variants/bowtie2/log" }, + mode: 'copy', + pattern: '*.log' + ] + } + + withName: '.*:.*:PRIMER_TRIM_IVAR:.*:SAMTOOLS_SORT' { + ext.prefix = { "${meta.id}.ivar_trim.sorted" } + publishDir = [ + path: { "${params.outdir}/variants/bowtie2" }, + mode: 'copy', + pattern: "*.bam", + enabled: params.skip_markduplicates + ] + } + + withName: '.*:.*:PRIMER_TRIM_IVAR:.*:SAMTOOLS_INDEX' { + publishDir = [ + path: { "${params.outdir}/variants/bowtie2" }, + mode: 'copy', + pattern: "*.bai", + enabled: params.skip_markduplicates + ] + } + + withName: '.*:.*:PRIMER_TRIM_IVAR:.*:BAM_STATS_SAMTOOLS:.*' { + publishDir = [ + path: { "${params.outdir}/variants/bowtie2/samtools_stats" }, + mode: 'copy', + pattern: "*.{stats,flagstat,idxstats}" + ] + } + } + } + + if (!params.skip_markduplicates) { + process { + withName: 'PICARD_MARKDUPLICATES' { + ext.args = [ + 'ASSUME_SORTED=true VALIDATION_STRINGENCY=LENIENT TMP_DIR=tmp', + params.filter_duplicates ? 'REMOVE_DUPLICATES=true' : '' + ].join(' ').trim() + ext.prefix = { "${meta.id}.markduplicates.sorted" } + publishDir = [ + [ + path: { "${params.outdir}/variants/bowtie2/picard_metrics" }, + mode: 'copy', + pattern: '*metrics.txt' + ], + [ + path: { "${params.outdir}/variants/bowtie2" }, + mode: 'copy', + pattern: '*.bam' + ] + ] + } + + withName: '.*:MARK_DUPLICATES_PICARD:SAMTOOLS_INDEX' { + ext.prefix = { "${meta.id}.markduplicates.sorted" } + publishDir = [ + path: { "${params.outdir}/variants/bowtie2" }, + mode: 'copy', + pattern: '*.bai' + ] + } + + withName: '.*:MARK_DUPLICATES_PICARD:BAM_STATS_SAMTOOLS:.*' { + publishDir = [ + path: { "${params.outdir}/variants/bowtie2/samtools_stats" }, + mode: 'copy', + pattern: '*.{stats,flagstat,idxstats}' + ] + } + } + } + + if (!params.skip_picard_metrics) { + process { + withName: 'PICARD_COLLECTMULTIPLEMETRICS' { + ext.args = 'VALIDATION_STRINGENCY=LENIENT TMP_DIR=tmp' + publishDir = [ + [ + path: { "${params.outdir}/variants/bowtie2/picard_metrics" }, + mode: 'copy', + pattern: '*metrics' + ], + [ + path: { "${params.outdir}/variants/bowtie2/picard_metrics/pdf" }, + mode: 'copy', + pattern: '*.pdf' + ] + ] + } + } + } + + if (!params.skip_mosdepth) { + process { + withName: 'MOSDEPTH_GENOME' { + ext.args = '--fast-mode' + publishDir = [ + path: { "${params.outdir}/variants/bowtie2/mosdepth/genome" }, + mode: 'copy', + pattern: "*.summary.txt" + ] + } + + withName: 'PLOT_MOSDEPTH_REGIONS_GENOME' { + ext.args = '--input_suffix .regions.bed.gz' + publishDir = [ + path: { "${params.outdir}/variants/bowtie2/mosdepth/genome" }, + mode: 'copy', + pattern: "*.{tsv,pdf}" + ] + } + } + + if (params.protocol == 'amplicon') { + process { + withName: 'COLLAPSE_PRIMERS' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'MOSDEPTH_AMPLICON' { + ext.args = '--fast-mode --use-median --thresholds 0,1,10,50,100,500' + publishDir = [ + path: { "${params.outdir}/variants/bowtie2/mosdepth/amplicon" }, + mode: 'copy', + pattern: "*.summary.txt" + ] + } + + withName: 'PLOT_MOSDEPTH_REGIONS_AMPLICON' { + ext.args = '--input_suffix .regions.bed.gz' + publishDir = [ + path: { "${params.outdir}/variants/bowtie2/mosdepth/amplicon" }, + mode: 'copy', + pattern: "*.{tsv,pdf}" + ] + } + } + } + } + + if (variant_caller == 'ivar') { + process { + withName: 'IVAR_VARIANTS' { + ext.args = '-t 0.25 -q 20 -m 10' + ext.args2 = '--ignore-overlaps --count-orphans --no-BAQ --max-depth 0 --min-BQ 0' + publishDir = [ + path: { "${params.outdir}/variants/ivar" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: 'IVAR_VARIANTS_TO_VCF' { + ext.args = params.protocol == 'amplicon' ? '--ignore_strand_bias' : '' + publishDir = [ + path: { "${params.outdir}/variants/ivar/log" }, + mode: 'copy', + pattern: '*.log' + ] + } + + withName: '.*:.*:VARIANTS_IVAR:.*:TABIX_BGZIP' { + publishDir = [ + path: { "${params.outdir}/variants/ivar" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:VARIANTS_IVAR:.*:.*:TABIX_TABIX' { + ext.args = '-p vcf -f' + publishDir = [ + path: { "${params.outdir}/variants/ivar" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:VARIANTS_IVAR:.*:.*:BCFTOOLS_STATS' { + publishDir = [ + path: { "${params.outdir}/variants/ivar/bcftools_stats" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (variant_caller == 'bcftools') { + process { + withName: 'BCFTOOLS_MPILEUP' { + ext.args = '--ignore-overlaps --count-orphans --no-BAQ --max-depth 0 --min-BQ 20 --annotate FORMAT/AD,FORMAT/ADF,FORMAT/ADR,FORMAT/DP,FORMAT/SP,INFO/AD,INFO/ADF,INFO/ADR' + ext.args2 = '--ploidy 1 --keep-alts --keep-masked-ref --multiallelic-caller --variants-only' + ext.args3 = "--include 'INFO/DP>=10'" + ext.prefix = { "${meta.id}.orig" } + publishDir = [ + path: { "${params.outdir}/variants/bcftools" }, + mode: 'copy', + pattern: '*.mpileup', + enabled: params.save_mpileup + ] + } + + withName: 'BCFTOOLS_NORM' { + ext.args = '--do-not-normalize --output-type z --multiallelics -any' + publishDir = [ + path: { "${params.outdir}/variants/bcftools" }, + mode: 'copy', + pattern: "*.vcf.gz" + ] + } + + withName: '.*:.*:VARIANTS_BCFTOOLS:.*:TABIX_TABIX' { + ext.args = '-p vcf -f' + publishDir = [ + path: { "${params.outdir}/variants/bcftools" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:VARIANTS_BCFTOOLS:.*:BCFTOOLS_STATS' { + publishDir = [ + path: { "${params.outdir}/variants/bcftools/bcftools_stats" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_asciigenome) { + process { + withName: 'ASCIIGENOME' { + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/asciigenome/${meta.id}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_snpeff) { + process { + withName: 'SNPEFF_BUILD' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'SNPEFF_ANN' { + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/snpeff" }, + mode: 'copy', + pattern: "*.{csv,txt,html}" + ] + } + + withName: '.*:.*:.*:.*:SNPEFF_SNPSIFT:.*:TABIX_BGZIP' { + ext.prefix = { "${meta.id}.snpeff" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/snpeff" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:.*:.*:SNPEFF_SNPSIFT:.*:.*:TABIX_TABIX' { + ext.args = '-p vcf -f' + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/snpeff" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:.*:.*:SNPEFF_SNPSIFT:.*:.*:BCFTOOLS_STATS' { + ext.prefix = { "${meta.id}.snpeff" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/snpeff/bcftools_stats" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:.*:.*:SNPEFF_SNPSIFT:SNPSIFT_EXTRACTFIELDS' { + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/snpeff" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + + if (!params.skip_variants_long_table) { + process { + withName: 'BCFTOOLS_QUERY' { + ext.args = [ + variant_caller == 'ivar' ? "-H -f '%CHROM\\t%POS\\t%REF\\t%ALT\\t%FILTER\\t[%DP\\t]\\t[%REF_DP\\t]\\t[%ALT_DP\\t]\\n'" : '', + variant_caller == 'bcftools' ? "-H -f '%CHROM\\t%POS\\t%REF\\t%ALT\\t%FILTER\\t[%DP\\t]\\t[%AD\\t]\\n'" : '', + ].join(' ').trim() + ext.prefix = { "${meta.id}.bcftools_query" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}" }, + enabled: false + ] + } + + withName: 'MAKE_VARIANTS_LONG_TABLE' { + ext.args = "--variant_caller ${variant_caller}" + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + } + + if (!params.skip_consensus && params.consensus_caller == 'ivar') { + process { + withName: 'IVAR_CONSENSUS' { + ext.args = '-t 0.75 -q 20 -m 10 -n N' + ext.args2 = '--count-orphans --no-BAQ --max-depth 0 --min-BQ 0 -aa' + ext.prefix = { "${meta.id}.consensus" } + publishDir = [ + [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/ivar" }, + mode: 'copy', + pattern: "*.{fa,txt}", + ], + [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/ivar" }, + mode: 'copy', + pattern: "*.mpileup", + enabled: params.save_mpileup + ] + ] + } + } + } + + if (!params.skip_consensus && params.consensus_caller == 'bcftools') { + process { + withName: 'BCFTOOLS_FILTER' { + ext.args = [ + '--output-type z', + variant_caller == 'ivar' ? "--include 'FORMAT/ALT_FREQ >= 0.75'" : '', + variant_caller == 'bcftools' ? "--include 'FORMAT/AD[:1] / FORMAT/DP >= 0.75'" : '', + ].join(' ').trim() + ext.prefix = { "${meta.id}.filtered" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/bcftools" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:CONSENSUS_BCFTOOLS:TABIX_TABIX' { + ext.args = '-p vcf -f' + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/bcftools" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: 'MAKE_BED_MASK' { + ext.args = "-a --ignore-overlaps --count-orphans --no-BAQ --max-depth 0 --min-BQ 0" + ext.args2 = 10 + ext.prefix = { "${meta.id}.coverage.masked" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/bcftools" }, + mode: 'copy', + pattern: "*.mpileup", + enabled: params.save_mpileup + ] + } + + withName: 'BEDTOOLS_MERGE' { + ext.prefix = { "${meta.id}.coverage.merged" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/bcftools" }, + enabled: false + ] + } + + withName: 'BEDTOOLS_MASKFASTA' { + ext.prefix = { "${meta.id}.masked" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/bcftools" }, + enabled: false + ] + } + + withName: 'BCFTOOLS_CONSENSUS' { + ext.prefix = { "${meta.id}" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/bcftools" }, + enabled: false + ] + } + + withName: 'RENAME_FASTA_HEADER' { + ext.prefix = { "${meta.id}.consensus" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/bcftools" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_consensus) { + if (!params.skip_pangolin) { + process { + withName: 'PANGOLIN' { + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/${params.consensus_caller}/pangolin" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_nextclade) { + process { + withName: 'NEXTCLADE_RUN' { + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/${params.consensus_caller}/nextclade" }, + mode: 'copy', + pattern: "*.csv" + ] + } + + withName: 'MULTIQC_TSV_NEXTCLADE' { + publishDir = [ + path: { "${params.outdir}/multiqc" }, + enabled: false + ] + } + } + } + + if (!params.skip_variants_quast) { + process { + withName: '.*:.*:CONSENSUS_.*:.*:QUAST' { + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/${params.consensus_caller}" }, + mode: 'copy', + pattern: "quast" + ] + } + } + } + + if (!params.skip_consensus_plots) { + process { + withName: 'PLOT_BASE_DENSITY' { + ext.prefix = { "${meta.id}.consensus" } + publishDir = [ + path: { "${params.outdir}/variants/${variant_caller}/consensus/${params.consensus_caller}/base_qc" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + } +} + +if (!params.skip_assembly) { + if (!params.skip_blast) { + process { + withName: 'BLAST_MAKEBLASTDB' { + ext.args = '-parse_seqids -dbtype nucl' + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + } + } + + if (params.protocol == 'amplicon' && !params.skip_cutadapt) { + process { + withName: 'BEDTOOLS_GETFASTA' { + ext.args = '-s -nameOnly' + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'CUTADAPT' { + ext.args = '--overlap 5 --minimum-length 30 --error-rate 0.1' + ext.prefix = { "${meta.id}.primer_trim" } + publishDir = [ + path: { "${params.outdir}/assembly/cutadapt/log" }, + mode: 'copy', + pattern: '*.log' + ] + } + } + + if (!params.skip_fastqc) { + process { + withName: '.*:.*:FASTQC' { + ext.args = '--quiet' + ext.prefix = { "${meta.id}.primer_trim" } + publishDir = [ + path: { "${params.outdir}/assembly/cutadapt/fastqc" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + } + + if ('spades' in assemblers) { + process { + withName: 'SPADES' { + ext.args = params.spades_mode ? "--${params.spades_mode}" : '' + publishDir = [ + [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}" }, + mode: 'copy', + pattern: '*.{fa.gz,gfa.gz}' + ], + [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}/log" }, + mode: 'copy', + pattern: '*.log' + ] + ] + } + + withName: '.*:.*:ASSEMBLY_SPADES:GUNZIP_SCAFFOLDS' { + publishDir = [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}" }, + enabled: false + ] + } + + withName: '.*:.*:ASSEMBLY_SPADES:GUNZIP_GFA' { + publishDir = [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}" }, + enabled: false + ] + } + } + + if (!params.skip_bandage) { + process { + withName: '.*:.*:ASSEMBLY_SPADES:BANDAGE_IMAGE' { + ext.args = '--height 1000' + publishDir = [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}/bandage" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_blast) { + process { + withName: '.*:.*:ASSEMBLY_SPADES:.*:BLAST_BLASTN' { + ext.args = "-outfmt '6 stitle std slen qlen qcovs'" + publishDir = [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}/blastn" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:ASSEMBLY_SPADES:.*:FILTER_BLASTN' { + ext.prefix = { "${meta.id}.filter.blastn" } + publishDir = [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}/blastn" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_assembly_quast) { + process { + withName: '.*:.*:ASSEMBLY_SPADES:.*:QUAST' { + publishDir = [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}" }, + mode: 'copy', + pattern: "quast" + ] + } + } + } + + if (!params.skip_abacas) { + process { + withName: '.*:.*:ASSEMBLY_SPADES:.*:ABACAS' { + ext.args = '-m -p nucmer' + publishDir = [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}/abacas" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_plasmidid) { + process { + withName: '.*:.*:ASSEMBLY_SPADES:.*:PLASMIDID' { + ext.args = '--only-reconstruct -C 47 -S 47 -i 60 --no-trim -k 0.80' + publishDir = [ + path: { "${params.outdir}/assembly/spades/${params.spades_mode}/plasmidid" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + } + + if ('unicycler' in assemblers) { + process { + withName: 'UNICYCLER' { + publishDir = [ + [ + path: { "${params.outdir}/assembly/unicycler" }, + mode: 'copy', + pattern: '*.{fa.gz,gfa.gz}' + ], + [ + path: { "${params.outdir}/assembly/unicycler/log" }, + mode: 'copy', + pattern: '*.log' + ] + ] + } + + withName: '.*:.*:ASSEMBLY_UNICYCLER:GUNZIP_SCAFFOLDS' { + publishDir = [ + path: { "${params.outdir}/assembly/unicycler" }, + enabled: false + ] + } + + withName: '.*:.*:ASSEMBLY_UNICYCLER:GUNZIP_GFA' { + publishDir = [ + path: { "${params.outdir}/assembly/unicycler" }, + enabled: false + ] + } + } + + if (!params.skip_bandage) { + process { + withName: '.*:.*:ASSEMBLY_UNICYCLER:BANDAGE_IMAGE' { + ext.args = '--height 1000' + publishDir = [ + path: { "${params.outdir}/assembly/unicycler/bandage" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_blast) { + process { + withName: '.*:.*:ASSEMBLY_UNICYCLER:.*:BLAST_BLASTN' { + ext.args = "-outfmt '6 stitle std slen qlen qcovs'" + publishDir = [ + path: { "${params.outdir}/assembly/unicycler/blastn" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:ASSEMBLY_UNICYCLER:.*:FILTER_BLASTN' { + ext.prefix = { "${meta.id}.filter.blastn" } + publishDir = [ + path: { "${params.outdir}/assembly/unicycler/blastn" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_assembly_quast) { + process { + withName: '.*:.*:ASSEMBLY_UNICYCLER:.*:QUAST' { + publishDir = [ + path: { "${params.outdir}/assembly/unicycler" }, + mode: 'copy', + pattern: "quast" + ] + } + } + } + + if (!params.skip_abacas) { + process { + withName: '.*:.*:ASSEMBLY_UNICYCLER:.*:ABACAS' { + ext.args = '-m -p nucmer' + publishDir = [ + path: { "${params.outdir}/assembly/unicycler/abacas" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_plasmidid) { + process { + withName: '.*:.*:ASSEMBLY_UNICYCLER:.*:PLASMIDID' { + ext.args = '--only-reconstruct -C 47 -S 47 -i 60 --no-trim -k 0.80' + publishDir = [ + path: { "${params.outdir}/assembly/unicycler/plasmidid" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + } + + if ('minia' in assemblers) { + process { + withName: 'MINIA' { + ext.args = '-kmer-size 31 -abundance-min 20' + publishDir = [ + path: { "${params.outdir}/assembly/minia" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + + if (!params.skip_blast) { + process { + withName: '.*:.*:ASSEMBLY_MINIA:.*:BLAST_BLASTN' { + ext.args = "-outfmt '6 stitle std slen qlen qcovs'" + publishDir = [ + path: { "${params.outdir}/assembly/minia/blastn" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:ASSEMBLY_MINIA:.*:FILTER_BLASTN' { + ext.prefix = { "${meta.id}.filter.blastn" } + publishDir = [ + path: { "${params.outdir}/assembly/minia/blastn" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_assembly_quast) { + process { + withName: '.*:.*:ASSEMBLY_MINIA:.*:QUAST' { + publishDir = [ + path: { "${params.outdir}/assembly/minia" }, + mode: 'copy', + pattern: "quast" + ] + } + } + } + + if (!params.skip_abacas) { + process { + withName: '.*:.*:ASSEMBLY_MINIA:.*:ABACAS' { + ext.args = '-m -p nucmer' + publishDir = [ + path: { "${params.outdir}/assembly/minia/abacas" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + + if (!params.skip_plasmidid) { + process { + withName: '.*:.*:ASSEMBLY_MINIA:.*:PLASMIDID' { + ext.args = '--only-reconstruct -C 47 -S 47 -i 60 --no-trim -k 0.80' + publishDir = [ + path: { "${params.outdir}/assembly/minia/plasmidid" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } + } +} + +if (!params.skip_multiqc) { + process { + withName: 'MULTIQC' { + ext.args = params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' + publishDir = [ + [ + path: { "${params.outdir}/multiqc" }, + mode: 'copy', + pattern: 'multiqc*' + ], + [ + path: { "${params.outdir}/multiqc" }, + mode: 'copy', + pattern: '*variants_metrics_mqc.csv', + enabled: !params.skip_variants + ], + [ + path: { "${params.outdir}/multiqc" }, + mode: 'copy', + pattern: '*assembly_metrics_mqc.csv', + enabled: !params.skip_assembly + ] + ] + } + } +} diff --git a/conf/modules_nanopore.config b/conf/modules_nanopore.config new file mode 100644 index 00000000..60a9e6b4 --- /dev/null +++ b/conf/modules_nanopore.config @@ -0,0 +1,373 @@ +/* +======================================================================================== + Config file for defining DSL2 per module options and publishing paths +======================================================================================== + Available keys to override module options: + ext.args = Additional arguments appended to command in module. + ext.args2 = Second set of arguments appended to command in module (multi-tool modules). + ext.args3 = Third set of arguments appended to command in module (multi-tool modules). + ext.prefix = File name prefix for output files. +---------------------------------------------------------------------------------------- +*/ + +// +// General configuration options +// + +process { + withName: 'GUNZIP_.*' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'CUSTOM_GETCHROMSIZES' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'MULTIQC_TSV_BARCODE_COUNT|MULTIQC_TSV_GUPPYPLEX_COUNT' { + publishDir = [ + path: { "${params.outdir}/multiqc/${params.artic_minion_caller}" }, + enabled: false + ] + } + + withName: 'ARTIC_GUPPYPLEX' { + ext.args = params.primer_set_version == 1200 ? '--min-length 250 --max-length 1500' : '--min-length 400 --max-length 700' + publishDir = [ + path: { "${params.outdir}/guppyplex" }, + enabled: false + ] + } + + withName: 'ARTIC_MINION' { + ext.args = [ + '--normalise 500', + params.artic_minion_caller == 'medaka' ? '--medaka' : '', + params.artic_minion_aligner == 'bwa' ? '--bwa' : '--minimap2' + ].join(' ').trim() + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}" }, + mode: 'copy', + pattern: "*.{sorted.bam,sorted.bam.bai,fail.vcf,merged.vcf,primers.vcf,gz,tbi,consensus.fasta}" + ] + } + + withName: 'VCFLIB_VCFUNIQ' { + ext.args = '-f' + ext.prefix = { "${meta.id}.pass.unique" } + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:TABIX_TABIX' { + ext.args = '-p vcf -f' + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:.*:SAMTOOLS_VIEW' { + ext.args = '-b -F 4' + ext.prefix = { "${meta.id}.mapped.sorted" } + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:.*:SAMTOOLS_INDEX' { + ext.prefix = { "${meta.id}.mapped.sorted" } + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:.*:BAM_STATS_SAMTOOLS:.*' { + ext.prefix = { "${meta.id}.mapped.sorted" } + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/samtools_stats" }, + mode: 'copy', + pattern: "*.{stats,flagstat,idxstats}" + ] + } + + withName: '.*:.*:BCFTOOLS_STATS' { + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/bcftools_stats" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } +} + +// +// Optional configuration options +// + +if (params.input) { + process { + withName: 'MULTIQC_TSV_NO_.*' { + publishDir = [ + path: { "${params.outdir}/multiqc/${params.artic_minion_caller}" }, + enabled: false + ] + } + } +} + +if (params.sequencing_summary && !params.skip_pycoqc) { + process { + withName: 'PYCOQC' { + publishDir = [ + path: { "${params.outdir}/pycoqc" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } +} + +if (!params.skip_nanoplot) { + process { + withName: 'NANOPLOT' { + publishDir = [ + path: { "${params.outdir}/nanoplot/${meta.id}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } +} + +if (!params.skip_mosdepth) { + process { + withName: 'COLLAPSE_PRIMERS' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'MOSDEPTH_GENOME' { + ext.args = '--fast-mode' + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/mosdepth/genome" }, + mode: 'copy', + pattern: "*.summary.txt" + ] + } + + withName: 'PLOT_MOSDEPTH_REGIONS_GENOME' { + ext.args = '--input_suffix .regions.bed.gz' + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/mosdepth/genome" }, + mode: 'copy', + pattern: "*.{tsv,pdf}" + ] + } + + withName: 'MOSDEPTH_AMPLICON' { + ext.args = '--fast-mode --use-median --thresholds 0,1,10,50,100,500' + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/mosdepth/amplicon" }, + mode: 'copy', + pattern: "*.summary.txt" + ] + } + + withName: 'PLOT_MOSDEPTH_REGIONS_AMPLICON' { + ext.args = '--input_suffix .regions.bed.gz' + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/mosdepth/amplicon" }, + mode: 'copy', + pattern: "*.{tsv,pdf}" + ] + } + } +} + +if (!params.skip_pangolin) { + process { + withName: 'PANGOLIN' { + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/pangolin" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } +} + +if (!params.skip_nextclade) { + process { + withName: 'UNTAR' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'NEXTCLADE_DATASETGET' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'NEXTCLADE_RUN' { + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/nextclade" }, + mode: 'copy', + pattern: "*.csv" + ] + } + + withName: 'MULTIQC_TSV_NEXTCLADE' { + publishDir = [ + path: { "${params.outdir}/multiqc/${params.artic_minion_caller}" }, + enabled: false + ] + } + } +} + +if (!params.skip_variants_quast) { + process { + withName: 'QUAST' { + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}" }, + mode: 'copy', + pattern: "quast" + ] + } + } +} + +if (!params.skip_snpeff) { + process { + withName: 'SNPEFF_BUILD' { + publishDir = [ + path: { "${params.outdir}/genome" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, + enabled: params.save_reference + ] + } + + withName: 'SNPEFF_ANN' { + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/snpeff" }, + mode: 'copy', + pattern: "*.{csv,txt,html}" + ] + } + + withName: '.*:.*:.*:.*:TABIX_BGZIP' { + ext.prefix = { "${meta.id}.snpeff" } + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/snpeff" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:.*:.*:.*:TABIX_TABIX' { + ext.args = '-p vcf -f' + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/snpeff" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: '.*:.*:.*:.*:.*:BCFTOOLS_STATS' { + ext.prefix = { "${meta.id}.snpeff" } + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/snpeff/bcftools_stats" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: 'SNPSIFT_EXTRACTFIELDS' { + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/snpeff" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + + if (!params.skip_variants_long_table) { + process { + withName: 'BCFTOOLS_QUERY' { + ext.args = [ + params.artic_minion_caller == 'nanopolish' ? "-H -f '%CHROM\\t%POS\\t%REF\\t%ALT\\t%FILTER\\t%StrandSupport\\n'" : '', + params.artic_minion_caller == 'medaka' ? "-H -f '%CHROM\\t%POS\\t%REF\\t%ALT\\t%FILTER\\t%DP\\t%AC\\n'" : '' + ].join(' ').trim() + ext.prefix = { "${meta.id}.bcftools_query" } + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}" }, + enabled: false + ] + } + + withName: 'MAKE_VARIANTS_LONG_TABLE' { + ext.args = "--variant_caller ${params.artic_minion_caller}" + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } + } +} + +if (!params.skip_asciigenome) { + process { + withName: 'ASCIIGENOME' { + publishDir = [ + path: { "${params.outdir}/${params.artic_minion_caller}/asciigenome/${meta.id}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } +} + +if (!params.skip_multiqc) { + process { + withName: 'MULTIQC' { + ext.args = params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' + publishDir = [ + path: { "${params.outdir}/multiqc/${params.artic_minion_caller}" }, + mode: 'copy', + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + } +} diff --git a/conf/test.config b/conf/test.config index ba5e0938..c3109b6a 100644 --- a/conf/test.config +++ b/conf/test.config @@ -16,8 +16,8 @@ params { // Limit resources so that this can run on GitHub Actions max_cpus = 2 - max_memory = 6.GB - max_time = 6.h + max_memory = '6.GB' + max_time = '6.h' // Input data to test amplicon analysis input = 'https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv' @@ -30,10 +30,10 @@ params { genome = 'MN908947.3' kraken2_db = 'https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/genome/kraken2/kraken2_hs22.tar.gz' - // Other pipeline options - callers = 'ivar,bcftools' - assemblers = 'spades,unicycler,minia' - - // Skip this by default to bypass Github Actions disk quota errors - skip_plasmidid = true + // Variant calling options + variant_caller = 'ivar' + + // Assembly options + assemblers = 'spades,unicycler,minia' + skip_plasmidid = true // Skip this by default to bypass Github Actions disk quota errors } diff --git a/conf/test_full.config b/conf/test_full.config index 10041d28..e8a6d2cc 100644 --- a/conf/test_full.config +++ b/conf/test_full.config @@ -24,13 +24,16 @@ params { // Genome references genome = 'MN908947.3' - // Other pipeline options - callers = 'ivar,bcftools' - assemblers = 'spades,unicycler,minia' + // Variant calling options + variant_caller = 'ivar' + + // Assembly options + assemblers = 'spades,unicycler,minia' + skip_plasmidid = true // Skip this by default to bypass Github Actions disk quota errors } process { - withName:PLASMIDID { + withName: 'PLASMIDID' { errorStrategy = 'ignore' } } diff --git a/conf/test_full_sispa.config b/conf/test_full_sispa.config index 49b295fc..2e8b04d2 100644 --- a/conf/test_full_sispa.config +++ b/conf/test_full_sispa.config @@ -22,7 +22,15 @@ params { // Genome references genome = 'MN908947.3' - // Other pipeline options - callers = 'ivar,bcftools' + // Variant calling options + variant_caller = 'bcftools' + + // Assembly options assemblers = 'spades,unicycler,minia' } + +process { + withName: 'PLASMIDID' { + errorStrategy = 'ignore' + } +} \ No newline at end of file diff --git a/conf/test_nanopore.config b/conf/test_nanopore.config index d7c6b945..647a5e5f 100644 --- a/conf/test_nanopore.config +++ b/conf/test_nanopore.config @@ -2,8 +2,8 @@ ======================================================================================== Nextflow config file for running minimal tests ======================================================================================== - Defines input files and everything required to run a fast and simple pipeline test. - + Defines input files and everything required to run a fast and simple pipeline test. + Use as follows: nextflow run nf-core/viralrecon -profile test_nanopore, @@ -16,8 +16,8 @@ params { // Limit resources so that this can run on GitHub Actions max_cpus = 2 - max_memory = 6.GB - max_time = 6.h + max_memory = '6.GB' + max_time = '6.h' // Input data to test nanopore analysis platform = 'nanopore' diff --git a/conf/test_sispa.config b/conf/test_sispa.config index bce9406d..9aaff136 100644 --- a/conf/test_sispa.config +++ b/conf/test_sispa.config @@ -16,8 +16,8 @@ params { // Limit resources so that this can run on GitHub Actions max_cpus = 2 - max_memory = 6.GB - max_time = 6.h + max_memory = '6.GB' + max_time = '6.h' // Input data to test SISPA/metagenomics analysis input = 'https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_sispa.csv' @@ -28,10 +28,10 @@ params { genome = 'MN908947.3' kraken2_db = 'https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/genome/kraken2/kraken2_hs22.tar.gz' - // Other pipeline options - callers = 'ivar,bcftools' - assemblers = 'spades,unicycler,minia' - - // Skip this by default to bypass Github Actions disk quota errors - skip_plasmidid = true + // Variant calling options + variant_caller = 'bcftools' + + // Assembly options + assemblers = 'spades,unicycler,minia' + skip_plasmidid = true // Skip this by default to bypass Github Actions disk quota errors } diff --git a/docs/images/nextclade_tag_example.png b/docs/images/nextclade_tag_example.png new file mode 100644 index 00000000..1d29b64e Binary files /dev/null and b/docs/images/nextclade_tag_example.png differ diff --git a/docs/images/nf-core-viralrecon_logo.png b/docs/images/nf-core-viralrecon_logo.png deleted file mode 100644 index 2f41904f..00000000 Binary files a/docs/images/nf-core-viralrecon_logo.png and /dev/null differ diff --git a/docs/images/nf-core-viralrecon_logo_dark.png b/docs/images/nf-core-viralrecon_logo_dark.png new file mode 100644 index 00000000..d66135b9 Binary files /dev/null and b/docs/images/nf-core-viralrecon_logo_dark.png differ diff --git a/docs/images/nf-core-viralrecon_logo_light.png b/docs/images/nf-core-viralrecon_logo_light.png new file mode 100644 index 00000000..dd92a124 Binary files /dev/null and b/docs/images/nf-core-viralrecon_logo_light.png differ diff --git a/docs/output.md b/docs/output.md index cc39171c..8c8e8d8b 100644 --- a/docs/output.md +++ b/docs/output.md @@ -22,6 +22,7 @@ The directories listed below will be created in the results directory after the * [Pangolin](#nanopore-pangolin) - Lineage analysis * [Nextclade](#nanopore-nextclade) - Clade assignment, mutation calling and sequence quality checks * [ASCIIGenome](#nanopore-asciigenome) - Individual variant screenshots with annotation tracks + * [Variants long table](#nanopore-variants-long-table) - Collate per-sample information for individual variants, functional effect prediction and lineage analysis * [Workflow reporting](#nanopore-workflow-reporting) * [MultiQC](#nanopore-multiqc) - Present QC, visualisation and custom reporting for sequencing, raw reads, alignment and variant calling results @@ -88,6 +89,8 @@ The [artic guppyplex](https://artic.readthedocs.io/en/latest/commands/) tool fro * `/` * `*.consensus.fasta`: Consensus fasta file generated by artic minion. + * `*.pass.unique.vcf.gz`: VCF file containing unique variants passing quality filters. + * `*.pass.unique.vcf.gz.tbi`: VCF index file containing unique variants passing quality filters. * `*.pass.vcf.gz`: VCF file containing variants passing quality filters. * `*.pass.vcf.gz.tbi`: VCF index file containing variants passing quality filters. * `*.primers.vcf`: VCF file containing variants found in primer-binding regions. @@ -255,6 +258,30 @@ As described in the documentation, [ASCIIGenome](https://asciigenome.readthedocs

ASCIIGenome screenshot

+### Nanopore: Variants long table + +
+Output files + +* `/` + * `variants_long_table.csv`: Long format table collating per-sample information for individual variants, functional effect prediction and lineage analysis. + +**NB:** The value of `` in the output directory name above is determined by the `--artic_minion_caller` parameter (Default: 'nanopolish'). + +
+ +Create variants long format table collating per-sample information for individual variants ([`BCFTools`](http://samtools.github.io/bcftools/bcftools.html)), functional effect prediction ([`SnpSift`](http://snpeff.sourceforge.net/SnpSift.html)) and lineage analysis ([`Pangolin`](https://github.com/cov-lineages/pangolin)). + +The more pertinent variant information is summarised in this table to make it easier for researchers to assess the impact of variants found amongst the sequenced sample(s). An example of the fields included in the table are shown below: + +```bash +SAMPLE,CHROM,POS,REF,ALT,FILTER,DP,REF_DP,ALT_DP,AF,GENE,EFFECT,HGVS_C,HGVS_P,HGVS_P_1LETTER,CALLER,LINEAGE +SAMPLE1_PE,MN908947.3,241,C,T,PASS,489,4,483,0.99,orf1ab,upstream_gene_variant,c.-25C>T,.,.,ivar,B.1 +SAMPLE1_PE,MN908947.3,1875,C,T,PASS,92,62,29,0.32,orf1ab,missense_variant,c.1610C>T,p.Ala537Val,p.A537V,ivar,B.1 +SAMPLE1_PE,MN908947.3,3037,C,T,PASS,213,0,213,1.0,orf1ab,synonymous_variant,c.2772C>T,p.Phe924Phe,p.F924F,ivar,B.1 +SAMPLE1_PE,MN908947.3,11719,G,A,PASS,195,9,186,0.95,orf1ab,synonymous_variant,c.11454G>A,p.Gln3818Gln,p.Q3818Q,ivar,B.1 +``` + ## Nanopore: Workflow reporting ### Nanopore: MultiQC @@ -293,13 +320,14 @@ An example MultiQC report generated from a full-sized dataset can be viewed on t * [picard MarkDuplicates](#picard-markduplicates) - Duplicate read marking and removal * [picard CollectMultipleMetrics](#picard-collectmultiplemetrics) - Alignment metrics * [mosdepth](#mosdepth) - Whole-genome and amplicon coverage metrics - * [iVar variants and iVar consensus](#ivar-variants-and-ivar-consensus) *||* [BCFTools and BEDTools](#bcftools-and-bedtools) - Variant calling and consensus sequence generation + * [iVar variants](#ivar-variants) *||* [BCFTools call](#bcftools-call) - Variant calling * [SnpEff and SnpSift](#snpeff-and-snpsift) - Genetic variant annotation and functional effect prediction + * [ASCIIGenome](#asciigenome) - Individual variant screenshots with annotation tracks + * [iVar consensus](#ivar-consensus) *||* [BCFTools and BEDTools](#bcftools-and-bedtools) - Consensus sequence generation * [QUAST](#quast) - Consensus assessment report * [Pangolin](#pangolin) - Lineage analysis * [Nextclade](#nextclade) - Clade assignment, mutation calling and sequence quality checks - * [ASCIIGenome](#asciigenome) - Individual variant screenshots with annotation tracks - * [BCFTools isec](#bcftools-isec) - Intersect variants across all callers + * [Variants long table](#variants-long-table) - Collate per-sample information for individual variants, functional effect prediction and lineage analysis * [De novo assembly](#illumina-de-novo-assembly) * [Cutadapt](#cutadapt) - Primer trimming for amplicon data * [SPAdes](#spades) *||* [Unicycler](#unicycler) *||* [minia](#minia) - Viral genome assembly @@ -502,24 +530,15 @@ Unless you are using [UMIs](https://emea.illumina.com/science/sequencing-method-

R - Sample per-amplicon coverage plot

-### iVar variants and iVar consensus +### iVar variants
Output files * `variants/ivar/` * `*.tsv`: Original iVar variants in TSV format. - * `*.vcf.gz`: iVar variants in VCF format. - * `*.vcf.gz.tbi`: iVar variants in VCF index file. -* `variants/ivar/consensus/` - * `*.consensus.fa`: Consensus Fasta file generated by iVar. - * `*.consensus.qual.txt`: File with the average quality of each base in the consensus sequence. -* `variants/ivar/consensus/base_qc/` - * `*.ACTG_density.pdf`: Plot showing density of ACGT bases within the consensus sequence. - * `*.base_counts.pdf`: Plot showing frequency and percentages of all bases in consensus sequence. - * `*.base_counts.tsv`: File containing frequency and percentages of all bases in consensus sequence. - * `*.N_density.pdf`: Plot showing density of N bases within the consensus sequence. - * `*.N_run.tsv`: File containing start positions and width of N bases in consensus sequence. + * `*.vcf.gz`: iVar variants in VCF format. Converted using custom `ivar_variants_to_vcf.py` python script. + * `*.vcf.gz.tbi`: iVar variants VCF index file. * `variants/ivar/log/` * `*.variant_counts.log`: Counts for type of variants called by iVar. * `variants/ivar/bcftools_stats/` @@ -527,11 +546,13 @@ Unless you are using [UMIs](https://emea.illumina.com/science/sequencing-method-
-[iVar](https://github.com/andersen-lab/ivar/blob/master/docs/MANUAL.md) is a computational package that contains functions broadly useful for viral amplicon-based sequencing. We use iVar in this pipeline to [trim primer sequences](#ivar-trim) for amplicon input data as well as to call variants and for consensus sequence generation. +[iVar](https://github.com/andersen-lab/ivar/blob/master/docs/MANUAL.md) is a computational package that contains functions broadly useful for viral amplicon-based sequencing. We use iVar in this pipeline to [trim primer sequences](#ivar-trim) for amplicon input data as well as to call variants. + +iVar outputs a tsv format which is not compatible with downstream analysis such as annotation using SnpEff. Moreover some issues need to be addressed such as [strand-bias filtering](https://github.com/andersen-lab/ivar/issues/5) and [the consecutive reporting of variants belonging to the same codon](https://github.com/andersen-lab/ivar/issues/92). This pipeline uses a custom Python script [ivar_variants_to_vcf.py](https://github.com/nf-core/viralrecon/blob/master/bin/ivar_variants_to_vcf.py) to convert the default iVar output to VCF whilst also addressing both of these issues. ![MultiQC - iVar variants called plot](images/mqc_ivar_variants_plot.png) -### BCFTools and BEDTools +### BCFTools call
Output files @@ -539,24 +560,12 @@ Unless you are using [UMIs](https://emea.illumina.com/science/sequencing-method- * `variants/bcftools/` * `*.vcf.gz`: Variants VCF file. * `*.vcf.gz.tbi`: Variants VCF index file. -* `variants/bcftools/consensus/` - * `*.consensus.fa`: Consensus Fasta file generated by integrating the variants called by BCFTools into the reference genome. -* `variants/bcftools/consensus/base_qc/` - * `*.ACTG_density.pdf`: Plot showing density of ACGT bases within the consensus sequence. - * `*.base_counts.pdf`: Plot showing frequency and percentages of all bases in consensus sequence. - * `*.base_counts.tsv`: File containing frequency and percentages of all bases in consensus sequence. - * `*.N_density.pdf`: Plot showing density of N bases within the consensus sequence. - * `*.N_run.tsv`: File containing start positions and width of N bases in consensus sequence. * `variants/bcftools/bcftools_stats/` * `*.bcftools_stats.txt`: Statistics and counts obtained from VCF file.
-[BCFtools](http://samtools.github.io/bcftools/bcftools.html) can be used to call variants directly from BAM alignment files. The functionality to call variants with BCFTools in this pipeline was inspired by work carried out by [Conor Walker](https://github.com/conorwalker/covid19/blob/3cb26ec399417bedb7e60487415c78a405f517d6/scripts/call_variants.sh). - -[BCFtools](http://samtools.github.io/bcftools/bcftools.html) is a set of utilities that manipulate variant calls in [VCF](https://vcftools.github.io/specs.html) and its binary counterpart BCF format. BCFTools is used in the variant calling and *de novo* assembly steps of this pipeline to obtain basic statistics from the VCF output. It can also used be used to generate a consensus sequence by integrating variant calls into the reference genome. - -[BEDTools](https://bedtools.readthedocs.io/en/latest/) is a swiss-army knife of tools for a wide-range of genomics analysis tasks. In this pipeline we use `bedtools genomecov` to compute the per-base mapped read coverage in bedGraph format, and `bedtools maskfasta` to mask sequences in a Fasta file based on intervals defined in a feature file. This may be useful for creating your own masked genome file based on custom annotations or for masking all but your target regions when aligning sequence data from a targeted capture experiment. +[BCFtools](http://samtools.github.io/bcftools/bcftools.html) can be used to call variants directly from BAM alignment files. It is a set of utilities that manipulate variant calls in [VCF](https://vcftools.github.io/specs.html) and its binary counterpart BCF format. BCFTools is used in the variant calling and *de novo* assembly steps of this pipeline to obtain basic statistics from the VCF output. ![MultiQC - BCFTools variant counts](images/mqc_bcftools_stats_plot.png) @@ -575,7 +584,7 @@ Unless you are using [UMIs](https://emea.illumina.com/science/sequencing-method- * `variants//snpeff/bcftools_stats/` * `*.bcftools_stats.txt`: Statistics and counts obtained from VCF file. -**NB:** The value of `` in the output directory name above is determined by the `--callers` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). +**NB:** The value of `` in the output directory name above is determined by the `--variant_caller` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). @@ -585,15 +594,75 @@ Unless you are using [UMIs](https://emea.illumina.com/science/sequencing-method- ![MultiQC - SnpEff annotation counts](images/mqc_snpeff_plot.png) +### ASCIIGenome + +
+Output files + +* `variants//asciigenome//` + * `*.pdf`: Individual variant screenshots with annotation tracks in PDF format. + +**NB:** The value of `` in the output directory name above is determined by the `--variant_caller` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). + +
+ +As described in the documentation, [ASCIIGenome](https://asciigenome.readthedocs.io/en/latest/) is a command-line genome browser that can be run from a terminal window and is solely based on ASCII characters. The closest program to ASCIIGenome is probably [samtools tview](http://www.htslib.org/doc/samtools-tview.html) but ASCIIGenome offers much more flexibility, similar to popular GUI viewers like the [IGV](https://software.broadinstitute.org/software/igv/) browser. We are using the batch processing mode of ASCIIGenome in this pipeline to generate individual screenshots for all of the variant sites reported for each sample in the VCF files. This is incredibly useful to be able to quickly QC the variants called by the pipeline without having to tediously load all of the relevant tracks into a conventional genome browser. Where possible, the BAM read alignments, VCF variant file, primer BED file and GFF annotation track will be represented in the screenshot for contextual purposes. The screenshot below shows a SNP called relative to the MN908947.3 SARS-CoV-2 reference genome that overlaps the ORF7a protein and the nCoV-2019_91_LEFT primer from the ARIC v3 protocol. + +

ASCIIGenome screenshot

+ +### iVar consensus + +
+Output files + +* `variants//consensus/ivar/` + * `*.consensus.fa`: Consensus Fasta file generated by iVar. + * `*.consensus.qual.txt`: File with the average quality of each base in the consensus sequence. +* `variants//consensus/ivar/base_qc/` + * `*.ACTG_density.pdf`: Plot showing density of ACGT bases within the consensus sequence. + * `*.base_counts.pdf`: Plot showing frequency and percentages of all bases in consensus sequence. + * `*.base_counts.tsv`: File containing frequency and percentages of all bases in consensus sequence. + * `*.N_density.pdf`: Plot showing density of N bases within the consensus sequence. + * `*.N_run.tsv`: File containing start positions and width of N bases in consensus sequence. + +**NB:** The value of `` in the output directory name above is determined by the `--variant_caller` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). + +
+ +As described in the [iVar variants](#ivar-variants) section, iVar can be used in this pipeline to call variants and for the consensus sequence generation. + +### BCFTools and BEDTools + +
+Output files + +* `variants//consensus/bcftools/` + * `*.consensus.fa`: Consensus fasta file generated by integrating the high allele-frequency variants called by iVar/BCFTools into the reference genome. + * `*.filtered.vcf.gz`: VCF file containing high allele-frequency variants (default: `>= 0.75`) that were integrated into the consensus sequence. + * `*.filtered.vcf.gz.tbi`: Variants VCF index file for high allele frequency variants. +* `variants//consensus/bcftools/base_qc/` + * `*.ACTG_density.pdf`: Plot showing density of ACGT bases within the consensus sequence. + * `*.base_counts.pdf`: Plot showing frequency and percentages of all bases in consensus sequence. + * `*.base_counts.tsv`: File containing frequency and percentages of all bases in consensus sequence. + * `*.N_density.pdf`: Plot showing density of N bases within the consensus sequence. + * `*.N_run.tsv`: File containing start positions and width of N bases in consensus sequence. + +**NB:** The value of `` in the output directory name above is determined by the `--variant_caller` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). + +
+ +[BCFTools](http://samtools.github.io/bcftools/bcftools.html) is used in the variant calling and *de novo* assembly steps of this pipeline to obtain basic statistics from the VCF output. It can also used be used to generate a consensus sequence by integrating variant calls into the reference genome. In this pipeline, we use `samtools mpileup` to create a mask using low coverage positions, and `bedtools maskfasta` to mask the genome sequences based on these intervals. Finally, `bcftools consensus` is used to generate the consensus by projecting the high allele frequency variants onto the masked genome reference sequence. + ### QUAST
Output files -* `variants//quast/` +* `variants//consensus//quast/` * `report.html`: Results report in HTML format. Also available in various other file formats i.e. `report.pdf`, `report.tex`, `report.tsv` and `report.txt`. -**NB:** The value of `` in the output directory name above is determined by the `--callers` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). +**NB:** The value of `` in the output directory name above is determined by the `--variant_caller` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). +**NB:** The value of `` in the output directory name above is determined by the `--consensus_caller` parameter (Default: 'bcftools' for both '--protocol amplicon' and '--protocol metagenomic').
@@ -604,10 +673,11 @@ Unless you are using [UMIs](https://emea.illumina.com/science/sequencing-method-
Output files -* `variants//pangolin/` +* `variants//consensus//pangolin/` * `*.pangolin.csv`: Lineage analysis results from Pangolin. -**NB:** The value of `` in the output directory name above is determined by the `--callers` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). +**NB:** The value of `` in the output directory name above is determined by the `--variant_caller` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). +**NB:** The value of `` in the output directory name above is determined by the `--consensus_caller` parameter (Default: 'bcftools' for both '--protocol amplicon' and '--protocol metagenomic').
@@ -618,47 +688,39 @@ Phylogenetic Assignment of Named Global Outbreak LINeages ([Pangolin](https://gi
Output files -* `variants//nextclade/` +* `variants//consensus//nextclade/` * `*.csv`: Analysis results from Nextlade containing genome clade assignment, mutation calling and sequence quality checks. -**NB:** The value of `` in the output directory name above is determined by the `--callers` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). +**NB:** The value of `` in the output directory name above is determined by the `--variant_caller` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). +**NB:** The value of `` in the output directory name above is determined by the `--consensus_caller` parameter (Default: 'bcftools' for both '--protocol amplicon' and '--protocol metagenomic').
[Nextclade](https://github.com/nextstrain/nextclade) performs viral genome clade assignment, mutation calling and sequence quality checks for the consensus sequences generated in this pipeline. Similar to Pangolin, it has been used extensively during the COVID-19 pandemic. A [web application](https://clades.nextstrain.org/) also exists that allows users to upload genome sequences via a web browser. -### ASCIIGenome +### Variants long table
Output files -* `variants//asciigenome//` - * `*.pdf`: Individual variant screenshots with annotation tracks in PDF format. +* `variants//` + * `variants_long_table.csv`: Long format table collating per-sample information for individual variants, functional effect prediction and lineage analysis. -**NB:** The value of `` in the output directory name above is determined by the `--callers` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic'). +**NB:** The value of `` in the output directory name above is determined by the `--variant_caller` parameter (Default: 'ivar' for '--protocol amplicon' and 'bcftools' for '--protocol metagenomic').
-As described in the documentation, [ASCIIGenome](https://asciigenome.readthedocs.io/en/latest/) is a command-line genome browser that can be run from a terminal window and is solely based on ASCII characters. The closest program to ASCIIGenome is probably [samtools tview](http://www.htslib.org/doc/samtools-tview.html) but ASCIIGenome offers much more flexibility, similar to popular GUI viewers like the [IGV](https://software.broadinstitute.org/software/igv/) browser. We are using the batch processing mode of ASCIIGenome in this pipeline to generate individual screenshots for all of the variant sites reported for each sample in the VCF files. This is incredibly useful to be able to quickly QC the variants called by the pipeline without having to tediously load all of the relevant tracks into a conventional genome browser. Where possible, the BAM read alignments, VCF variant file, primer BED file and GFF annotation track will be represented in the screenshot for contextual purposes. The screenshot below shows a SNP called relative to the MN908947.3 SARS-CoV-2 reference genome that overlaps the ORF7a protein and the nCoV-2019_91_LEFT primer from the ARIC v3 protocol. - -

ASCIIGenome screenshot

- -### BCFTools isec - -
-Output files - -* `variants/intersect//` - * `*.vcf.gz`: VCF file containing variants common to both variant callers. There will be one file for each caller - see `README.txt` for details. - * `*.vcf.gz.tbi`: Index for VCF file. - * `README.txt`: File containing command used and file name mappings. - * `sites.txt`: List of variants common to both variant callers in textual format. The last column indicates presence (1) or absence (0) amongst the 2 different callers. - -**NB:** This process will only be executed when both variant callers are specified to be run i.e. `--callers ivar,bcftools`. +Create variants long format table collating per-sample information for individual variants ([`BCFTools`](http://samtools.github.io/bcftools/bcftools.html)), functional effect prediction ([`SnpSift`](http://snpeff.sourceforge.net/SnpSift.html)) and lineage analysis ([`Pangolin`](https://github.com/cov-lineages/pangolin)). -
+The more pertinent variant information is summarised in this table to make it easier for researchers to assess the impact of variants found amongst the sequenced sample(s). An example of the fields included in the table are shown below: -[BCFTools isec](http://samtools.github.io/bcftools/bcftools.html#isec) can be used to intersect the variant calls generated by the 2 different callers used in the pipeline. This permits a quick assessment of how consistently a particular variant is being called using different algorithms and to prioritise the investigation of the variants. +```bash +SAMPLE,CHROM,POS,REF,ALT,FILTER,DP,REF_DP,ALT_DP,AF,GENE,EFFECT,HGVS_C,HGVS_P,HGVS_P_1LETTER,CALLER,LINEAGE +SAMPLE1_PE,MN908947.3,241,C,T,PASS,489,4,483,0.99,orf1ab,upstream_gene_variant,c.-25C>T,.,.,ivar,B.1 +SAMPLE1_PE,MN908947.3,1875,C,T,PASS,92,62,29,0.32,orf1ab,missense_variant,c.1610C>T,p.Ala537Val,p.A537V,ivar,B.1 +SAMPLE1_PE,MN908947.3,3037,C,T,PASS,213,0,213,1.0,orf1ab,synonymous_variant,c.2772C>T,p.Phe924Phe,p.F924F,ivar,B.1 +SAMPLE1_PE,MN908947.3,11719,G,A,PASS,195,9,186,0.95,orf1ab,synonymous_variant,c.11454G>A,p.Gln3818Gln,p.Q3818Q,ivar,B.1 +``` ## Illumina: De novo assembly @@ -687,9 +749,9 @@ In the variant calling branch of the pipeline we are using [iVar trim](#ivar-tri Output files * `assembly/spades//` - * `*.scaffolds.fa`: SPAdes scaffold assembly. - * `*.contigs.fa`: SPAdes assembly contigs. - * `*.assembly.gfa`: SPAdes assembly graph in [GFA](https://github.com/GFA-spec/GFA-spec/blob/master/GFA1.md) format. + * `*.scaffolds.fa.gz`: SPAdes scaffold assembly. + * `*.contigs.fa.gz`: SPAdes assembly contigs. + * `*.assembly.gfa.gz`: SPAdes assembly graph in [GFA](https://github.com/GFA-spec/GFA-spec/blob/master/GFA1.md) format. * `assembly/spades//bandage/` * `*.png`: Bandage visualisation for SPAdes assembly graph in PNG format. * `*.svg`: Bandage visualisation for SPAdes assembly graph in SVG format. @@ -708,8 +770,8 @@ In the variant calling branch of the pipeline we are using [iVar trim](#ivar-tri Output files * `assembly/unicycler/` - * `*.scaffolds.fa`: Unicycler scaffold assembly. - * `*.assembly.gfa`: Unicycler assembly graph in GFA format. + * `*.scaffolds.fa.gz`: Unicycler scaffold assembly. + * `*.assembly.gfa.gz`: Unicycler assembly graph in GFA format. * `assembly/unicycler/bandage/` * `*.png`: Bandage visualisation for Unicycler assembly graph in PNG format. * `*.svg`: Bandage visualisation for Unicycler assembly graph in SVG format. @@ -833,15 +895,13 @@ An example MultiQC report generated from a full-sized dataset can be viewed on t Output files * `genome/` - * Unzipped genome fasta file for viral genome - * Unzipped genome annotation GFF file for viral genome -* `genome/index/` * `bowtie2/`: Bowtie 2 index for viral genome. -* `genome/db/` * `blast_db/`: BLAST database for viral genome. * `kraken2_db/`: Kraken 2 database for host genome. * `snpeff_db/`: SnpEff database for viral genome. * `snpeff.config`: SnpEff config file for viral genome. + * Unzipped genome fasta file for viral genome + * Unzipped genome annotation GFF file for viral genome @@ -854,7 +914,7 @@ A number of genome-specific files are generated by the pipeline because they are * `pipeline_info/` * Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`. - * Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.tsv`. + * Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline. * Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`. diff --git a/docs/usage.md b/docs/usage.md index d3b7a889..f131a364 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -64,7 +64,7 @@ For Nanopore data the pipeline only supports amplicon-based analysis obtained fr ### Nanopolish -The default variant caller used by artic minion is [Nanopolish](https://github.com/jts/nanopolish) and this requires that you provide `*.fastq`, `*.fast5` and `sequencing_summary.txt` files as input to the pipeline. These files can typically be obtained after demultiplexing and basecalling the sequencing data using [Guppy](https://nanoporetech.com/nanopore-sequencing-data-analysis) (see [ARTIC SOP docs](https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html)). This pipeline requires that the files are organised in the format outlined below: +The default variant caller used by artic minion is [Nanopolish](https://github.com/jts/nanopolish) and this requires that you provide `*.fastq`, `*.fast5` and `sequencing_summary.txt` files as input to the pipeline. These files can typically be obtained after demultiplexing and basecalling the sequencing data using [Guppy](https://nanoporetech.com/nanopore-sequencing-data-analysis) (see [ARTIC SOP docs](https://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html)). This pipeline requires that the files are organised in the format outlined below and gzip compressed files are also accepted: ```console . @@ -297,56 +297,62 @@ process { > **NB:** We specify just the process name i.e. `STAR_ALIGN` in the config file and not the full task name string that is printed to screen in the error message or on the terminal whilst the pipeline is running i.e. `RNASEQ:ALIGN_STAR:STAR_ALIGN`. You may get a warning suggesting that the process selector isn't recognised but you can ignore that if the process name has been specified correctly. This is something that needs to be fixed upstream in core Nextflow. -### Tool-specific options - -For the ultimate flexibility, we have implemented and are using Nextflow DSL2 modules in a way where it is possible for both developers and users to change tool-specific command-line arguments (e.g. providing an additional command-line argument to the `STAR_ALIGN` process) as well as publishing options (e.g. saving files produced by the `STAR_ALIGN` process that aren't saved by default by the pipeline). In the majority of instances, as a user you won't have to change the default options set by the pipeline developer(s), however, there may be edge cases where creating a simple custom config file can improve the behaviour of the pipeline if for example it is failing due to a weird error that requires setting a tool-specific parameter to deal with smaller / larger genomes. +### Updating containers -The command-line arguments passed to STAR in the `STAR_ALIGN` module are a combination of: +The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. If for some reason you need to use a different version of a particular tool with the pipeline then you just need to identify the `process` name and override the Nextflow `container` definition for that process using the `withName` declaration. -* Mandatory arguments or those that need to be evaluated within the scope of the module, as supplied in the [`script`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/modules/nf-core/software/star/align/main.nf#L49-L55) section of the module file. +#### Pangolin -* An [`options.args`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/modules/nf-core/software/star/align/main.nf#L56) string of non-mandatory parameters that is set to be empty by default in the module but can be overwritten when including the module in the sub-workflow / workflow context via the `addParams` Nextflow option. +For example, in the [nf-core/viralrecon](https://nf-co.re/viralrecon) pipeline a tool called [Pangolin](https://github.com/cov-lineages/pangolin) has been used during the COVID-19 pandemic to assign lineages to SARS-CoV-2 genome sequenced samples. Given that the lineage assignments change quite frequently it doesn't make sense to re-release the nf-core/viralrecon everytime a new version of Pangolin has been released. However, you can override the default container used by the pipeline by creating a custom config file and passing it as a command-line argument via `-c custom.config`. -The nf-core/rnaseq pipeline has a sub-workflow (see [terminology](https://github.com/nf-core/modules#terminology)) specifically to align reads with STAR and to sort, index and generate some basic stats on the resulting BAM files using SAMtools. At the top of this file we import the `STAR_ALIGN` module via the Nextflow [`include`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/subworkflows/nf-core/align_star.nf#L10) keyword and by default the options passed to the module via the `addParams` option are set as an empty Groovy map [here](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/subworkflows/nf-core/align_star.nf#L5); this in turn means `options.args` will be set to empty by default in the module file too. This is an intentional design choice and allows us to implement well-written sub-workflows composed of a chain of tools that by default run with the bare minimum parameter set for any given tool in order to make it much easier to share across pipelines and to provide the flexibility for users and developers to customise any non-mandatory arguments. +1. Check the default version used by the pipeline in the module file for [Pangolin](https://github.com/nf-core/viralrecon/blob/a85d5969f9025409e3618d6c280ef15ce417df65/modules/nf-core/software/pangolin/main.nf#L14-L19) +2. Find the latest version of the Biocontainer available on [Quay.io](https://quay.io/repository/biocontainers/pangolin?tag=latest&tab=tags) +3. Create the custom config accordingly: -When including the sub-workflow above in the main pipeline workflow we use the same `include` statement, however, we now have the ability to overwrite options for each of the tools in the sub-workflow including the [`align_options`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/workflows/rnaseq.nf#L225) variable that will be used specifically to overwrite the optional arguments passed to the `STAR_ALIGN` module. In this case, the options to be provided to `STAR_ALIGN` have been assigned sensible defaults by the developer(s) in the pipeline's [`modules.config`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/modules.config#L70-L74) and can be accessed and customised in the [workflow context](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/workflows/rnaseq.nf#L201-L204) too before eventually passing them to the sub-workflow as a Groovy map called `star_align_options`. These options will then be propagated from `workflow -> sub-workflow -> module`. + * For Docker: -As mentioned at the beginning of this section it may also be necessary for users to overwrite the options passed to modules to be able to customise specific aspects of the way in which a particular tool is executed by the pipeline. Given that all of the default module options are stored in the pipeline's `modules.config` as a [`params` variable](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/modules.config#L24-L25) it is also possible to overwrite any of these options via a custom config file. + ```nextflow + process { + withName: PANGOLIN { + container = 'quay.io/biocontainers/pangolin:3.1.17--pyhdfd78af_1' + } + } + ``` -Say for example we want to append an additional, non-mandatory parameter (i.e. `--outFilterMismatchNmax 16`) to the arguments passed to the `STAR_ALIGN` module. Firstly, we need to copy across the default `args` specified in the [`modules.config`](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/modules.config#L71) and create a custom config file that is a composite of the default `args` as well as the additional options you would like to provide. This is very important because Nextflow will overwrite the default value of `args` that you provide via the custom config. + * For Singularity: -As you will see in the example below, we have: + ```nextflow + process { + withName: PANGOLIN { + container = 'https://depot.galaxyproject.org/singularity/pangolin:3.1.17--pyhdfd78af_1' + } + } + ``` -* appended `--outFilterMismatchNmax 16` to the default `args` used by the module. -* changed the default `publish_dir` value to where the files will eventually be published in the main results directory. -* appended `'bam':''` to the default value of `publish_files` so that the BAM files generated by the process will also be saved in the top-level results directory for the module. Note: `'out':'log'` means any file/directory ending in `out` will now be saved in a separate directory called `my_star_directory/log/`. + * For Conda: -```nextflow -params { - modules { - 'star_align' { - args = "--quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand zcat --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend --outFilterMismatchNmax 16" - publish_dir = "my_star_directory" - publish_files = ['out':'log', 'tab':'log', 'bam':''] + ```nextflow + process { + withName: PANGOLIN { + conda = 'bioconda::pangolin=3.1.17' + } } - } -} -``` + ``` -### Updating containers +#### Nextclade -The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. If for some reason you need to use a different version of a particular tool with the pipeline then you just need to identify the `process` name and override the Nextflow `container` definition for that process using the `withName` declaration. For example, in the [nf-core/viralrecon](https://nf-co.re/viralrecon) pipeline a tool called [Pangolin](https://github.com/cov-lineages/pangolin) has been used during the COVID-19 pandemic to assign lineages to SARS-CoV-2 genome sequenced samples. Given that the lineage assignments change quite frequently it doesn't make sense to re-release the nf-core/viralrecon everytime a new version of Pangolin has been released. However, you can override the default container used by the pipeline by creating a custom config file and passing it as a command-line argument via `-c custom.config`. +You can use a similar approach to update the version of Nextclade used by the pipeline: -1. Check the default version used by the pipeline in the module file for [Pangolin](https://github.com/nf-core/viralrecon/blob/a85d5969f9025409e3618d6c280ef15ce417df65/modules/nf-core/software/pangolin/main.nf#L14-L19) -2. Find the latest version of the Biocontainer available on [Quay.io](https://quay.io/repository/biocontainers/pangolin?tag=latest&tab=tags) +1. Check the default version used by the pipeline in the module file for [Nextclade](https://github.com/nf-core/viralrecon/blob/e582db9c70721aae530703ec9a2ab8b219c96a99/modules/nf-core/modules/nextclade/run/main.nf#L5-L8) +2. Find the latest version of the Biocontainer available on [Quay.io](https://quay.io/repository/biocontainers/nextclade?tag=latest&tab=tags) 3. Create the custom config accordingly: * For Docker: ```nextflow process { - withName: PANGOLIN { - container = 'quay.io/biocontainers/pangolin:3.0.5--pyhdfd78af_0' + withName: 'NEXTCLADE_DATASETGET|NEXTCLADE_RUN' { + container = 'quay.io/biocontainers/nextclade:1.10.1--h9ee0642_0' } } ``` @@ -355,8 +361,8 @@ The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementatio ```nextflow process { - withName: PANGOLIN { - container = 'https://depot.galaxyproject.org/singularity/pangolin:3.0.5--pyhdfd78af_0' + withName: 'NEXTCLADE_DATASETGET|NEXTCLADE_RUN' { + container = 'https://depot.galaxyproject.org/singularity/nextclade:1.10.1--h9ee0642_0' } } ``` @@ -365,12 +371,26 @@ The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementatio ```nextflow process { - withName: PANGOLIN { - conda = 'bioconda::pangolin=3.0.5' + withName: 'NEXTCLADE_DATASETGET|NEXTCLADE_RUN' { + conda = 'bioconda::nextclade=1.10.1' } } ``` +##### Nextclade datasets + +A [`nextclade dataset`](https://docs.nextstrain.org/projects/nextclade/en/latest/user/datasets.html#nextclade-datasets) feature was introduced in [Nextclade CLI v1.3.0](https://github.com/nextstrain/nextclade/releases/tag/1.3.0) that fetches input genome files such as reference sequences and trees from a central dataset repository. We have uploaded Nextclade dataset [v2022-01-18](https://github.com/nextstrain/nextclade_data/releases/tag/2022-01-24--21-27-29--UTC) to [nf-core/test-datasets](https://github.com/nf-core/test-datasets/blob/viralrecon/genome/MN908947.3/nextclade_sars-cov-2_MN908947_2022-01-18T12_00_00Z.tar.gz?raw=true), and for reproducibility, this will be used by default if you specify `--genome 'MN908947.3'` when running the pipeline. However, there are a number of ways you can use a more recent version of the dataset: + +* Supply your own by setting: `--nextclade_dataset ` +* Let the pipeline create and use the latest version by setting: `--nextclade_dataset false --nextclade_dataset_tag false` +* Let the pipeline create and use a specific, tagged version by setting: `--nextclade_dataset false --nextclade_dataset_tag ` + +The Nextclade dataset releases can be found on their [Github page](https://github.com/nextstrain/nextclade_data/releases). Use the tag specified for each release e.g `2022-01-18T12:00:00Z` in the example below: + +![Nextclade tag example](images/nextclade_tag_example.png) + +If the `--save_reference` parameter is provided then the Nextclade dataset generated by the pipeline will also be saved in the `results/genome/` directory. + > **NB:** If you wish to periodically update individual tool-specific results (e.g. Pangolin) generated by the pipeline then you must ensure to keep the `work/` directory otherwise the `-resume` ability of the pipeline will be compromised and it will restart from scratch. ### nf-core/configs diff --git a/lib/NfcoreSchema.groovy b/lib/NfcoreSchema.groovy index 8d6920dd..40ab65f2 100755 --- a/lib/NfcoreSchema.groovy +++ b/lib/NfcoreSchema.groovy @@ -105,9 +105,13 @@ class NfcoreSchema { // Collect expected parameters from the schema def expectedParams = [] + def enums = [:] for (group in schemaParams) { for (p in group.value['properties']) { expectedParams.push(p.key) + if (group.value['properties'][p.key].containsKey('enum')) { + enums[p.key] = group.value['properties'][p.key]['enum'] + } } } @@ -155,7 +159,7 @@ class NfcoreSchema { println '' log.error 'ERROR: Validation of pipeline parameters failed!' JSONObject exceptionJSON = e.toJSON() - printExceptions(exceptionJSON, params_json, log) + printExceptions(exceptionJSON, params_json, log, enums) println '' has_error = true } @@ -202,7 +206,7 @@ class NfcoreSchema { } def type = '[' + group_params.get(param).type + ']' def description = group_params.get(param).description - def defaultValue = group_params.get(param).default ? " [default: " + group_params.get(param).default.toString() + "]" : '' + def defaultValue = group_params.get(param).default != null ? " [default: " + group_params.get(param).default.toString() + "]" : '' def description_default = description + colors.dim + defaultValue + colors.reset // Wrap long description texts // Loosely based on https://dzone.com/articles/groovy-plain-text-word-wrap @@ -260,13 +264,12 @@ class NfcoreSchema { // Get pipeline parameters defined in JSON Schema def Map params_summary = [:] - def blacklist = ['hostnames'] def params_map = paramsLoad(getSchemaPath(workflow, schema_filename=schema_filename)) for (group in params_map.keySet()) { def sub_params = new LinkedHashMap() def group_params = params_map.get(group) // This gets the parameters of that particular group for (param in group_params.keySet()) { - if (params.containsKey(param) && !blacklist.contains(param)) { + if (params.containsKey(param)) { def params_value = params.get(param) def schema_value = group_params.get(param).default def param_type = group_params.get(param).type @@ -330,7 +333,7 @@ class NfcoreSchema { // // Loop over nested exceptions and print the causingException // - private static void printExceptions(ex_json, params_json, log) { + private static void printExceptions(ex_json, params_json, log, enums, limit=5) { def causingExceptions = ex_json['causingExceptions'] if (causingExceptions.length() == 0) { def m = ex_json['message'] =~ /required key \[([^\]]+)\] not found/ @@ -346,11 +349,20 @@ class NfcoreSchema { else { def param = ex_json['pointerToViolation'] - ~/^#\// def param_val = params_json[param].toString() - log.error "* --${param}: ${ex_json['message']} (${param_val})" + if (enums.containsKey(param)) { + def error_msg = "* --${param}: '${param_val}' is not a valid choice (Available choices" + if (enums[param].size() > limit) { + log.error "${error_msg} (${limit} of ${enums[param].size()}): ${enums[param][0..limit-1].join(', ')}, ... )" + } else { + log.error "${error_msg}: ${enums[param].join(', ')})" + } + } else { + log.error "* --${param}: ${ex_json['message']} (${param_val})" + } } } for (ex in causingExceptions) { - printExceptions(ex, params_json, log) + printExceptions(ex, params_json, log, enums) } } diff --git a/lib/NfcoreTemplate.groovy b/lib/NfcoreTemplate.groovy index 8b1168d6..d244008f 100755 --- a/lib/NfcoreTemplate.groovy +++ b/lib/NfcoreTemplate.groovy @@ -19,27 +19,16 @@ class NfcoreTemplate { } // - // Check params.hostnames + // Warn if a -profile or Nextflow config has not been provided to run the pipeline // - public static void hostName(workflow, params, log) { - Map colors = logColours(params.monochrome_logs) - if (params.hostnames) { - try { - def hostname = "hostname".execute().text.trim() - params.hostnames.each { prof, hnames -> - hnames.each { hname -> - if (hostname.contains(hname) && !workflow.profile.contains(prof)) { - log.info "=${colors.yellow}====================================================${colors.reset}=\n" + - "${colors.yellow}WARN: You are running with `-profile $workflow.profile`\n" + - " but your machine hostname is ${colors.white}'$hostname'${colors.reset}.\n" + - " ${colors.yellow_bold}Please use `-profile $prof${colors.reset}`\n" + - "=${colors.yellow}====================================================${colors.reset}=" - } - } - } - } catch (Exception e) { - log.warn "[$workflow.manifest.name] Could not determine 'hostname' - skipping check. Reason: ${e.message}." - } + public static void checkConfigProvided(workflow, log) { + if (workflow.profile == 'standard' && workflow.configFiles.size() <= 1) { + log.warn "[$workflow.manifest.name] You are attempting to run the pipeline without any custom configuration!\n\n" + + "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + + " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + + " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + + " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + + "Please refer to the quick start section and usage docs for the pipeline.\n " } } @@ -196,7 +185,6 @@ class NfcoreTemplate { log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed successfully, but with errored process(es) ${colors.reset}-" } } else { - hostName(workflow, params, log) log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed with errors${colors.reset}-" } } diff --git a/lib/Utils.groovy b/lib/Utils.groovy index 18173e98..1b88aec0 100755 --- a/lib/Utils.groovy +++ b/lib/Utils.groovy @@ -37,11 +37,4 @@ class Utils { "===================================================================================" } } - - // - // Join module args with appropriate spacing - // - public static String joinModuleArgs(args_list) { - return ' ' + args_list.join(' ') - } } diff --git a/lib/WorkflowCommons.groovy b/lib/WorkflowCommons.groovy index 9483a91e..5672a1cf 100755 --- a/lib/WorkflowCommons.groovy +++ b/lib/WorkflowCommons.groovy @@ -13,7 +13,7 @@ class WorkflowCommons { " Genome '${params.genome}' not found in any config files provided to the pipeline.\n" + " Currently, the available genome keys are:\n" + " ${params.genomes.keySet().join(", ")}\n" + - "===================================================================================" + "=============================================================================" System.exit(1) } } @@ -73,6 +73,53 @@ class WorkflowCommons { } } + // + // Function to get column entries from a file + // + public static ArrayList getColFromFile(input_file, col=0, uniqify=false, sep='\t') { + def vals = [] + input_file.eachLine { line -> + def val = line.split(sep)[col] + if (uniqify) { + if (!vals.contains(val)) { + vals << val + } + } else { + vals << val + } + } + return vals + } + + // + // Function that returns the number of lines in a file + // + public static Integer getNumLinesInFile(input_file) { + def num_lines = 0 + input_file.eachLine { line -> + num_lines ++ + } + return num_lines + } + + // + // Function to generate an error if contigs in BED file do not match those in reference genome + // + public static void checkContigsInBED(fai_contigs, bed_contigs, log) { + def intersect = bed_contigs.intersect(fai_contigs) + if (intersect.size() != bed_contigs.size()) { + def diff = bed_contigs.minus(intersect).sort() + log.error "=============================================================================\n" + + " Contigs in primer BED file do not match those in the reference genome:\n\n" + + " ${diff.join('\n ')}\n\n" + + " Please check:\n" + + " - Primer BED file supplied with --primer_bed\n" + + " - Genome FASTA file supplied with --fasta\n" + + "=============================================================================" + System.exit(1) + } + } + // // Function to read in all fields into a Groovy Map from Nextclade CSV output file // diff --git a/lib/WorkflowIllumina.groovy b/lib/WorkflowIllumina.groovy index f9531376..56b20612 100755 --- a/lib/WorkflowIllumina.groovy +++ b/lib/WorkflowIllumina.groovy @@ -31,10 +31,19 @@ class WorkflowIllumina { } // Variant calling parameter validation - def callers = params.callers ? params.callers.split(',').collect{ it.trim().toLowerCase() } : [] - if ((valid_params['callers'] + callers).unique().size() != valid_params['callers'].size()) { - log.error "Invalid option: ${params.callers}. Valid options for '--callers': ${valid_params['callers'].join(', ')}." - System.exit(1) + if (params.variant_caller) { + if (!valid_params['variant_callers'].contains(params.variant_caller)) { + log.error "Invalid option: ${params.variant_caller}. Valid options for '--variant_caller': ${valid_params['variant_callers'].join(', ')}." + System.exit(1) + } + } + + // Consensus calling parameter validation + if (params.consensus_caller) { + if (!valid_params['consensus_callers'].contains(params.consensus_caller)) { + log.error "Invalid option: ${params.consensus_caller}. Valid options for '--consensus_caller': ${valid_params['consensus_callers'].join(', ')}." + System.exit(1) + } } if (params.protocol == 'amplicon' && !params.skip_variants && !params.primer_bed) { diff --git a/lib/WorkflowMain.groovy b/lib/WorkflowMain.groovy index fa463562..6f9a12db 100755 --- a/lib/WorkflowMain.groovy +++ b/lib/WorkflowMain.groovy @@ -60,6 +60,9 @@ class WorkflowMain { // Print parameter summary log to screen log.info paramsSummaryLog(workflow, params, log) + // Check that a -profile or Nextflow config has been provided to run the pipeline + NfcoreTemplate.checkConfigProvided(workflow, log) + // Check that conda channels are set-up correctly if (params.enable_conda) { Utils.checkCondaChannels(log) @@ -68,9 +71,6 @@ class WorkflowMain { // Check AWS batch settings NfcoreTemplate.awsBatch(workflow, params) - // Check the hostnames against configured profiles - NfcoreTemplate.hostName(workflow, params, log) - // Check sequencing platform def platformList = ['illumina', 'nanopore'] if (!params.platform) { @@ -80,6 +80,14 @@ class WorkflowMain { log.error "Invalid platform option: '${params.platform}'. Valid options: ${platformList.join(', ')}." System.exit(1) } + + // Check Nextclade dataset parameters + if (!params.skip_consensus && !params.skip_nextclade) { + if (!params.nextclade_dataset && !params.nextclade_dataset_name) { + log.error "Nextclade dataset not specified with '--nextclade_dataset' or '--nextclade_dataset_name'. A list of available datasets can be obtained using the Nextclade 'nextclade dataset list' command." + System.exit(1) + } + } } // @@ -134,6 +142,8 @@ class WorkflowMain { } if (genome_map.containsKey(attribute)) { val = genome_map[ attribute ] + } else if (params.genomes[ params.genome ].containsKey(attribute)) { + val = params.genomes[ params.genome ][ attribute ] } } return val diff --git a/main.nf b/main.nf index 9d9eb79d..b0337a10 100644 --- a/main.nf +++ b/main.nf @@ -33,6 +33,11 @@ params.gff = WorkflowMain.getGenomeAttribute(params, 'gff' , log params.bowtie2_index = WorkflowMain.getGenomeAttribute(params, 'bowtie2' , log, primer_set, primer_set_version) params.primer_bed = WorkflowMain.getGenomeAttribute(params, 'primer_bed', log, primer_set, primer_set_version) +params.nextclade_dataset = WorkflowMain.getGenomeAttribute(params, 'nextclade_dataset' , log, primer_set, primer_set_version) +params.nextclade_dataset_name = WorkflowMain.getGenomeAttribute(params, 'nextclade_dataset_name' , log, primer_set, primer_set_version) +params.nextclade_dataset_reference = WorkflowMain.getGenomeAttribute(params, 'nextclade_dataset_reference', log, primer_set, primer_set_version) +params.nextclade_dataset_tag = WorkflowMain.getGenomeAttribute(params, 'nextclade_dataset_tag' , log, primer_set, primer_set_version) + /* ======================================================================================== VALIDATE & PRINT PARAMETER SUMMARY @@ -79,6 +84,7 @@ workflow NFCORE_VIRALRECON { // WORKFLOW: Execute a single named workflow for the pipeline // See: https://github.com/nf-core/rnaseq/issues/619 // + workflow { NFCORE_VIRALRECON () } diff --git a/modules.json b/modules.json index f2405bfc..9295f28c 100644 --- a/modules.json +++ b/modules.json @@ -4,139 +4,157 @@ "repos": { "nf-core/modules": { "abacas": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "artic/guppyplex": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "artic/minion": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "104f896a268df93876c284fdc9e604015bcca243" }, "bandage/image": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "bcftools/consensus": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "bb90e4fb78a977d469aad2a614c673b1981e7806" + }, + "bcftools/filter": { + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "bcftools/mpileup": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "bb90e4fb78a977d469aad2a614c673b1981e7806" }, - "bcftools/stats": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "bcftools/norm": { + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, - "bedtools/genomecov": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "bcftools/query": { + "git_sha": "aa2eca69975dc3b53b0c2fbffcaf70b0112c08d8" + }, + "bcftools/stats": { + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "bedtools/getfasta": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "d473a247d2e0c619b0df877ea19d9a5a98c8e3c8" }, "bedtools/maskfasta": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "bedtools/merge": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "blast/blastn": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "blast/makeblastdb": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "20d8250d9f39ddb05dfb437603aaf99b5c0b2b41" }, "bowtie2/align": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "3f9a0285816a2e0ae73a3875fb5ae4b409da5952" }, "bowtie2/build": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e3285528aca2733ff2d544cb5e5fcc34599226f3" }, "cat/fastq": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "826a5603db5cf5b4f1e55cef9cc0b7c37d3c7e70" + }, + "custom/dumpsoftwareversions": { + "git_sha": "20d8250d9f39ddb05dfb437603aaf99b5c0b2b41" + }, + "custom/getchromsizes": { + "git_sha": "20d8250d9f39ddb05dfb437603aaf99b5c0b2b41" }, "fastp": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "fastqc": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "gunzip": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "20d8250d9f39ddb05dfb437603aaf99b5c0b2b41" }, "ivar/consensus": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "bb90e4fb78a977d469aad2a614c673b1981e7806" }, "ivar/trim": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "ivar/variants": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "bb90e4fb78a977d469aad2a614c673b1981e7806" }, "kraken2/kraken2": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "3f9a0285816a2e0ae73a3875fb5ae4b409da5952" }, "minia": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "mosdepth": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "nanoplot": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, - "nextclade": { - "git_sha": "29c847424034eb04765d7378fb384ad3094a66a6" + "nextclade/datasetget": { + "git_sha": "aa2eca69975dc3b53b0c2fbffcaf70b0112c08d8" + }, + "nextclade/run": { + "git_sha": "aa2eca69975dc3b53b0c2fbffcaf70b0112c08d8" }, "pangolin": { - "git_sha": "e7e30b6da631ce5288151af4e46488ac6d294ff4" + "git_sha": "aa2eca69975dc3b53b0c2fbffcaf70b0112c08d8" }, "picard/collectmultiplemetrics": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "picard/markduplicates": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "plasmidid": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "d473a247d2e0c619b0df877ea19d9a5a98c8e3c8" }, "pycoqc": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "20d8250d9f39ddb05dfb437603aaf99b5c0b2b41" }, "quast": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "d473a247d2e0c619b0df877ea19d9a5a98c8e3c8" }, "samtools/flagstat": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "samtools/idxstats": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "samtools/index": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "samtools/mpileup": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "samtools/sort": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "samtools/stats": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "e751e5040af57e1b4e06ed4e0f3efe6de25c1683" }, "samtools/view": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "05ba4d901db380c6def3bc242ab18a2d88b25819" }, "spades": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "tabix/bgzip": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "tabix/tabix": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "20d8250d9f39ddb05dfb437603aaf99b5c0b2b41" }, "unicycler": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "9d0cad583b9a71a6509b754fdf589cbfbed08961" }, "untar": { - "git_sha": "e937c7950af70930d1f34bb961403d9d2aa81c7d" + "git_sha": "20d8250d9f39ddb05dfb437603aaf99b5c0b2b41" + }, + "vcflib/vcfuniq": { + "git_sha": "280712419d6ef5e3fecdc6e9eb98f8746fcbe0b7" } } } diff --git a/modules/local/asciigenome.nf b/modules/local/asciigenome.nf index 6f83c915..30ad698a 100644 --- a/modules/local/asciigenome.nf +++ b/modules/local/asciigenome.nf @@ -1,41 +1,29 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process ASCIIGENOME { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::asciigenome=1.16.0 bioconda::bedtools=2.30.0" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/mulled-v2-093691b47d719890dc19ac0c13c4528e9776897f:27211b8c38006480d69eb1be3ef09a7bf0a49d76-0" - } else { - container "quay.io/biocontainers/mulled-v2-093691b47d719890dc19ac0c13c4528e9776897f:27211b8c38006480d69eb1be3ef09a7bf0a49d76-0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-093691b47d719890dc19ac0c13c4528e9776897f:27211b8c38006480d69eb1be3ef09a7bf0a49d76-0' : + 'quay.io/biocontainers/mulled-v2-093691b47d719890dc19ac0c13c4528e9776897f:27211b8c38006480d69eb1be3ef09a7bf0a49d76-0' }" input: tuple val(meta), path(bam), path(vcf) - path fasta - path sizes - path gff - path bed - val window - val track_height + path fasta + path sizes + path gff + path bed + val window + val track_height output: tuple val(meta), path("*pdf"), emit: pdf - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" - def gff_track = gff ? "$gff" : '' - def bed_track = bed ? "$bed" : '' + def prefix = task.ext.prefix ?: "${meta.id}" + def gff_track = gff ? "$gff" : '' + def bed_track = bed ? "$bed" : '' def paired_end = meta.single_end ? '' : '&& readsAsPairs -on' """ zcat $vcf \\ @@ -61,6 +49,10 @@ process ASCIIGENOME { $gff_track \\ > /dev/null - echo \$(ASCIIGenome -ni --version 2>&1) | sed -e "s/ASCIIGenome //g" > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + asciigenome: \$(echo \$(ASCIIGenome -ni --version 2>&1) | sed -e "s/ASCIIGenome //g") + bedtools: \$(bedtools --version | sed -e "s/bedtools v//g") + END_VERSIONS """ } diff --git a/modules/local/bcftools_isec.nf b/modules/local/bcftools_isec.nf deleted file mode 100644 index f9b6ffb6..00000000 --- a/modules/local/bcftools_isec.nf +++ /dev/null @@ -1,38 +0,0 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - -process BCFTOOLS_ISEC { - tag "$meta.id" - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - - conda (params.enable_conda ? "bioconda::bcftools=1.11" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/bcftools:1.11--h7c999a4_0" - } else { - container "quay.io/biocontainers/bcftools:1.11--h7c999a4_0" - } - - input: - tuple val(meta), path('ivar/*'), path('ivar/*'), path('bcftools/*'), path('bcftools/*') - - output: - tuple val(meta), path("${prefix}"), emit: results - path "*.version.txt" , emit: version - - script: - def software = getSoftwareName(task.process) - prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" - """ - bcftools isec \\ - $options.args \\ - -p $prefix \\ - */*.vcf.gz - - echo \$(bcftools --version 2>&1) | sed 's/^.*bcftools //; s/ .*\$//' > ${software}.version.txt - """ -} diff --git a/modules/local/collapse_primers.nf b/modules/local/collapse_primers.nf index 2b81410f..78e371bd 100644 --- a/modules/local/collapse_primers.nf +++ b/modules/local/collapse_primers.nf @@ -1,36 +1,32 @@ -// Import generic module functions -include { saveFiles } from './functions' - -params.options = [:] - process COLLAPSE_PRIMERS { tag "$bed" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'primers', meta:[:], publish_by_meta:[]) } - conda (params.enable_conda ? "conda-forge::python=3.8.3" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/python:3.8.3" - } else { - container "quay.io/biocontainers/python:3.8.3" - } + conda (params.enable_conda ? "conda-forge::python=3.9.5" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/python:3.9--1' : + 'quay.io/biocontainers/python:3.9--1' }" input: path bed - val left_suffix - val right_suffix + val left_suffix + val right_suffix output: - path '*.bed', emit: bed + path '*.bed' , emit: bed + path "versions.yml", emit: versions - script: + script: // This script is bundled with the pipeline, in nf-core/viralrecon/bin/ """ collapse_primer_bed.py \\ --left_primer_suffix $left_suffix \\ --right_primer_suffix $right_suffix \\ $bed \\ ${bed.baseName}.collapsed.bed + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + python: \$(python --version | sed 's/Python //g') + END_VERSIONS """ } diff --git a/modules/local/cutadapt.nf b/modules/local/cutadapt.nf index 7d75385b..76a824e7 100644 --- a/modules/local/cutadapt.nf +++ b/modules/local/cutadapt.nf @@ -1,48 +1,40 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process CUTADAPT { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::cutadapt=3.2' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/cutadapt:3.2--py38h0213d0e_0' - } else { - container 'quay.io/biocontainers/cutadapt:3.2--py38h0213d0e_0' - } + conda (params.enable_conda ? 'bioconda::cutadapt=3.5' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/cutadapt:3.5--py39h38f01e4_0' : + 'quay.io/biocontainers/cutadapt:3.5--py39h38f01e4_0' }" input: tuple val(meta), path(reads) - path adapters + path adapters output: tuple val(meta), path('*.fastq.gz'), emit: reads tuple val(meta), path('*.log') , emit: log - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" - def paired = meta.single_end ? "-a file:adapters.sub.fa" : "-a file:adapters.sub.fa -A file:adapters.sub.fa" - def trimmed = meta.single_end ? "-o ${prefix}.fastq.gz" : "-o ${prefix}_1.fastq.gz -p ${prefix}_2.fastq.gz" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def paired = meta.single_end ? "-a file:adapters.sub.fa" : "-a file:adapters.sub.fa -A file:adapters.sub.fa" + def trimmed = meta.single_end ? "-o ${prefix}.fastq.gz" : "-o ${prefix}_1.fastq.gz -p ${prefix}_2.fastq.gz" """ sed -r '/^[ACTGactg]+\$/ s/\$/X/g' $adapters > adapters.sub.fa cutadapt \\ --cores $task.cpus \\ - $options.args \\ + $args \\ $paired \\ $trimmed \\ $reads \\ > ${prefix}.cutadapt.log - echo \$(cutadapt --version) > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cutadapt: \$(cutadapt --version) + END_VERSIONS """ } diff --git a/modules/local/filter_blastn.nf b/modules/local/filter_blastn.nf index 004314f0..6acf352f 100644 --- a/modules/local/filter_blastn.nf +++ b/modules/local/filter_blastn.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process FILTER_BLASTN { tag "$meta.id" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'blastn', meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "conda-forge::sed=4.7" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img" - } else { - container "biocontainers/biocontainers:v1.2.0_cv1" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img' : + 'biocontainers/biocontainers:v1.2.0_cv1' }" input: tuple val(meta), path(hits) @@ -24,11 +13,17 @@ process FILTER_BLASTN { output: tuple val(meta), path('*.txt'), emit: txt + path "versions.yml" , emit: versions script: - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def prefix = task.ext.prefix ?: "${meta.id}" """ awk 'BEGIN{OFS=\"\\t\";FS=\"\\t\"}{print \$0,\$5/\$15,\$5/\$14}' $hits | awk 'BEGIN{OFS=\"\\t\";FS=\"\\t\"} \$15 > 200 && \$17 > 0.7 && \$1 !~ /phage/ {print \$0}' > tmp.out cat $header tmp.out > ${prefix}.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + sed: \$(echo \$(sed --version 2>&1) | sed 's/^.*GNU sed) //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/local/functions.nf b/modules/local/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/local/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/local/get_chrom_sizes.nf b/modules/local/get_chrom_sizes.nf deleted file mode 100644 index e20fb395..00000000 --- a/modules/local/get_chrom_sizes.nf +++ /dev/null @@ -1,34 +0,0 @@ -// Import generic module functions -include { saveFiles } from './functions' - -params.options = [:] - -process GET_CHROM_SIZES { - tag "$fasta" - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'genome', meta:[:], publish_by_meta:[]) } - - conda (params.enable_conda ? "bioconda::samtools=1.10" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/samtools:1.10--h9402c20_2" - } else { - container "quay.io/biocontainers/samtools:1.10--h9402c20_2" - } - - input: - path fasta - - output: - path '*.sizes' , emit: sizes - path '*.fai' , emit: fai - path "*.version.txt", emit: version - - script: - def software = 'samtools' - """ - samtools faidx $fasta - cut -f 1,2 ${fasta}.fai > ${fasta}.sizes - echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt - """ -} diff --git a/modules/local/get_software_versions.nf b/modules/local/get_software_versions.nf deleted file mode 100644 index 15a486ca..00000000 --- a/modules/local/get_software_versions.nf +++ /dev/null @@ -1,33 +0,0 @@ -// Import generic module functions -include { saveFiles } from './functions' - -params.options = [:] - -process GET_SOFTWARE_VERSIONS { - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'pipeline_info', meta:[:], publish_by_meta:[]) } - - conda (params.enable_conda ? "conda-forge::python=3.8.3" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/python:3.8.3" - } else { - container "quay.io/biocontainers/python:3.8.3" - } - - cache false - - input: - path versions - - output: - path "software_versions.tsv" , emit: tsv - path 'software_versions_mqc.yaml', emit: yaml - - script: // This script is bundled with the pipeline, in nf-core/viralrecon/bin/ - """ - echo $workflow.manifest.version > pipeline.version.txt - echo $workflow.nextflow.version > nextflow.version.txt - scrape_software_versions.py &> software_versions_mqc.yaml - """ -} diff --git a/modules/local/ivar_variants_to_vcf.nf b/modules/local/ivar_variants_to_vcf.nf index 3ff46e0f..a27b6dda 100644 --- a/modules/local/ivar_variants_to_vcf.nf +++ b/modules/local/ivar_variants_to_vcf.nf @@ -1,21 +1,10 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process IVAR_VARIANTS_TO_VCF { tag "$meta.id" - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "conda-forge::python=3.8.3" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/python:3.8.3" - } else { - container "quay.io/biocontainers/python:3.8.3" - } + conda (params.enable_conda ? "conda-forge::python=3.9.5 conda-forge::matplotlib=3.5.1 conda-forge::pandas=1.3.5 conda-forge::r-sys=3.4 conda-forge::regex=2021.11.10 conda-forge::scipy=1.7.3" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-77320db00eefbbf8c599692102c3d387a37ef02a:08144a66f00dc7684fad061f1466033c0176e7ad-0' : + 'quay.io/biocontainers/mulled-v2-77320db00eefbbf8c599692102c3d387a37ef02a:08144a66f00dc7684fad061f1466033c0176e7ad-0' }" input: tuple val(meta), path(tsv) @@ -25,16 +14,26 @@ process IVAR_VARIANTS_TO_VCF { tuple val(meta), path("*.vcf"), emit: vcf tuple val(meta), path("*.log"), emit: log tuple val(meta), path("*.tsv"), emit: tsv + path "versions.yml" , emit: versions script: // This script is bundled with the pipeline, in nf-core/viralrecon/bin/ - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ ivar_variants_to_vcf.py \\ $tsv \\ - ${prefix}.vcf \\ - $options.args \\ + unsorted.txt \\ + $args \\ > ${prefix}.variant_counts.log + ## Order vcf by coordinates + cat unsorted.txt | grep "^#" > ${prefix}.vcf; cat unsorted.txt | grep -v "^#" | sort -k1,1d -k2,2n >> ${prefix}.vcf + cat $header ${prefix}.variant_counts.log > ${prefix}.variant_counts_mqc.tsv + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + python: \$(python --version | sed 's/Python //g') + END_VERSIONS """ } diff --git a/modules/local/kraken2_build.nf b/modules/local/kraken2_build.nf index ae7bc2f6..22c9c2de 100644 --- a/modules/local/kraken2_build.nf +++ b/modules/local/kraken2_build.nf @@ -1,36 +1,32 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process KRAKEN2_BUILD { + tag "$library" label 'process_high' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } - conda (params.enable_conda ? 'bioconda::kraken2=2.1.1' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/kraken2:2.1.1--pl526hc9558a2_0' - } else { - container 'quay.io/biocontainers/kraken2:2.1.1--pl526hc9558a2_0' - } + conda (params.enable_conda ? 'bioconda::kraken2=2.1.2 conda-forge::pigz=2.6' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-5799ab18b5fc681e75923b2450abaa969907ec98:87fc08d11968d081f3e8a37131c1f1f6715b6542-0' : + 'quay.io/biocontainers/mulled-v2-5799ab18b5fc681e75923b2450abaa969907ec98:87fc08d11968d081f3e8a37131c1f1f6715b6542-0' }" input: val library output: - path 'kraken2_db' , emit: db - path '*.version.txt', emit: version + path 'kraken2_db' , emit: db + path "versions.yml", emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + def args3 = task.ext.args3 ?: '' """ - kraken2-build --db kraken2_db --threads $task.cpus $options.args --download-taxonomy - kraken2-build --db kraken2_db --threads $task.cpus $options.args2 --download-library $library - kraken2-build --db kraken2_db --threads $task.cpus $options.args3 --build + kraken2-build --db kraken2_db --threads $task.cpus $args --download-taxonomy + kraken2-build --db kraken2_db --threads $task.cpus $args2 --download-library $library + kraken2-build --db kraken2_db --threads $task.cpus $args3 --build - echo \$(kraken2 --version 2>&1) | sed 's/^.*Kraken version //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + kraken2: \$(echo \$(kraken2 --version 2>&1) | sed 's/^.*Kraken version //; s/ .*\$//') + pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) + END_VERSIONS """ } diff --git a/modules/local/make_bed_mask.nf b/modules/local/make_bed_mask.nf index 08c3a0a9..81546c4b 100644 --- a/modules/local/make_bed_mask.nf +++ b/modules/local/make_bed_mask.nf @@ -1,40 +1,44 @@ -// Import generic module functions -include { initOptions; saveFiles } from './functions' - -params.options = [:] -options = initOptions(params.options) - process MAKE_BED_MASK { tag "$meta.id" - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'bed', meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "conda-forge::python=3.8.3" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/python:3.8.3" - } else { - container "quay.io/biocontainers/python:3.8.3" - } + conda (params.enable_conda ? "conda-forge::python=3.9.5 bioconda::samtools=1.14" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-1a35167f7a491c7086c13835aaa74b39f1f43979:6b5cffa1187cfccf2dc983ed3b5359d49b999eb0-0' : + 'quay.io/biocontainers/mulled-v2-1a35167f7a491c7086c13835aaa74b39f1f43979:6b5cffa1187cfccf2dc983ed3b5359d49b999eb0-0' }" input: - tuple val(meta), path(vcf), path(bed) - path fasta + tuple val(meta), path(bam), path(vcf) + path fasta + val save_mpileup output: - tuple val(meta), path("*.bed") , emit: bed - tuple val(meta), path("*.fasta"), emit: fasta + tuple val(meta), path("*.bed") , emit: bed + tuple val(meta), path("*.mpileup"), optional:true, emit: mpileup + path "versions.yml" , emit: versions script: // This script is bundled with the pipeline, in nf-core/viralrecon/bin/ - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: 10 + def prefix = task.ext.prefix ?: "${meta.id}" + def mpileup = save_mpileup ? "| tee ${prefix}.mpileup" : "" """ + samtools \\ + mpileup \\ + $args \\ + --reference $fasta \\ + $bam \\ + $mpileup \\ + | awk -v OFS='\\t' '{print \$1, \$2-1, \$2, \$4}' | awk '\$4 < $args2' > lowcov_positions.txt + make_bed_mask.py \\ $vcf \\ - $bed \\ + lowcov_positions.txt \\ ${prefix}.bed - ## Rename fasta entry by sample name and not reference genome - FASTA_NAME=\$(head -n1 $fasta | sed 's/>//g') - sed "s/\${FASTA_NAME}/${meta.id}/g" $fasta > ${prefix}.fasta + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + python: \$(python --version | sed 's/Python //g') + END_VERSIONS """ } diff --git a/modules/local/make_variants_long_table.nf b/modules/local/make_variants_long_table.nf new file mode 100644 index 00000000..c4f9996a --- /dev/null +++ b/modules/local/make_variants_long_table.nf @@ -0,0 +1,31 @@ +process MAKE_VARIANTS_LONG_TABLE { + + conda (params.enable_conda ? "conda-forge::python=3.9.5 conda-forge::matplotlib=3.5.1 conda-forge::pandas=1.3.5 conda-forge::r-sys=3.4 conda-forge::regex=2021.11.10 conda-forge::scipy=1.7.3" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-77320db00eefbbf8c599692102c3d387a37ef02a:08144a66f00dc7684fad061f1466033c0176e7ad-0' : + 'quay.io/biocontainers/mulled-v2-77320db00eefbbf8c599692102c3d387a37ef02a:08144a66f00dc7684fad061f1466033c0176e7ad-0' }" + + input: + path ('bcftools_query/*') + path ('snpsift/*') + path ('pangolin/*') + + output: + path "*.csv" , emit: csv + path "versions.yml", emit: versions + + script: // This script is bundled with the pipeline, in nf-core/viralrecon/bin/ + def args = task.ext.args ?: '' + """ + make_variants_long_table.py \\ + --bcftools_query_dir ./bcftools_query \\ + --snpsift_dir ./snpsift \\ + --pangolin_dir ./pangolin \\ + $args + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + python: \$(python --version | sed 's/Python //g') + END_VERSIONS + """ +} diff --git a/modules/local/multiqc_custom_csv_from_map.nf b/modules/local/multiqc_custom_csv_from_map.nf deleted file mode 100644 index 49968fd4..00000000 --- a/modules/local/multiqc_custom_csv_from_map.nf +++ /dev/null @@ -1,27 +0,0 @@ -// Import generic module functions -include { saveFiles; getSoftwareName } from './functions' - -params.options = [:] - -process MULTIQC_CUSTOM_CSV_FROM_MAP { - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } - - memory 100.MB - - input: - val csv_data - val out_prefix - - output: - path "*.csv" - - exec: - // Write to file - def file = task.workDir.resolve("${out_prefix}_mqc.csv") - file.write csv_data[0].keySet().join(",") + '\n' - csv_data.each { data -> - file.append(data.values().join(",") + '\n') - } -} diff --git a/modules/local/multiqc_custom_tsv_from_string.nf b/modules/local/multiqc_custom_tsv_from_string.nf deleted file mode 100644 index c2ece8bc..00000000 --- a/modules/local/multiqc_custom_tsv_from_string.nf +++ /dev/null @@ -1,37 +0,0 @@ -// Import generic module functions -include { saveFiles; getSoftwareName } from './functions' - -params.options = [:] - -process MULTIQC_CUSTOM_TSV_FROM_STRING { - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } - - conda (params.enable_conda ? "conda-forge::sed=4.7" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img" - } else { - container "biocontainers/biocontainers:v1.2.0_cv1" - } - - input: - val tsv_data - val col_names - val out_prefix - - output: - path "*.tsv" - - script: - if (tsv_data.size() > 0) { - """ - echo "${col_names}" > ${out_prefix}_mqc.tsv - echo "${tsv_data.join('\n')}" >> ${out_prefix}_mqc.tsv - """ - } else { - """ - touch ${out_prefix}_mqc.tsv - """ - } -} diff --git a/modules/local/multiqc_illumina.nf b/modules/local/multiqc_illumina.nf index f1f15a15..eb3a12b9 100644 --- a/modules/local/multiqc_illumina.nf +++ b/modules/local/multiqc_illumina.nf @@ -1,21 +1,10 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process MULTIQC { label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } conda (params.enable_conda ? "bioconda::multiqc=1.11" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/multiqc:1.11--pyhdfd78af_0" - } else { - container "quay.io/biocontainers/multiqc:1.11--pyhdfd78af_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/multiqc:1.11--pyhdfd78af_0' : + 'quay.io/biocontainers/multiqc:1.11--pyhdfd78af_0' }" input: path 'multiqc_config.yaml' @@ -33,17 +22,12 @@ process MULTIQC { path ('ivar_trim/*') path ('picard_markduplicates/*') path ('mosdepth/*') - path ('variants_ivar/*') - path ('variants_ivar/*') - path ('variants_ivar/*') - path ('variants_ivar/*') - path ('variants_ivar/*') - path ('variants_ivar/*') - path ('variants_bcftools/*') - path ('variants_bcftools/*') - path ('variants_bcftools/*') - path ('variants_bcftools/*') - path ('variants_bcftools/*') + path ('variants/*') + path ('variants/*') + path ('variants/*') + path ('variants/*') + path ('variants/*') + path ('variants/*') path ('cutadapt/*') path ('assembly_spades/*') path ('assembly_unicycler/*') @@ -55,13 +39,14 @@ process MULTIQC { path "*variants_metrics_mqc.csv", optional:true, emit: csv_variants path "*assembly_metrics_mqc.csv", optional:true, emit: csv_assembly path "*_plots" , optional:true, emit: plots + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def custom_config = params.multiqc_config ? "--config $multiqc_custom_config" : '' + def args = task.ext.args ?: '' + def custom_config = multiqc_custom_config ? "--config $multiqc_custom_config" : '' """ ## Run MultiQC once to parse tool logs - multiqc -f $options.args $custom_config . + multiqc -f $args $custom_config . ## Parse YAML files dumped by MultiQC to obtain metrics multiqc_to_custom_csv.py --platform illumina @@ -75,10 +60,14 @@ process MULTIQC { rm -f *variants_metrics_mqc.csv fi - rm -f variants_ivar/report.tsv - rm -f variants_bcftools/report.tsv + rm -f variants/report.tsv ## Run MultiQC a second time - multiqc -f $options.args -e general_stats --ignore *nextclade_clade_mqc.tsv $custom_config . + multiqc -f $args -e general_stats --ignore nextclade_clade_mqc.tsv $custom_config . + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" ) + END_VERSIONS """ } diff --git a/modules/local/multiqc_nanopore.nf b/modules/local/multiqc_nanopore.nf index de171181..f1cb728d 100644 --- a/modules/local/multiqc_nanopore.nf +++ b/modules/local/multiqc_nanopore.nf @@ -1,21 +1,10 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process MULTIQC { label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } conda (params.enable_conda ? "bioconda::multiqc=1.11" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/multiqc:1.11--pyhdfd78af_0" - } else { - container "quay.io/biocontainers/multiqc:1.11--pyhdfd78af_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/multiqc:1.11--pyhdfd78af_0' : + 'quay.io/biocontainers/multiqc:1.11--pyhdfd78af_0' }" input: path 'multiqc_config.yaml' @@ -42,13 +31,14 @@ process MULTIQC { path "*_data" , emit: data path "*.csv" , optional:true, emit: csv path "*_plots" , optional:true, emit: plots + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def custom_config = params.multiqc_config ? "--config $multiqc_custom_config" : '' + def args = task.ext.args ?: '' + def custom_config = multiqc_custom_config ? "--config $multiqc_custom_config" : '' """ ## Run MultiQC once to parse tool logs - multiqc -f $options.args $custom_config . + multiqc -f $args $custom_config . ## Parse YAML files dumped by MultiQC to obtain metrics multiqc_to_custom_csv.py --platform nanopore @@ -57,6 +47,11 @@ process MULTIQC { rm -rf quast ## Run MultiQC a second time - multiqc -f $options.args -e general_stats --ignore *nextclade_clade_mqc.tsv $custom_config . + multiqc -f $args -e general_stats --ignore *nextclade_clade_mqc.tsv $custom_config . + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" ) + END_VERSIONS """ } diff --git a/modules/local/multiqc_tsv_from_list.nf b/modules/local/multiqc_tsv_from_list.nf new file mode 100644 index 00000000..7f7b7c0a --- /dev/null +++ b/modules/local/multiqc_tsv_from_list.nf @@ -0,0 +1,25 @@ +process MULTIQC_TSV_FROM_LIST { + + executor 'local' + memory 100.MB + + input: + val tsv_data // [ ['foo', 1], ['bar', 1] ] + val header // [ 'name', 'number' ] + val out_prefix + + output: + path "*.tsv" + + exec: + // Generate file contents + def contents = "" + if (tsv_data.size() > 0) { + contents += "${header.join('\t')}\n" + contents += tsv_data.join('\n') + } + + // Write to file + def mqc_file = task.workDir.resolve("${out_prefix}_mqc.tsv") + mqc_file.text = contents +} diff --git a/modules/local/plot_base_density.nf b/modules/local/plot_base_density.nf index d5ddfce1..ec499aba 100644 --- a/modules/local/plot_base_density.nf +++ b/modules/local/plot_base_density.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process PLOT_BASE_DENSITY { tag "$fasta" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'plots', meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "conda-forge::r-base=4.0.3 conda-forge::r-reshape2=1.4.4 conda-forge::r-optparse=1.6.6 conda-forge::r-ggplot2=3.3.3 conda-forge::r-scales=1.1.1 conda-forge::r-viridis=0.5.1 conda-forge::r-tidyverse=1.3.0 bioconda::bioconductor-biostrings=2.58.0 bioconda::bioconductor-complexheatmap=2.6.2" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/mulled-v2-ad9dd5f398966bf899ae05f8e7c54d0fb10cdfa7:05678da05b8e5a7a5130e90a9f9a6c585b965afa-0" - } else { - container "quay.io/biocontainers/mulled-v2-ad9dd5f398966bf899ae05f8e7c54d0fb10cdfa7:05678da05b8e5a7a5130e90a9f9a6c585b965afa-0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-ad9dd5f398966bf899ae05f8e7c54d0fb10cdfa7:05678da05b8e5a7a5130e90a9f9a6c585b965afa-0' : + 'quay.io/biocontainers/mulled-v2-ad9dd5f398966bf899ae05f8e7c54d0fb10cdfa7:05678da05b8e5a7a5130e90a9f9a6c585b965afa-0' }" input: tuple val(meta), path(fasta) @@ -24,13 +13,19 @@ process PLOT_BASE_DENSITY { output: tuple val(meta), path('*.pdf'), emit: pdf tuple val(meta), path('*.tsv'), emit: tsv + path "versions.yml" , emit: versions - script: - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + script: // This script is bundled with the pipeline, in nf-core/viralrecon/bin/ + def prefix = task.ext.prefix ?: "${meta.id}" """ plot_base_density.r \\ --fasta_files $fasta \\ --prefixes $prefix \\ --output_dir ./ + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + r-base: \$(echo \$(R --version 2>&1) | sed 's/^.*R version //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/local/plot_mosdepth_regions.nf b/modules/local/plot_mosdepth_regions.nf index 66dc082d..38f9f21c 100644 --- a/modules/local/plot_mosdepth_regions.nf +++ b/modules/local/plot_mosdepth_regions.nf @@ -1,21 +1,10 @@ -// Import generic module functions -include { initOptions; saveFiles } from './functions' - -params.options = [:] -options = initOptions(params.options) - process PLOT_MOSDEPTH_REGIONS { label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'mosdepth', meta:[:], publish_by_meta:[]) } conda (params.enable_conda ? "conda-forge::r-base=4.0.3 conda-forge::r-reshape2=1.4.4 conda-forge::r-optparse=1.6.6 conda-forge::r-ggplot2=3.3.3 conda-forge::r-scales=1.1.1 conda-forge::r-viridis=0.5.1 conda-forge::r-tidyverse=1.3.0 bioconda::bioconductor-biostrings=2.58.0 bioconda::bioconductor-complexheatmap=2.6.2" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/mulled-v2-ad9dd5f398966bf899ae05f8e7c54d0fb10cdfa7:05678da05b8e5a7a5130e90a9f9a6c585b965afa-0" - } else { - container "quay.io/biocontainers/mulled-v2-ad9dd5f398966bf899ae05f8e7c54d0fb10cdfa7:05678da05b8e5a7a5130e90a9f9a6c585b965afa-0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-ad9dd5f398966bf899ae05f8e7c54d0fb10cdfa7:05678da05b8e5a7a5130e90a9f9a6c585b965afa-0' : + 'quay.io/biocontainers/mulled-v2-ad9dd5f398966bf899ae05f8e7c54d0fb10cdfa7:05678da05b8e5a7a5130e90a9f9a6c585b965afa-0' }" input: path beds @@ -25,14 +14,21 @@ process PLOT_MOSDEPTH_REGIONS { path '*coverage.tsv', emit: coverage_tsv path '*heatmap.pdf' , optional:true, emit: heatmap_pdf path '*heatmap.tsv' , optional:true, emit: heatmap_tsv + path "versions.yml" , emit: versions - script: - def prefix = options.suffix ?: "mosdepth" + script: // This script is bundled with the pipeline, in nf-core/viralrecon/bin/ + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "mosdepth" """ plot_mosdepth_regions.r \\ --input_files ${beds.join(',')} \\ --output_dir ./ \\ --output_suffix $prefix \\ - $options.args + $args + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + r-base: \$(echo \$(R --version 2>&1) | sed 's/^.*R version //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/local/rename_fasta_header.nf b/modules/local/rename_fasta_header.nf new file mode 100644 index 00000000..64c45a73 --- /dev/null +++ b/modules/local/rename_fasta_header.nf @@ -0,0 +1,26 @@ +process RENAME_FASTA_HEADER { + tag "$meta.id" + + conda (params.enable_conda ? "conda-forge::sed=4.7" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img' : + 'biocontainers/biocontainers:v1.2.0_cv1' }" + + input: + tuple val(meta), path(fasta) + + output: + tuple val(meta), path("*.fa"), emit: fasta + path "versions.yml" , emit: versions + + script: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + sed "s/>/>${meta.id} /g" $fasta > ${prefix}.fa + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + sed: \$(echo \$(sed --version 2>&1) | sed 's/^.*GNU sed) //; s/ .*\$//') + END_VERSIONS + """ +} diff --git a/modules/local/samplesheet_check.nf b/modules/local/samplesheet_check.nf index c28dca82..60c9c512 100644 --- a/modules/local/samplesheet_check.nf +++ b/modules/local/samplesheet_check.nf @@ -1,27 +1,18 @@ -// Import generic module functions -include { saveFiles } from './functions' - -params.options = [:] - process SAMPLESHEET_CHECK { tag "$samplesheet" - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'pipeline_info', meta:[:], publish_by_meta:[]) } - conda (params.enable_conda ? "conda-forge::python=3.8.3" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/python:3.8.3" - } else { - container "quay.io/biocontainers/python:3.8.3" - } + conda (params.enable_conda ? "conda-forge::python=3.9.5" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/python:3.9--1' : + 'quay.io/biocontainers/python:3.9--1' }" input: path samplesheet val platform output: - path '*.csv' + path '*.csv' , emit: csv + path "versions.yml", emit: versions script: // This script is bundled with the pipeline, in nf-core/viralrecon/bin/ """ @@ -29,5 +20,10 @@ process SAMPLESHEET_CHECK { $samplesheet \\ samplesheet.valid.csv \\ --platform $platform + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + python: \$(python --version | sed 's/Python //g') + END_VERSIONS """ } diff --git a/modules/local/snpeff_ann.nf b/modules/local/snpeff_ann.nf index 9bd6c15b..04823334 100644 --- a/modules/local/snpeff_ann.nf +++ b/modules/local/snpeff_ann.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SNPEFF_ANN { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::snpeff=5.0' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/snpeff:5.0--0' - } else { - container 'quay.io/biocontainers/snpeff:5.0--0' - } + conda (params.enable_conda ? "bioconda::snpeff=5.0" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/snpeff:5.0--hdfd78af_1' : + 'quay.io/biocontainers/snpeff:5.0--hdfd78af_1' }" input: tuple val(meta), path(vcf) @@ -29,11 +18,12 @@ process SNPEFF_ANN { tuple val(meta), path("*.csv") , emit: csv tuple val(meta), path("*.genes.txt"), emit: txt tuple val(meta), path("*.html") , emit: html - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def avail_mem = 4 if (!task.memory) { log.info '[snpEff] Available memory not known - defaulting to 4GB. Specify process memory requirements to change this.' @@ -46,12 +36,15 @@ process SNPEFF_ANN { ${fasta.baseName} \\ -config $config \\ -dataDir $db \\ - $options.args \\ + $args \\ $vcf \\ -csvStats ${prefix}.snpeff.csv \\ > ${prefix}.snpeff.vcf mv snpEff_summary.html ${prefix}.snpeff.summary.html - echo \$(snpEff -version 2>&1) | sed 's/^.*SnpEff //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + snpeff: \$(echo \$(snpEff -version 2>&1) | cut -f 2 -d ' ') + END_VERSIONS """ } diff --git a/modules/local/snpeff_build.nf b/modules/local/snpeff_build.nf index 5e61d105..abdd9b3c 100644 --- a/modules/local/snpeff_build.nf +++ b/modules/local/snpeff_build.nf @@ -1,35 +1,24 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SNPEFF_BUILD { tag "$fasta" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } - conda (params.enable_conda ? 'bioconda::snpeff=5.0' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/snpeff:5.0--0' - } else { - container 'quay.io/biocontainers/snpeff:5.0--0' - } + conda (params.enable_conda ? "bioconda::snpeff=5.0" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/snpeff:5.0--hdfd78af_1' : + 'quay.io/biocontainers/snpeff:5.0--hdfd78af_1' }" input: path fasta path gff output: - path 'snpeff_db' , emit: db - path '*.config' , emit: config - path '*.version.txt', emit: version + path 'snpeff_db' , emit: db + path '*.config' , emit: config + path "versions.yml", emit: versions script: - def software = getSoftwareName(task.process) - def basename = fasta.baseName + def basename = fasta.baseName + def avail_mem = 4 if (!task.memory) { log.info '[snpEff] Available memory not known - defaulting to 4GB. Specify process memory requirements to change this.' @@ -58,6 +47,9 @@ process SNPEFF_BUILD { -v \\ ${basename} - echo \$(snpEff -version 2>&1) | sed 's/^.*SnpEff //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + snpeff: \$(echo \$(snpEff -version 2>&1) | cut -f 2 -d ' ') + END_VERSIONS """ } diff --git a/modules/local/snpsift_extractfields.nf b/modules/local/snpsift_extractfields.nf index a02c42a5..88dfb9f7 100644 --- a/modules/local/snpsift_extractfields.nf +++ b/modules/local/snpsift_extractfields.nf @@ -1,33 +1,23 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SNPSIFT_EXTRACTFIELDS { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::snpsift=4.3.1t' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/snpsift:4.3.1t--2' - } else { - container 'quay.io/biocontainers/snpsift:4.3.1t--2' - } + conda (params.enable_conda ? "bioconda::snpsift=4.3.1t" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/snpsift:4.3.1t--hdfd78af_3' : + 'quay.io/biocontainers/snpsift:4.3.1t--hdfd78af_3' }" input: tuple val(meta), path(vcf) output: tuple val(meta), path("*.snpsift.txt"), emit: txt - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def avail_mem = 4 if (!task.memory) { log.info '[SnpSift] Available memory not known - defaulting to 4GB. Specify process memory requirements to change this.' @@ -40,7 +30,7 @@ process SNPSIFT_EXTRACTFIELDS { extractFields \\ -s "," \\ -e "." \\ - $options.args \\ + $args \\ $vcf \\ CHROM POS REF ALT \\ "ANN[*].GENE" "ANN[*].GENEID" \\ @@ -53,6 +43,9 @@ process SNPSIFT_EXTRACTFIELDS { "EFF[*].FUNCLASS" "EFF[*].CODON" "EFF[*].AA" "EFF[*].AA_LEN" \\ > ${prefix}.snpsift.txt - echo \$(SnpSift -h 2>&1) | sed 's/^.*SnpSift version //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + snpsift: \$( echo \$(SnpSift split -h 2>&1) | sed 's/^.*version //' | sed 's/(.*//' | sed 's/t//g' ) + END_VERSIONS """ } diff --git a/modules/nf-core/modules/abacas/functions.nf b/modules/nf-core/modules/abacas/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/abacas/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/abacas/main.nf b/modules/nf-core/modules/abacas/main.nf index 6ec65ea2..49040214 100644 --- a/modules/nf-core/modules/abacas/main.nf +++ b/modules/nf-core/modules/abacas/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process ABACAS { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::abacas=1.3.1" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/abacas:1.3.1--pl526_0" - } else { - container "quay.io/biocontainers/abacas:1.3.1--pl526_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/abacas:1.3.1--pl526_0' : + 'quay.io/biocontainers/abacas:1.3.1--pl526_0' }" input: tuple val(meta), path(scaffold) @@ -24,22 +13,25 @@ process ABACAS { output: tuple val(meta), path('*.abacas*'), emit: results - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ abacas.pl \\ -r $fasta \\ -q $scaffold \\ - $options.args \\ + $args \\ -o ${prefix}.abacas mv nucmer.delta ${prefix}.abacas.nucmer.delta mv nucmer.filtered.delta ${prefix}.abacas.nucmer.filtered.delta mv nucmer.tiling ${prefix}.abacas.nucmer.tiling mv unused_contigs.out ${prefix}.abacas.unused.contigs.out - echo \$(abacas.pl -v 2>&1) | sed 's/^.*ABACAS.//; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + abacas: \$(echo \$(abacas.pl -v 2>&1) | sed 's/^.*ABACAS.//; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/abacas/meta.yml b/modules/nf-core/modules/abacas/meta.yml index d60afee0..039fb0be 100644 --- a/modules/nf-core/modules/abacas/meta.yml +++ b/modules/nf-core/modules/abacas/meta.yml @@ -48,10 +48,10 @@ output: 'test.abacas.MULTIFASTA.fa' ] pattern: "*.{abacas}*" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" diff --git a/modules/nf-core/modules/artic/guppyplex/functions.nf b/modules/nf-core/modules/artic/guppyplex/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/artic/guppyplex/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/artic/guppyplex/main.nf b/modules/nf-core/modules/artic/guppyplex/main.nf index 41178298..780f5111 100644 --- a/modules/nf-core/modules/artic/guppyplex/main.nf +++ b/modules/nf-core/modules/artic/guppyplex/main.nf @@ -1,41 +1,33 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process ARTIC_GUPPYPLEX { tag "$meta.id" label 'process_high' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::artic=1.2.1" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/artic:1.2.1--py_0" - } else { - container "quay.io/biocontainers/artic:1.2.1--py_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/artic:1.2.1--py_0' : + 'quay.io/biocontainers/artic:1.2.1--py_0' }" input: tuple val(meta), path(fastq_dir) output: tuple val(meta), path("*.fastq.gz"), emit: fastq - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ artic \\ guppyplex \\ - $options.args \\ + $args \\ --directory $fastq_dir \\ --output ${prefix}.fastq pigz -p $task.cpus *.fastq - echo \$(artic --version 2>&1) | sed 's/^.*artic //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + artic: \$(artic --version 2>&1 | sed 's/^.*artic //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/artic/guppyplex/meta.yml b/modules/nf-core/modules/artic/guppyplex/meta.yml index 0caaf5d2..5056f908 100644 --- a/modules/nf-core/modules/artic/guppyplex/meta.yml +++ b/modules/nf-core/modules/artic/guppyplex/meta.yml @@ -34,10 +34,10 @@ output: type: file description: Aggregated FastQ files pattern: "*.{fastq.gz}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" diff --git a/modules/nf-core/modules/artic/minion/functions.nf b/modules/nf-core/modules/artic/minion/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/artic/minion/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/artic/minion/main.nf b/modules/nf-core/modules/artic/minion/main.nf index e408551b..c25ec0db 100644 --- a/modules/nf-core/modules/artic/minion/main.nf +++ b/modules/nf-core/modules/artic/minion/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process ARTIC_MINION { tag "$meta.id" label 'process_high' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::artic=1.2.1" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/artic:1.2.1--py_0" - } else { - container "quay.io/biocontainers/artic:1.2.1--py_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/artic:1.2.1--py_0' : + 'quay.io/biocontainers/artic:1.2.1--py_0' }" input: tuple val(meta), path(fastq) @@ -40,24 +29,27 @@ process ARTIC_MINION { tuple val(meta), path("${prefix}.pass.vcf.gz") , emit: vcf tuple val(meta), path("${prefix}.pass.vcf.gz.tbi") , emit: tbi tuple val(meta), path("*.json"), optional:true , emit: json - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" def version = scheme_version.toString().toLowerCase().replaceAll('v','') - def fast5 = params.fast5_dir ? "--fast5-directory $fast5_dir" : "" - def summary = params.sequencing_summary ? "--sequencing-summary $sequencing_summary" : "" + def fast5 = fast5_dir ? "--fast5-directory $fast5_dir" : "" + def summary = sequencing_summary ? "--sequencing-summary $sequencing_summary" : "" def model = "" - if (options.args.tokenize().contains('--medaka')) { + if (args.tokenize().contains('--medaka')) { fast5 = "" summary = "" - model = file(params.artic_minion_medaka_model).exists() ? "--medaka-model ./$medaka_model" : "--medaka-model $params.artic_minion_medaka_model" + model = file(medaka_model).exists() ? "--medaka-model ./$medaka_model" : "--medaka-model $medaka_model" } + def hd5_plugin_path = task.ext.hd5_plugin_path ? "export HDF5_PLUGIN_PATH=" + task.ext.hd5_plugin_path : "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin" """ + $hd5_plugin_path + artic \\ minion \\ - $options.args \\ + $args \\ --threads $task.cpus \\ --read-file $fastq \\ --scheme-directory ./primer-schemes \\ @@ -68,6 +60,9 @@ process ARTIC_MINION { $scheme \\ $prefix - echo \$(artic --version 2>&1) | sed 's/^.*artic //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + artic: \$(artic --version 2>&1 | sed 's/^.*artic //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/artic/minion/meta.yml b/modules/nf-core/modules/artic/minion/meta.yml index 1b6a73cf..464e1dc7 100644 --- a/modules/nf-core/modules/artic/minion/meta.yml +++ b/modules/nf-core/modules/artic/minion/meta.yml @@ -103,10 +103,10 @@ output: type: file description: JSON file for MultiQC pattern: "*.json" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" diff --git a/modules/nf-core/modules/bandage/image/functions.nf b/modules/nf-core/modules/bandage/image/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bandage/image/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bandage/image/main.nf b/modules/nf-core/modules/bandage/image/main.nf index 6afdb60d..bc2a9495 100644 --- a/modules/nf-core/modules/bandage/image/main.nf +++ b/modules/nf-core/modules/bandage/image/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BANDAGE_IMAGE { tag "${meta.id}" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? 'bioconda::bandage=0.8.1' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/bandage:0.8.1--hc9558a2_2" - } else { - container "quay.io/biocontainers/bandage:0.8.1--hc9558a2_2" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bandage:0.8.1--hc9558a2_2' : + 'quay.io/biocontainers/bandage:0.8.1--hc9558a2_2' }" input: tuple val(meta), path(gfa) @@ -24,15 +13,18 @@ process BANDAGE_IMAGE { output: tuple val(meta), path('*.png'), emit: png tuple val(meta), path('*.svg'), emit: svg - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ - Bandage image $gfa ${prefix}.png $options.args - Bandage image $gfa ${prefix}.svg $options.args + Bandage image $gfa ${prefix}.png $args + Bandage image $gfa ${prefix}.svg $args - echo \$(Bandage --version 2>&1) | sed 's/^.*Version: //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bandage: \$(echo \$(Bandage --version 2>&1) | sed 's/^.*Version: //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/bandage/image/meta.yml b/modules/nf-core/modules/bandage/image/meta.yml index 26c23a07..1c2b9840 100644 --- a/modules/nf-core/modules/bandage/image/meta.yml +++ b/modules/nf-core/modules/bandage/image/meta.yml @@ -11,6 +11,7 @@ tools: Bandage - a Bioinformatics Application for Navigating De novo Assembly Graphs Easily homepage: https://github.com/rrwick/Bandage documentation: https://github.com/rrwick/Bandage + licence: ['GPL-3.0-or-later'] input: - meta: type: map @@ -35,9 +36,9 @@ output: type: file description: Bandage image in SVG format pattern: "*.svg" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@heuermh" diff --git a/modules/nf-core/modules/bcftools/consensus/functions.nf b/modules/nf-core/modules/bcftools/consensus/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bcftools/consensus/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bcftools/consensus/main.nf b/modules/nf-core/modules/bcftools/consensus/main.nf index 67321fc2..040e6534 100644 --- a/modules/nf-core/modules/bcftools/consensus/main.nf +++ b/modules/nf-core/modules/bcftools/consensus/main.nf @@ -1,38 +1,33 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BCFTOOLS_CONSENSUS { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::bcftools=1.11' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/bcftools:1.11--h7c999a4_0' - } else { - container 'quay.io/biocontainers/bcftools:1.11--h7c999a4_0' - } + conda (params.enable_conda ? 'bioconda::bcftools=1.14' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bcftools:1.14--h88f3f91_0' : + 'quay.io/biocontainers/bcftools:1.14--h88f3f91_0' }" input: tuple val(meta), path(vcf), path(tbi), path(fasta) output: tuple val(meta), path('*.fa'), emit: fasta - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ - cat $fasta | bcftools consensus $vcf $options.args > ${prefix}.fa - header=\$(head -n 1 ${prefix}.fa | sed 's/>//g') - sed -i 's/\${header}/${meta.id}/g' ${prefix}.fa + cat $fasta \\ + | bcftools \\ + consensus \\ + $vcf \\ + $args \\ + > ${prefix}.fa - echo \$(bcftools --version 2>&1) | sed 's/^.*bcftools //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/bcftools/consensus/meta.yml b/modules/nf-core/modules/bcftools/consensus/meta.yml index ef14479d..761115a6 100644 --- a/modules/nf-core/modules/bcftools/consensus/meta.yml +++ b/modules/nf-core/modules/bcftools/consensus/meta.yml @@ -11,6 +11,7 @@ tools: homepage: http://samtools.github.io/bcftools/bcftools.html documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map @@ -39,10 +40,10 @@ output: type: file description: FASTA reference consensus file pattern: "*.{fasta,fa}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/bcftools/filter/main.nf b/modules/nf-core/modules/bcftools/filter/main.nf new file mode 100644 index 00000000..98b422b1 --- /dev/null +++ b/modules/nf-core/modules/bcftools/filter/main.nf @@ -0,0 +1,31 @@ +process BCFTOOLS_FILTER { + tag "$meta.id" + label 'process_medium' + + conda (params.enable_conda ? 'bioconda::bcftools=1.14' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bcftools:1.14--h88f3f91_0' : + 'quay.io/biocontainers/bcftools:1.14--h88f3f91_0' }" + + input: + tuple val(meta), path(vcf) + + output: + tuple val(meta), path("*.gz"), emit: vcf + path "versions.yml" , emit: versions + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + bcftools filter \\ + --output ${prefix}.vcf.gz \\ + $args \\ + $vcf + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/modules/bcftools/filter/meta.yml b/modules/nf-core/modules/bcftools/filter/meta.yml new file mode 100644 index 00000000..72d28bf0 --- /dev/null +++ b/modules/nf-core/modules/bcftools/filter/meta.yml @@ -0,0 +1,41 @@ +name: bcftools_filter +description: Filters VCF files +keywords: + - variant calling + - filtering + - VCF +tools: + - filter: + description: | + Apply fixed-threshold filters to VCF files. + homepage: http://samtools.github.io/bcftools/bcftools.html + documentation: http://www.htslib.org/doc/bcftools.html + doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] +input: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF input file + pattern: "*.{vcf}" +output: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF filtered output file + pattern: "*.{vcf}" + - versions: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@joseespinosa" + - "@drpatelh" diff --git a/modules/nf-core/modules/bcftools/mpileup/functions.nf b/modules/nf-core/modules/bcftools/mpileup/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bcftools/mpileup/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bcftools/mpileup/main.nf b/modules/nf-core/modules/bcftools/mpileup/main.nf index 287a0c9d..cdd38eec 100644 --- a/modules/nf-core/modules/bcftools/mpileup/main.nf +++ b/modules/nf-core/modules/bcftools/mpileup/main.nf @@ -1,47 +1,50 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BCFTOOLS_MPILEUP { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::bcftools=1.11" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/bcftools:1.11--h7c999a4_0" - } else { - container "quay.io/biocontainers/bcftools:1.11--h7c999a4_0" - } + conda (params.enable_conda ? 'bioconda::bcftools=1.14' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bcftools:1.14--h88f3f91_0' : + 'quay.io/biocontainers/bcftools:1.14--h88f3f91_0' }" input: tuple val(meta), path(bam) - path fasta + path fasta + val save_mpileup output: tuple val(meta), path("*.gz") , emit: vcf tuple val(meta), path("*.tbi") , emit: tbi tuple val(meta), path("*stats.txt"), emit: stats - path "*.version.txt" , emit: version + tuple val(meta), path("*.mpileup") , emit: mpileup, optional: true + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + def args3 = task.ext.args3 ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def mpileup = save_mpileup ? "| tee ${prefix}.mpileup" : "" """ echo "${meta.id}" > sample_name.list - bcftools mpileup \\ + + bcftools \\ + mpileup \\ --fasta-ref $fasta \\ - $options.args \\ + $args \\ $bam \\ - | bcftools call --output-type v $options.args2 \\ + $mpileup \\ + | bcftools call --output-type v $args2 \\ | bcftools reheader --samples sample_name.list \\ - | bcftools view --output-file ${prefix}.vcf.gz --output-type z $options.args3 + | bcftools view --output-file ${prefix}.vcf.gz --output-type z $args3 + tabix -p vcf -f ${prefix}.vcf.gz + bcftools stats ${prefix}.vcf.gz > ${prefix}.bcftools_stats.txt - echo \$(bcftools --version 2>&1) | sed 's/^.*bcftools //; s/ .*\$//' > ${software}.version.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/bcftools/mpileup/meta.yml b/modules/nf-core/modules/bcftools/mpileup/meta.yml index a15aea14..483d0e71 100644 --- a/modules/nf-core/modules/bcftools/mpileup/meta.yml +++ b/modules/nf-core/modules/bcftools/mpileup/meta.yml @@ -11,6 +11,7 @@ tools: homepage: http://samtools.github.io/bcftools/bcftools.html documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map @@ -25,6 +26,10 @@ input: type: file description: FASTA reference file pattern: "*.{fasta,fa}" + - save_mpileup: + type: boolean + description: Save mpileup file generated by bcftools mpileup + patter: "*.mpileup" output: - meta: type: map @@ -43,10 +48,10 @@ output: type: file description: Text output file containing stats pattern: "*{stats.txt}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/bcftools/norm/main.nf b/modules/nf-core/modules/bcftools/norm/main.nf new file mode 100644 index 00000000..e8bf6324 --- /dev/null +++ b/modules/nf-core/modules/bcftools/norm/main.nf @@ -0,0 +1,34 @@ +process BCFTOOLS_NORM { + tag "$meta.id" + label 'process_medium' + + conda (params.enable_conda ? 'bioconda::bcftools=1.14' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bcftools:1.14--h88f3f91_0' : + 'quay.io/biocontainers/bcftools:1.14--h88f3f91_0' }" + + input: + tuple val(meta), path(vcf) + path(fasta) + + output: + tuple val(meta), path("*.gz") , emit: vcf + path "versions.yml" , emit: versions + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + bcftools norm \\ + --fasta-ref ${fasta} \\ + --output ${prefix}.vcf.gz \\ + $args \\ + --threads $task.cpus \\ + ${vcf} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/modules/bcftools/norm/meta.yml b/modules/nf-core/modules/bcftools/norm/meta.yml new file mode 100644 index 00000000..27978a53 --- /dev/null +++ b/modules/nf-core/modules/bcftools/norm/meta.yml @@ -0,0 +1,46 @@ +name: bcftools_norm +description: Normalize VCF file +keywords: + - normalize + - norm + - variant calling + - VCF +tools: + - norm: + description: | + Normalize VCF files. + homepage: http://samtools.github.io/bcftools/bcftools.html + documentation: http://www.htslib.org/doc/bcftools.html + doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] +input: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: | + The vcf file to be normalized + e.g. 'file1.vcf' + - fasta: + type: file + description: FASTA reference file + pattern: "*.{fasta,fa}" +output: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF normalized output file + pattern: "*.{vcf.gz}" + - versions: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@abhi18av" diff --git a/modules/nf-core/modules/bcftools/query/main.nf b/modules/nf-core/modules/bcftools/query/main.nf new file mode 100644 index 00000000..a165b103 --- /dev/null +++ b/modules/nf-core/modules/bcftools/query/main.nf @@ -0,0 +1,40 @@ +process BCFTOOLS_QUERY { + tag "$meta.id" + label 'process_medium' + + conda (params.enable_conda ? 'bioconda::bcftools=1.14' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bcftools:1.14--h88f3f91_0' : + 'quay.io/biocontainers/bcftools:1.14--h88f3f91_0' }" + + input: + tuple val(meta), path(vcf), path(tbi) + path regions + path targets + path samples + + output: + tuple val(meta), path("*.txt"), emit: txt + path "versions.yml" , emit: versions + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def regions_file = regions ? "--regions-file ${regions}" : "" + def targets_file = targets ? "--targets-file ${targets}" : "" + def samples_file = samples ? "--samples-file ${samples}" : "" + """ + bcftools query \\ + --output ${prefix}.txt \\ + $regions_file \\ + $targets_file \\ + $samples_file \\ + $args \\ + $vcf + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/modules/bcftools/query/meta.yml b/modules/nf-core/modules/bcftools/query/meta.yml new file mode 100644 index 00000000..e49f13c8 --- /dev/null +++ b/modules/nf-core/modules/bcftools/query/meta.yml @@ -0,0 +1,61 @@ +name: bcftools_query +description: Extracts fields from VCF or BCF files and outputs them in user-defined format. +keywords: + - query + - variant calling + - bcftools + - VCF +tools: + - query: + description: | + Extracts fields from VCF or BCF files and outputs them in user-defined format. + homepage: http://samtools.github.io/bcftools/bcftools.html + documentation: http://www.htslib.org/doc/bcftools.html + doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] +input: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: | + The vcf file to be qeuried. + pattern: "*.{vcf.gz, vcf}" + - tbi: + type: file + description: | + The tab index for the VCF file to be inspected. + pattern: "*.tbi" + - regions: + type: file + description: | + Optionally, restrict the operation to regions listed in this file. + - targets: + type: file + description: | + Optionally, restrict the operation to regions listed in this file (doesn't rely upon index files) + - samples: + type: file + description: | + Optional, file of sample names to be included or excluded. + e.g. 'file.tsv' +output: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - txt: + type: file + description: BCFTools query output file + pattern: "*.txt" + - versions: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@abhi18av" + - "@drpatelh" diff --git a/modules/nf-core/modules/bcftools/stats/functions.nf b/modules/nf-core/modules/bcftools/stats/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bcftools/stats/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bcftools/stats/main.nf b/modules/nf-core/modules/bcftools/stats/main.nf index 84e48c05..54a28bce 100644 --- a/modules/nf-core/modules/bcftools/stats/main.nf +++ b/modules/nf-core/modules/bcftools/stats/main.nf @@ -1,35 +1,27 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BCFTOOLS_STATS { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::bcftools=1.11" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/bcftools:1.11--h7c999a4_0" - } else { - container "quay.io/biocontainers/bcftools:1.11--h7c999a4_0" - } + conda (params.enable_conda ? 'bioconda::bcftools=1.14' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bcftools:1.14--h88f3f91_0' : + 'quay.io/biocontainers/bcftools:1.14--h88f3f91_0' }" input: tuple val(meta), path(vcf) output: tuple val(meta), path("*stats.txt"), emit: stats - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ - bcftools stats $options.args $vcf > ${prefix}.bcftools_stats.txt - echo \$(bcftools --version 2>&1) | sed 's/^.*bcftools //; s/ .*\$//' > ${software}.version.txt + bcftools stats $args $vcf > ${prefix}.bcftools_stats.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/bcftools/stats/meta.yml b/modules/nf-core/modules/bcftools/stats/meta.yml index 6b70f83a..505bf729 100644 --- a/modules/nf-core/modules/bcftools/stats/meta.yml +++ b/modules/nf-core/modules/bcftools/stats/meta.yml @@ -12,6 +12,7 @@ tools: homepage: http://samtools.github.io/bcftools/bcftools.html documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map @@ -32,10 +33,10 @@ output: type: file description: Text output file containing stats pattern: "*_{stats.txt}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/bedtools/genomecov/functions.nf b/modules/nf-core/modules/bedtools/genomecov/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bedtools/genomecov/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bedtools/genomecov/main.nf b/modules/nf-core/modules/bedtools/genomecov/main.nf deleted file mode 100644 index f9b87464..00000000 --- a/modules/nf-core/modules/bedtools/genomecov/main.nf +++ /dev/null @@ -1,55 +0,0 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - -process BEDTOOLS_GENOMECOV { - tag "$meta.id" - label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - - conda (params.enable_conda ? "bioconda::bedtools=2.30.0" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/bedtools:2.30.0--hc088bd4_0" - } else { - container "quay.io/biocontainers/bedtools:2.30.0--hc088bd4_0" - } - - input: - tuple val(meta), path(intervals) - path sizes - val extension - - output: - tuple val(meta), path("*.${extension}"), emit: genomecov - path "*.version.txt" , emit: version - - script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" - if (intervals.name =~ /\.bam/) { - """ - bedtools \\ - genomecov \\ - -ibam $intervals \\ - $options.args \\ - > ${prefix}.${extension} - - bedtools --version | sed -e "s/bedtools v//g" > ${software}.version.txt - """ - } else { - """ - bedtools \\ - genomecov \\ - -i $intervals \\ - -g $sizes \\ - $options.args \\ - > ${prefix}.${extension} - - bedtools --version | sed -e "s/bedtools v//g" > ${software}.version.txt - """ - } -} diff --git a/modules/nf-core/modules/bedtools/genomecov/meta.yml b/modules/nf-core/modules/bedtools/genomecov/meta.yml deleted file mode 100644 index f629665c..00000000 --- a/modules/nf-core/modules/bedtools/genomecov/meta.yml +++ /dev/null @@ -1,46 +0,0 @@ -name: bedtools_genomecov -description: Computes histograms (default), per-base reports (-d) and BEDGRAPH (-bg) summaries of feature coverage (e.g., aligned sequences) for a given genome. -keywords: - - bed - - bam - - genomecov -tools: - - bedtools: - description: | - A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types. - documentation: https://bedtools.readthedocs.io/en/latest/content/tools/genomecov.html -input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - intervals: - type: file - description: BAM/BED/GFF/VCF - pattern: "*.{bam|bed|gff|vcf}" - - sizes: - type: file - description: Tab-delimited table of chromosome names in the first column and chromosome sizes in the second column - - extension: - type: string - description: Extension of the output file (e. g., ".bg", ".bedgraph", ".txt", ".tab", etc.) It is set arbitrarily by the user and corresponds to the file format which depends on arguments. -output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - genomecov: - type: file - description: Computed genome coverage file - pattern: "*.${extension}" - - version: - type: file - description: File containing software version - pattern: "*.{version.txt}" -authors: - - "@Emiller88" - - "@sruthipsuresh" - - "@drpatelh" - - "@sidorov-si" diff --git a/modules/nf-core/modules/bedtools/getfasta/functions.nf b/modules/nf-core/modules/bedtools/getfasta/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bedtools/getfasta/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bedtools/getfasta/main.nf b/modules/nf-core/modules/bedtools/getfasta/main.nf index 374a310b..5a283e94 100644 --- a/modules/nf-core/modules/bedtools/getfasta/main.nf +++ b/modules/nf-core/modules/bedtools/getfasta/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BEDTOOLS_GETFASTA { tag "$bed" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } conda (params.enable_conda ? "bioconda::bedtools=2.30.0" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/bedtools:2.30.0--hc088bd4_0" - } else { - container "quay.io/biocontainers/bedtools:2.30.0--hc088bd4_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bedtools:2.30.0--hc088bd4_0' : + 'quay.io/biocontainers/bedtools:2.30.0--hc088bd4_0' }" input: path bed @@ -24,19 +13,22 @@ process BEDTOOLS_GETFASTA { output: path "*.fa" , emit: fasta - path "*.version.txt", emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${bed.baseName}${options.suffix}" : "${bed.baseName}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${bed.baseName}" """ bedtools \\ getfasta \\ - $options.args \\ + $args \\ -fi $fasta \\ -bed $bed \\ -fo ${prefix}.fa - bedtools --version | sed -e "s/bedtools v//g" > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bedtools: \$(bedtools --version | sed -e "s/bedtools v//g") + END_VERSIONS """ } diff --git a/modules/nf-core/modules/bedtools/getfasta/meta.yml b/modules/nf-core/modules/bedtools/getfasta/meta.yml index 1ca63bdc..38715c3d 100644 --- a/modules/nf-core/modules/bedtools/getfasta/meta.yml +++ b/modules/nf-core/modules/bedtools/getfasta/meta.yml @@ -9,6 +9,7 @@ tools: description: | A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types. documentation: https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html + licence: ['MIT'] input: - bed: type: file @@ -24,10 +25,10 @@ output: type: file description: Output fasta file with extracted sequences pattern: "*.{fa}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/bedtools/maskfasta/functions.nf b/modules/nf-core/modules/bedtools/maskfasta/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bedtools/maskfasta/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bedtools/maskfasta/main.nf b/modules/nf-core/modules/bedtools/maskfasta/main.nf index 02110149..7eeb4c7d 100644 --- a/modules/nf-core/modules/bedtools/maskfasta/main.nf +++ b/modules/nf-core/modules/bedtools/maskfasta/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BEDTOOLS_MASKFASTA { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::bedtools=2.30.0" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/bedtools:2.30.0--hc088bd4_0" - } else { - container "quay.io/biocontainers/bedtools:2.30.0--hc088bd4_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bedtools:2.30.0--hc088bd4_0' : + 'quay.io/biocontainers/bedtools:2.30.0--hc088bd4_0' }" input: tuple val(meta), path(bed) @@ -24,18 +13,21 @@ process BEDTOOLS_MASKFASTA { output: tuple val(meta), path("*.fa"), emit: fasta - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ bedtools \\ maskfasta \\ - $options.args \\ + $args \\ -fi $fasta \\ -bed $bed \\ -fo ${prefix}.fa - bedtools --version | sed -e "s/bedtools v//g" > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bedtools: \$(bedtools --version | sed -e "s/bedtools v//g") + END_VERSIONS """ } diff --git a/modules/nf-core/modules/bedtools/maskfasta/meta.yml b/modules/nf-core/modules/bedtools/maskfasta/meta.yml index b6e494e6..0b7aa3ed 100644 --- a/modules/nf-core/modules/bedtools/maskfasta/meta.yml +++ b/modules/nf-core/modules/bedtools/maskfasta/meta.yml @@ -9,6 +9,7 @@ tools: description: | A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types. documentation: https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html + licence: ['MIT'] input: - meta: type: map @@ -34,10 +35,10 @@ output: type: file description: Output masked fasta file pattern: "*.{fa}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/bedtools/merge/functions.nf b/modules/nf-core/modules/bedtools/merge/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bedtools/merge/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bedtools/merge/main.nf b/modules/nf-core/modules/bedtools/merge/main.nf index 4ac7d1a5..5f1da95b 100644 --- a/modules/nf-core/modules/bedtools/merge/main.nf +++ b/modules/nf-core/modules/bedtools/merge/main.nf @@ -1,40 +1,32 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BEDTOOLS_MERGE { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::bedtools=2.30.0" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/bedtools:2.30.0--hc088bd4_0" - } else { - container "quay.io/biocontainers/bedtools:2.30.0--hc088bd4_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bedtools:2.30.0--hc088bd4_0' : + 'quay.io/biocontainers/bedtools:2.30.0--hc088bd4_0' }" input: tuple val(meta), path(bed) output: tuple val(meta), path('*.bed'), emit: bed - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ bedtools \\ merge \\ -i $bed \\ - $options.args \\ + $args \\ > ${prefix}.bed - bedtools --version | sed -e "s/bedtools v//g" > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bedtools: \$(bedtools --version | sed -e "s/bedtools v//g") + END_VERSIONS """ } diff --git a/modules/nf-core/modules/bedtools/merge/meta.yml b/modules/nf-core/modules/bedtools/merge/meta.yml index f75bea67..40a42b7b 100644 --- a/modules/nf-core/modules/bedtools/merge/meta.yml +++ b/modules/nf-core/modules/bedtools/merge/meta.yml @@ -8,6 +8,7 @@ tools: description: | A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types. documentation: https://bedtools.readthedocs.io/en/latest/content/tools/merge.html + licence: ['MIT'] input: - meta: type: map @@ -28,10 +29,10 @@ output: type: file description: Overlapped bed file with combined features pattern: "*.{bed}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@Emiller88" - "@sruthipsuresh" diff --git a/modules/nf-core/modules/blast/blastn/functions.nf b/modules/nf-core/modules/blast/blastn/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/blast/blastn/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/blast/blastn/main.nf b/modules/nf-core/modules/blast/blastn/main.nf index 8d519613..3a0bafe0 100644 --- a/modules/nf-core/modules/blast/blastn/main.nf +++ b/modules/nf-core/modules/blast/blastn/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BLAST_BLASTN { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::blast=2.10.1' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/blast:2.10.1--pl526he19e7b1_3' - } else { - container 'quay.io/biocontainers/blast:2.10.1--pl526he19e7b1_3' - } + conda (params.enable_conda ? 'bioconda::blast=2.12.0' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/blast:2.12.0--pl5262h3289130_0' : + 'quay.io/biocontainers/blast:2.12.0--pl5262h3289130_0' }" input: tuple val(meta), path(fasta) @@ -24,19 +13,22 @@ process BLAST_BLASTN { output: tuple val(meta), path('*.blastn.txt'), emit: txt - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ DB=`find -L ./ -name "*.ndb" | sed 's/.ndb//'` blastn \\ -num_threads $task.cpus \\ -db \$DB \\ -query $fasta \\ - $options.args \\ + $args \\ -out ${prefix}.blastn.txt - echo \$(blastn -version 2>&1) | sed 's/^.*blastn: //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + blast: \$(blastn -version 2>&1 | sed 's/^.*blastn: //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/blast/blastn/meta.yml b/modules/nf-core/modules/blast/blastn/meta.yml index d04889a8..39acb663 100644 --- a/modules/nf-core/modules/blast/blastn/meta.yml +++ b/modules/nf-core/modules/blast/blastn/meta.yml @@ -12,6 +12,7 @@ tools: homepage: https://blast.ncbi.nlm.nih.gov/Blast.cgi documentation: https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=Blastdocs doi: 10.1016/S0022-2836(05)80360-2 + licence: ['US-Government-Work'] input: - meta: type: map @@ -31,10 +32,10 @@ output: type: file description: File containing blastn hits pattern: "*.{blastn.txt}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/blast/makeblastdb/functions.nf b/modules/nf-core/modules/blast/makeblastdb/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/blast/makeblastdb/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/blast/makeblastdb/main.nf b/modules/nf-core/modules/blast/makeblastdb/main.nf index 3e3b74c2..b4c426a4 100644 --- a/modules/nf-core/modules/blast/makeblastdb/main.nf +++ b/modules/nf-core/modules/blast/makeblastdb/main.nf @@ -1,38 +1,30 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BLAST_MAKEBLASTDB { tag "$fasta" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } - conda (params.enable_conda ? 'bioconda::blast=2.10.1' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/blast:2.10.1--pl526he19e7b1_3' - } else { - container 'quay.io/biocontainers/blast:2.10.1--pl526he19e7b1_3' - } + conda (params.enable_conda ? 'bioconda::blast=2.12.0' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/blast:2.12.0--pl5262h3289130_0' : + 'quay.io/biocontainers/blast:2.12.0--pl5262h3289130_0' }" input: path fasta output: path 'blast_db' , emit: db - path '*.version.txt', emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' """ makeblastdb \\ -in $fasta \\ - $options.args + $args mkdir blast_db mv ${fasta}* blast_db - echo \$(blastn -version 2>&1) | sed 's/^.*blastn: //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + blast: \$(blastn -version 2>&1 | sed 's/^.*blastn: //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/blast/makeblastdb/meta.yml b/modules/nf-core/modules/blast/makeblastdb/meta.yml index 0ea4903f..c9d18cba 100644 --- a/modules/nf-core/modules/blast/makeblastdb/meta.yml +++ b/modules/nf-core/modules/blast/makeblastdb/meta.yml @@ -11,6 +11,7 @@ tools: homepage: https://blast.ncbi.nlm.nih.gov/Blast.cgi documentation: https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=Blastdocs doi: 10.1016/S0022-2836(05)80360-2 + licence: ['US-Government-Work'] input: - fasta: type: file @@ -21,10 +22,10 @@ output: type: directory description: Output directory containing blast database files pattern: "*" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/bowtie2/align/functions.nf b/modules/nf-core/modules/bowtie2/align/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bowtie2/align/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bowtie2/align/main.nf b/modules/nf-core/modules/bowtie2/align/main.nf index d43d479d..20b08f72 100644 --- a/modules/nf-core/modules/bowtie2/align/main.nf +++ b/modules/nf-core/modules/bowtie2/align/main.nf @@ -1,65 +1,60 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BOWTIE2_ALIGN { tag "$meta.id" label 'process_high' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::bowtie2=2.4.2 bioconda::samtools=1.11 conda-forge::pigz=2.3.4' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/mulled-v2-ac74a7f02cebcfcc07d8e8d1d750af9c83b4d45a:577a697be67b5ae9b16f637fd723b8263a3898b3-0" - } else { - container "quay.io/biocontainers/mulled-v2-ac74a7f02cebcfcc07d8e8d1d750af9c83b4d45a:577a697be67b5ae9b16f637fd723b8263a3898b3-0" - } + conda (params.enable_conda ? 'bioconda::bowtie2=2.4.4 bioconda::samtools=1.14 conda-forge::pigz=2.6' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-ac74a7f02cebcfcc07d8e8d1d750af9c83b4d45a:4d235f41348a00533f18e47c9669f1ecb327f629-0' : + 'quay.io/biocontainers/mulled-v2-ac74a7f02cebcfcc07d8e8d1d750af9c83b4d45a:4d235f41348a00533f18e47c9669f1ecb327f629-0' }" input: tuple val(meta), path(reads) path index + val save_unaligned output: - tuple val(meta), path('*.bam'), emit: bam - tuple val(meta), path('*.log'), emit: log - path '*.version.txt' , emit: version - tuple val(meta), path('*fastq.gz'), optional:true, emit: fastq + tuple val(meta), path('*.bam') , emit: bam + tuple val(meta), path('*.log') , emit: log + tuple val(meta), path('*fastq.gz'), emit: fastq, optional:true + path "versions.yml" , emit: versions script: - def split_cpus = Math.floor(task.cpus/2) - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" if (meta.single_end) { - def unaligned = params.save_unaligned ? "--un-gz ${prefix}.unmapped.fastq.gz" : '' + def unaligned = save_unaligned ? "--un-gz ${prefix}.unmapped.fastq.gz" : '' """ INDEX=`find -L ./ -name "*.rev.1.bt2" | sed 's/.rev.1.bt2//'` bowtie2 \\ -x \$INDEX \\ -U $reads \\ - --threads ${split_cpus} \\ + --threads $task.cpus \\ $unaligned \\ - $options.args \\ + $args \\ 2> ${prefix}.bowtie2.log \\ - | samtools view -@ ${split_cpus} $options.args2 -bhS -o ${prefix}.bam - + | samtools view -@ $task.cpus $args2 -bhS -o ${prefix}.bam - - echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) + END_VERSIONS """ } else { - def unaligned = params.save_unaligned ? "--un-conc-gz ${prefix}.unmapped.fastq.gz" : '' + def unaligned = save_unaligned ? "--un-conc-gz ${prefix}.unmapped.fastq.gz" : '' """ INDEX=`find -L ./ -name "*.rev.1.bt2" | sed 's/.rev.1.bt2//'` bowtie2 \\ -x \$INDEX \\ -1 ${reads[0]} \\ -2 ${reads[1]} \\ - --threads ${split_cpus} \\ + --threads $task.cpus \\ $unaligned \\ - $options.args \\ + $args \\ 2> ${prefix}.bowtie2.log \\ - | samtools view -@ ${split_cpus} $options.args2 -bhS -o ${prefix}.bam - + | samtools view -@ $task.cpus $args2 -bhS -o ${prefix}.bam - if [ -f ${prefix}.unmapped.fastq.1.gz ]; then mv ${prefix}.unmapped.fastq.1.gz ${prefix}.unmapped_1.fastq.gz @@ -67,7 +62,13 @@ process BOWTIE2_ALIGN { if [ -f ${prefix}.unmapped.fastq.2.gz ]; then mv ${prefix}.unmapped.fastq.2.gz ${prefix}.unmapped_2.fastq.gz fi - echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//' > ${software}.version.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) + END_VERSIONS """ } } diff --git a/modules/nf-core/modules/bowtie2/align/meta.yml b/modules/nf-core/modules/bowtie2/align/meta.yml index 9d9cd004..77c9e397 100644 --- a/modules/nf-core/modules/bowtie2/align/meta.yml +++ b/modules/nf-core/modules/bowtie2/align/meta.yml @@ -13,6 +13,7 @@ tools: homepage: http://bowtie-bio.sourceforge.net/bowtie2/index.shtml documentation: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml doi: 10.1038/nmeth.1923 + licence: ['GPL-3.0-or-later'] input: - meta: type: map @@ -33,10 +34,10 @@ output: type: file description: Output BAM file containing read alignments pattern: "*.{bam}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" - fastq: type: file description: Unaligned FastQ files diff --git a/modules/nf-core/modules/bowtie2/build/functions.nf b/modules/nf-core/modules/bowtie2/build/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/bowtie2/build/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/bowtie2/build/main.nf b/modules/nf-core/modules/bowtie2/build/main.nf index 42ff1d20..da2e9ed5 100644 --- a/modules/nf-core/modules/bowtie2/build/main.nf +++ b/modules/nf-core/modules/bowtie2/build/main.nf @@ -1,35 +1,27 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process BOWTIE2_BUILD { tag "$fasta" label 'process_high' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'index', meta:[:], publish_by_meta:[]) } - conda (params.enable_conda ? 'bioconda::bowtie2=2.4.2' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/bowtie2:2.4.2--py38h1c8e9b9_1' - } else { - container 'quay.io/biocontainers/bowtie2:2.4.2--py38h1c8e9b9_1' - } + conda (params.enable_conda ? 'bioconda::bowtie2=2.4.4' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bowtie2:2.4.4--py39hbb4e92a_0' : + 'quay.io/biocontainers/bowtie2:2.4.4--py39hbb4e92a_0' }" input: path fasta output: path 'bowtie2' , emit: index - path '*.version.txt', emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' """ mkdir bowtie2 - bowtie2-build $options.args --threads $task.cpus $fasta bowtie2/${fasta.baseName} - echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//' > ${software}.version.txt + bowtie2-build $args --threads $task.cpus $fasta bowtie2/${fasta.baseName} + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/bowtie2/build/meta.yml b/modules/nf-core/modules/bowtie2/build/meta.yml index 0a4cd3de..ecc54e9b 100644 --- a/modules/nf-core/modules/bowtie2/build/meta.yml +++ b/modules/nf-core/modules/bowtie2/build/meta.yml @@ -14,6 +14,7 @@ tools: homepage: http://bowtie-bio.sourceforge.net/bowtie2/index.shtml documentation: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml doi: 10.1038/nmeth.1923 + licence: ['GPL-3.0-or-later'] input: - fasta: type: file @@ -23,10 +24,10 @@ output: type: file description: Bowtie2 genome index files pattern: "*.bt2" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/cat/fastq/functions.nf b/modules/nf-core/modules/cat/fastq/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/cat/fastq/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/cat/fastq/main.nf b/modules/nf-core/modules/cat/fastq/main.nf index 55ccca90..d02598e1 100644 --- a/modules/nf-core/modules/cat/fastq/main.nf +++ b/modules/nf-core/modules/cat/fastq/main.nf @@ -1,36 +1,32 @@ -// Import generic module functions -include { initOptions; saveFiles } from './functions' - -params.options = [:] -options = initOptions(params.options) - process CAT_FASTQ { tag "$meta.id" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'merged_fastq', meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "conda-forge::sed=4.7" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img" - } else { - container "biocontainers/biocontainers:v1.2.0_cv1" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img' : + 'biocontainers/biocontainers:v1.2.0_cv1' }" input: - tuple val(meta), path(reads) + tuple val(meta), path(reads, stageAs: "input*/*") output: tuple val(meta), path("*.merged.fastq.gz"), emit: reads + path "versions.yml" , emit: versions script: - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" def readList = reads.collect{ it.toString() } if (meta.single_end) { if (readList.size > 1) { """ - cat ${readList.sort().join(' ')} > ${prefix}.merged.fastq.gz + cat ${readList.join(' ')} > ${prefix}.merged.fastq.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//') + END_VERSIONS """ } } else { @@ -39,8 +35,13 @@ process CAT_FASTQ { def read2 = [] readList.eachWithIndex{ v, ix -> ( ix & 1 ? read2 : read1 ) << v } """ - cat ${read1.sort().join(' ')} > ${prefix}_1.merged.fastq.gz - cat ${read2.sort().join(' ')} > ${prefix}_2.merged.fastq.gz + cat ${read1.join(' ')} > ${prefix}_1.merged.fastq.gz + cat ${read2.join(' ')} > ${prefix}_2.merged.fastq.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//') + END_VERSIONS """ } } diff --git a/modules/nf-core/modules/cat/fastq/meta.yml b/modules/nf-core/modules/cat/fastq/meta.yml index e7b8eebe..1992fa34 100644 --- a/modules/nf-core/modules/cat/fastq/meta.yml +++ b/modules/nf-core/modules/cat/fastq/meta.yml @@ -8,6 +8,7 @@ tools: description: | The cat utility reads files sequentially, writing them to the standard output. documentation: https://www.gnu.org/software/coreutils/manual/html_node/cat-invocation.html + licence: ['GPL-3.0-or-later'] input: - meta: type: map @@ -28,6 +29,11 @@ output: type: file description: Merged fastq file pattern: "*.{merged.fastq.gz}" + - versions: + type: file + description: File containing software versions + pattern: "versions.yml" + authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/custom/dumpsoftwareversions/main.nf b/modules/nf-core/modules/custom/dumpsoftwareversions/main.nf new file mode 100644 index 00000000..934bb467 --- /dev/null +++ b/modules/nf-core/modules/custom/dumpsoftwareversions/main.nf @@ -0,0 +1,21 @@ +process CUSTOM_DUMPSOFTWAREVERSIONS { + label 'process_low' + + // Requires `pyyaml` which does not have a dedicated container but is in the MultiQC container + conda (params.enable_conda ? "bioconda::multiqc=1.11" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/multiqc:1.11--pyhdfd78af_0' : + 'quay.io/biocontainers/multiqc:1.11--pyhdfd78af_0' }" + + input: + path versions + + output: + path "software_versions.yml" , emit: yml + path "software_versions_mqc.yml", emit: mqc_yml + path "versions.yml" , emit: versions + + script: + def args = task.ext.args ?: '' + template 'dumpsoftwareversions.py' +} diff --git a/modules/nf-core/modules/custom/dumpsoftwareversions/meta.yml b/modules/nf-core/modules/custom/dumpsoftwareversions/meta.yml new file mode 100644 index 00000000..5b5b8a60 --- /dev/null +++ b/modules/nf-core/modules/custom/dumpsoftwareversions/meta.yml @@ -0,0 +1,34 @@ +name: custom_dumpsoftwareversions +description: Custom module used to dump software versions within the nf-core pipeline template +keywords: + - custom + - version +tools: + - custom: + description: Custom module used to dump software versions within the nf-core pipeline template + homepage: https://github.com/nf-core/tools + documentation: https://github.com/nf-core/tools + licence: ['MIT'] +input: + - versions: + type: file + description: YML file containing software versions + pattern: "*.yml" + +output: + - yml: + type: file + description: Standard YML file containing software versions + pattern: "software_versions.yml" + - mqc_yml: + type: file + description: MultiQC custom content YML file containing software versions + pattern: "software_versions_mqc.yml" + - versions: + type: file + description: File containing software versions + pattern: "versions.yml" + +authors: + - "@drpatelh" + - "@grst" diff --git a/modules/nf-core/modules/custom/dumpsoftwareversions/templates/dumpsoftwareversions.py b/modules/nf-core/modules/custom/dumpsoftwareversions/templates/dumpsoftwareversions.py new file mode 100644 index 00000000..d1390392 --- /dev/null +++ b/modules/nf-core/modules/custom/dumpsoftwareversions/templates/dumpsoftwareversions.py @@ -0,0 +1,89 @@ +#!/usr/bin/env python + +import yaml +import platform +from textwrap import dedent + + +def _make_versions_html(versions): + html = [ + dedent( + """\\ + + + + + + + + + + """ + ) + ] + for process, tmp_versions in sorted(versions.items()): + html.append("") + for i, (tool, version) in enumerate(sorted(tmp_versions.items())): + html.append( + dedent( + f"""\\ + + + + + + """ + ) + ) + html.append("") + html.append("
Process Name Software Version
{process if (i == 0) else ''}{tool}{version}
") + return "\\n".join(html) + + +versions_this_module = {} +versions_this_module["${task.process}"] = { + "python": platform.python_version(), + "yaml": yaml.__version__, +} + +with open("$versions") as f: + versions_by_process = yaml.load(f, Loader=yaml.BaseLoader) | versions_this_module + +# aggregate versions by the module name (derived from fully-qualified process name) +versions_by_module = {} +for process, process_versions in versions_by_process.items(): + module = process.split(":")[-1] + try: + assert versions_by_module[module] == process_versions, ( + "We assume that software versions are the same between all modules. " + "If you see this error-message it means you discovered an edge-case " + "and should open an issue in nf-core/tools. " + ) + except KeyError: + versions_by_module[module] = process_versions + +versions_by_module["Workflow"] = { + "Nextflow": "$workflow.nextflow.version", + "$workflow.manifest.name": "$workflow.manifest.version", +} + +versions_mqc = { + "id": "software_versions", + "section_name": "${workflow.manifest.name} Software Versions", + "section_href": "https://github.com/${workflow.manifest.name}", + "plot_type": "html", + "description": "are collected at run time from the software output.", + "data": _make_versions_html(versions_by_module), +} + +with open("software_versions.yml", "w") as f: + yaml.dump(versions_by_module, f, default_flow_style=False) +with open("software_versions_mqc.yml", "w") as f: + yaml.dump(versions_mqc, f, default_flow_style=False) + +with open("versions.yml", "w") as f: + yaml.dump(versions_this_module, f, default_flow_style=False) diff --git a/modules/nf-core/modules/custom/getchromsizes/main.nf b/modules/nf-core/modules/custom/getchromsizes/main.nf new file mode 100644 index 00000000..270b3f48 --- /dev/null +++ b/modules/nf-core/modules/custom/getchromsizes/main.nf @@ -0,0 +1,29 @@ +process CUSTOM_GETCHROMSIZES { + tag "$fasta" + label 'process_low' + + conda (params.enable_conda ? "bioconda::samtools=1.14" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.14--hb421002_0' : + 'quay.io/biocontainers/samtools:1.14--hb421002_0' }" + + input: + path fasta + + output: + path '*.sizes' , emit: sizes + path '*.fai' , emit: fai + path "versions.yml", emit: versions + + script: + def args = task.ext.args ?: '' + """ + samtools faidx $fasta + cut -f 1,2 ${fasta}.fai > ${fasta}.sizes + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + custom: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/modules/custom/getchromsizes/meta.yml b/modules/nf-core/modules/custom/getchromsizes/meta.yml new file mode 100644 index 00000000..eb1db4bb --- /dev/null +++ b/modules/nf-core/modules/custom/getchromsizes/meta.yml @@ -0,0 +1,39 @@ +name: custom_getchromsizes +description: Generates a FASTA file of chromosome sizes and a fasta index file +keywords: + - fasta + - chromosome + - indexing +tools: + - samtools: + description: Tools for dealing with SAM, BAM and CRAM files + homepage: http://www.htslib.org/ + documentation: http://www.htslib.org/doc/samtools.html + tool_dev_url: https://github.com/samtools/samtools + doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] + +input: + - fasta: + type: file + description: FASTA file + pattern: "*.{fasta}" + +output: + - sizes: + type: file + description: File containing chromosome lengths + pattern: "*.{sizes}" + - fai: + type: file + description: FASTA index file + pattern: "*.{fai}" + - versions: + type: file + description: File containing software version + pattern: "versions.yml" + + +authors: + - "@tamara-hodgetts" + - "@chris-cheshire" diff --git a/modules/nf-core/modules/fastp/functions.nf b/modules/nf-core/modules/fastp/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/fastp/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/fastp/main.nf b/modules/nf-core/modules/fastp/main.nf index 6d703615..a406036a 100644 --- a/modules/nf-core/modules/fastp/main.nf +++ b/modules/nf-core/modules/fastp/main.nf @@ -1,40 +1,32 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process FASTP { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::fastp=0.20.1' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/fastp:0.20.1--h8b12597_0' - } else { - container 'quay.io/biocontainers/fastp:0.20.1--h8b12597_0' - } + conda (params.enable_conda ? 'bioconda::fastp=0.23.2' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/fastp:0.23.2--h79da9fb_0' : + 'quay.io/biocontainers/fastp:0.23.2--h79da9fb_0' }" input: tuple val(meta), path(reads) + val save_trimmed_fail + val save_merged output: - tuple val(meta), path('*.trim.fastq.gz'), emit: reads - tuple val(meta), path('*.json') , emit: json - tuple val(meta), path('*.html') , emit: html - tuple val(meta), path('*.log') , emit: log - path '*.version.txt' , emit: version - tuple val(meta), path('*.fail.fastq.gz'), optional:true, emit: reads_fail + tuple val(meta), path('*.trim.fastq.gz') , emit: reads + tuple val(meta), path('*.json') , emit: json + tuple val(meta), path('*.html') , emit: html + tuple val(meta), path('*.log') , emit: log + path "versions.yml" , emit: versions + tuple val(meta), path('*.fail.fastq.gz') , optional:true, emit: reads_fail + tuple val(meta), path('*.merged.fastq.gz'), optional:true, emit: reads_merged script: + def args = task.ext.args ?: '' // Added soft-links to original fastqs for consistent naming in MultiQC - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def prefix = task.ext.prefix ?: "${meta.id}" if (meta.single_end) { - def fail_fastq = params.save_trimmed_fail ? "--failed_out ${prefix}.fail.fastq.gz" : '' + def fail_fastq = save_trimmed_fail ? "--failed_out ${prefix}.fail.fastq.gz" : '' """ [ ! -f ${prefix}.fastq.gz ] && ln -s $reads ${prefix}.fastq.gz fastp \\ @@ -44,12 +36,16 @@ process FASTP { --json ${prefix}.fastp.json \\ --html ${prefix}.fastp.html \\ $fail_fastq \\ - $options.args \\ + $args \\ 2> ${prefix}.fastp.log - echo \$(fastp --version 2>&1) | sed -e "s/fastp //g" > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g") + END_VERSIONS """ } else { - def fail_fastq = params.save_trimmed_fail ? "--unpaired1 ${prefix}_1.fail.fastq.gz --unpaired2 ${prefix}_2.fail.fastq.gz" : '' + def fail_fastq = save_trimmed_fail ? "--unpaired1 ${prefix}_1.fail.fastq.gz --unpaired2 ${prefix}_2.fail.fastq.gz" : '' + def merge_fastq = save_merged ? "-m --merged_out ${prefix}.merged.fastq.gz" : '' """ [ ! -f ${prefix}_1.fastq.gz ] && ln -s ${reads[0]} ${prefix}_1.fastq.gz [ ! -f ${prefix}_2.fastq.gz ] && ln -s ${reads[1]} ${prefix}_2.fastq.gz @@ -61,12 +57,16 @@ process FASTP { --json ${prefix}.fastp.json \\ --html ${prefix}.fastp.html \\ $fail_fastq \\ + $merge_fastq \\ --thread $task.cpus \\ --detect_adapter_for_pe \\ - $options.args \\ + $args \\ 2> ${prefix}.fastp.log - echo \$(fastp --version 2>&1) | sed -e "s/fastp //g" > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g") + END_VERSIONS """ } } diff --git a/modules/nf-core/modules/fastp/meta.yml b/modules/nf-core/modules/fastp/meta.yml index 1fc3dfb6..a1875faf 100644 --- a/modules/nf-core/modules/fastp/meta.yml +++ b/modules/nf-core/modules/fastp/meta.yml @@ -10,6 +10,7 @@ tools: A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance. documentation: https://github.com/OpenGene/fastp doi: https://doi.org/10.1093/bioinformatics/bty560 + licence: ['MIT'] input: - meta: type: map @@ -30,7 +31,7 @@ output: e.g. [ id:'test', single_end:false ] - reads: type: file - description: The trimmed/modified fastq reads + description: The trimmed/modified/unmerged fastq reads pattern: "*trim.fastq.gz" - json: type: file @@ -39,19 +40,23 @@ output: - html: type: file description: Results in HTML format - pattern: "*.thml" + pattern: "*.html" - log: type: file description: fastq log file pattern: "*.log" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" - reads_fail: type: file description: Reads the failed the preprocessing pattern: "*fail.fastq.gz" + - reads_merged: + type: file + description: Reads that were successfully merged + pattern: "*.{merged.fastq.gz}" authors: - "@drpatelh" - "@kevinmenden" diff --git a/modules/nf-core/modules/fastqc/functions.nf b/modules/nf-core/modules/fastqc/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/fastqc/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/fastqc/main.nf b/modules/nf-core/modules/fastqc/main.nf index 39c327b2..d250eca0 100644 --- a/modules/nf-core/modules/fastqc/main.nf +++ b/modules/nf-core/modules/fastqc/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process FASTQC { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::fastqc=0.11.9" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0" - } else { - container "quay.io/biocontainers/fastqc:0.11.9--0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0' : + 'quay.io/biocontainers/fastqc:0.11.9--0' }" input: tuple val(meta), path(reads) @@ -24,24 +13,32 @@ process FASTQC { output: tuple val(meta), path("*.html"), emit: html tuple val(meta), path("*.zip") , emit: zip - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: + def args = task.ext.args ?: '' // Add soft-links to original FastQs for consistent naming in pipeline - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def prefix = task.ext.prefix ?: "${meta.id}" if (meta.single_end) { """ [ ! -f ${prefix}.fastq.gz ] && ln -s $reads ${prefix}.fastq.gz - fastqc $options.args --threads $task.cpus ${prefix}.fastq.gz - fastqc --version | sed -e "s/FastQC v//g" > ${software}.version.txt + fastqc $args --threads $task.cpus ${prefix}.fastq.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" ) + END_VERSIONS """ } else { """ [ ! -f ${prefix}_1.fastq.gz ] && ln -s ${reads[0]} ${prefix}_1.fastq.gz [ ! -f ${prefix}_2.fastq.gz ] && ln -s ${reads[1]} ${prefix}_2.fastq.gz - fastqc $options.args --threads $task.cpus ${prefix}_1.fastq.gz ${prefix}_2.fastq.gz - fastqc --version | sed -e "s/FastQC v//g" > ${software}.version.txt + fastqc $args --threads $task.cpus ${prefix}_1.fastq.gz ${prefix}_2.fastq.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" ) + END_VERSIONS """ } } diff --git a/modules/nf-core/modules/fastqc/meta.yml b/modules/nf-core/modules/fastqc/meta.yml index 8eb9953d..b09553a3 100644 --- a/modules/nf-core/modules/fastqc/meta.yml +++ b/modules/nf-core/modules/fastqc/meta.yml @@ -15,6 +15,7 @@ tools: overrepresented sequences. homepage: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ documentation: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/ + licence: ['GPL-2.0-only'] input: - meta: type: map @@ -40,10 +41,10 @@ output: type: file description: FastQC report archive pattern: "*_{fastqc.zip}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@grst" diff --git a/modules/nf-core/modules/gunzip/functions.nf b/modules/nf-core/modules/gunzip/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/gunzip/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/gunzip/main.nf b/modules/nf-core/modules/gunzip/main.nf index 29248796..77a4e546 100644 --- a/modules/nf-core/modules/gunzip/main.nf +++ b/modules/nf-core/modules/gunzip/main.nf @@ -1,35 +1,31 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process GUNZIP { tag "$archive" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } conda (params.enable_conda ? "conda-forge::sed=4.7" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img" - } else { - container "biocontainers/biocontainers:v1.2.0_cv1" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img' : + 'biocontainers/biocontainers:v1.2.0_cv1' }" input: - path archive + tuple val(meta), path(archive) output: - path "$gunzip", emit: gunzip - path "*.version.txt", emit: version + tuple val(meta), path("$gunzip"), emit: gunzip + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - gunzip = archive.toString() - '.gz' + def args = task.ext.args ?: '' + gunzip = archive.toString() - '.gz' """ - gunzip -f $options.args $archive - echo \$(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*\$//' > ${software}.version.txt + gunzip \\ + -f \\ + $args \\ + $archive + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + gunzip: \$(echo \$(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/gunzip/meta.yml b/modules/nf-core/modules/gunzip/meta.yml index 922e74e6..ea1f1546 100644 --- a/modules/nf-core/modules/gunzip/meta.yml +++ b/modules/nf-core/modules/gunzip/meta.yml @@ -8,7 +8,13 @@ tools: description: | gzip is a file format and a software application used for file compression and decompression. documentation: https://www.gnu.org/software/gzip/manual/gzip.html + licence: ['GPL-3.0-or-later'] input: + - meta: + type: map + description: | + Optional groovy Map containing meta information + e.g. [ id:'test', single_end:false ] - archive: type: file description: File to be compressed/uncompressed @@ -18,10 +24,11 @@ output: type: file description: Compressed/uncompressed file pattern: "*.*" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" + - "@jfy133" diff --git a/modules/nf-core/modules/ivar/consensus/functions.nf b/modules/nf-core/modules/ivar/consensus/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/ivar/consensus/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/ivar/consensus/main.nf b/modules/nf-core/modules/ivar/consensus/main.nf index 1b1019cf..96d00ce2 100644 --- a/modules/nf-core/modules/ivar/consensus/main.nf +++ b/modules/nf-core/modules/ivar/consensus/main.nf @@ -1,47 +1,43 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process IVAR_CONSENSUS { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::ivar=1.3.1" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/ivar:1.3.1--h089eab3_0" - } else { - container "quay.io/biocontainers/ivar:1.3.1--h089eab3_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/ivar:1.3.1--h089eab3_0' : + 'quay.io/biocontainers/ivar:1.3.1--h089eab3_0' }" input: tuple val(meta), path(bam) - path fasta + path fasta + val save_mpileup output: tuple val(meta), path("*.fa") , emit: fasta tuple val(meta), path("*.qual.txt"), emit: qual tuple val(meta), path("*.mpileup") , optional:true, emit: mpileup - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" - def save_mpileup = params.save_mpileup ? "tee ${prefix}.mpileup |" : "" + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def mpileup = save_mpileup ? "| tee ${prefix}.mpileup" : "" """ - samtools mpileup \\ + samtools \\ + mpileup \\ --reference $fasta \\ - $options.args2 \\ - $bam | \\ - $save_mpileup \\ - ivar consensus \\ - $options.args \\ + $args2 \\ + $bam \\ + $mpileup \\ + | ivar \\ + consensus \\ + $args \\ -p $prefix - echo \$(ivar version 2>&1) | sed 's/^.*iVar version //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + ivar: \$(echo \$(ivar version 2>&1) | sed 's/^.*iVar version //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/ivar/consensus/meta.yml b/modules/nf-core/modules/ivar/consensus/meta.yml index 913a7660..aa08ad98 100644 --- a/modules/nf-core/modules/ivar/consensus/meta.yml +++ b/modules/nf-core/modules/ivar/consensus/meta.yml @@ -10,6 +10,7 @@ tools: iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing. homepage: https://github.com/andersen-lab/ivar documentation: https://andersen-lab.github.io/ivar/html/manualpage.html + licence: ['GPL-3.0-or-later'] input: - meta: type: map @@ -24,6 +25,10 @@ input: type: file description: The reference sequence used for mapping and generating the BAM file pattern: "*.fa" + - save_mpileup: + type: boolean + description: Save mpileup file generated by ivar consensus + patter: "*.mpileup" output: - meta: type: map @@ -42,10 +47,10 @@ output: type: file description: mpileup output from samtools mpileup [OPTIONAL] pattern: "*.mpileup" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@andersgs" - "@drpatelh" diff --git a/modules/nf-core/modules/ivar/trim/functions.nf b/modules/nf-core/modules/ivar/trim/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/ivar/trim/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/ivar/trim/main.nf b/modules/nf-core/modules/ivar/trim/main.nf index afdc99e4..4d0c70a2 100644 --- a/modules/nf-core/modules/ivar/trim/main.nf +++ b/modules/nf-core/modules/ivar/trim/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process IVAR_TRIM { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::ivar=1.3.1" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/ivar:1.3.1--h089eab3_0" - } else { - container "quay.io/biocontainers/ivar:1.3.1--h089eab3_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/ivar:1.3.1--h089eab3_0' : + 'quay.io/biocontainers/ivar:1.3.1--h089eab3_0' }" input: tuple val(meta), path(bam), path(bai) @@ -25,19 +14,22 @@ process IVAR_TRIM { output: tuple val(meta), path("*.bam"), emit: bam tuple val(meta), path('*.log'), emit: log - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ ivar trim \\ - $options.args \\ + $args \\ -i $bam \\ -b $bed \\ -p $prefix \\ > ${prefix}.ivar.log - echo \$(ivar version 2>&1) | sed 's/^.*iVar version //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + ivar: \$(echo \$(ivar version 2>&1) | sed 's/^.*iVar version //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/ivar/trim/meta.yml b/modules/nf-core/modules/ivar/trim/meta.yml index 5791db66..44bc742e 100644 --- a/modules/nf-core/modules/ivar/trim/meta.yml +++ b/modules/nf-core/modules/ivar/trim/meta.yml @@ -10,6 +10,7 @@ tools: iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing. homepage: https://github.com/andersen-lab/ivar documentation: https://andersen-lab.github.io/ivar/html/manualpage.html + licence: ['GPL-3.0-or-later'] input: - meta: type: map @@ -42,10 +43,10 @@ output: type: file description: Log file generated by iVar for use with MultiQC pattern: "*.log" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@andersgs" - "@drpatelh" diff --git a/modules/nf-core/modules/ivar/variants/functions.nf b/modules/nf-core/modules/ivar/variants/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/ivar/variants/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/ivar/variants/main.nf b/modules/nf-core/modules/ivar/variants/main.nf index 154f309c..bb6e402b 100644 --- a/modules/nf-core/modules/ivar/variants/main.nf +++ b/modules/nf-core/modules/ivar/variants/main.nf @@ -1,50 +1,46 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process IVAR_VARIANTS { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? "bioconda::ivar=1.3.1" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/ivar:1.3.1--h089eab3_0" - } else { - container "quay.io/biocontainers/ivar:1.3.1--h089eab3_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/ivar:1.3.1--h089eab3_0' : + 'quay.io/biocontainers/ivar:1.3.1--h089eab3_0' }" input: tuple val(meta), path(bam) path fasta path gff + val save_mpileup output: tuple val(meta), path("*.tsv") , emit: tsv tuple val(meta), path("*.mpileup"), optional:true, emit: mpileup - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" - def save_mpileup = params.save_mpileup ? "tee ${prefix}.mpileup |" : "" - def features = params.gff ? "-g $gff" : "" + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def features = gff ? "-g $gff" : "" + def mpileup = save_mpileup ? "| tee ${prefix}.mpileup" : "" """ - samtools mpileup \\ - $options.args2 \\ + samtools \\ + mpileup \\ + $args2 \\ --reference $fasta \\ - $bam | \\ - $save_mpileup \\ - ivar variants \\ - $options.args \\ + $bam \\ + $mpileup \\ + | ivar \\ + variants \\ + $args \\ $features \\ -r $fasta \\ -p $prefix - echo \$(ivar version 2>&1) | sed 's/^.*iVar version //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + ivar: \$(echo \$(ivar version 2>&1) | sed 's/^.*iVar version //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/ivar/variants/meta.yml b/modules/nf-core/modules/ivar/variants/meta.yml index 7a5fbbc0..29cbd958 100644 --- a/modules/nf-core/modules/ivar/variants/meta.yml +++ b/modules/nf-core/modules/ivar/variants/meta.yml @@ -10,6 +10,7 @@ tools: iVar - a computational package that contains functions broadly useful for viral amplicon-based sequencing. homepage: https://github.com/andersen-lab/ivar documentation: https://andersen-lab.github.io/ivar/html/manualpage.html + licence: ['GPL-3.0-or-later'] input: - meta: type: map @@ -28,6 +29,10 @@ input: type: file description: A GFF file in the GFF3 format can be supplied to specify coordinates of open reading frames (ORFs). In absence of GFF file, amino acid translation will not be done. patter: "*.gff" + - save_mpileup: + type: boolean + description: Save mpileup file generated by ivar variants + patter: "*.mpileup" output: - meta: type: map @@ -42,10 +47,10 @@ output: type: file description: mpileup output from samtools mpileup [OPTIONAL] pattern: "*.mpileup" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@andersgs" - "@drpatelh" diff --git a/modules/nf-core/modules/kraken2/kraken2/functions.nf b/modules/nf-core/modules/kraken2/kraken2/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/kraken2/kraken2/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/kraken2/kraken2/main.nf b/modules/nf-core/modules/kraken2/kraken2/main.nf index ea0b72fd..eaabb229 100644 --- a/modules/nf-core/modules/kraken2/kraken2/main.nf +++ b/modules/nf-core/modules/kraken2/kraken2/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process KRAKEN2_KRAKEN2 { tag "$meta.id" label 'process_high' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::kraken2=2.1.1 conda-forge::pigz=2.6' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/mulled-v2-5799ab18b5fc681e75923b2450abaa969907ec98:941789bd7fe00db16531c26de8bf3c5c985242a5-0' - } else { - container 'quay.io/biocontainers/mulled-v2-5799ab18b5fc681e75923b2450abaa969907ec98:941789bd7fe00db16531c26de8bf3c5c985242a5-0' - } + conda (params.enable_conda ? 'bioconda::kraken2=2.1.2 conda-forge::pigz=2.6' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-5799ab18b5fc681e75923b2450abaa969907ec98:87fc08d11968d081f3e8a37131c1f1f6715b6542-0' : + 'quay.io/biocontainers/mulled-v2-5799ab18b5fc681e75923b2450abaa969907ec98:87fc08d11968d081f3e8a37131c1f1f6715b6542-0' }" input: tuple val(meta), path(reads) @@ -26,11 +15,11 @@ process KRAKEN2_KRAKEN2 { tuple val(meta), path('*classified*') , emit: classified tuple val(meta), path('*unclassified*'), emit: unclassified tuple val(meta), path('*report.txt') , emit: txt - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" def paired = meta.single_end ? "" : "--paired" def classified = meta.single_end ? "${prefix}.classified.fastq" : "${prefix}.classified#.fastq" def unclassified = meta.single_end ? "${prefix}.unclassified.fastq" : "${prefix}.unclassified#.fastq" @@ -43,11 +32,15 @@ process KRAKEN2_KRAKEN2 { --report ${prefix}.kraken2.report.txt \\ --gzip-compressed \\ $paired \\ - $options.args \\ + $args \\ $reads pigz -p $task.cpus *.fastq - echo \$(kraken2 --version 2>&1) | sed 's/^.*Kraken version //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + kraken2: \$(echo \$(kraken2 --version 2>&1) | sed 's/^.*Kraken version //; s/ .*\$//') + pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) + END_VERSIONS """ } diff --git a/modules/nf-core/modules/kraken2/kraken2/meta.yml b/modules/nf-core/modules/kraken2/kraken2/meta.yml index cb1ec0de..4b894705 100644 --- a/modules/nf-core/modules/kraken2/kraken2/meta.yml +++ b/modules/nf-core/modules/kraken2/kraken2/meta.yml @@ -12,6 +12,7 @@ tools: homepage: https://ccb.jhu.edu/software/kraken2/ documentation: https://github.com/DerrickWood/kraken2/wiki/Manual doi: 10.1186/s13059-019-1891-0 + licence: ['MIT'] input: - meta: type: map @@ -50,10 +51,10 @@ output: Kraken2 report containing stats about classified and not classifed reads. pattern: "*.{report.txt}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/minia/functions.nf b/modules/nf-core/modules/minia/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/minia/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/minia/main.nf b/modules/nf-core/modules/minia/main.nf index 9ae79ede..968cafa5 100644 --- a/modules/nf-core/modules/minia/main.nf +++ b/modules/nf-core/modules/minia/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process MINIA { tag "$meta.id" label 'process_high' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::minia=3.2.4" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/minia:3.2.4--he513fc3_0" - } else { - container "quay.io/biocontainers/minia:3.2.4--he513fc3_0" - } + conda (params.enable_conda ? "bioconda::minia=3.2.6" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/minia:3.2.6--h9a82719_0' : + 'quay.io/biocontainers/minia:3.2.6--h9a82719_0' }" input: tuple val(meta), path(reads) @@ -25,19 +14,23 @@ process MINIA { tuple val(meta), path('*.contigs.fa'), emit: contigs tuple val(meta), path('*.unitigs.fa'), emit: unitigs tuple val(meta), path('*.h5') , emit: h5 - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def read_list = reads.join(",") """ - echo "${reads.join("\n")}" > input_files.txt + echo "${read_list}" | sed 's/,/\\n/g' > input_files.txt minia \\ - $options.args \\ + $args \\ -nb-cores $task.cpus \\ -in input_files.txt \\ -out $prefix - echo \$(minia --version 2>&1) | sed 's/^.*Minia version //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + minia: \$(echo \$(minia --version 2>&1 | grep Minia) | sed 's/^.*Minia version //;') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/minia/meta.yml b/modules/nf-core/modules/minia/meta.yml index d3a76be8..397a1d49 100644 --- a/modules/nf-core/modules/minia/meta.yml +++ b/modules/nf-core/modules/minia/meta.yml @@ -9,6 +9,7 @@ tools: a human genome on a desktop computer in a day. The output of Minia is a set of contigs. homepage: https://github.com/GATB/minia documentation: https://github.com/GATB/minia + licence: ['AGPL-3.0-or-later'] input: - meta: type: map @@ -37,10 +38,10 @@ output: type: file description: Minia output h5 file pattern: "*{.h5}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@kevinmenden" diff --git a/modules/nf-core/modules/mosdepth/functions.nf b/modules/nf-core/modules/mosdepth/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/mosdepth/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/mosdepth/main.nf b/modules/nf-core/modules/mosdepth/main.nf index 618efd79..d2669b7e 100644 --- a/modules/nf-core/modules/mosdepth/main.nf +++ b/modules/nf-core/modules/mosdepth/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process MOSDEPTH { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::mosdepth=0.3.1' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/mosdepth:0.3.1--ha7ba039_0" - } else { - container "quay.io/biocontainers/mosdepth:0.3.1--ha7ba039_0" - } + conda (params.enable_conda ? 'bioconda::mosdepth=0.3.2' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mosdepth:0.3.2--h01d7912_0' : + 'quay.io/biocontainers/mosdepth:0.3.2--h01d7912_0' }" input: tuple val(meta), path(bam), path(bai) @@ -31,18 +20,21 @@ process MOSDEPTH { tuple val(meta), path('*.per-base.bed.gz.csi'), emit: per_base_csi tuple val(meta), path('*.regions.bed.gz') , emit: regions_bed tuple val(meta), path('*.regions.bed.gz.csi') , emit: regions_csi - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" def interval = window_size ? "--by ${window_size}" : "--by ${bed}" """ mosdepth \\ $interval \\ - $options.args \\ + $args \\ $prefix \\ $bam - echo \$(mosdepth --version 2>&1) | sed 's/^.*mosdepth //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mosdepth: \$(mosdepth --version 2>&1 | sed 's/^.*mosdepth //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/mosdepth/meta.yml b/modules/nf-core/modules/mosdepth/meta.yml index d96e474f..be568aa6 100644 --- a/modules/nf-core/modules/mosdepth/meta.yml +++ b/modules/nf-core/modules/mosdepth/meta.yml @@ -11,6 +11,7 @@ tools: Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing. documentation: https://github.com/brentp/mosdepth doi: 10.1093/bioinformatics/btx699 + licence: ['MIT'] input: - meta: type: map @@ -67,10 +68,10 @@ output: type: file description: Index file for BED file with per-region coverage pattern: "*.{regions.bed.gz.csi}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/nanoplot/functions.nf b/modules/nf-core/modules/nanoplot/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/nanoplot/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/nanoplot/main.nf b/modules/nf-core/modules/nanoplot/main.nf index af080dc8..c3fb8a37 100644 --- a/modules/nf-core/modules/nanoplot/main.nf +++ b/modules/nf-core/modules/nanoplot/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process NANOPLOT { tag "$meta.id" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::nanoplot=1.36.1" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/nanoplot:1.36.1--pyhdfd78af_0" - } else { - container "quay.io/biocontainers/nanoplot:1.36.1--pyhdfd78af_0" - } + conda (params.enable_conda ? 'bioconda::nanoplot=1.39.0' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/nanoplot:1.39.0--pyhdfd78af_0' : + 'quay.io/biocontainers/nanoplot:1.39.0--pyhdfd78af_0' }" input: tuple val(meta), path(ontfile) @@ -26,17 +15,20 @@ process NANOPLOT { tuple val(meta), path("*.png") , emit: png tuple val(meta), path("*.txt") , emit: txt tuple val(meta), path("*.log") , emit: log - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' def input_file = ("$ontfile".endsWith(".fastq.gz")) ? "--fastq ${ontfile}" : ("$ontfile".endsWith(".txt")) ? "--summary ${ontfile}" : '' """ NanoPlot \\ - $options.args \\ + $args \\ -t $task.cpus \\ $input_file - echo \$(NanoPlot --version 2>&1) | sed 's/^.*NanoPlot //; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + nanoplot: \$(echo \$(NanoPlot --version 2>&1) | sed 's/^.*NanoPlot //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/nanoplot/meta.yml b/modules/nf-core/modules/nanoplot/meta.yml index f1d94312..52ebb622 100644 --- a/modules/nf-core/modules/nanoplot/meta.yml +++ b/modules/nf-core/modules/nanoplot/meta.yml @@ -13,6 +13,7 @@ tools: alignment. homepage: http://nanoplot.bioinf.be documentation: https://github.com/wdecoster/NanoPlot + licence: ['GPL-3.0-or-later'] input: - meta: type: map @@ -49,10 +50,10 @@ output: type: file description: log file of NanoPlot run pattern: "*{.log}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@yuukiiwa" diff --git a/modules/nf-core/modules/nextclade/datasetget/main.nf b/modules/nf-core/modules/nextclade/datasetget/main.nf new file mode 100644 index 00000000..75bb88f3 --- /dev/null +++ b/modules/nf-core/modules/nextclade/datasetget/main.nf @@ -0,0 +1,39 @@ +process NEXTCLADE_DATASETGET { + tag "$dataset" + label 'process_low' + + conda (params.enable_conda ? "bioconda::nextclade=1.10.2" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/nextclade:1.10.2--h9ee0642_0' : + 'quay.io/biocontainers/nextclade:1.10.2--h9ee0642_0' }" + + input: + val dataset + val reference + val tag + + output: + path "$prefix" , emit: dataset + path "versions.yml", emit: versions + + script: + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${dataset}" + def fasta = reference ? "--reference ${reference}" : '' + def version = tag ? "--tag ${tag}" : '' + """ + nextclade \\ + dataset \\ + get \\ + $args \\ + --name $dataset \\ + $fasta \\ + $version \\ + --output-dir $prefix + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + nextclade: \$(nextclade --version 2>&1) + END_VERSIONS + """ +} diff --git a/modules/nf-core/modules/nextclade/datasetget/meta.yml b/modules/nf-core/modules/nextclade/datasetget/meta.yml new file mode 100644 index 00000000..1246d918 --- /dev/null +++ b/modules/nf-core/modules/nextclade/datasetget/meta.yml @@ -0,0 +1,42 @@ +name: nextclade_datasetget +description: Get dataset for SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks (C++ implementation) +keywords: + - nextclade + - variant + - consensus +tools: + - nextclade: + description: SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks + homepage: https://github.com/nextstrain/nextclade + documentation: https://github.com/nextstrain/nextclade + tool_dev_url: https://github.com/nextstrain/nextclade + doi: "" + licence: ['MIT'] + +input: + - dataset: + type: string + description: Name of dataset to retrieve. A list of available datasets can be obtained using the nextclade dataset list command. + pattern: ".+" + - reference: + type: string + description: Accession id to download dataset based on a particular reference sequence. A list of available datasets can be obtained using the nextclade dataset list command. + pattern: ".+" + - tag: + type: string + description: Version tag of the dataset to download. A list of available datasets can be obtained using the nextclade dataset list command. + pattern: ".+" + +output: + - versions: + type: file + description: File containing software versions + pattern: "versions.yml" + - prefix: + type: path + description: A directory containing the dataset files needed for nextclade run + pattern: "prefix" + +authors: + - "@antunderwood" + - "@drpatelh" diff --git a/modules/nf-core/modules/nextclade/functions.nf b/modules/nf-core/modules/nextclade/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/nextclade/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/nextclade/main.nf b/modules/nf-core/modules/nextclade/main.nf deleted file mode 100644 index 8319f6b1..00000000 --- a/modules/nf-core/modules/nextclade/main.nf +++ /dev/null @@ -1,48 +0,0 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - -process NEXTCLADE { - tag "$meta.id" - label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - - conda (params.enable_conda ? "bioconda::nextclade_js=0.14.4" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/nextclade_js:0.14.4--h9ee0642_0" - } else { - container "quay.io/biocontainers/nextclade_js:0.14.4--h9ee0642_0" - } - - input: - tuple val(meta), path(fasta) - - output: - tuple val(meta), path("${prefix}.csv") , emit: csv - tuple val(meta), path("${prefix}.json") , emit: json - tuple val(meta), path("${prefix}.tree.json") , emit: json_tree - tuple val(meta), path("${prefix}.tsv") , emit: tsv - tuple val(meta), path("${prefix}.clades.tsv"), optional:true, emit: tsv_clades - path "*.version.txt" , emit: version - - script: - def software = getSoftwareName(task.process) - prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" - """ - nextclade \\ - $options.args \\ - --jobs $task.cpus \\ - --input-fasta $fasta \\ - --output-json ${prefix}.json \\ - --output-csv ${prefix}.csv \\ - --output-tsv ${prefix}.tsv \\ - --output-tsv-clades-only ${prefix}.clades.tsv \\ - --output-tree ${prefix}.tree.json - - echo \$(nextclade --version 2>&1) > ${software}.version.txt - """ -} diff --git a/modules/nf-core/modules/nextclade/run/main.nf b/modules/nf-core/modules/nextclade/run/main.nf new file mode 100644 index 00000000..b3d101ce --- /dev/null +++ b/modules/nf-core/modules/nextclade/run/main.nf @@ -0,0 +1,42 @@ +process NEXTCLADE_RUN { + tag "$meta.id" + label 'process_low' + + conda (params.enable_conda ? "bioconda::nextclade=1.10.2" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/nextclade:1.10.2--h9ee0642_0' : + 'quay.io/biocontainers/nextclade:1.10.2--h9ee0642_0' }" + + input: + tuple val(meta), path(fasta) + path dataset + + output: + tuple val(meta), path("${prefix}.csv") , emit: csv + tuple val(meta), path("${prefix}.tsv") , emit: tsv + tuple val(meta), path("${prefix}.json") , emit: json + tuple val(meta), path("${prefix}.tree.json"), emit: json_tree + path "versions.yml" , emit: versions + + script: + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" + """ + nextclade \\ + run \\ + $args \\ + --jobs $task.cpus \\ + --input-fasta $fasta \\ + --input-dataset $dataset \\ + --output-csv ${prefix}.csv \\ + --output-tsv ${prefix}.tsv \\ + --output-json ${prefix}.json \\ + --output-tree ${prefix}.tree.json \\ + --output-basename ${prefix} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + nextclade: \$(nextclade --version 2>&1) + END_VERSIONS + """ +} diff --git a/modules/nf-core/modules/nextclade/meta.yml b/modules/nf-core/modules/nextclade/run/meta.yml similarity index 70% rename from modules/nf-core/modules/nextclade/meta.yml rename to modules/nf-core/modules/nextclade/run/meta.yml index d321e08f..40a863e6 100644 --- a/modules/nf-core/modules/nextclade/meta.yml +++ b/modules/nf-core/modules/nextclade/run/meta.yml @@ -1,17 +1,17 @@ -name: nextclade -description: SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks (Javascript implementation) +name: nextclade_run +description: SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks (C++ implementation) keywords: - nextclade - variant - consensus tools: - nextclade: - description: SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks (Javascript implementation) - homepage: https://clades.nextstrain.org - documentation: None + description: SARS-CoV-2 genome clade assignment, mutation calling, and sequence quality checks + homepage: https://github.com/nextstrain/nextclade + documentation: https://github.com/nextstrain/nextclade tool_dev_url: https://github.com/nextstrain/nextclade doi: "" - licence: ["MIT"] + licence: ['MIT'] input: - meta: @@ -19,6 +19,10 @@ input: description: | Groovy Map containing sample information e.g. [ id:'test', single_end:false ] + - dataset: + type: path + description: Path containing the dataset files obtained by running nextclade dataset get + pattern: "*" - fasta: type: file description: FASTA file containing one or more consensus sequences @@ -30,10 +34,10 @@ output: description: | Groovy Map containing sample information e.g. [ id:'test', single_end:false ] - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" - csv: type: file description: CSV file containing nextclade results @@ -50,10 +54,7 @@ output: type: file description: TSV file containing nextclade results pattern: "*.{tsv}" - - tsv_clades: - type: file - description: TSV file containing nextclade results for clades only - pattern: "*.{clades.tsv}" authors: + - "@antunderwood" - "@drpatelh" diff --git a/modules/nf-core/modules/pangolin/functions.nf b/modules/nf-core/modules/pangolin/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/pangolin/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/pangolin/main.nf b/modules/nf-core/modules/pangolin/main.nf index d1417990..6c8682e3 100644 --- a/modules/nf-core/modules/pangolin/main.nf +++ b/modules/nf-core/modules/pangolin/main.nf @@ -1,40 +1,32 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process PANGOLIN { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? 'bioconda::pangolin=3.1.7' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/pangolin:3.1.7--pyhdfd78af_0' - } else { - container 'quay.io/biocontainers/pangolin:3.1.7--pyhdfd78af_0' - } + conda (params.enable_conda ? 'bioconda::pangolin=3.1.19' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/pangolin:3.1.19--pyhdfd78af_0' : + 'quay.io/biocontainers/pangolin:3.1.19--pyhdfd78af_0' }" input: tuple val(meta), path(fasta) output: tuple val(meta), path('*.csv'), emit: report - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ pangolin \\ $fasta\\ --outfile ${prefix}.pangolin.csv \\ --threads $task.cpus \\ - $options.args + $args - echo \$(pangolin --version) | sed "s/pangolin //g" > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + pangolin: \$(pangolin --version | sed "s/pangolin //g") + END_VERSIONS """ } diff --git a/modules/nf-core/modules/pangolin/meta.yml b/modules/nf-core/modules/pangolin/meta.yml index 2b2eb952..a2c0979a 100644 --- a/modules/nf-core/modules/pangolin/meta.yml +++ b/modules/nf-core/modules/pangolin/meta.yml @@ -10,6 +10,7 @@ tools: Phylogenetic Assignment of Named Global Outbreak LINeages homepage: https://github.com/cov-lineages/pangolin#pangolearn-description manual: https://github.com/cov-lineages/pangolin#pangolearn-description + licence: ['GPL-3.0-or-later'] input: - meta: type: map @@ -24,10 +25,10 @@ output: type: file description: Pangolin lineage report pattern: "*.{csv}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@kevinmenden" - "@drpatelh" diff --git a/modules/nf-core/modules/picard/collectmultiplemetrics/functions.nf b/modules/nf-core/modules/picard/collectmultiplemetrics/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/picard/collectmultiplemetrics/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/picard/collectmultiplemetrics/main.nf b/modules/nf-core/modules/picard/collectmultiplemetrics/main.nf index c0059a40..9511f7a4 100644 --- a/modules/nf-core/modules/picard/collectmultiplemetrics/main.nf +++ b/modules/nf-core/modules/picard/collectmultiplemetrics/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process PICARD_COLLECTMULTIPLEMETRICS { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::picard=2.23.9" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/picard:2.23.9--0" - } else { - container "quay.io/biocontainers/picard:2.23.9--0" - } + conda (params.enable_conda ? "bioconda::picard=2.26.10" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/picard:2.26.10--hdfd78af_0' : + 'quay.io/biocontainers/picard:2.26.10--hdfd78af_0' }" input: tuple val(meta), path(bam) @@ -25,11 +14,11 @@ process PICARD_COLLECTMULTIPLEMETRICS { output: tuple val(meta), path("*_metrics"), emit: metrics tuple val(meta), path("*.pdf") , emit: pdf - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" def avail_mem = 3 if (!task.memory) { log.info '[Picard CollectMultipleMetrics] Available memory not known - defaulting to 3GB. Specify process memory requirements to change this.' @@ -40,11 +29,14 @@ process PICARD_COLLECTMULTIPLEMETRICS { picard \\ -Xmx${avail_mem}g \\ CollectMultipleMetrics \\ - $options.args \\ + $args \\ INPUT=$bam \\ OUTPUT=${prefix}.CollectMultipleMetrics \\ REFERENCE_SEQUENCE=$fasta - echo \$(picard CollectMultipleMetrics --version 2>&1) | grep -o 'Version.*' | cut -f2- -d: > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + picard: \$(picard CollectMultipleMetrics --version 2>&1 | grep -o 'Version.*' | cut -f2- -d:) + END_VERSIONS """ } diff --git a/modules/nf-core/modules/picard/collectmultiplemetrics/meta.yml b/modules/nf-core/modules/picard/collectmultiplemetrics/meta.yml index 34006093..613afc62 100644 --- a/modules/nf-core/modules/picard/collectmultiplemetrics/meta.yml +++ b/modules/nf-core/modules/picard/collectmultiplemetrics/meta.yml @@ -14,6 +14,7 @@ tools: data and formats such as SAM/BAM/CRAM and VCF. homepage: https://broadinstitute.github.io/picard/ documentation: https://broadinstitute.github.io/picard/ + licence: ['MIT'] input: - meta: type: map @@ -41,9 +42,9 @@ output: type: file description: PDF plots of metrics pattern: "*.{pdf}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" diff --git a/modules/nf-core/modules/picard/markduplicates/functions.nf b/modules/nf-core/modules/picard/markduplicates/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/picard/markduplicates/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/picard/markduplicates/main.nf b/modules/nf-core/modules/picard/markduplicates/main.nf index d7647414..7990d7e6 100644 --- a/modules/nf-core/modules/picard/markduplicates/main.nf +++ b/modules/nf-core/modules/picard/markduplicates/main.nf @@ -1,34 +1,24 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process PICARD_MARKDUPLICATES { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::picard=2.23.9" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/picard:2.23.9--0" - } else { - container "quay.io/biocontainers/picard:2.23.9--0" - } + conda (params.enable_conda ? "bioconda::picard=2.26.10" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/picard:2.26.10--hdfd78af_0' : + 'quay.io/biocontainers/picard:2.26.10--hdfd78af_0' }" input: tuple val(meta), path(bam) output: tuple val(meta), path("*.bam") , emit: bam + tuple val(meta), path("*.bai") , optional:true, emit: bai tuple val(meta), path("*.metrics.txt"), emit: metrics - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" def avail_mem = 3 if (!task.memory) { log.info '[Picard MarkDuplicates] Available memory not known - defaulting to 3GB. Specify process memory requirements to change this.' @@ -39,11 +29,14 @@ process PICARD_MARKDUPLICATES { picard \\ -Xmx${avail_mem}g \\ MarkDuplicates \\ - $options.args \\ - INPUT=$bam \\ - OUTPUT=${prefix}.bam \\ - METRICS_FILE=${prefix}.MarkDuplicates.metrics.txt + $args \\ + I=$bam \\ + O=${prefix}.bam \\ + M=${prefix}.MarkDuplicates.metrics.txt - echo \$(picard MarkDuplicates --version 2>&1) | grep -o 'Version:.*' | cut -f2- -d: > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + picard: \$(echo \$(picard MarkDuplicates --version 2>&1) | grep -o 'Version:.*' | cut -f2- -d:) + END_VERSIONS """ } diff --git a/modules/nf-core/modules/picard/markduplicates/meta.yml b/modules/nf-core/modules/picard/markduplicates/meta.yml index 6420ce9a..c9a08b36 100644 --- a/modules/nf-core/modules/picard/markduplicates/meta.yml +++ b/modules/nf-core/modules/picard/markduplicates/meta.yml @@ -1,46 +1,52 @@ name: picard_markduplicates description: Locate and tag duplicate reads in a BAM file keywords: - - markduplicates - - pcr - - duplicates - - bam - - sam - - cram + - markduplicates + - pcr + - duplicates + - bam + - sam + - cram tools: - - picard: - description: | - A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) - data and formats such as SAM/BAM/CRAM and VCF. - homepage: https://broadinstitute.github.io/picard/ - documentation: https://broadinstitute.github.io/picard/ + - picard: + description: | + A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) + data and formats such as SAM/BAM/CRAM and VCF. + homepage: https://broadinstitute.github.io/picard/ + documentation: https://broadinstitute.github.io/picard/ + licence: ['MIT'] input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM file - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: BAM file + pattern: "*.{bam}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM file with duplicate reads marked/removed - pattern: "*.{bam}" - - metrics: - type: file - description: Duplicate metrics file generated by picard - pattern: "*.{metrics.txt}" - - version: - type: file - description: File containing software version - pattern: "*.{version.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: BAM file with duplicate reads marked/removed + pattern: "*.{bam}" + - bai: + type: file + description: An optional BAM index file. If desired, --CREATE_INDEX must be passed as a flag + pattern: "*.{bai}" + - metrics: + type: file + description: Duplicate metrics file generated by picard + pattern: "*.{metrics.txt}" + - versions: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - - "@drpatelh" + - "@drpatelh" + - "@projectoriented" diff --git a/modules/nf-core/modules/plasmidid/functions.nf b/modules/nf-core/modules/plasmidid/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/plasmidid/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/plasmidid/main.nf b/modules/nf-core/modules/plasmidid/main.nf index 986b6451..7404a678 100644 --- a/modules/nf-core/modules/plasmidid/main.nf +++ b/modules/nf-core/modules/plasmidid/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process PLASMIDID { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? 'bioconda::plasmidid=1.6.5' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/plasmidid:1.6.5--hdfd78af_0' - } else { - container 'quay.io/biocontainers/plasmidid:1.6.5--hdfd78af_0' - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/plasmidid:1.6.5--hdfd78af_0' : + 'quay.io/biocontainers/plasmidid:1.6.5--hdfd78af_0' }" input: tuple val(meta), path(scaffold) @@ -31,20 +20,23 @@ process PLASMIDID { tuple val(meta), path("${prefix}/database/") , emit: database tuple val(meta), path("${prefix}/fasta_files/") , emit: fasta_files tuple val(meta), path("${prefix}/kmer/") , emit: kmer - path '*.version.txt' , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" """ plasmidID \\ -d $fasta \\ -s $prefix \\ -c $scaffold \\ - $options.args \\ + $args \\ -o . mv NO_GROUP/$prefix ./$prefix - echo \$(plasmidID --version 2>&1) > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + plasmidid: \$(echo \$(plasmidID --version 2>&1)) + END_VERSIONS """ } diff --git a/modules/nf-core/modules/plasmidid/meta.yml b/modules/nf-core/modules/plasmidid/meta.yml index b7b188f8..8cde23c5 100644 --- a/modules/nf-core/modules/plasmidid/meta.yml +++ b/modules/nf-core/modules/plasmidid/meta.yml @@ -66,10 +66,10 @@ output: type: directory description: Directory containing the kmer files produced by plasmidid pattern: "database" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" diff --git a/modules/nf-core/modules/pycoqc/functions.nf b/modules/nf-core/modules/pycoqc/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/pycoqc/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/pycoqc/main.nf b/modules/nf-core/modules/pycoqc/main.nf index 3f010247..e966b31c 100644 --- a/modules/nf-core/modules/pycoqc/main.nf +++ b/modules/nf-core/modules/pycoqc/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process PYCOQC { tag "$summary" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } conda (params.enable_conda ? "bioconda::pycoqc=2.5.2" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/pycoqc:2.5.2--py_0" - } else { - container "quay.io/biocontainers/pycoqc:2.5.2--py_0" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/pycoqc:2.5.2--py_0' : + 'quay.io/biocontainers/pycoqc:2.5.2--py_0' }" input: path summary @@ -24,17 +13,20 @@ process PYCOQC { output: path "*.html" , emit: html path "*.json" , emit: json - path "*.version.txt", emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' """ pycoQC \\ - $options.args \\ + $args \\ -f $summary \\ -o pycoqc.html \\ -j pycoqc.json - echo \$(pycoQC --version 2>&1) | sed 's/^.*pycoQC v//; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + pycoqc: \$(pycoQC --version 2>&1 | sed 's/^.*pycoQC v//; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/pycoqc/meta.yml b/modules/nf-core/modules/pycoqc/meta.yml index 059b2f15..33bd6b07 100644 --- a/modules/nf-core/modules/pycoqc/meta.yml +++ b/modules/nf-core/modules/pycoqc/meta.yml @@ -38,10 +38,10 @@ output: type: file description: Results in JSON format pattern: "*.{json}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" diff --git a/modules/nf-core/modules/quast/functions.nf b/modules/nf-core/modules/quast/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/quast/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/quast/main.nf b/modules/nf-core/modules/quast/main.nf index 0b94c410..e88051b5 100644 --- a/modules/nf-core/modules/quast/main.nf +++ b/modules/nf-core/modules/quast/main.nf @@ -1,21 +1,10 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process QUAST { label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } conda (params.enable_conda ? 'bioconda::quast=5.0.2' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container 'https://depot.galaxyproject.org/singularity/quast:5.0.2--py37pl526hb5aa323_2' - } else { - container 'quay.io/biocontainers/quast:5.0.2--py37pl526hb5aa323_2' - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/quast:5.0.2--py37pl526hb5aa323_2' : + 'quay.io/biocontainers/quast:5.0.2--py37pl526hb5aa323_2' }" input: path consensus @@ -27,11 +16,11 @@ process QUAST { output: path "${prefix}" , emit: results path '*.tsv' , emit: tsv - path '*.version.txt', emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - prefix = options.suffix ?: software + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: 'quast' def features = use_gff ? "--features $gff" : '' def reference = use_fasta ? "-r $fasta" : '' """ @@ -40,9 +29,14 @@ process QUAST { $reference \\ $features \\ --threads $task.cpus \\ - $options.args \\ + $args \\ ${consensus.join(' ')} + ln -s ${prefix}/report.tsv - echo \$(quast.py --version 2>&1) | sed 's/^.*QUAST v//; s/ .*\$//' > ${software}.version.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + quast: \$(quast.py --version 2>&1 | sed 's/^.*QUAST v//; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/quast/meta.yml b/modules/nf-core/modules/quast/meta.yml index cc79486e..05faa8b8 100644 --- a/modules/nf-core/modules/quast/meta.yml +++ b/modules/nf-core/modules/quast/meta.yml @@ -9,7 +9,8 @@ tools: description: | QUAST calculates quality metrics for genome assemblies homepage: http://bioinf.spbau.ru/quast - doi: + doi: https://doi.org/10.1093/bioinformatics/btt086 + licence: ['GPL-2.0-only'] input: - consensus: type: file @@ -36,10 +37,10 @@ output: pattern: "{prefix}.lineage_report.csv" - report: - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" diff --git a/modules/nf-core/modules/samtools/flagstat/functions.nf b/modules/nf-core/modules/samtools/flagstat/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/samtools/flagstat/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/samtools/flagstat/main.nf b/modules/nf-core/modules/samtools/flagstat/main.nf index a66ea56d..119adf77 100644 --- a/modules/nf-core/modules/samtools/flagstat/main.nf +++ b/modules/nf-core/modules/samtools/flagstat/main.nf @@ -1,34 +1,31 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SAMTOOLS_FLAGSTAT { tag "$meta.id" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::samtools=1.12" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/samtools:1.12--hd5e65b6_0" - } else { - container "quay.io/biocontainers/samtools:1.12--hd5e65b6_0" - } + conda (params.enable_conda ? "bioconda::samtools=1.14" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.14--hb421002_0' : + 'quay.io/biocontainers/samtools:1.14--hb421002_0' }" input: tuple val(meta), path(bam), path(bai) output: tuple val(meta), path("*.flagstat"), emit: flagstat - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' """ - samtools flagstat $bam > ${bam}.flagstat - echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt + samtools \\ + flagstat \\ + --threads ${task.cpus-1} \\ + $bam \\ + > ${bam}.flagstat + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/samtools/flagstat/meta.yml b/modules/nf-core/modules/samtools/flagstat/meta.yml index 8414bf54..9bd9ff89 100644 --- a/modules/nf-core/modules/samtools/flagstat/meta.yml +++ b/modules/nf-core/modules/samtools/flagstat/meta.yml @@ -16,6 +16,7 @@ tools: homepage: http://www.htslib.org/ documentation: hhttp://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map @@ -40,9 +41,9 @@ output: type: file description: File containing samtools flagstat output pattern: "*.{flagstat}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" diff --git a/modules/nf-core/modules/samtools/idxstats/functions.nf b/modules/nf-core/modules/samtools/idxstats/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/samtools/idxstats/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/samtools/idxstats/main.nf b/modules/nf-core/modules/samtools/idxstats/main.nf index ff3cd9a6..fc54e676 100644 --- a/modules/nf-core/modules/samtools/idxstats/main.nf +++ b/modules/nf-core/modules/samtools/idxstats/main.nf @@ -1,34 +1,30 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SAMTOOLS_IDXSTATS { tag "$meta.id" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::samtools=1.12" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/samtools:1.12--hd5e65b6_0" - } else { - container "quay.io/biocontainers/samtools:1.12--hd5e65b6_0" - } + conda (params.enable_conda ? "bioconda::samtools=1.14" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.14--hb421002_0' : + 'quay.io/biocontainers/samtools:1.14--hb421002_0' }" input: tuple val(meta), path(bam), path(bai) output: tuple val(meta), path("*.idxstats"), emit: idxstats - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' """ - samtools idxstats $bam > ${bam}.idxstats - echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt + samtools \\ + idxstats \\ + $bam \\ + > ${bam}.idxstats + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/samtools/idxstats/meta.yml b/modules/nf-core/modules/samtools/idxstats/meta.yml index 530d0772..ec542f34 100644 --- a/modules/nf-core/modules/samtools/idxstats/meta.yml +++ b/modules/nf-core/modules/samtools/idxstats/meta.yml @@ -17,6 +17,7 @@ tools: homepage: http://www.htslib.org/ documentation: hhttp://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map @@ -41,9 +42,9 @@ output: type: file description: File containing samtools idxstats output pattern: "*.{idxstats}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" diff --git a/modules/nf-core/modules/samtools/index/functions.nf b/modules/nf-core/modules/samtools/index/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/samtools/index/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/samtools/index/main.nf b/modules/nf-core/modules/samtools/index/main.nf index 778e9384..c4fa2c63 100644 --- a/modules/nf-core/modules/samtools/index/main.nf +++ b/modules/nf-core/modules/samtools/index/main.nf @@ -1,35 +1,33 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SAMTOOLS_INDEX { tag "$meta.id" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::samtools=1.12" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/samtools:1.12--hd5e65b6_0" - } else { - container "quay.io/biocontainers/samtools:1.12--hd5e65b6_0" - } + conda (params.enable_conda ? "bioconda::samtools=1.14" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.14--hb421002_0' : + 'quay.io/biocontainers/samtools:1.14--hb421002_0' }" input: - tuple val(meta), path(bam) + tuple val(meta), path(input) output: - tuple val(meta), path("*.bai"), optional:true, emit: bai - tuple val(meta), path("*.csi"), optional:true, emit: csi - path "*.version.txt" , emit: version + tuple val(meta), path("*.bai") , optional:true, emit: bai + tuple val(meta), path("*.csi") , optional:true, emit: csi + tuple val(meta), path("*.crai"), optional:true, emit: crai + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' """ - samtools index $options.args $bam - echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt + samtools \\ + index \\ + -@ ${task.cpus-1} \\ + $args \\ + $input + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/samtools/index/meta.yml b/modules/nf-core/modules/samtools/index/meta.yml index 5d076e3b..0905b3cd 100644 --- a/modules/nf-core/modules/samtools/index/meta.yml +++ b/modules/nf-core/modules/samtools/index/meta.yml @@ -14,6 +14,7 @@ tools: homepage: http://www.htslib.org/ documentation: hhttp://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map @@ -34,14 +35,19 @@ output: type: file description: BAM/CRAM/SAM index file pattern: "*.{bai,crai,sai}" + - crai: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" - csi: type: file description: CSI index file pattern: "*.{csi}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@ewels" + - "@maxulysse" diff --git a/modules/nf-core/modules/samtools/mpileup/functions.nf b/modules/nf-core/modules/samtools/mpileup/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/samtools/mpileup/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/samtools/mpileup/main.nf b/modules/nf-core/modules/samtools/mpileup/main.nf index 8f2cebd1..c40f46d1 100644 --- a/modules/nf-core/modules/samtools/mpileup/main.nf +++ b/modules/nf-core/modules/samtools/mpileup/main.nf @@ -1,22 +1,11 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SAMTOOLS_MPILEUP { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::samtools=1.12" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/samtools:1.12--hd5e65b6_0" - } else { - container "quay.io/biocontainers/samtools:1.12--hd5e65b6_0" - } + conda (params.enable_conda ? "bioconda::samtools=1.14" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.14--hb421002_0' : + 'quay.io/biocontainers/samtools:1.14--hb421002_0' }" input: tuple val(meta), path(bam) @@ -24,17 +13,20 @@ process SAMTOOLS_MPILEUP { output: tuple val(meta), path("*.mpileup"), emit: mpileup - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ samtools mpileup \\ --fasta-ref $fasta \\ --output ${prefix}.mpileup \\ - $options.args \\ + $args \\ $bam - echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/samtools/mpileup/meta.yml b/modules/nf-core/modules/samtools/mpileup/meta.yml index 7e432a78..fac7a5bc 100644 --- a/modules/nf-core/modules/samtools/mpileup/meta.yml +++ b/modules/nf-core/modules/samtools/mpileup/meta.yml @@ -14,6 +14,7 @@ tools: homepage: http://www.htslib.org/ documentation: hhttp://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map @@ -38,10 +39,10 @@ output: type: file description: mpileup file pattern: "*.{mpileup}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@joseespinosa" diff --git a/modules/nf-core/modules/samtools/sort/functions.nf b/modules/nf-core/modules/samtools/sort/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/samtools/sort/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/samtools/sort/main.nf b/modules/nf-core/modules/samtools/sort/main.nf index 240e8e9f..42c7bbf4 100644 --- a/modules/nf-core/modules/samtools/sort/main.nf +++ b/modules/nf-core/modules/samtools/sort/main.nf @@ -1,35 +1,28 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SAMTOOLS_SORT { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::samtools=1.12" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/samtools:1.12--hd5e65b6_0" - } else { - container "quay.io/biocontainers/samtools:1.12--hd5e65b6_0" - } + conda (params.enable_conda ? "bioconda::samtools=1.14" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.14--hb421002_0' : + 'quay.io/biocontainers/samtools:1.14--hb421002_0' }" input: tuple val(meta), path(bam) output: tuple val(meta), path("*.bam"), emit: bam - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + if ("$bam" == "${prefix}.bam") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!" """ - samtools sort $options.args -@ $task.cpus -o ${prefix}.bam -T $prefix $bam - echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt + samtools sort $args -@ $task.cpus -o ${prefix}.bam -T $prefix $bam + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/samtools/sort/meta.yml b/modules/nf-core/modules/samtools/sort/meta.yml index 704e8c1f..3402a068 100644 --- a/modules/nf-core/modules/samtools/sort/meta.yml +++ b/modules/nf-core/modules/samtools/sort/meta.yml @@ -14,6 +14,7 @@ tools: homepage: http://www.htslib.org/ documentation: hhttp://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map @@ -34,10 +35,10 @@ output: type: file description: Sorted BAM/CRAM/SAM file pattern: "*.{bam,cram,sam}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@ewels" diff --git a/modules/nf-core/modules/samtools/stats/functions.nf b/modules/nf-core/modules/samtools/stats/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/samtools/stats/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/samtools/stats/main.nf b/modules/nf-core/modules/samtools/stats/main.nf index 6bb0a4c7..7209070d 100644 --- a/modules/nf-core/modules/samtools/stats/main.nf +++ b/modules/nf-core/modules/samtools/stats/main.nf @@ -1,34 +1,34 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SAMTOOLS_STATS { tag "$meta.id" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::samtools=1.12" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/samtools:1.12--hd5e65b6_0" - } else { - container "quay.io/biocontainers/samtools:1.12--hd5e65b6_0" - } + conda (params.enable_conda ? "bioconda::samtools=1.14" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.14--hb421002_0' : + 'quay.io/biocontainers/samtools:1.14--hb421002_0' }" input: - tuple val(meta), path(bam), path(bai) + tuple val(meta), path(input), path(input_index) + path fasta output: tuple val(meta), path("*.stats"), emit: stats - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' + def reference = fasta ? "--reference ${fasta}" : "" """ - samtools stats $bam > ${bam}.stats - echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt + samtools \\ + stats \\ + --threads ${task.cpus-1} \\ + ${reference} \\ + ${input} \\ + > ${input}.stats + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/samtools/stats/meta.yml b/modules/nf-core/modules/samtools/stats/meta.yml index b549ff5c..869e62e3 100644 --- a/modules/nf-core/modules/samtools/stats/meta.yml +++ b/modules/nf-core/modules/samtools/stats/meta.yml @@ -15,20 +15,25 @@ tools: homepage: http://www.htslib.org/ documentation: hhttp://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map description: | Groovy Map containing sample information e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - bai: - type: file - description: Index for BAM/CRAM/SAM file - pattern: "*.{bai,crai,sai}" + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - fasta: + type: optional file + description: Reference file the CRAM was created with + pattern: "*.{fasta,fa}" output: - meta: type: map @@ -39,9 +44,10 @@ output: type: file description: File containing samtools stats output pattern: "*.{stats}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" + - "@FriederikeHanssen" diff --git a/modules/nf-core/modules/samtools/view/functions.nf b/modules/nf-core/modules/samtools/view/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/samtools/view/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/samtools/view/main.nf b/modules/nf-core/modules/samtools/view/main.nf index ec6c747f..cb205d0b 100644 --- a/modules/nf-core/modules/samtools/view/main.nf +++ b/modules/nf-core/modules/samtools/view/main.nf @@ -1,35 +1,41 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SAMTOOLS_VIEW { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::samtools=1.12" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/samtools:1.12--hd5e65b6_0" - } else { - container "quay.io/biocontainers/samtools:1.12--hd5e65b6_0" - } + conda (params.enable_conda ? "bioconda::samtools=1.14" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.14--hb421002_0' : + 'quay.io/biocontainers/samtools:1.14--hb421002_0' }" input: - tuple val(meta), path(bam) + tuple val(meta), path(input) + path fasta output: - tuple val(meta), path("*.bam"), emit: bam - path "*.version.txt" , emit: version + tuple val(meta), path("*.bam") , emit: bam , optional: true + tuple val(meta), path("*.cram"), emit: cram, optional: true + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def reference = fasta ? "--reference ${fasta} -C" : "" + def file_type = input.getExtension() + if ("$input" == "${prefix}.${file_type}") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!" """ - samtools view $options.args $bam > ${prefix}.bam - echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//' > ${software}.version.txt + samtools \\ + view \\ + --threads ${task.cpus-1} \\ + ${reference} \\ + $args \\ + $input \\ + $args2 \\ + > ${prefix}.${file_type} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/samtools/view/meta.yml b/modules/nf-core/modules/samtools/view/meta.yml index c35a8b03..8abf34af 100644 --- a/modules/nf-core/modules/samtools/view/meta.yml +++ b/modules/nf-core/modules/samtools/view/meta.yml @@ -14,16 +14,21 @@ tools: homepage: http://www.htslib.org/ documentation: hhttp://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map description: | Groovy Map containing sample information e.g. [ id:'test', single_end:false ] - - bam: + - input: type: file description: BAM/CRAM/SAM file pattern: "*.{bam,cram,sam}" + - fasta: + type: optional file + description: Reference file the CRAM was created with + pattern: "*.{fasta,fa}" output: - meta: type: map @@ -32,12 +37,17 @@ output: e.g. [ id:'test', single_end:false ] - bam: type: file - description: filtered/converted BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - version: + description: filtered/converted BAM/SAM file + pattern: "*.{bam,sam}" + - cram: + type: file + description: filtered/converted CRAM file + pattern: "*.cram" + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@joseespinosa" + - "@FriederikeHanssen" diff --git a/modules/nf-core/modules/spades/functions.nf b/modules/nf-core/modules/spades/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/spades/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/spades/main.nf b/modules/nf-core/modules/spades/main.nf index c6208053..ba690d35 100644 --- a/modules/nf-core/modules/spades/main.nf +++ b/modules/nf-core/modules/spades/main.nf @@ -1,67 +1,70 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process SPADES { tag "$meta.id" label 'process_high' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::spades=3.15.2" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/spades:3.15.2--h95f258a_1" - } else { - container "quay.io/biocontainers/spades:3.15.2--h95f258a_1" - } + conda (params.enable_conda ? 'bioconda::spades=3.15.3' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/spades:3.15.3--h95f258a_0' : + 'quay.io/biocontainers/spades:3.15.3--h95f258a_0' }" input: - tuple val(meta), path(reads) + tuple val(meta), path(illumina), path(pacbio), path(nanopore) path hmm output: - tuple val(meta), path('*.scaffolds.fa') , optional:true, emit: scaffolds - tuple val(meta), path('*.contigs.fa') , optional:true, emit: contigs - tuple val(meta), path('*.transcripts.fa') , optional:true, emit: transcripts - tuple val(meta), path('*.gene_clusters.fa'), optional:true, emit: gene_clusters - tuple val(meta), path('*.assembly.gfa') , optional:true, emit: gfa - tuple val(meta), path('*.log') , emit: log - path '*.version.txt' , emit: version + tuple val(meta), path('*.scaffolds.fa.gz') , optional:true, emit: scaffolds + tuple val(meta), path('*.contigs.fa.gz') , optional:true, emit: contigs + tuple val(meta), path('*.transcripts.fa.gz') , optional:true, emit: transcripts + tuple val(meta), path('*.gene_clusters.fa.gz'), optional:true, emit: gene_clusters + tuple val(meta), path('*.assembly.gfa.gz') , optional:true, emit: gfa + tuple val(meta), path('*.log') , emit: log + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" - def input_reads = meta.single_end ? "-s $reads" : "-1 ${reads[0]} -2 ${reads[1]}" - def custom_hmms = params.spades_hmm ? "--custom-hmms $hmm" : "" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def maxmem = task.memory.toGiga() + def illumina_reads = illumina ? ( meta.single_end ? "-s $illumina" : "-1 ${illumina[0]} -2 ${illumina[1]}" ) : "" + def pacbio_reads = pacbio ? "--pacbio $pacbio" : "" + def nanopore_reads = nanopore ? "--nanopore $nanopore" : "" + def custom_hmms = hmm ? "--custom-hmms $hmm" : "" """ spades.py \\ - $options.args \\ + $args \\ --threads $task.cpus \\ + --memory $maxmem \\ $custom_hmms \\ - $input_reads \\ + $illumina_reads \\ + $pacbio_reads \\ + $nanopore_reads \\ -o ./ mv spades.log ${prefix}.spades.log if [ -f scaffolds.fasta ]; then mv scaffolds.fasta ${prefix}.scaffolds.fa + gzip -n ${prefix}.scaffolds.fa fi if [ -f contigs.fasta ]; then mv contigs.fasta ${prefix}.contigs.fa + gzip -n ${prefix}.contigs.fa fi if [ -f transcripts.fasta ]; then mv transcripts.fasta ${prefix}.transcripts.fa + gzip -n ${prefix}.transcripts.fa fi if [ -f assembly_graph_with_scaffolds.gfa ]; then mv assembly_graph_with_scaffolds.gfa ${prefix}.assembly.gfa + gzip -n ${prefix}.assembly.gfa fi if [ -f gene_clusters.fasta ]; then mv gene_clusters.fasta ${prefix}.gene_clusters.fa + gzip -n ${prefix}.gene_clusters.fa fi - echo \$(spades.py --version 2>&1) | sed 's/^.*SPAdes genome assembler v//; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + spades: \$(spades.py --version 2>&1 | sed 's/^.*SPAdes genome assembler v//; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/spades/meta.yml b/modules/nf-core/modules/spades/meta.yml index 5a05e5f3..b6878d3d 100644 --- a/modules/nf-core/modules/spades/meta.yml +++ b/modules/nf-core/modules/spades/meta.yml @@ -20,11 +20,20 @@ input: description: | Groovy Map containing sample information e.g. [ id:'test', single_end:false ] - - reads: + - illumina: type: file description: | - List of input FastQ files of size 1 and 2 for single-end and paired-end data, - respectively. + List of input FastQ (Illumina or PacBio CCS reads) files + of size 1 and 2 for single-end and paired-end data, + respectively. This input data type is required. + - pacbio: + type: file + description: | + List of input PacBio CLR FastQ files of size 1. + - nanopore: + type: file + description: | + List of input FastQ files of size 1, originating from Oxford Nanopore technology. - hmm: type: file description: @@ -39,31 +48,38 @@ output: type: file description: | Fasta file containing scaffolds + pattern: "*.fa.gz" - contigs: type: file description: | Fasta file containing contigs + pattern: "*.fa.gz" - transcripts: type: file description: | Fasta file containing transcripts + pattern: "*.fa.gz" - gene_clusters: type: file description: | Fasta file containing gene_clusters + pattern: "*.fa.gz" - gfa: type: file description: | gfa file containing assembly + pattern: "*.gfa.gz" - log: type: file description: | Spades log file - - version: + pattern: "*.log" + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@JoseEspinosa" - "@drpatelh" + - "@d4straub" diff --git a/modules/nf-core/modules/tabix/bgzip/functions.nf b/modules/nf-core/modules/tabix/bgzip/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/tabix/bgzip/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/tabix/bgzip/main.nf b/modules/nf-core/modules/tabix/bgzip/main.nf index 56a351db..ed9362b2 100644 --- a/modules/nf-core/modules/tabix/bgzip/main.nf +++ b/modules/nf-core/modules/tabix/bgzip/main.nf @@ -1,36 +1,28 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process TABIX_BGZIP { tag "$meta.id" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::tabix=0.2.6" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/tabix:0.2.6--ha92aebf_0" - } else { - container "quay.io/biocontainers/tabix:0.2.6--ha92aebf_0" - } + conda (params.enable_conda ? 'bioconda::tabix=1.11' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/tabix:1.11--hdfd78af_0' : + 'quay.io/biocontainers/tabix:1.11--hdfd78af_0' }" input: tuple val(meta), path(input) output: tuple val(meta), path("*.gz"), emit: gz - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" """ - bgzip -c $options.args $input > ${prefix}.${input.getExtension()}.gz + bgzip -c $args $input > ${prefix}.${input.getExtension()}.gz - echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/(.*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/tabix/bgzip/meta.yml b/modules/nf-core/modules/tabix/bgzip/meta.yml index 686d72e6..f8318c7c 100644 --- a/modules/nf-core/modules/tabix/bgzip/meta.yml +++ b/modules/nf-core/modules/tabix/bgzip/meta.yml @@ -11,6 +11,7 @@ tools: homepage: https://www.htslib.org/doc/tabix.html documentation: http://www.htslib.org/doc/bgzip.html doi: 10.1093/bioinformatics/btp352 + licence: ['MIT'] input: - meta: type: map @@ -30,10 +31,10 @@ output: type: file description: Output compressed file pattern: "*.{gz}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/tabix/tabix/functions.nf b/modules/nf-core/modules/tabix/tabix/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/tabix/tabix/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/tabix/tabix/main.nf b/modules/nf-core/modules/tabix/tabix/main.nf index da23f535..c721a554 100644 --- a/modules/nf-core/modules/tabix/tabix/main.nf +++ b/modules/nf-core/modules/tabix/tabix/main.nf @@ -1,35 +1,27 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process TABIX_TABIX { tag "$meta.id" label 'process_medium' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } - conda (params.enable_conda ? "bioconda::tabix=0.2.6" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/tabix:0.2.6--ha92aebf_0" - } else { - container "quay.io/biocontainers/tabix:0.2.6--ha92aebf_0" - } + conda (params.enable_conda ? 'bioconda::tabix=1.11' : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/tabix:1.11--hdfd78af_0' : + 'quay.io/biocontainers/tabix:1.11--hdfd78af_0' }" input: tuple val(meta), path(tab) output: tuple val(meta), path("*.tbi"), emit: tbi - path "*.version.txt" , emit: version + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' """ - tabix $options.args $tab + tabix $args $tab - echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/(.*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/tabix/tabix/meta.yml b/modules/nf-core/modules/tabix/tabix/meta.yml index f66270db..2e37c4ff 100644 --- a/modules/nf-core/modules/tabix/tabix/meta.yml +++ b/modules/nf-core/modules/tabix/tabix/meta.yml @@ -10,6 +10,7 @@ tools: homepage: https://www.htslib.org/doc/tabix.html documentation: https://www.htslib.org/doc/tabix.1.html doi: 10.1093/bioinformatics/btq671 + licence: ['MIT'] input: - meta: type: map @@ -30,10 +31,10 @@ output: type: file description: tabix index file pattern: "*.{tbi}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/unicycler/functions.nf b/modules/nf-core/modules/unicycler/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/unicycler/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/unicycler/main.nf b/modules/nf-core/modules/unicycler/main.nf index 320c0f29..1ccc72a9 100644 --- a/modules/nf-core/modules/unicycler/main.nf +++ b/modules/nf-core/modules/unicycler/main.nf @@ -1,47 +1,43 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process UNICYCLER { tag "$meta.id" label 'process_high' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:meta, publish_by_meta:['id']) } conda (params.enable_conda ? 'bioconda::unicycler=0.4.8' : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://depot.galaxyproject.org/singularity/unicycler:0.4.8--py38h8162308_3" - } else { - container "quay.io/biocontainers/unicycler:0.4.8--py38h8162308_3" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/unicycler:0.4.8--py38h8162308_3' : + 'quay.io/biocontainers/unicycler:0.4.8--py38h8162308_3' }" input: - tuple val(meta), path(reads) + tuple val(meta), path(shortreads), path(longreads) output: - tuple val(meta), path('*.scaffolds.fa'), emit: scaffolds - tuple val(meta), path('*.assembly.gfa'), emit: gfa - tuple val(meta), path('*.log') , emit: log - path '*.version.txt' , emit: version + tuple val(meta), path('*.scaffolds.fa.gz'), emit: scaffolds + tuple val(meta), path('*.assembly.gfa.gz'), emit: gfa + tuple val(meta), path('*.log') , emit: log + path "versions.yml" , emit: versions script: - def software = getSoftwareName(task.process) - def prefix = options.suffix ? "${meta.id}${options.suffix}" : "${meta.id}" - def input_reads = meta.single_end ? "-s $reads" : "-1 ${reads[0]} -2 ${reads[1]}" + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def short_reads = shortreads ? ( meta.single_end ? "-s $shortreads" : "-1 ${shortreads[0]} -2 ${shortreads[1]}" ) : "" + def long_reads = longreads ? "-l $longreads" : "" """ unicycler \\ --threads $task.cpus \\ - $options.args \\ - $input_reads \\ + $args \\ + $short_reads \\ + $long_reads \\ --out ./ mv assembly.fasta ${prefix}.scaffolds.fa + gzip -n ${prefix}.scaffolds.fa mv assembly.gfa ${prefix}.assembly.gfa + gzip -n ${prefix}.assembly.gfa mv unicycler.log ${prefix}.unicycler.log - echo \$(unicycler --version 2>&1) | sed 's/^.*Unicycler v//; s/ .*\$//' > ${software}.version.txt + cat <<-END_VERSIONS > versions.yml + "${task.process}": + unicycler: \$(echo \$(unicycler --version 2>&1) | sed 's/^.*Unicycler v//; s/ .*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/unicycler/meta.yml b/modules/nf-core/modules/unicycler/meta.yml index 286b7f67..b04ac882 100644 --- a/modules/nf-core/modules/unicycler/meta.yml +++ b/modules/nf-core/modules/unicycler/meta.yml @@ -19,37 +19,42 @@ input: description: | Groovy Map containing sample information e.g. [ id:'test', single_end:false ] - - reads: + - shortreads: type: file description: | - List of input FastQ files of size 1 and 2 for single-end and paired-end data, + List of input Illumina FastQ files of size 1 and 2 for single-end and paired-end data, respectively. + - longreads: + type: file + description: | + List of input FastQ files of size 1, PacBio or Nanopore long reads. output: - meta: type: map description: | Groovy Map containing sample information e.g. [ id:'test', single_end:false ] - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" - scaffolds: type: file description: Fasta file containing scaffolds - pattern: "*.{scaffolds.fa}" + pattern: "*.{scaffolds.fa.gz}" - gfa: type: file description: gfa file containing assembly - pattern: "*.{assembly.gfa}" + pattern: "*.{assembly.gfa.gz}" - log: type: file description: unicycler log file pattern: "*.{log}" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@JoseEspinosa" - "@drpatelh" + - "@d4straub" diff --git a/modules/nf-core/modules/untar/functions.nf b/modules/nf-core/modules/untar/functions.nf deleted file mode 100644 index da9da093..00000000 --- a/modules/nf-core/modules/untar/functions.nf +++ /dev/null @@ -1,68 +0,0 @@ -// -// Utility functions used in nf-core DSL2 module files -// - -// -// Extract name of software tool from process name using $task.process -// -def getSoftwareName(task_process) { - return task_process.tokenize(':')[-1].tokenize('_')[0].toLowerCase() -} - -// -// Function to initialise default values and to generate a Groovy Map of available options for nf-core modules -// -def initOptions(Map args) { - def Map options = [:] - options.args = args.args ?: '' - options.args2 = args.args2 ?: '' - options.args3 = args.args3 ?: '' - options.publish_by_meta = args.publish_by_meta ?: [] - options.publish_dir = args.publish_dir ?: '' - options.publish_files = args.publish_files - options.suffix = args.suffix ?: '' - return options -} - -// -// Tidy up and join elements of a list to return a path string -// -def getPathFromList(path_list) { - def paths = path_list.findAll { item -> !item?.trim().isEmpty() } // Remove empty entries - paths = paths.collect { it.trim().replaceAll("^[/]+|[/]+\$", "") } // Trim whitespace and trailing slashes - return paths.join('/') -} - -// -// Function to save/publish module results -// -def saveFiles(Map args) { - if (!args.filename.endsWith('.version.txt')) { - def ioptions = initOptions(args.options) - def path_list = [ ioptions.publish_dir ?: args.publish_dir ] - if (ioptions.publish_by_meta) { - def key_list = ioptions.publish_by_meta instanceof List ? ioptions.publish_by_meta : args.publish_by_meta - for (key in key_list) { - if (args.meta && key instanceof String) { - def path = key - if (args.meta.containsKey(key)) { - path = args.meta[key] instanceof Boolean ? "${key}_${args.meta[key]}".toString() : args.meta[key] - } - path = path instanceof String ? path : '' - path_list.add(path) - } - } - } - if (ioptions.publish_files instanceof Map) { - for (ext in ioptions.publish_files) { - if (args.filename.endsWith(ext.key)) { - def ext_list = path_list.collect() - ext_list.add(ext.value) - return "${getPathFromList(ext_list)}/$args.filename" - } - } - } else if (ioptions.publish_files == null) { - return "${getPathFromList(path_list)}/$args.filename" - } - } -} diff --git a/modules/nf-core/modules/untar/main.nf b/modules/nf-core/modules/untar/main.nf index fc6d7ec5..6d1996e7 100644 --- a/modules/nf-core/modules/untar/main.nf +++ b/modules/nf-core/modules/untar/main.nf @@ -1,35 +1,33 @@ -// Import generic module functions -include { initOptions; saveFiles; getSoftwareName } from './functions' - -params.options = [:] -options = initOptions(params.options) - process UNTAR { tag "$archive" label 'process_low' - publishDir "${params.outdir}", - mode: params.publish_dir_mode, - saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), meta:[:], publish_by_meta:[]) } conda (params.enable_conda ? "conda-forge::sed=4.7" : null) - if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) { - container "https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img" - } else { - container "biocontainers/biocontainers:v1.2.0_cv1" - } + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://containers.biocontainers.pro/s3/SingImgsRepo/biocontainers/v1.2.0_cv1/biocontainers_v1.2.0_cv1.img' : + 'biocontainers/biocontainers:v1.2.0_cv1' }" input: path archive output: - path "$untar" , emit: untar - path "*.version.txt", emit: version + path "$untar" , emit: untar + path "versions.yml", emit: versions script: - def software = getSoftwareName(task.process) + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' untar = archive.toString() - '.tar.gz' """ - tar -xzvf $options.args $archive - echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//' > ${software}.version.txt + tar \\ + -xzvf \\ + $args \\ + $archive \\ + $args2 \\ + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + untar: \$(echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//') + END_VERSIONS """ } diff --git a/modules/nf-core/modules/untar/meta.yml b/modules/nf-core/modules/untar/meta.yml index af4674f0..51f94995 100644 --- a/modules/nf-core/modules/untar/meta.yml +++ b/modules/nf-core/modules/untar/meta.yml @@ -8,6 +8,7 @@ tools: description: | Extract tar.gz files. documentation: https://www.gnu.org/software/tar/manual/ + licence: ['GPL-3.0-or-later'] input: - archive: type: file @@ -18,10 +19,10 @@ output: type: file description: pattern: "*.*" - - version: + - versions: type: file - description: File containing software version - pattern: "*.{version.txt}" + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/modules/vcflib/vcfuniq/main.nf b/modules/nf-core/modules/vcflib/vcfuniq/main.nf new file mode 100644 index 00000000..37fc51bc --- /dev/null +++ b/modules/nf-core/modules/vcflib/vcfuniq/main.nf @@ -0,0 +1,32 @@ +def VERSION = '1.0.2' // Version information not provided by tool on CLI + +process VCFLIB_VCFUNIQ { + tag "$meta.id" + label 'process_low' + + conda (params.enable_conda ? "bioconda::vcflib=1.0.2" : null) + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/vcflib:1.0.2--h3198e80_5': + 'quay.io/biocontainers/vcflib:1.0.2--h3198e80_5' }" + + input: + tuple val(meta), path(vcf), path(tbi) + + output: + tuple val(meta), path("*.gz"), emit: vcf + path "versions.yml" , emit: versions + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + vcfuniq \\ + $vcf \\ + | bgzip -c $args > ${prefix}.vcf.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + vcflib: $VERSION + END_VERSIONS + """ +} diff --git a/modules/nf-core/modules/vcflib/vcfuniq/meta.yml b/modules/nf-core/modules/vcflib/vcfuniq/meta.yml new file mode 100644 index 00000000..3bfc679b --- /dev/null +++ b/modules/nf-core/modules/vcflib/vcfuniq/meta.yml @@ -0,0 +1,46 @@ +name: vcflib_vcfuniq +description: List unique genotypes. Like GNU uniq, but for VCF records. Remove records which have the same position, ref, and alt as the previous record. +keywords: + - vcf + - uniq + - deduplicate +tools: + - vcflib: + description: Command-line tools for manipulating VCF files + homepage: https://github.com/vcflib/vcflib + documentation: https://github.com/vcflib/vcflib#USAGE + doi: "https://doi.org/10.1101/2021.05.21.445151" + licence: ['MIT'] + +input: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: Compressed VCF file + pattern: "*.vcf.gz" + - tbi: + type: file + description: Index of VCF file + pattern: "*.vcf.gz.tbi" + +output: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - versions: + type: file + description: File containing software versions + pattern: "versions.yml" + - vcf: + type: file + description: Compressed VCF file + pattern: "*.vcf.gz" + +authors: + - "@drpatelh" diff --git a/nextflow.config b/nextflow.config index 0c310aab..19e1f2a0 100644 --- a/nextflow.config +++ b/nextflow.config @@ -10,119 +10,116 @@ params { // Input options - input = null - platform = null - protocol = null + input = null + platform = null + protocol = null // Reference genome options - genome = null - primer_set = null - primer_set_version = null - primer_fasta = null - primer_left_suffix = '_LEFT' - primer_right_suffix = '_RIGHT' - save_reference = false + genome = null + primer_set = null + primer_set_version = null + primer_fasta = null + primer_left_suffix = '_LEFT' + primer_right_suffix = '_RIGHT' + save_reference = false // Nanopore options - fastq_dir = null - fast5_dir = null - sequencing_summary = null - min_barcode_reads = 100 - min_guppyplex_reads = 10 - artic_minion_caller = 'nanopolish' - artic_minion_aligner = 'minimap2' - artic_minion_medaka_model = null - skip_pycoqc = false - skip_nanoplot = false + fastq_dir = null + fast5_dir = null + sequencing_summary = null + min_barcode_reads = 100 + min_guppyplex_reads = 10 + artic_minion_caller = 'nanopolish' + artic_minion_aligner = 'minimap2' + artic_minion_medaka_model = null + skip_pycoqc = false + skip_nanoplot = false // Nanopore/Illumina options - asciigenome_read_depth = 50 - asciigenome_window_size = 50 - multiqc_title = null - multiqc_config = null - max_multiqc_email_size = '25.MB' - skip_mosdepth = false - skip_pangolin = false - skip_nextclade = false - skip_asciigenome = false - skip_variants_quast = false - skip_multiqc = false + asciigenome_read_depth = 50 + asciigenome_window_size = 50 + multiqc_title = null + multiqc_config = null + max_multiqc_email_size = '25.MB' + skip_mosdepth = false + skip_pangolin = false + skip_nextclade = false + skip_variants_quast = false + skip_snpeff = false + skip_asciigenome = false + skip_variants_long_table = false + skip_multiqc = false // Illumina QC, read trimming and filtering options - kraken2_db = 's3://nf-core-awsmegatests/viralrecon/input_data/kraken2_human.tar.gz' - kraken2_db_name = 'human' + kraken2_db = 's3://nf-core-awsmegatests/viralrecon/input_data/kraken2_human.tar.gz' + kraken2_db_name = 'human' kraken2_variants_host_filter = false kraken2_assembly_host_filter = true - save_trimmed_fail = false - skip_fastqc = false - skip_kraken2 = false - skip_fastp = false - skip_cutadapt = false + save_trimmed_fail = false + skip_fastqc = false + skip_kraken2 = false + skip_fastp = false + skip_cutadapt = false // Illumina variant calling options - callers = null - min_mapped_reads = 1000 - ivar_trim_noprimer = false - ivar_trim_offset = null - filter_duplicates = false - save_unaligned = false - save_mpileup = false - skip_ivar_trim = false - skip_markduplicates = true - skip_picard_metrics = false - skip_snpeff = false - skip_consensus = false - skip_variants = false + variant_caller = null + consensus_caller = 'bcftools' + min_mapped_reads = 1000 + ivar_trim_noprimer = false + ivar_trim_offset = null + filter_duplicates = false + save_unaligned = false + save_mpileup = false + skip_ivar_trim = false + skip_markduplicates = true + skip_picard_metrics = false + skip_consensus_plots = false + skip_consensus = false + skip_variants = false // Illumina de novo assembly options - assemblers = 'spades' - spades_mode = 'rnaviral' - spades_hmm = null - blast_db = null - skip_bandage = false - skip_blast = false - skip_abacas = false - skip_plasmidid = false - skip_assembly_quast = false - skip_assembly = false + assemblers = 'spades' + spades_mode = 'rnaviral' + spades_hmm = null + blast_db = null + skip_bandage = false + skip_blast = false + skip_abacas = false + skip_plasmidid = false + skip_assembly_quast = false + skip_assembly = false // Boilerplate options - outdir = './results' - tracedir = "${params.outdir}/pipeline_info" - publish_dir_mode = 'copy' - email = null - email_on_fail = null - plaintext_email = false - monochrome_logs = false - help = false - enable_conda = false - singularity_pull_docker_container = false - validate_params = true - show_hidden_params = false - schema_ignore_params = 'genomes,modules,igenomes_base' + outdir = './results' + tracedir = "${params.outdir}/pipeline_info" + email = null + email_on_fail = null + plaintext_email = false + monochrome_logs = false + help = false + validate_params = true + show_hidden_params = false + schema_ignore_params = 'genomes,igenomes_base' + enable_conda = false // Config options - custom_config_version = 'master' - custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}" - hostnames = [:] - config_profile_name = null - config_profile_description = null - config_profile_contact = null - config_profile_url = null + custom_config_version = 'master' + custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}" + config_profile_description = null + config_profile_contact = null + config_profile_url = null + config_profile_name = null // Max resource options // Defaults only, expecting to be overwritten - max_memory = '128.GB' - max_cpus = 16 - max_time = '240.h' + max_memory = '128.GB' + max_cpus = 16 + max_time = '240.h' } // Load base.config by default for all pipelines includeConfig 'conf/base.config' -// Load modules.config for DSL2 module specific options -includeConfig 'conf/modules.config' - // Load nf-core custom profiles from different Institutions try { includeConfig "${params.custom_config_base}/nfcore_custom.config" @@ -199,10 +196,14 @@ profiles { conda { createTimeout = "120 min" } // Export these variables to prevent local Python/R libraries from conflicting with those in the container +// The JULIA depot path has been adjusted to a fixed path `/usr/local/share/julia` that needs to be used for packages in the container. +// See https://apeltzer.github.io/post/03-julia-lang-nextflow/ for details on that. Once we have a common agreement on where to keep Julia packages, this is adjustable. + env { PYTHONNOUSERSITE = 1 R_PROFILE_USER = "/.Rprofile" R_ENVIRON_USER = "/.Renviron" + JULIA_DEPOT_PATH = "/usr/local/share/julia" } // Capture exit codes from upstream processes when piping @@ -232,10 +233,13 @@ manifest { homePage = 'https://github.com/nf-core/viralrecon' description = 'Assembly and intrahost/low-frequency variant calling for viral samples' mainScript = 'main.nf' - nextflowVersion = '!>=21.04.0' - version = '2.2' + nextflowVersion = '!>=21.10.3' + version = '2.3' } +// Load modules.config for DSL2 module specific options +includeConfig 'conf/modules.config' + // Function to ensure that resource requirements don't go beyond // a maximum limit def check_max(obj, type) { diff --git a/nextflow_schema.json b/nextflow_schema.json index 965b8b21..553f338c 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -24,12 +24,20 @@ "platform": { "type": "string", "fa_icon": "fas fa-hdd", - "description": "NGS platform used to sequence the samples" + "description": "NGS platform used to sequence the samples.", + "enum": [ + "illumina", + "nanopore" + ] }, "protocol": { "type": "string", - "description": "Specifies the type of protocol used for sequencing i.e. 'metagenomic' or 'amplicon'.", - "fa_icon": "fas fa-vials" + "description": "Specifies the type of protocol used for sequencing.", + "fa_icon": "fas fa-vials", + "enum": [ + "metagenomic", + "amplicon" + ] }, "outdir": { "type": "string", @@ -176,14 +184,22 @@ "artic_minion_caller": { "type": "string", "default": "nanopolish", - "description": "Variant caller used when running artic minion. Available options are 'medaka' and 'nanopolish' (default).", - "fa_icon": "fas fa-phone-volume" + "description": "Variant caller used when running artic minion (default: 'nanopolish').", + "fa_icon": "fas fa-phone-volume", + "enum": [ + "nanopolish", + "medaka" + ] }, "artic_minion_aligner": { "type": "string", "default": "minimap2", - "description": "Aligner used when running artic minion. Available options are 'bwa' and 'minimap2' (default).", - "fa_icon": "fas fa-map-signs" + "description": "Aligner used when running artic minion (default: 'minimap2').", + "fa_icon": "fas fa-map-signs", + "enum": [ + "minimap2", + "bwa" + ] }, "artic_scheme": { "type": "string", @@ -215,6 +231,26 @@ "description": "Options common to both the Nanopore and Illumina workflows in the pipeline.", "default": "", "properties": { + "nextclade_dataset": { + "type": "string", + "description": "Full path to Nextclade dataset required for 'nextclade run' command.", + "fa_icon": "fas fa-project-diagram" + }, + "nextclade_dataset_name": { + "type": "string", + "description": "Name of Nextclade dataset to retrieve. A list of available datasets can be obtained using the 'nextclade dataset list' command.", + "fa_icon": "fas fa-project-diagram" + }, + "nextclade_dataset_reference": { + "type": "string", + "description": "Accession id to download dataset based on a particular reference sequence. A list of available datasets can be obtained using the 'nextclade dataset list' command.", + "fa_icon": "fas fa-project-diagram" + }, + "nextclade_dataset_tag": { + "type": "string", + "description": "Version tag of the dataset to download. A list of available datasets can be obtained using the 'nextclade dataset list' command.", + "fa_icon": "fas fa-project-diagram" + }, "asciigenome_read_depth": { "type": "integer", "default": 50, @@ -274,6 +310,11 @@ "fa_icon": "fas fa-fast-forward", "description": "Skip generation of QUAST aggregated report for consensus sequences." }, + "skip_variants_long_table": { + "type": "boolean", + "fa_icon": "fas fa-fast-forward", + "description": "Skip long table generation for reporting variants." + }, "skip_multiqc": { "type": "boolean", "fa_icon": "fas fa-fast-forward", @@ -347,10 +388,24 @@ "description": "Various options for the variant calling branch of the Illumina workflow.", "default": "", "properties": { - "callers": { + "variant_caller": { + "type": "string", + "fa_icon": "fas fa-phone-volume", + "description": "Specify which variant calling algorithm you would like to use. Available options are 'ivar' (default for '--protocol amplicon') and 'bcftools' (default for '--protocol metagenomic').", + "enum": [ + "ivar", + "bcftools" + ] + }, + "consensus_caller": { "type": "string", - "description": "Specify which variant calling algorithms you would like to use. Available options are 'ivar' (default for '--protocol amplicon') and 'bcftools' (default for '--protocol metagenomic').", - "fa_icon": "fas fa-phone-volume" + "default": "bcftools", + "fa_icon": "fas fa-phone-volume", + "description": "Specify which consensus calling algorithm you would like to use. Available options are 'bcftools' and 'ivar' (default: 'bcftools').", + "enum": [ + "ivar", + "bcftools" + ] }, "min_mapped_reads": { "type": "integer", @@ -405,6 +460,11 @@ "fa_icon": "fas fa-fast-forward", "description": "Skip SnpEff and SnpSift annotation of variants." }, + "skip_consensus_plots": { + "type": "boolean", + "fa_icon": "fas fa-fast-forward", + "description": "Skip creation of consensus base density plots." + }, "skip_consensus": { "type": "boolean", "fa_icon": "fas fa-fast-forward", @@ -434,7 +494,18 @@ "type": "string", "default": "rnaviral", "fa_icon": "fab fa-digg", - "description": "Specify the SPAdes mode you would like to run. Supported options are 'rnaviral', 'corona', 'metaviral', 'meta', 'metaplasmid', 'plasmid', 'isolate', 'rna', 'bio'." + "description": "Specify the SPAdes mode you would like to run (default: 'rnaviral').", + "enum": [ + "rnaviral", + "corona", + "metaviral", + "meta", + "metaplasmid", + "plasmid", + "isolate", + "rna", + "bio" + ] }, "spades_hmm": { "type": "string", @@ -494,22 +565,6 @@ "hidden": true, "fa_icon": "fas fa-question-circle" }, - "publish_dir_mode": { - "type": "string", - "default": "copy", - "hidden": true, - "description": "Method used to save pipeline results to output directory.", - "help_text": "The Nextflow `publishDir` option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See [Nextflow docs](https://www.nextflow.io/docs/latest/process.html#publishdir) for details.", - "fa_icon": "fas fa-copy", - "enum": [ - "symlink", - "rellink", - "link", - "copy", - "copyNoFollow", - "move" - ] - }, "email_on_fail": { "type": "string", "description": "Email address for completion summary, only when pipeline fails.", @@ -546,13 +601,6 @@ "description": "Run this workflow with Conda. You can also use '-profile conda' instead of providing this parameter.", "fa_icon": "fas fa-bacon" }, - "singularity_pull_docker_container": { - "type": "boolean", - "hidden": true, - "description": "Instead of directly downloading Singularity images for use with Singularity, force the workflow to pull and convert Docker containers instead.", - "fa_icon": "fas fa-toolbox", - "help_text": "This may be useful for example if you are unable to directly pull Singularity containers to run the pipeline due to http/https proxy issues." - }, "validate_params": { "type": "boolean", "description": "Boolean whether to validate parameters against the schema at runtime", @@ -628,12 +676,6 @@ "help_text": "If you're running offline, nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell nextflow where to find them with the `custom_config_base` option. For example:\n\n```bash\n## Download and unzip the config files\ncd /path/to/my/configs\nwget https://github.com/nf-core/configs/archive/master.zip\nunzip master.zip\n\n## Run the pipeline\ncd /path/to/my/data\nnextflow run /path/to/pipeline/ --custom_config_base /path/to/my/configs/configs-master/\n```\n\n> Note that the nf-core/tools helper package has a `download` command to download all required pipeline files + singularity containers + institutional configs in one go for you, to make this process easier.", "fa_icon": "fas fa-users-cog" }, - "hostnames": { - "type": "string", - "description": "Institutional configs hostname.", - "hidden": true, - "fa_icon": "fas fa-users-cog" - }, "config_profile_name": { "type": "string", "description": "Institutional config name.", diff --git a/subworkflows/local/assembly_minia.nf b/subworkflows/local/assembly_minia.nf index 16e755e5..43d34c6e 100644 --- a/subworkflows/local/assembly_minia.nf +++ b/subworkflows/local/assembly_minia.nf @@ -2,15 +2,9 @@ // Assembly and downstream processing for minia scaffolds // -params.minia_options = [:] -params.blastn_options = [:] -params.blastn_filter_options = [:] -params.abacas_options = [:] -params.plasmidid_options = [:] -params.quast_options = [:] +include { MINIA } from '../../modules/nf-core/modules/minia/main' -include { MINIA } from '../../modules/nf-core/modules/minia/main' addParams( options: params.minia_options ) -include { ASSEMBLY_QC } from './assembly_qc' addParams( blastn_options: params.blastn_options, blastn_filter_options: params.blastn_filter_options, abacas_options: params.abacas_options, plasmidid_options: params.plasmidid_options, quast_options: params.quast_options ) +include { ASSEMBLY_QC } from './assembly_qc' workflow ASSEMBLY_MINIA { take: @@ -22,10 +16,15 @@ workflow ASSEMBLY_MINIA { main: + ch_versions = Channel.empty() + // // Assemble reads with minia // - MINIA ( reads ) + MINIA ( + reads + ) + ch_versions = ch_versions.mix(MINIA.out.versions.first()) // // Filter for empty contig files @@ -46,23 +45,20 @@ workflow ASSEMBLY_MINIA { blast_db, blast_header ) + ch_versions = ch_versions.mix(ASSEMBLY_QC.out.versions) emit: contigs = MINIA.out.contigs // channel: [ val(meta), [ contigs ] ] unitigs = MINIA.out.unitigs // channel: [ val(meta), [ unitigs ] ] h5 = MINIA.out.h5 // channel: [ val(meta), [ h5 ] ] - minia_version = MINIA.out.version // path: *.version.txt blast_txt = ASSEMBLY_QC.out.blast_txt // channel: [ val(meta), [ txt ] ] blast_filter_txt = ASSEMBLY_QC.out.blast_filter_txt // channel: [ val(meta), [ txt ] ] - blast_version = ASSEMBLY_QC.out.blast_version // path: *.version.txt quast_results = ASSEMBLY_QC.out.quast_results // channel: [ val(meta), [ results ] ] quast_tsv = ASSEMBLY_QC.out.quast_tsv // channel: [ val(meta), [ tsv ] ] - quast_version = ASSEMBLY_QC.out.quast_version // path: *.version.txt abacas_results = ASSEMBLY_QC.out.abacas_results // channel: [ val(meta), [ results ] ] - abacas_version = ASSEMBLY_QC.out.abacas_version // path: *.version.txt plasmidid_html = ASSEMBLY_QC.out.plasmidid_html // channel: [ val(meta), [ html ] ] plasmidid_tab = ASSEMBLY_QC.out.plasmidid_tab // channel: [ val(meta), [ tab ] ] @@ -72,5 +68,6 @@ workflow ASSEMBLY_MINIA { plasmidid_database = ASSEMBLY_QC.out.plasmidid_database // channel: [ val(meta), [ database/ ] ] plasmidid_fasta = ASSEMBLY_QC.out.plasmidid_fasta // channel: [ val(meta), [ fasta_files/ ] ] plasmidid_kmer = ASSEMBLY_QC.out.plasmidid_kmer // channel: [ val(meta), [ kmer/ ] ] - plasmidid_version = ASSEMBLY_QC.out.plasmidid_version // path: *.version.txt + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/assembly_qc.nf b/subworkflows/local/assembly_qc.nf index 090a85a0..9c42c652 100644 --- a/subworkflows/local/assembly_qc.nf +++ b/subworkflows/local/assembly_qc.nf @@ -2,17 +2,11 @@ // Downstream analysis for assembly scaffolds // -params.blastn_options = [:] -params.blastn_filter_options = [:] -params.abacas_options = [:] -params.plasmidid_options = [:] -params.quast_options = [:] - -include { FILTER_BLASTN } from '../../modules/local/filter_blastn' addParams( options: params.blastn_filter_options ) -include { ABACAS } from '../../modules/nf-core/modules/abacas/main' addParams( options: params.abacas_options ) -include { BLAST_BLASTN } from '../../modules/nf-core/modules/blast/blastn/main' addParams( options: params.blastn_options ) -include { PLASMIDID } from '../../modules/nf-core/modules/plasmidid/main' addParams( options: params.plasmidid_options ) -include { QUAST } from '../../modules/nf-core/modules/quast/main' addParams( options: params.quast_options ) +include { FILTER_BLASTN } from '../../modules/local/filter_blastn' +include { ABACAS } from '../../modules/nf-core/modules/abacas/main' +include { BLAST_BLASTN } from '../../modules/nf-core/modules/blast/blastn/main' +include { PLASMIDID } from '../../modules/nf-core/modules/plasmidid/main' +include { QUAST } from '../../modules/nf-core/modules/quast/main' workflow ASSEMBLY_QC { take: @@ -24,19 +18,27 @@ workflow ASSEMBLY_QC { main: + ch_versions = Channel.empty() + // // Run blastn on assembly scaffolds // ch_blast_txt = Channel.empty() ch_blast_filter_txt = Channel.empty() - ch_blast_version = Channel.empty() if (!params.skip_blast) { - BLAST_BLASTN ( scaffolds, blast_db ) - ch_blast_txt = BLAST_BLASTN.out.txt - ch_blast_version = BLAST_BLASTN.out.version + BLAST_BLASTN ( + scaffolds, + blast_db + ) + ch_blast_txt = BLAST_BLASTN.out.txt + ch_versions = ch_versions.mix(BLAST_BLASTN.out.versions.first()) - FILTER_BLASTN ( BLAST_BLASTN.out.txt, blast_header ) + FILTER_BLASTN ( + BLAST_BLASTN.out.txt, + blast_header + ) ch_blast_filter_txt = FILTER_BLASTN.out.txt + ch_versions = ch_versions.mix(FILTER_BLASTN.out.versions.first()) } // @@ -44,23 +46,30 @@ workflow ASSEMBLY_QC { // ch_quast_results = Channel.empty() ch_quast_tsv = Channel.empty() - ch_quast_version = Channel.empty() if (!params.skip_assembly_quast) { - QUAST ( scaffolds.collect{ it[1] }, fasta, gff, true, params.gff ) + QUAST ( + scaffolds.collect{ it[1] }, + fasta, + gff, + true, + params.gff + ) ch_quast_results = QUAST.out.results ch_quast_tsv = QUAST.out.tsv - ch_quast_version = QUAST.out.version + ch_versions = ch_versions.mix(QUAST.out.versions) } // // Contiguate assembly with ABACAS // ch_abacas_results = Channel.empty() - ch_abacas_version = Channel.empty() if (!params.skip_abacas) { - ABACAS ( scaffolds, fasta ) + ABACAS ( + scaffolds, + fasta + ) ch_abacas_results = ABACAS.out.results - ch_abacas_version = ABACAS.out.version + ch_versions = ch_versions.mix(ABACAS.out.versions.first()) } // @@ -74,9 +83,11 @@ workflow ASSEMBLY_QC { ch_plasmidid_database = Channel.empty() ch_plasmidid_fasta = Channel.empty() ch_plasmidid_kmer = Channel.empty() - ch_plasmidid_version = Channel.empty() if (!params.skip_plasmidid) { - PLASMIDID ( scaffolds, fasta ) + PLASMIDID ( + scaffolds, + fasta + ) ch_plasmidid_html = PLASMIDID.out.html ch_plasmidid_tab = PLASMIDID.out.tab ch_plasmidid_images = PLASMIDID.out.images @@ -85,20 +96,17 @@ workflow ASSEMBLY_QC { ch_plasmidid_database = PLASMIDID.out.database ch_plasmidid_fasta = PLASMIDID.out.fasta_files ch_plasmidid_kmer = PLASMIDID.out.kmer - ch_plasmidid_version = PLASMIDID.out.version + ch_versions = ch_versions.mix(PLASMIDID.out.versions.first()) } emit: blast_txt = ch_blast_txt // channel: [ val(meta), [ txt ] ] blast_filter_txt = ch_blast_filter_txt // channel: [ val(meta), [ txt ] ] - blast_version = ch_blast_version // path: *.version.txt quast_results = ch_quast_results // channel: [ val(meta), [ results ] ] quast_tsv = ch_quast_tsv // channel: [ val(meta), [ tsv ] ] - quast_version = ch_quast_version // path: *.version.txt abacas_results = ch_abacas_results // channel: [ val(meta), [ results ] ] - abacas_version = ch_abacas_version // path: *.version.txt plasmidid_html = ch_plasmidid_html // channel: [ val(meta), [ html ] ] plasmidid_tab = ch_plasmidid_tab // channel: [ val(meta), [ tab ] ] @@ -108,5 +116,6 @@ workflow ASSEMBLY_QC { plasmidid_database = ch_plasmidid_database // channel: [ val(meta), [ database/ ] ] plasmidid_fasta = ch_plasmidid_fasta // channel: [ val(meta), [ fasta_files/ ] ] plasmidid_kmer = ch_plasmidid_kmer // channel: [ val(meta), [ kmer/ ] ] - plasmidid_version = ch_plasmidid_version // path: *.version.txt + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/assembly_spades.nf b/subworkflows/local/assembly_spades.nf index 205a67d4..46457c23 100644 --- a/subworkflows/local/assembly_spades.nf +++ b/subworkflows/local/assembly_spades.nf @@ -2,70 +2,88 @@ // Assembly and downstream processing for SPAdes scaffolds // -params.spades_options = [:] -params.bandage_options = [:] -params.blastn_options = [:] -params.blastn_filter_options = [:] -params.abacas_options = [:] -params.plasmidid_options = [:] -params.quast_options = [:] - -include { SPADES } from '../../modules/nf-core/modules/spades/main' addParams( options: params.spades_options ) -include { BANDAGE_IMAGE } from '../../modules/nf-core/modules/bandage/image/main' addParams( options: params.bandage_options ) -include { ASSEMBLY_QC } from './assembly_qc' addParams( blastn_options: params.blastn_options, blastn_filter_options: params.blastn_filter_options, abacas_options: params.abacas_options, plasmidid_options: params.plasmidid_options, quast_options: params.quast_options ) +include { SPADES } from '../../modules/nf-core/modules/spades/main' +include { BANDAGE_IMAGE } from '../../modules/nf-core/modules/bandage/image/main' +include { GUNZIP as GUNZIP_SCAFFOLDS } from '../../modules/nf-core/modules/gunzip/main' +include { GUNZIP as GUNZIP_GFA } from '../../modules/nf-core/modules/gunzip/main' + +include { ASSEMBLY_QC } from './assembly_qc' workflow ASSEMBLY_SPADES { take: - reads // channel: [ val(meta), [ reads ] ] - hmm // channel: /path/to/spades.hmm - fasta // channel: /path/to/genome.fasta - gff // channel: /path/to/genome.gff - blast_db // channel: /path/to/blast_db/ - blast_header // channel: /path/to/blast_header.txt + reads // channel: [ val(meta), [ reads ] ] + mode // string : spades assembly mode e.g. 'rnaviral' + hmm // channel: /path/to/spades.hmm + fasta // channel: /path/to/genome.fasta + gff // channel: /path/to/genome.gff + blast_db // channel: /path/to/blast_db/ + blast_header // channel: /path/to/blast_header.txt main: + ch_versions = Channel.empty() + // // Filter for paired-end samples if running metaSPAdes / metaviralSPAdes / metaplasmidSPAdes // ch_reads = reads - if (params.spades_options.args.contains('--meta') || params.spades_options.args.contains('--bio')) { + if (mode.contains('meta') || mode.contains('bio')) { reads - .filter { meta, fastq -> !meta.single_end } + .filter { meta, illumina, pacbio, nanopore -> !meta.single_end } .set { ch_reads } } // // Assemble reads with SPAdes // - SPADES ( ch_reads, hmm ) + SPADES ( + ch_reads, + hmm + ) + ch_versions = ch_versions.mix(SPADES.out.versions.first()) + + // + // Unzip scaffolds file + // + GUNZIP_SCAFFOLDS ( + SPADES.out.scaffolds + ) + ch_versions = ch_versions.mix(GUNZIP_SCAFFOLDS.out.versions.first()) + + // + // Unzip gfa file + // + GUNZIP_GFA ( + SPADES.out.gfa + ) // // Filter for empty scaffold files // - SPADES + GUNZIP_SCAFFOLDS .out - .scaffolds + .gunzip .filter { meta, scaffold -> scaffold.size() > 0 } .set { ch_scaffolds } - SPADES + GUNZIP_GFA .out - .gfa + .gunzip .filter { meta, gfa -> gfa.size() > 0 } .set { ch_gfa } // // Generate assembly visualisation with Bandage // - ch_bandage_png = Channel.empty() - ch_bandage_svg = Channel.empty() - ch_bandage_version = Channel.empty() + ch_bandage_png = Channel.empty() + ch_bandage_svg = Channel.empty() if (!params.skip_bandage) { - BANDAGE_IMAGE ( ch_gfa ) - ch_bandage_version = BANDAGE_IMAGE.out.version - ch_bandage_png = BANDAGE_IMAGE.out.png - ch_bandage_svg = BANDAGE_IMAGE.out.svg + BANDAGE_IMAGE ( + ch_gfa + ) + ch_bandage_png = BANDAGE_IMAGE.out.png + ch_bandage_svg = BANDAGE_IMAGE.out.svg + ch_versions = ch_versions.mix(BANDAGE_IMAGE.out.versions.first()) } // @@ -78,6 +96,7 @@ workflow ASSEMBLY_SPADES { blast_db, blast_header ) + ch_versions = ch_versions.mix(ASSEMBLY_QC.out.versions) emit: scaffolds = SPADES.out.scaffolds // channel: [ val(meta), [ scaffolds ] ] @@ -86,22 +105,17 @@ workflow ASSEMBLY_SPADES { gene_clusters = SPADES.out.gene_clusters // channel: [ val(meta), [ gene_clusters ] ] gfa = SPADES.out.gfa // channel: [ val(meta), [ gfa ] ] log_out = SPADES.out.log // channel: [ val(meta), [ log ] ] - spades_version = SPADES.out.version // path: *.version.txt bandage_png = ch_bandage_png // channel: [ val(meta), [ png ] ] bandage_svg = ch_bandage_svg // channel: [ val(meta), [ svg ] ] - bandage_version = ch_bandage_version // path: *.version.txt blast_txt = ASSEMBLY_QC.out.blast_txt // channel: [ val(meta), [ txt ] ] blast_filter_txt = ASSEMBLY_QC.out.blast_filter_txt // channel: [ val(meta), [ txt ] ] - blast_version = ASSEMBLY_QC.out.blast_version // path: *.version.txt quast_results = ASSEMBLY_QC.out.quast_results // channel: [ val(meta), [ results ] ] quast_tsv = ASSEMBLY_QC.out.quast_tsv // channel: [ val(meta), [ tsv ] ] - quast_version = ASSEMBLY_QC.out.quast_version // path: *.version.txt abacas_results = ASSEMBLY_QC.out.abacas_results // channel: [ val(meta), [ results ] ] - abacas_version = ASSEMBLY_QC.out.abacas_version // path: *.version.txt plasmidid_html = ASSEMBLY_QC.out.plasmidid_html // channel: [ val(meta), [ html ] ] plasmidid_tab = ASSEMBLY_QC.out.plasmidid_tab // channel: [ val(meta), [ tab ] ] @@ -111,5 +125,6 @@ workflow ASSEMBLY_SPADES { plasmidid_database = ASSEMBLY_QC.out.plasmidid_database // channel: [ val(meta), [ database/ ] ] plasmidid_fasta = ASSEMBLY_QC.out.plasmidid_fasta // channel: [ val(meta), [ fasta_files/ ] ] plasmidid_kmer = ASSEMBLY_QC.out.plasmidid_kmer // channel: [ val(meta), [ kmer/ ] ] - plasmidid_version = ASSEMBLY_QC.out.plasmidid_version // path: *.version.txt + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/assembly_unicycler.nf b/subworkflows/local/assembly_unicycler.nf index 3d2d7787..88df5ef5 100644 --- a/subworkflows/local/assembly_unicycler.nf +++ b/subworkflows/local/assembly_unicycler.nf @@ -2,59 +2,75 @@ // Assembly and downstream processing for Unicycler scaffolds // -params.unicycler_options = [:] -params.bandage_options = [:] -params.blastn_options = [:] -params.blastn_filter_options = [:] -params.abacas_options = [:] -params.plasmidid_options = [:] -params.quast_options = [:] - -include { UNICYCLER } from '../../modules/nf-core/modules/unicycler/main' addParams( options: params.unicycler_options ) -include { BANDAGE_IMAGE } from '../../modules/nf-core/modules/bandage/image/main' addParams( options: params.bandage_options ) -include { ASSEMBLY_QC } from './assembly_qc' addParams( blastn_options: params.blastn_options, blastn_filter_options: params.blastn_filter_options, abacas_options: params.abacas_options, plasmidid_options: params.plasmidid_options, quast_options: params.quast_options ) +include { UNICYCLER } from '../../modules/nf-core/modules/unicycler/main' +include { BANDAGE_IMAGE } from '../../modules/nf-core/modules/bandage/image/main' +include { GUNZIP as GUNZIP_SCAFFOLDS } from '../../modules/nf-core/modules/gunzip/main' +include { GUNZIP as GUNZIP_GFA } from '../../modules/nf-core/modules/gunzip/main' + +include { ASSEMBLY_QC } from './assembly_qc' workflow ASSEMBLY_UNICYCLER { take: - reads // channel: [ val(meta), [ reads ] ] - fasta // channel: /path/to/genome.fasta - gff // channel: /path/to/genome.gff - blast_db // channel: /path/to/blast_db/ - blast_header // channel: /path/to/blast_header.txt + reads // channel: [ val(meta), [ reads ] ] + fasta // channel: /path/to/genome.fasta + gff // channel: /path/to/genome.gff + blast_db // channel: /path/to/blast_db/ + blast_header // channel: /path/to/blast_header.txt main: + ch_versions = Channel.empty() + // // Assemble reads with Unicycler // - UNICYCLER ( reads ) + UNICYCLER ( + reads + ) + ch_versions = ch_versions.mix(UNICYCLER.out.versions.first()) + + // + // Unzip scaffolds file + // + GUNZIP_SCAFFOLDS ( + UNICYCLER.out.scaffolds + ) + ch_versions = ch_versions.mix(GUNZIP_SCAFFOLDS.out.versions.first()) + + // + // Unzip gfa file + // + GUNZIP_GFA ( + UNICYCLER.out.gfa + ) // // Filter for empty scaffold files // - UNICYCLER + GUNZIP_SCAFFOLDS .out - .scaffolds + .gunzip .filter { meta, scaffold -> scaffold.size() > 0 } .set { ch_scaffolds } - UNICYCLER + GUNZIP_GFA .out - .gfa + .gunzip .filter { meta, gfa -> gfa.size() > 0 } .set { ch_gfa } // // Generate assembly visualisation with Bandage // - ch_bandage_png = Channel.empty() - ch_bandage_svg = Channel.empty() - ch_bandage_version = Channel.empty() + ch_bandage_png = Channel.empty() + ch_bandage_svg = Channel.empty() if (!params.skip_bandage) { - BANDAGE_IMAGE ( ch_gfa ) - ch_bandage_version = BANDAGE_IMAGE.out.version - ch_bandage_png = BANDAGE_IMAGE.out.png - ch_bandage_svg = BANDAGE_IMAGE.out.svg + BANDAGE_IMAGE ( + ch_gfa + ) + ch_bandage_png = BANDAGE_IMAGE.out.png + ch_bandage_svg = BANDAGE_IMAGE.out.svg + ch_versions = ch_versions.mix(BANDAGE_IMAGE.out.versions.first()) } // @@ -67,27 +83,23 @@ workflow ASSEMBLY_UNICYCLER { blast_db, blast_header ) + ch_versions = ch_versions.mix(ASSEMBLY_QC.out.versions) emit: scaffolds = UNICYCLER.out.scaffolds // channel: [ val(meta), [ scaffolds ] ] gfa = UNICYCLER.out.gfa // channel: [ val(meta), [ gfa ] ] log_out = UNICYCLER.out.log // channel: [ val(meta), [ log ] ] - unicycler_version = UNICYCLER.out.version // path: *.version.txt bandage_png = ch_bandage_png // channel: [ val(meta), [ png ] ] bandage_svg = ch_bandage_svg // channel: [ val(meta), [ svg ] ] - bandage_version = ch_bandage_version // path: *.version.txt blast_txt = ASSEMBLY_QC.out.blast_txt // channel: [ val(meta), [ txt ] ] blast_filter_txt = ASSEMBLY_QC.out.blast_filter_txt // channel: [ val(meta), [ txt ] ] - blast_version = ASSEMBLY_QC.out.blast_version // path: *.version.txt quast_results = ASSEMBLY_QC.out.quast_results // channel: [ val(meta), [ results ] ] quast_tsv = ASSEMBLY_QC.out.quast_tsv // channel: [ val(meta), [ tsv ] ] - quast_version = ASSEMBLY_QC.out.quast_version // path: *.version.txt abacas_results = ASSEMBLY_QC.out.abacas_results // channel: [ val(meta), [ results ] ] - abacas_version = ASSEMBLY_QC.out.abacas_version // path: *.version.txt plasmidid_html = ASSEMBLY_QC.out.plasmidid_html // channel: [ val(meta), [ html ] ] plasmidid_tab = ASSEMBLY_QC.out.plasmidid_tab // channel: [ val(meta), [ tab ] ] @@ -97,5 +109,6 @@ workflow ASSEMBLY_UNICYCLER { plasmidid_database = ASSEMBLY_QC.out.plasmidid_database // channel: [ val(meta), [ database/ ] ] plasmidid_fasta = ASSEMBLY_QC.out.plasmidid_fasta // channel: [ val(meta), [ fasta_files/ ] ] plasmidid_kmer = ASSEMBLY_QC.out.plasmidid_kmer // channel: [ val(meta), [ kmer/ ] ] - plasmidid_version = ASSEMBLY_QC.out.plasmidid_version // path: *.version.txt + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/consensus_bcftools.nf b/subworkflows/local/consensus_bcftools.nf new file mode 100644 index 00000000..3f82151e --- /dev/null +++ b/subworkflows/local/consensus_bcftools.nf @@ -0,0 +1,108 @@ +// +// Consensus calling with BCFTools and downstream processing QC +// + +include { BCFTOOLS_FILTER } from '../../modules/nf-core/modules/bcftools/filter/main' +include { TABIX_TABIX } from '../../modules/nf-core/modules/tabix/tabix/main' +include { BEDTOOLS_MERGE } from '../../modules/nf-core/modules/bedtools/merge/main' +include { BEDTOOLS_MASKFASTA } from '../../modules/nf-core/modules/bedtools/maskfasta/main' +include { BCFTOOLS_CONSENSUS } from '../../modules/nf-core/modules/bcftools/consensus/main' +include { MAKE_BED_MASK } from '../../modules/local/make_bed_mask' +include { RENAME_FASTA_HEADER } from '../../modules/local/rename_fasta_header' +include { CONSENSUS_QC } from './consensus_qc' + +workflow CONSENSUS_BCFTOOLS { + take: + bam // channel: [ val(meta), [ bam ] ] + vcf // channel: [ val(meta), [ vcf ] ] + tbi // channel: [ val(meta), [ tbi ] ] + fasta // channel: /path/to/genome.fasta + gff // channel: /path/to/genome.gff + nextclade_db // channel: /path/to/nextclade_db/ + + main: + + ch_versions = Channel.empty() + + // + // Filter variants by allele frequency, zip and index + // + BCFTOOLS_FILTER ( + vcf + ) + ch_versions = ch_versions.mix(BCFTOOLS_FILTER.out.versions.first()) + + TABIX_TABIX ( + BCFTOOLS_FILTER.out.vcf + ) + ch_versions = ch_versions.mix(TABIX_TABIX.out.versions.first()) + + // + // Create BED file with consensus regions to mask + // + MAKE_BED_MASK ( + bam.join(BCFTOOLS_FILTER.out.vcf, by: [0]), + fasta, + params.save_mpileup + ) + ch_versions = ch_versions.mix(MAKE_BED_MASK.out.versions.first()) + + // + // Merge intervals with BEDTools + // + BEDTOOLS_MERGE ( + MAKE_BED_MASK.out.bed + ) + ch_versions = ch_versions.mix(BEDTOOLS_MERGE.out.versions.first()) + + // + // Mask regions in consensus with BEDTools + // + BEDTOOLS_MASKFASTA ( + BEDTOOLS_MERGE.out.bed, + fasta + ) + ch_versions = ch_versions.mix(BEDTOOLS_MASKFASTA.out.versions.first()) + + // + // Call consensus sequence with BCFTools + // + BCFTOOLS_CONSENSUS ( + BCFTOOLS_FILTER.out.vcf.join(TABIX_TABIX.out.tbi, by: [0]).join(BEDTOOLS_MASKFASTA.out.fasta, by: [0]) + ) + ch_versions = ch_versions.mix(BCFTOOLS_CONSENSUS.out.versions.first()) + + // + // Rename consensus header adding sample name + // + RENAME_FASTA_HEADER ( + BCFTOOLS_CONSENSUS.out.fasta + ) + ch_versions = ch_versions.mix(RENAME_FASTA_HEADER.out.versions.first()) + + // + // Consensus sequence QC + // + CONSENSUS_QC ( + RENAME_FASTA_HEADER.out.fasta, + fasta, + gff, + nextclade_db + ) + ch_versions = ch_versions.mix(CONSENSUS_QC.out.versions.first()) + + emit: + consensus = RENAME_FASTA_HEADER.out.fasta // channel: [ val(meta), [ fasta ] ] + + quast_results = CONSENSUS_QC.out.quast_results // channel: [ val(meta), [ results ] ] + quast_tsv = CONSENSUS_QC.out.quast_tsv // channel: [ val(meta), [ tsv ] ] + + pangolin_report = CONSENSUS_QC.out.pangolin_report // channel: [ val(meta), [ csv ] ] + + nextclade_report = CONSENSUS_QC.out.nextclade_report // channel: [ val(meta), [ csv ] ] + + bases_tsv = CONSENSUS_QC.out.bases_tsv // channel: [ val(meta), [ tsv ] ] + bases_pdf = CONSENSUS_QC.out.bases_pdf // channel: [ val(meta), [ pdf ] ] + + versions = ch_versions // channel: [ versions.yml ] +} diff --git a/subworkflows/local/consensus_ivar.nf b/subworkflows/local/consensus_ivar.nf new file mode 100644 index 00000000..15fc326e --- /dev/null +++ b/subworkflows/local/consensus_ivar.nf @@ -0,0 +1,55 @@ +// +// Consensus calling with iVar and downstream processing QC +// + +include { IVAR_CONSENSUS } from '../../modules/nf-core/modules/ivar/consensus/main' +include { CONSENSUS_QC } from './consensus_qc' + +workflow CONSENSUS_IVAR { + take: + bam // channel: [ val(meta), [ bam ] ] + fasta // channel: /path/to/genome.fasta + gff // channel: /path/to/genome.gff + nextclade_db // channel: /path/to/nextclade_db/ + + main: + + ch_versions = Channel.empty() + + // + // Call consensus sequence with iVar + // + IVAR_CONSENSUS ( + bam, + fasta, + params.save_mpileup + ) + ch_versions = ch_versions.mix(IVAR_CONSENSUS.out.versions.first()) + + // + // Consensus sequence QC + // + CONSENSUS_QC ( + IVAR_CONSENSUS.out.fasta, + fasta, + gff, + nextclade_db + ) + ch_versions = ch_versions.mix(CONSENSUS_QC.out.versions.first()) + + emit: + consensus = IVAR_CONSENSUS.out.fasta // channel: [ val(meta), [ fasta ] ] + consensus_qual = IVAR_CONSENSUS.out.qual // channel: [ val(meta), [ qual.txt ] ] + + quast_results = CONSENSUS_QC.out.quast_results // channel: [ val(meta), [ results ] ] + quast_tsv = CONSENSUS_QC.out.quast_tsv // channel: [ val(meta), [ tsv ] ] + + pangolin_report = CONSENSUS_QC.out.pangolin_report // channel: [ val(meta), [ csv ] ] + + nextclade_report = CONSENSUS_QC.out.nextclade_report // channel: [ val(meta), [ csv ] ] + + bases_tsv = CONSENSUS_QC.out.bases_tsv // channel: [ val(meta), [ tsv ] ] + bases_pdf = CONSENSUS_QC.out.bases_pdf // channel: [ val(meta), [ pdf ] ] + + versions = ch_versions // channel: [ versions.yml ] +} diff --git a/subworkflows/local/consensus_qc.nf b/subworkflows/local/consensus_qc.nf new file mode 100644 index 00000000..8b24e60e --- /dev/null +++ b/subworkflows/local/consensus_qc.nf @@ -0,0 +1,90 @@ +// +// Consensus calling QC +// + +include { QUAST } from '../../modules/nf-core/modules/quast/main' +include { PANGOLIN } from '../../modules/nf-core/modules/pangolin/main' +include { NEXTCLADE_RUN } from '../../modules/nf-core/modules/nextclade/run/main' +include { PLOT_BASE_DENSITY } from '../../modules/local/plot_base_density' + +workflow CONSENSUS_QC { + take: + consensus // channel: [ val(meta), [ consensus ] ] + fasta // channel: /path/to/genome.fasta + gff // channel: /path/to/genome.gff + nextclade_db // channel: /path/to/nextclade_db/ + + main: + + ch_versions = Channel.empty() + + // + // Consensus QC report across samples with QUAST + // + ch_quast_results = Channel.empty() + ch_quast_tsv = Channel.empty() + if (!params.skip_variants_quast) { + QUAST ( + consensus.collect{ it[1] }, + fasta, + gff, + true, + params.gff + ) + ch_quast_results = QUAST.out.results + ch_quast_tsv = QUAST.out.tsv + ch_versions = ch_versions.mix(QUAST.out.versions) + } + + // + // Lineage analysis with Pangolin + // + ch_pangolin_report = Channel.empty() + if (!params.skip_pangolin) { + PANGOLIN ( + consensus + ) + ch_pangolin_report = PANGOLIN.out.report + ch_versions = ch_versions.mix(PANGOLIN.out.versions.first()) + } + + // + // Lineage analysis with Nextclade + // + ch_nextclade_report = Channel.empty() + if (!params.skip_nextclade) { + NEXTCLADE_RUN ( + consensus, + nextclade_db + ) + ch_nextclade_report = NEXTCLADE_RUN.out.csv + ch_versions = ch_versions.mix(NEXTCLADE_RUN.out.versions.first()) + } + + // + // Plot consensus base density + // + ch_bases_tsv = Channel.empty() + ch_bases_pdf = Channel.empty() + if (!params.skip_consensus_plots) { + PLOT_BASE_DENSITY ( + consensus + ) + ch_bases_tsv = PLOT_BASE_DENSITY.out.tsv + ch_bases_pdf = PLOT_BASE_DENSITY.out.pdf + ch_versions = ch_versions.mix(PLOT_BASE_DENSITY.out.versions.first()) + } + + emit: + quast_results = ch_quast_results // channel: [ val(meta), [ results ] ] + quast_tsv = ch_quast_tsv // channel: [ val(meta), [ tsv ] ] + + pangolin_report = ch_pangolin_report // channel: [ val(meta), [ csv ] ] + + nextclade_report = ch_nextclade_report // channel: [ val(meta), [ csv ] ] + + bases_tsv = ch_bases_tsv // channel: [ val(meta), [ tsv ] ] + bases_pdf = ch_bases_pdf // channel: [ val(meta), [ pdf ] ] + + versions = ch_versions // channel: [ versions.yml ] +} diff --git a/subworkflows/local/input_check.nf b/subworkflows/local/input_check.nf index 7cfb85d4..4762335f 100644 --- a/subworkflows/local/input_check.nf +++ b/subworkflows/local/input_check.nf @@ -2,9 +2,7 @@ // Check input samplesheet and get read channels // -params.options = [:] - -include { SAMPLESHEET_CHECK } from '../../modules/local/samplesheet_check' addParams( options: params.options ) +include { SAMPLESHEET_CHECK } from '../../modules/local/samplesheet_check' workflow INPUT_CHECK { take: @@ -12,31 +10,38 @@ workflow INPUT_CHECK { platform // string: sequencing platform. Accepted values: 'illumina', 'nanopore' main: - SAMPLESHEET_CHECK ( samplesheet, platform ) + + SAMPLESHEET_CHECK ( + samplesheet, + platform + ) if (platform == 'illumina') { SAMPLESHEET_CHECK .out + .csv .splitCsv ( header:true, sep:',' ) .map { create_fastq_channels(it) } .set { sample_info } } else if (platform == 'nanopore') { SAMPLESHEET_CHECK .out + .csv .splitCsv ( header:true, sep:',' ) .map { row -> [ row.barcode, row.sample ] } .set { sample_info } } emit: - sample_info // channel: [ val(meta), [ reads ] ] + sample_info // channel: [ val(meta), [ reads ] ] + versions = SAMPLESHEET_CHECK.out.versions // channel: [ versions.yml ] } // Function to get list of [ meta, [ fastq_1, fastq_2 ] ] def create_fastq_channels(LinkedHashMap row) { def meta = [:] - meta.id = row.sample - meta.single_end = row.single_end.toBoolean() + meta.id = row.sample + meta.single_end = row.single_end.toBoolean() def array = [] if (!file(row.fastq_1).exists()) { diff --git a/subworkflows/local/make_consensus.nf b/subworkflows/local/make_consensus.nf deleted file mode 100644 index e34e05d8..00000000 --- a/subworkflows/local/make_consensus.nf +++ /dev/null @@ -1,44 +0,0 @@ -// -// Run various tools to generate a masked genome consensus sequence -// - -params.genomecov_options = [:] -params.merge_options = [:] -params.mask_options = [:] -params.maskfasta_options = [:] -params.bcftools_options = [:] -params.plot_bases_options = [:] - -include { BEDTOOLS_GENOMECOV } from '../../modules/nf-core/modules/bedtools/genomecov/main' addParams( options: params.genomecov_options ) -include { BEDTOOLS_MERGE } from '../../modules/nf-core/modules/bedtools/merge/main' addParams( options: params.merge_options ) -include { BEDTOOLS_MASKFASTA } from '../../modules/nf-core/modules/bedtools/maskfasta/main' addParams( options: params.maskfasta_options ) -include { BCFTOOLS_CONSENSUS } from '../../modules/nf-core/modules/bcftools/consensus/main' addParams( options: params.bcftools_options ) -include { MAKE_BED_MASK } from '../../modules/local/make_bed_mask' addParams( options: params.mask_options ) -include { PLOT_BASE_DENSITY } from '../../modules/local/plot_base_density' addParams( options: params.plot_bases_options ) - - -workflow MAKE_CONSENSUS { - take: - bam_vcf // channel: [ val(meta), [ bam ], [ vcf ], [ tbi ] ] - fasta - - main: - BEDTOOLS_GENOMECOV ( bam_vcf.map { meta, bam, vcf, tbi -> [ meta, bam ] }, [], 'bed' ) - - BEDTOOLS_MERGE ( BEDTOOLS_GENOMECOV.out.genomecov ) - - MAKE_BED_MASK ( bam_vcf.map { meta, bam, vcf, tbi -> [ meta, vcf ] }.join( BEDTOOLS_MERGE.out.bed, by: [0] ), fasta ) - - BEDTOOLS_MASKFASTA ( MAKE_BED_MASK.out.bed, MAKE_BED_MASK.out.fasta.map{it[1]} ) - - BCFTOOLS_CONSENSUS ( bam_vcf.map { meta, bam, vcf, tbi -> [ meta, vcf, tbi ] }.join( BEDTOOLS_MASKFASTA.out.fasta, by: [0] ) ) - - PLOT_BASE_DENSITY ( BCFTOOLS_CONSENSUS.out.fasta ) - - emit: - fasta = BCFTOOLS_CONSENSUS.out.fasta // channel: [ val(meta), [ fasta ] ] - tsv = PLOT_BASE_DENSITY.out.tsv // channel: [ val(meta), [ tsv ] ] - pdf = PLOT_BASE_DENSITY.out.pdf // channel: [ val(meta), [ pdf ] ] - bedtools_version = BEDTOOLS_MERGE.out.version // path: *.version.txt - bcftools_version = BCFTOOLS_CONSENSUS.out.version // path: *.version.txt -} diff --git a/subworkflows/local/prepare_genome_illumina.nf b/subworkflows/local/prepare_genome_illumina.nf index 5912aec9..1a9a54c5 100644 --- a/subworkflows/local/prepare_genome_illumina.nf +++ b/subworkflows/local/prepare_genome_illumina.nf @@ -2,43 +2,37 @@ // Uncompress and prepare reference genome files // -params.genome_options = [:] -params.index_options = [:] -params.db_options = [:] -params.bowtie2_build_options = [:] -params.collapse_primers_options = [:] -params.bedtools_getfasta_options = [:] -params.snpeff_build_options = [:] -params.makeblastdb_options = [:] -params.kraken2_build_options = [:] - -include { - GUNZIP as GUNZIP_FASTA - GUNZIP as GUNZIP_GFF - GUNZIP as GUNZIP_PRIMER_BED - GUNZIP as GUNZIP_PRIMER_FASTA } from '../../modules/nf-core/modules/gunzip/main' addParams( options: params.genome_options ) -include { UNTAR as UNTAR_BOWTIE2_INDEX } from '../../modules/nf-core/modules/untar/main' addParams( options: params.index_options ) -include { UNTAR as UNTAR_KRAKEN2_DB } from '../../modules/nf-core/modules/untar/main' addParams( options: params.db_options ) -include { UNTAR as UNTAR_BLAST_DB } from '../../modules/nf-core/modules/untar/main' addParams( options: params.db_options ) -include { BOWTIE2_BUILD } from '../../modules/nf-core/modules/bowtie2/build/main' addParams( options: params.bowtie2_build_options ) -include { BLAST_MAKEBLASTDB } from '../../modules/nf-core/modules/blast/makeblastdb/main' addParams( options: params.makeblastdb_options ) -include { BEDTOOLS_GETFASTA } from '../../modules/nf-core/modules/bedtools/getfasta/main' addParams( options: params.bedtools_getfasta_options ) -include { GET_CHROM_SIZES } from '../../modules/local/get_chrom_sizes' addParams( options: params.genome_options ) -include { COLLAPSE_PRIMERS } from '../../modules/local/collapse_primers' addParams( options: params.collapse_primers_options ) -include { KRAKEN2_BUILD } from '../../modules/local/kraken2_build' addParams( options: params.kraken2_build_options ) -include { SNPEFF_BUILD } from '../../modules/local/snpeff_build' addParams( options: params.snpeff_build_options ) +include { GUNZIP as GUNZIP_FASTA } from '../../modules/nf-core/modules/gunzip/main' +include { GUNZIP as GUNZIP_GFF } from '../../modules/nf-core/modules/gunzip/main' +include { GUNZIP as GUNZIP_PRIMER_BED } from '../../modules/nf-core/modules/gunzip/main' +include { GUNZIP as GUNZIP_PRIMER_FASTA } from '../../modules/nf-core/modules/gunzip/main' +include { UNTAR as UNTAR_BOWTIE2_INDEX } from '../../modules/nf-core/modules/untar/main' +include { UNTAR as UNTAR_NEXTCLADE_DB } from '../../modules/nf-core/modules/untar/main' +include { UNTAR as UNTAR_KRAKEN2_DB } from '../../modules/nf-core/modules/untar/main' +include { UNTAR as UNTAR_BLAST_DB } from '../../modules/nf-core/modules/untar/main' +include { BOWTIE2_BUILD } from '../../modules/nf-core/modules/bowtie2/build/main' +include { BLAST_MAKEBLASTDB } from '../../modules/nf-core/modules/blast/makeblastdb/main' +include { BEDTOOLS_GETFASTA } from '../../modules/nf-core/modules/bedtools/getfasta/main' +include { CUSTOM_GETCHROMSIZES } from '../../modules/nf-core/modules/custom/getchromsizes/main' +include { NEXTCLADE_DATASETGET } from '../../modules/nf-core/modules/nextclade/datasetget/main' +include { COLLAPSE_PRIMERS } from '../../modules/local/collapse_primers' +include { KRAKEN2_BUILD } from '../../modules/local/kraken2_build' +include { SNPEFF_BUILD } from '../../modules/local/snpeff_build' workflow PREPARE_GENOME { - take: - dummy_file - main: + ch_versions = Channel.empty() + // // Uncompress genome fasta file if required // if (params.fasta.endsWith('.gz')) { - ch_fasta = GUNZIP_FASTA ( params.fasta ).gunzip + GUNZIP_FASTA ( + [ [:], params.fasta ] + ) + ch_fasta = GUNZIP_FASTA.out.gunzip.map { it[1] } + ch_versions = ch_versions.mix(GUNZIP_FASTA.out.versions) } else { ch_fasta = file(params.fasta) } @@ -48,20 +42,30 @@ workflow PREPARE_GENOME { // if (params.gff) { if (params.gff.endsWith('.gz')) { - ch_gff = GUNZIP_GFF ( params.gff ).gunzip + GUNZIP_GFF ( + [ [:], params.gff ] + ) + ch_gff = GUNZIP_GFF.out.gunzip.map { it[1] } + ch_versions = ch_versions.mix(GUNZIP_GFF.out.versions) } else { ch_gff = file(params.gff) } } else { - ch_gff = dummy_file + ch_gff = [] } // // Create chromosome sizes file // + ch_fai = Channel.empty() ch_chrom_sizes = Channel.empty() - if (!params.skip_asciigenome) { - ch_chrom_sizes = GET_CHROM_SIZES ( ch_fasta ).sizes + if (params.protocol == 'amplicon' || !params.skip_asciigenome) { + CUSTOM_GETCHROMSIZES ( + ch_fasta + ) + ch_fai = CUSTOM_GETCHROMSIZES.out.fai + ch_chrom_sizes = CUSTOM_GETCHROMSIZES.out.sizes + ch_versions = ch_versions.mix(CUSTOM_GETCHROMSIZES.out.versions) } // @@ -71,12 +75,20 @@ workflow PREPARE_GENOME { if (!params.skip_kraken2) { if (params.kraken2_db) { if (params.kraken2_db.endsWith('.tar.gz')) { - ch_kraken2_db = UNTAR_KRAKEN2_DB ( params.kraken2_db ).untar + UNTAR_KRAKEN2_DB ( + params.kraken2_db + ) + ch_kraken2_db = UNTAR_KRAKEN2_DB.out.untar + ch_versions = ch_versions.mix(UNTAR_KRAKEN2_DB.out.versions) } else { ch_kraken2_db = file(params.kraken2_db) } } else { - ch_kraken2_db = KRAKEN2_BUILD ( params.kraken2_db_name ).db + KRAKEN2_BUILD ( + params.kraken2_db_name + ) + ch_kraken2_db = KRAKEN2_BUILD.out.db + ch_versions = ch_versions.mix(KRAKEN2_BUILD.out.versions) } } @@ -89,25 +101,44 @@ workflow PREPARE_GENOME { if (params.protocol == 'amplicon') { if (params.primer_bed) { if (params.primer_bed.endsWith('.gz')) { - ch_primer_bed = GUNZIP_PRIMER_BED ( params.primer_bed ).gunzip + GUNZIP_PRIMER_BED ( + [ [:], params.primer_bed ] + ) + ch_primer_bed = GUNZIP_PRIMER_BED.out.gunzip.map { it[1] } + ch_versions = ch_versions.mix(GUNZIP_PRIMER_BED.out.versions) } else { ch_primer_bed = file(params.primer_bed) } } if (!params.skip_variants && !params.skip_mosdepth) { - ch_primer_collapsed_bed = COLLAPSE_PRIMERS ( ch_primer_bed, params.primer_left_suffix, params.primer_right_suffix ) + COLLAPSE_PRIMERS ( + ch_primer_bed, + params.primer_left_suffix, + params.primer_right_suffix + ) + ch_primer_collapsed_bed = COLLAPSE_PRIMERS.out.bed + ch_versions = ch_versions.mix(COLLAPSE_PRIMERS.out.versions) } if (!params.skip_assembly && !params.skip_cutadapt) { if (params.primer_fasta) { if (params.primer_fasta.endsWith('.gz')) { - ch_primer_fasta = GUNZIP_PRIMER_FASTA ( params.primer_fasta ).gunzip + GUNZIP_PRIMER_FASTA ( + [ [:], params.primer_fasta ] + ) + ch_primer_fasta = GUNZIP_PRIMER_FASTA.out.gunzip.map { it[1] } + ch_versions = ch_versions.mix(GUNZIP_PRIMER_FASTA.out.versions) } else { ch_primer_fasta = file(params.primer_fasta) } } else { - ch_primer_fasta = BEDTOOLS_GETFASTA ( ch_primer_bed, ch_fasta ).fasta + BEDTOOLS_GETFASTA ( + ch_primer_bed, + ch_fasta + ) + ch_primer_fasta = BEDTOOLS_GETFASTA.out.fasta + ch_versions = ch_versions.mix(BEDTOOLS_GETFASTA.out.versions) } } } @@ -119,12 +150,46 @@ workflow PREPARE_GENOME { if (!params.skip_variants) { if (params.bowtie2_index) { if (params.bowtie2_index.endsWith('.tar.gz')) { - ch_bowtie2_index = UNTAR_BOWTIE2_INDEX ( params.bowtie2_index ).untar + UNTAR_BOWTIE2_INDEX ( + params.bowtie2_index + ) + ch_bowtie2_index = UNTAR_BOWTIE2_INDEX.out.untar + ch_versions = ch_versions.mix(UNTAR_BOWTIE2_INDEX.out.versions) } else { ch_bowtie2_index = file(params.bowtie2_index) } } else { - ch_bowtie2_index = BOWTIE2_BUILD ( ch_fasta ).index + BOWTIE2_BUILD ( + ch_fasta + ) + ch_bowtie2_index = BOWTIE2_BUILD.out.index + ch_versions = ch_versions.mix(BOWTIE2_BUILD.out.versions) + } + } + + // + // Prepare Nextclade dataset + // + ch_nextclade_db = Channel.empty() + if (!params.skip_consensus && !params.skip_nextclade) { + if (params.nextclade_dataset) { + if (params.nextclade_dataset.endsWith('.tar.gz')) { + UNTAR_NEXTCLADE_DB ( + params.nextclade_dataset + ) + ch_nextclade_db = UNTAR_NEXTCLADE_DB.out.untar + ch_versions = ch_versions.mix(UNTAR_NEXTCLADE_DB.out.versions) + } else { + ch_nextclade_db = file(params.nextclade_dataset) + } + } else if (params.nextclade_dataset_name) { + NEXTCLADE_DATASETGET ( + params.nextclade_dataset_name, + params.nextclade_dataset_reference, + params.nextclade_dataset_tag + ) + ch_nextclade_db = NEXTCLADE_DATASETGET.out.dataset + ch_versions = ch_versions.mix(NEXTCLADE_DATASETGET.out.versions) } } @@ -136,12 +201,20 @@ workflow PREPARE_GENOME { if (!params.skip_blast) { if (params.blast_db) { if (params.blast_db.endsWith('.tar.gz')) { - ch_blast_db = UNTAR_BLAST_DB ( params.blast_db ).untar + UNTAR_BLAST_DB ( + params.blast_db + ) + ch_blast_db = UNTAR_BLAST_DB.out.untar + ch_versions = ch_versions.mix(UNTAR_BLAST_DB.out.versions) } else { ch_blast_db = file(params.blast_db) } } else { - ch_blast_db = BLAST_MAKEBLASTDB ( ch_fasta ).db + BLAST_MAKEBLASTDB ( + ch_fasta + ) + ch_blast_db = BLAST_MAKEBLASTDB.out.db + ch_versions = ch_versions.mix(BLAST_MAKEBLASTDB.out.versions) } } } @@ -152,21 +225,29 @@ workflow PREPARE_GENOME { ch_snpeff_db = Channel.empty() ch_snpeff_config = Channel.empty() if (!params.skip_variants && params.gff && !params.skip_snpeff) { - SNPEFF_BUILD ( ch_fasta, ch_gff ) + SNPEFF_BUILD ( + ch_fasta, + ch_gff + ) ch_snpeff_db = SNPEFF_BUILD.out.db ch_snpeff_config = SNPEFF_BUILD.out.config + ch_versions = ch_versions.mix(SNPEFF_BUILD.out.versions) } emit: - fasta = ch_fasta // path: genome.fasta - gff = ch_gff // path: genome.gff - chrom_sizes = ch_chrom_sizes // path: genome.sizes - bowtie2_index = ch_bowtie2_index // path: bowtie2/index/ - primer_bed = ch_primer_bed // path: primer.bed - primer_collapsed_bed = ch_primer_collapsed_bed // path: primer.collapsed.bed - primer_fasta = ch_primer_fasta // path: primer.fasta - blast_db = ch_blast_db // path: blast_db/ - kraken2_db = ch_kraken2_db // path: kraken2_db/ - snpeff_db = ch_snpeff_db // path: snpeff_db - snpeff_config = ch_snpeff_config // path: snpeff.config + fasta = ch_fasta // path: genome.fasta + gff = ch_gff // path: genome.gff + fai = ch_fai // path: genome.fai + chrom_sizes = ch_chrom_sizes // path: genome.sizes + bowtie2_index = ch_bowtie2_index // path: bowtie2/index/ + primer_bed = ch_primer_bed // path: primer.bed + primer_collapsed_bed = ch_primer_collapsed_bed // path: primer.collapsed.bed + primer_fasta = ch_primer_fasta // path: primer.fasta + nextclade_db = ch_nextclade_db // path: nextclade_db + blast_db = ch_blast_db // path: blast_db/ + kraken2_db = ch_kraken2_db // path: kraken2_db/ + snpeff_db = ch_snpeff_db // path: snpeff_db + snpeff_config = ch_snpeff_config // path: snpeff.config + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/prepare_genome_nanopore.nf b/subworkflows/local/prepare_genome_nanopore.nf index a23ca4b8..6d2449d7 100644 --- a/subworkflows/local/prepare_genome_nanopore.nf +++ b/subworkflows/local/prepare_genome_nanopore.nf @@ -2,29 +2,29 @@ // Uncompress and prepare reference genome files // -params.genome_options = [:] -params.collapse_primers_options = [:] -params.snpeff_build_options = [:] - -include { - GUNZIP as GUNZIP_FASTA - GUNZIP as GUNZIP_GFF - GUNZIP as GUNZIP_PRIMER_BED } from '../../modules/nf-core/modules/gunzip/main' addParams( options: params.genome_options ) -include { GET_CHROM_SIZES } from '../../modules/local/get_chrom_sizes' addParams( options: params.genome_options ) -include { COLLAPSE_PRIMERS } from '../../modules/local/collapse_primers' addParams( options: params.collapse_primers_options ) -include { SNPEFF_BUILD } from '../../modules/local/snpeff_build' addParams( options: params.snpeff_build_options ) +include { GUNZIP as GUNZIP_FASTA } from '../../modules/nf-core/modules/gunzip/main' +include { GUNZIP as GUNZIP_GFF } from '../../modules/nf-core/modules/gunzip/main' +include { GUNZIP as GUNZIP_PRIMER_BED } from '../../modules/nf-core/modules/gunzip/main' +include { UNTAR } from '../../modules/nf-core/modules/untar/main' +include { CUSTOM_GETCHROMSIZES } from '../../modules/nf-core/modules/custom/getchromsizes/main' +include { NEXTCLADE_DATASETGET } from '../../modules/nf-core/modules/nextclade/datasetget/main' +include { COLLAPSE_PRIMERS } from '../../modules/local/collapse_primers' +include { SNPEFF_BUILD } from '../../modules/local/snpeff_build' workflow PREPARE_GENOME { - take: - dummy_file - main: + ch_versions = Channel.empty() + // // Uncompress genome fasta file if required // if (params.fasta.endsWith('.gz')) { - ch_fasta = GUNZIP_FASTA ( params.fasta ).gunzip + GUNZIP_FASTA ( + [ [:], params.fasta ] + ) + ch_fasta = GUNZIP_FASTA.out.gunzip.map { it[1] } + ch_versions = ch_versions.mix(GUNZIP_FASTA.out.versions) } else { ch_fasta = file(params.fasta) } @@ -34,21 +34,27 @@ workflow PREPARE_GENOME { // if (params.gff) { if (params.gff.endsWith('.gz')) { - ch_gff = GUNZIP_GFF ( params.gff ).gunzip + GUNZIP_GFF ( + [ [:], params.gff ] + ) + ch_gff = GUNZIP_GFF.out.gunzip.map { it[1] } + ch_versions = ch_versions.mix(GUNZIP_GFF.out.versions) } else { ch_gff = file(params.gff) } } else { - ch_gff = dummy_file + ch_gff = [] } // // Create chromosome sizes file // - ch_chrom_sizes = Channel.empty() - if (!params.skip_asciigenome) { - ch_chrom_sizes = GET_CHROM_SIZES ( ch_fasta ).sizes - } + CUSTOM_GETCHROMSIZES ( + ch_fasta + ) + ch_fai = CUSTOM_GETCHROMSIZES.out.fai + ch_chrom_sizes = CUSTOM_GETCHROMSIZES.out.sizes + ch_versions = ch_versions.mix(CUSTOM_GETCHROMSIZES.out.versions) // // Uncompress primer BED file @@ -56,7 +62,11 @@ workflow PREPARE_GENOME { ch_primer_bed = Channel.empty() if (params.primer_bed) { if (params.primer_bed.endsWith('.gz')) { - ch_primer_bed = GUNZIP_PRIMER_BED ( params.primer_bed ).gunzip + GUNZIP_PRIMER_BED ( + [ [:], params.primer_bed ] + ) + ch_primer_bed = GUNZIP_PRIMER_BED.out.gunzip.map { it[1] } + ch_versions = ch_versions.mix(GUNZIP_PRIMER_BED.out.versions) } else { ch_primer_bed = file(params.primer_bed) } @@ -67,7 +77,39 @@ workflow PREPARE_GENOME { // ch_primer_collapsed_bed = Channel.empty() if (!params.skip_mosdepth) { - ch_primer_collapsed_bed = COLLAPSE_PRIMERS ( ch_primer_bed, params.primer_left_suffix, params.primer_right_suffix ) + COLLAPSE_PRIMERS ( + ch_primer_bed, + params.primer_left_suffix, + params.primer_right_suffix + ) + ch_primer_collapsed_bed = COLLAPSE_PRIMERS.out.bed + ch_versions = ch_versions.mix(COLLAPSE_PRIMERS.out.versions) + } + + // + // Prepare Nextclade dataset + // + ch_nextclade_db = Channel.empty() + if (!params.skip_consensus && !params.skip_nextclade) { + if (params.nextclade_dataset) { + if (params.nextclade_dataset.endsWith('.tar.gz')) { + UNTAR ( + params.nextclade_dataset + ) + ch_nextclade_db = UNTAR.out.untar + ch_versions = ch_versions.mix(UNTAR.out.versions) + } else { + ch_nextclade_db = file(params.nextclade_dataset) + } + } else if (params.nextclade_dataset_name) { + NEXTCLADE_DATASETGET ( + params.nextclade_dataset_name, + params.nextclade_dataset_reference, + params.nextclade_dataset_tag + ) + ch_nextclade_db = NEXTCLADE_DATASETGET.out.dataset + ch_versions = ch_versions.mix(NEXTCLADE_DATASETGET.out.versions) + } } // @@ -76,17 +118,25 @@ workflow PREPARE_GENOME { ch_snpeff_db = Channel.empty() ch_snpeff_config = Channel.empty() if (params.gff && !params.skip_snpeff) { - SNPEFF_BUILD ( ch_fasta, ch_gff ) + SNPEFF_BUILD ( + ch_fasta, + ch_gff + ) ch_snpeff_db = SNPEFF_BUILD.out.db ch_snpeff_config = SNPEFF_BUILD.out.config + ch_versions = ch_versions.mix(SNPEFF_BUILD.out.versions) } emit: - fasta = ch_fasta // path: genome.fasta - gff = ch_gff // path: genome.gff - chrom_sizes = ch_chrom_sizes // path: genome.sizes - primer_bed = ch_primer_bed // path: primer.bed - primer_collapsed_bed = ch_primer_collapsed_bed // path: primer.collapsed.bed - snpeff_db = ch_snpeff_db // path: snpeff_db - snpeff_config = ch_snpeff_config // path: snpeff.config + fasta = ch_fasta // path: genome.fasta + gff = ch_gff // path: genome.gff + fai = ch_fai // path: genome.fai + chrom_sizes = ch_chrom_sizes // path: genome.sizes + primer_bed = ch_primer_bed // path: primer.bed + primer_collapsed_bed = ch_primer_collapsed_bed // path: primer.collapsed.bed + nextclade_db = ch_nextclade_db // path: nextclade_db + snpeff_db = ch_snpeff_db // path: snpeff_db + snpeff_config = ch_snpeff_config // path: snpeff.config + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/primer_trim_ivar.nf b/subworkflows/local/primer_trim_ivar.nf deleted file mode 100644 index 4037b5b5..00000000 --- a/subworkflows/local/primer_trim_ivar.nf +++ /dev/null @@ -1,40 +0,0 @@ -// -// iVar trim, sort, index BAM file and run samtools stats, flagstat and idxstats -// - -params.ivar_trim_options = [:] -params.samtools_options = [:] - -include { IVAR_TRIM } from '../../modules/nf-core/modules/ivar/trim/main' addParams( options: params.ivar_trim_options ) -include { SAMTOOLS_INDEX } from '../../modules/nf-core/modules/samtools/index/main' addParams( options: params.samtools_options ) -include { BAM_SORT_SAMTOOLS } from '../nf-core/bam_sort_samtools' addParams( options: params.samtools_options ) - -workflow PRIMER_TRIM_IVAR { - take: - bam // channel: [ val(meta), [ bam ], [bai] ] - bed // path : bed - - main: - - // - // iVar trim primers - // - IVAR_TRIM ( bam, bed ) - - // - // Sort, index BAM file and run samtools stats, flagstat and idxstats - // - BAM_SORT_SAMTOOLS ( IVAR_TRIM.out.bam ) - - emit: - bam_orig = IVAR_TRIM.out.bam // channel: [ val(meta), bam ] - log_out = IVAR_TRIM.out.log // channel: [ val(meta), log ] - ivar_version = IVAR_TRIM.out.version // path: *.version.txt - - bam = BAM_SORT_SAMTOOLS.out.bam // channel: [ val(meta), [ bam ] ] - bai = BAM_SORT_SAMTOOLS.out.bai // channel: [ val(meta), [ bai ] ] - stats = BAM_SORT_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] - flagstat = BAM_SORT_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ] - idxstats = BAM_SORT_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ] - samtools_version = BAM_SORT_SAMTOOLS.out.version // path: *.version.txt -} diff --git a/subworkflows/local/snpeff_snpsift.nf b/subworkflows/local/snpeff_snpsift.nf index cee772e3..ad84feb3 100644 --- a/subworkflows/local/snpeff_snpsift.nf +++ b/subworkflows/local/snpeff_snpsift.nf @@ -2,15 +2,10 @@ // Run snpEff, bgzip, tabix, stats and SnpSift commands // -params.snpeff_options = [:] -params.bgzip_options = [:] -params.tabix_options = [:] -params.stats_options = [:] -params.snpsift_options = [:] +include { SNPEFF_ANN } from '../../modules/local/snpeff_ann' +include { SNPSIFT_EXTRACTFIELDS } from '../../modules/local/snpsift_extractfields' -include { SNPEFF_ANN } from '../../modules/local/snpeff_ann' addParams( options: params.snpeff_options ) -include { SNPSIFT_EXTRACTFIELDS } from '../../modules/local/snpsift_extractfields' addParams( options: params.snpsift_options ) -include { VCF_BGZIP_TABIX_STATS } from '../nf-core/vcf_bgzip_tabix_stats' addParams( bgzip_options: params.bgzip_options, tabix_options: params.tabix_options, stats_options: params.stats_options ) +include { VCF_BGZIP_TABIX_STATS } from '../nf-core/vcf_bgzip_tabix_stats' workflow SNPEFF_SNPSIFT { take: @@ -21,24 +16,36 @@ workflow SNPEFF_SNPSIFT { main: - SNPEFF_ANN ( vcf, db, config, fasta ) + ch_versions = Channel.empty() - VCF_BGZIP_TABIX_STATS ( SNPEFF_ANN.out.vcf ) + SNPEFF_ANN ( + vcf, + db, + config, + fasta + ) + ch_versions = ch_versions.mix(SNPEFF_ANN.out.versions.first()) - SNPSIFT_EXTRACTFIELDS ( VCF_BGZIP_TABIX_STATS.out.vcf ) + VCF_BGZIP_TABIX_STATS ( + SNPEFF_ANN.out.vcf + ) + ch_versions = ch_versions.mix(VCF_BGZIP_TABIX_STATS.out.versions) + + SNPSIFT_EXTRACTFIELDS ( + VCF_BGZIP_TABIX_STATS.out.vcf + ) + ch_versions = ch_versions.mix(SNPSIFT_EXTRACTFIELDS.out.versions.first()) emit: - csv = SNPEFF_ANN.out.csv // channel: [ val(meta), [ csv ] ] - txt = SNPEFF_ANN.out.txt // channel: [ val(meta), [ txt ] ] - html = SNPEFF_ANN.out.html // channel: [ val(meta), [ html ] ] - snpeff_version = SNPEFF_ANN.out.version // path: *.version.txt - - vcf = VCF_BGZIP_TABIX_STATS.out.vcf // channel: [ val(meta), [ vcf.gz ] ] - tbi = VCF_BGZIP_TABIX_STATS.out.tbi // channel: [ val(meta), [ tbi ] ] - stats = VCF_BGZIP_TABIX_STATS.out.stats // channel: [ val(meta), [ txt ] ] - tabix_version = VCF_BGZIP_TABIX_STATS.out.tabix_version // path: *.version.txt - bcftools_version = VCF_BGZIP_TABIX_STATS.out.bcftools_version // path: *.version.txt - - snpsift_txt = SNPSIFT_EXTRACTFIELDS.out.txt // channel: [ val(meta), [ txt ] ] - snpsift_version = SNPSIFT_EXTRACTFIELDS.out.version // path: *.version.txt + csv = SNPEFF_ANN.out.csv // channel: [ val(meta), [ csv ] ] + txt = SNPEFF_ANN.out.txt // channel: [ val(meta), [ txt ] ] + html = SNPEFF_ANN.out.html // channel: [ val(meta), [ html ] ] + + vcf = VCF_BGZIP_TABIX_STATS.out.vcf // channel: [ val(meta), [ vcf.gz ] ] + tbi = VCF_BGZIP_TABIX_STATS.out.tbi // channel: [ val(meta), [ tbi ] ] + stats = VCF_BGZIP_TABIX_STATS.out.stats // channel: [ val(meta), [ txt ] ] + + snpsift_txt = SNPSIFT_EXTRACTFIELDS.out.txt // channel: [ val(meta), [ txt ] ] + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/variants_bcftools.nf b/subworkflows/local/variants_bcftools.nf index 626659a9..662c860a 100644 --- a/subworkflows/local/variants_bcftools.nf +++ b/subworkflows/local/variants_bcftools.nf @@ -1,31 +1,11 @@ // -// Variant calling and downstream processing for BCFTools +// Variant calling with BCFTools, downstream processing and QC // -params.bcftools_mpileup_options = [:] -params.quast_options = [:] -params.consensus_genomecov_options = [:] -params.consensus_merge_options = [:] -params.consensus_mask_options = [:] -params.consensus_maskfasta_options = [:] -params.consensus_bcftools_options = [:] -params.consensus_plot_options = [:] -params.snpeff_options = [:] -params.snpsift_options = [:] -params.snpeff_bgzip_options = [:] -params.snpeff_tabix_options = [:] -params.snpeff_stats_options = [:] -params.pangolin_options = [:] -params.nextclade_options = [:] -params.asciigenome_options = [:] - -include { BCFTOOLS_MPILEUP } from '../../modules/nf-core/modules/bcftools/mpileup/main' addParams( options: params.bcftools_mpileup_options ) -include { QUAST } from '../../modules/nf-core/modules/quast/main' addParams( options: params.quast_options ) -include { PANGOLIN } from '../../modules/nf-core/modules/pangolin/main' addParams( options: params.pangolin_options ) -include { NEXTCLADE } from '../../modules/nf-core/modules/nextclade/main' addParams( options: params.nextclade_options ) -include { ASCIIGENOME } from '../../modules/local/asciigenome' addParams( options: params.asciigenome_options ) -include { MAKE_CONSENSUS } from './make_consensus' addParams( genomecov_options: params.consensus_genomecov_options, merge_options: params.consensus_merge_options, mask_options: params.consensus_mask_options, maskfasta_options: params.consensus_maskfasta_options, bcftools_options: params.consensus_bcftools_options, plot_bases_options: params.consensus_plot_options ) -include { SNPEFF_SNPSIFT } from './snpeff_snpsift' addParams( snpeff_options: params.snpeff_options, snpsift_options: params.snpsift_options, bgzip_options: params.snpeff_bgzip_options, tabix_options: params.snpeff_tabix_options, stats_options: params.snpeff_stats_options ) +include { BCFTOOLS_MPILEUP } from '../../modules/nf-core/modules/bcftools/mpileup/main' +include { BCFTOOLS_NORM } from '../../modules/nf-core/modules/bcftools/norm/main' +include { VCF_TABIX_STATS } from '../nf-core/vcf_tabix_stats' +include { VARIANTS_QC } from './variants_qc' workflow VARIANTS_BCFTOOLS { take: @@ -39,137 +19,87 @@ workflow VARIANTS_BCFTOOLS { main: + ch_versions = Channel.empty() + // // Call variants // - BCFTOOLS_MPILEUP ( bam, fasta ) + BCFTOOLS_MPILEUP ( + bam, + fasta, + params.save_mpileup + ) + ch_versions = ch_versions.mix(BCFTOOLS_MPILEUP.out.versions.first()) + + // Filter out samples with 0 variants + BCFTOOLS_MPILEUP + .out + .vcf + .join(BCFTOOLS_MPILEUP.out.tbi) + .join(BCFTOOLS_MPILEUP.out.stats) + .filter { meta, vcf, tbi, stats -> WorkflowCommons.getNumVariantsFromBCFToolsStats(stats) > 0 } + .set { ch_vcf_tbi_stats } + + ch_vcf_tbi_stats + .map { meta, vcf, tbi, stats -> [ meta, vcf ] } + .set { ch_vcf } + + ch_vcf_tbi_stats + .map { meta, vcf, tbi, stats -> [ meta, tbi ] } + .set { ch_tbi } + + ch_vcf_tbi_stats + .map { meta, vcf, tbi, stats -> [ meta, stats ] } + .set { ch_stats } // - // Create genome consensus using variants in VCF, run QUAST and pangolin + // Split multi-allelic positions // - ch_consensus = Channel.empty() - ch_bases_tsv = Channel.empty() - ch_bases_pdf = Channel.empty() - ch_bedtools_version = Channel.empty() - ch_quast_results = Channel.empty() - ch_quast_tsv = Channel.empty() - ch_quast_version = Channel.empty() - ch_pangolin_report = Channel.empty() - ch_pangolin_version = Channel.empty() - ch_nextclade_report = Channel.empty() - ch_nextclade_version = Channel.empty() - if (!params.skip_consensus) { - MAKE_CONSENSUS ( bam.join(BCFTOOLS_MPILEUP.out.vcf, by: [0]).join(BCFTOOLS_MPILEUP.out.tbi, by: [0]), fasta ) - ch_consensus = MAKE_CONSENSUS.out.fasta - ch_bases_tsv = MAKE_CONSENSUS.out.tsv - ch_bases_pdf = MAKE_CONSENSUS.out.pdf - ch_bedtools_version = MAKE_CONSENSUS.out.bedtools_version - - if (!params.skip_variants_quast) { - QUAST ( ch_consensus.collect{ it[1] }, fasta, gff, true, params.gff ) - ch_quast_results = QUAST.out.results - ch_quast_tsv = QUAST.out.tsv - ch_quast_version = QUAST.out.version - } - - if (!params.skip_pangolin) { - PANGOLIN ( ch_consensus ) - ch_pangolin_report = PANGOLIN.out.report - ch_pangolin_version = PANGOLIN.out.version - } - - if (!params.skip_nextclade) { - NEXTCLADE ( ch_consensus ) - ch_nextclade_report = NEXTCLADE.out.csv - ch_nextclade_version = NEXTCLADE.out.version - } - } + BCFTOOLS_NORM ( + ch_vcf, + fasta + ) + ch_versions = ch_versions.mix(BCFTOOLS_NORM.out.versions.first()) - // - // Annotate variants - // - ch_snpeff_vcf = Channel.empty() - ch_snpeff_tbi = Channel.empty() - ch_snpeff_stats = Channel.empty() - ch_snpeff_csv = Channel.empty() - ch_snpeff_txt = Channel.empty() - ch_snpeff_html = Channel.empty() - ch_snpsift_txt = Channel.empty() - ch_snpeff_version = Channel.empty() - ch_snpsift_version = Channel.empty() - if (params.gff && !params.skip_snpeff) { - SNPEFF_SNPSIFT ( BCFTOOLS_MPILEUP.out.vcf, snpeff_db, snpeff_config, fasta ) - ch_snpeff_vcf = SNPEFF_SNPSIFT.out.vcf - ch_snpeff_tbi = SNPEFF_SNPSIFT.out.tbi - ch_snpeff_stats = SNPEFF_SNPSIFT.out.stats - ch_snpeff_csv = SNPEFF_SNPSIFT.out.csv - ch_snpeff_txt = SNPEFF_SNPSIFT.out.txt - ch_snpeff_html = SNPEFF_SNPSIFT.out.html - ch_snpsift_txt = SNPEFF_SNPSIFT.out.snpsift_txt - ch_snpeff_version = SNPEFF_SNPSIFT.out.snpeff_version - ch_snpsift_version = SNPEFF_SNPSIFT.out.snpsift_version - } + VCF_TABIX_STATS ( + BCFTOOLS_NORM.out.vcf + ) + ch_versions = ch_versions.mix(VCF_TABIX_STATS.out.versions) // - // Variant screenshots with ASCIIGenome + // Run downstream tools for variants QC // - ch_asciigenome_pdf = Channel.empty() - ch_asciigenome_version = Channel.empty() - if (!params.skip_asciigenome) { - bam - .join(BCFTOOLS_MPILEUP.out.vcf, by: [0]) - .join(BCFTOOLS_MPILEUP.out.stats, by: [0]) - .map { meta, bam, vcf, stats -> - if (WorkflowCommons.getNumVariantsFromBCFToolsStats(stats) > 0) { - return [ meta, bam, vcf ] - } - } - .set { ch_asciigenome } - - ASCIIGENOME ( - ch_asciigenome, - fasta, - sizes, - gff, - bed, - params.asciigenome_window_size, - params.asciigenome_read_depth - ) - ch_asciigenome_pdf = ASCIIGENOME.out.pdf - ch_asciigenome_version = ASCIIGENOME.out.version - } + VARIANTS_QC ( + bam, + BCFTOOLS_NORM.out.vcf, + VCF_TABIX_STATS.out.stats, + fasta, + sizes, + gff, + bed, + snpeff_db, + snpeff_config + ) + ch_versions = ch_versions.mix(VARIANTS_QC.out.versions) emit: - vcf = BCFTOOLS_MPILEUP.out.vcf // channel: [ val(meta), [ vcf ] ] - tbi = BCFTOOLS_MPILEUP.out.tbi // channel: [ val(meta), [ tbi ] ] - stats = BCFTOOLS_MPILEUP.out.stats // channel: [ val(meta), [ txt ] ] - bcftools_version = BCFTOOLS_MPILEUP.out.version // path: *.version.txt - - consensus = ch_consensus // channel: [ val(meta), [ fasta ] ] - bases_tsv = ch_bases_tsv // channel: [ val(meta), [ tsv ] ] - bases_pdf = ch_bases_pdf // channel: [ val(meta), [ pdf ] ] - bedtools_version = ch_bedtools_version // path: *.version.txt - - quast_results = ch_quast_results // channel: [ val(meta), [ results ] ] - quast_tsv = ch_quast_tsv // channel: [ val(meta), [ tsv ] ] - quast_version = ch_quast_version // path: *.version.txt - - snpeff_vcf = ch_snpeff_vcf // channel: [ val(meta), [ vcf.gz ] ] - snpeff_tbi = ch_snpeff_tbi // channel: [ val(meta), [ tbi ] ] - snpeff_stats = ch_snpeff_stats // channel: [ val(meta), [ txt ] ] - snpeff_csv = ch_snpeff_csv // channel: [ val(meta), [ csv ] ] - snpeff_txt = ch_snpeff_txt // channel: [ val(meta), [ txt ] ] - snpeff_html = ch_snpeff_html // channel: [ val(meta), [ html ] ] - snpsift_txt = ch_snpsift_txt // channel: [ val(meta), [ txt ] ] - snpeff_version = ch_snpeff_version // path: *.version.txt - snpsift_version = ch_snpsift_version // path: *.version.txt - - pangolin_report = ch_pangolin_report // channel: [ val(meta), [ csv ] ] - pangolin_version = ch_pangolin_version // path: *.version.txt - - nextclade_report = ch_nextclade_report // channel: [ val(meta), [ csv ] ] - nextclade_version = ch_nextclade_version // path: *.version.txt - - asciigenome_pdf = ch_asciigenome_pdf // channel: [ val(meta), [ pdf ] ] - asciigenome_version = ch_asciigenome_version // path: *.version.txt + vcf_orig = ch_vcf // channel: [ val(meta), [ vcf ] ] + tbi_orig = ch_tbi // channel: [ val(meta), [ tbi ] ] + stats_orig = ch_stats // channel: [ val(meta), [ txt ] ] + + vcf = BCFTOOLS_NORM.out.vcf // channel: [ val(meta), [ vcf ] ] + tbi = VCF_TABIX_STATS.out.tbi // channel: [ val(meta), [ tbi ] ] + stats = VCF_TABIX_STATS.out.stats // channel: [ val(meta), [ txt ] ] + + snpeff_vcf = VARIANTS_QC.out.snpeff_vcf // channel: [ val(meta), [ vcf.gz ] ] + snpeff_tbi = VARIANTS_QC.out.snpeff_tbi // channel: [ val(meta), [ tbi ] ] + snpeff_stats = VARIANTS_QC.out.snpeff_stats // channel: [ val(meta), [ txt ] ] + snpeff_csv = VARIANTS_QC.out.snpeff_csv // channel: [ val(meta), [ csv ] ] + snpeff_txt = VARIANTS_QC.out.snpeff_txt // channel: [ val(meta), [ txt ] ] + snpeff_html = VARIANTS_QC.out.snpeff_html // channel: [ val(meta), [ html ] ] + snpsift_txt = VARIANTS_QC.out.snpsift_txt // channel: [ val(meta), [ txt ] ] + + asciigenome_pdf = VARIANTS_QC.out.asciigenome_pdf // channel: [ val(meta), [ pdf ] ] + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/variants_ivar.nf b/subworkflows/local/variants_ivar.nf index 2d2748ef..27e92ff8 100644 --- a/subworkflows/local/variants_ivar.nf +++ b/subworkflows/local/variants_ivar.nf @@ -1,34 +1,11 @@ // -// Variant calling and downstream processing for IVar +// Variant calling with IVar, downstream processing and QC // -params.ivar_variants_options = [:] -params.ivar_variants_to_vcf_options = [:] -params.tabix_bgzip_options = [:] -params.tabix_tabix_options = [:] -params.bcftools_stats_options = [:] -params.ivar_consensus_options = [:] -params.consensus_plot_options = [:] -params.quast_options = [:] -params.snpeff_options = [:] -params.snpsift_options = [:] -params.snpeff_bgzip_options = [:] -params.snpeff_tabix_options = [:] -params.snpeff_stats_options = [:] -params.pangolin_options = [:] -params.nextclade_options = [:] -params.asciigenome_options = [:] - -include { IVAR_VARIANTS_TO_VCF } from '../../modules/local/ivar_variants_to_vcf' addParams( options: params.ivar_variants_to_vcf_options ) -include { PLOT_BASE_DENSITY } from '../../modules/local/plot_base_density' addParams( options: params.consensus_plot_options ) -include { IVAR_VARIANTS } from '../../modules/nf-core/modules/ivar/variants/main' addParams( options: params.ivar_variants_options ) -include { IVAR_CONSENSUS } from '../../modules/nf-core/modules/ivar/consensus/main' addParams( options: params.ivar_consensus_options ) -include { QUAST } from '../../modules/nf-core/modules/quast/main' addParams( options: params.quast_options ) -include { PANGOLIN } from '../../modules/nf-core/modules/pangolin/main' addParams( options: params.pangolin_options ) -include { NEXTCLADE } from '../../modules/nf-core/modules/nextclade/main' addParams( options: params.nextclade_options ) -include { ASCIIGENOME } from '../../modules/local/asciigenome' addParams( options: params.asciigenome_options ) -include { VCF_BGZIP_TABIX_STATS } from '../nf-core/vcf_bgzip_tabix_stats' addParams( bgzip_options: params.tabix_bgzip_options, tabix_options: params.tabix_tabix_options, stats_options: params.bcftools_stats_options ) -include { SNPEFF_SNPSIFT } from './snpeff_snpsift' addParams( snpeff_options: params.snpeff_options, snpsift_options: params.snpsift_options, bgzip_options: params.snpeff_bgzip_options, tabix_options: params.snpeff_tabix_options, stats_options: params.snpeff_stats_options ) +include { IVAR_VARIANTS } from '../../modules/nf-core/modules/ivar/variants/main' +include { IVAR_VARIANTS_TO_VCF } from '../../modules/local/ivar_variants_to_vcf' +include { VCF_BGZIP_TABIX_STATS } from '../nf-core/vcf_bgzip_tabix_stats' +include { VARIANTS_QC } from './variants_qc' workflow VARIANTS_IVAR { take: @@ -43,154 +20,76 @@ workflow VARIANTS_IVAR { main: + ch_versions = Channel.empty() + // // Call variants // - IVAR_VARIANTS ( bam, fasta, gff ) + IVAR_VARIANTS ( + bam, + fasta, + gff, + params.save_mpileup + ) + ch_versions = ch_versions.mix(IVAR_VARIANTS.out.versions.first()) + + // Filter out samples with 0 variants + IVAR_VARIANTS + .out + .tsv + .filter { meta, tsv -> WorkflowCommons.getNumLinesInFile(tsv) > 1 } + .set { ch_ivar_tsv } // // Convert original iVar output to VCF, zip and index // - IVAR_VARIANTS_TO_VCF ( IVAR_VARIANTS.out.tsv, ivar_multiqc_header ) - - VCF_BGZIP_TABIX_STATS ( IVAR_VARIANTS_TO_VCF.out.vcf ) - - // - // Create genome consensus - // - ch_consensus = Channel.empty() - ch_consensus_qual = Channel.empty() - ch_bases_tsv = Channel.empty() - ch_bases_pdf = Channel.empty() - ch_quast_results = Channel.empty() - ch_quast_tsv = Channel.empty() - ch_quast_version = Channel.empty() - ch_pangolin_report = Channel.empty() - ch_pangolin_version = Channel.empty() - ch_nextclade_report = Channel.empty() - ch_nextclade_version = Channel.empty() - if (!params.skip_consensus) { - IVAR_CONSENSUS ( bam, fasta ) - ch_consensus = IVAR_CONSENSUS.out.fasta - ch_consensus_qual = IVAR_CONSENSUS.out.qual - - PLOT_BASE_DENSITY ( ch_consensus ) - ch_bases_tsv = PLOT_BASE_DENSITY.out.tsv - ch_bases_pdf = PLOT_BASE_DENSITY.out.pdf - - if (!params.skip_variants_quast) { - QUAST ( ch_consensus.collect{ it[1] }, fasta, gff, true, params.gff ) - ch_quast_results = QUAST.out.results - ch_quast_tsv = QUAST.out.tsv - ch_quast_version = QUAST.out.version - } + IVAR_VARIANTS_TO_VCF ( + ch_ivar_tsv, + ivar_multiqc_header + ) + ch_versions = ch_versions.mix(IVAR_VARIANTS_TO_VCF.out.versions.first()) - if (!params.skip_pangolin) { - PANGOLIN ( ch_consensus ) - ch_pangolin_report = PANGOLIN.out.report - ch_pangolin_version = PANGOLIN.out.version - } - - if (!params.skip_nextclade) { - NEXTCLADE ( ch_consensus ) - ch_nextclade_report = NEXTCLADE.out.csv - ch_nextclade_version = NEXTCLADE.out.version - } - } - - // - // Annotate variants - // - ch_snpeff_vcf = Channel.empty() - ch_snpeff_tbi = Channel.empty() - ch_snpeff_stats = Channel.empty() - ch_snpeff_csv = Channel.empty() - ch_snpeff_txt = Channel.empty() - ch_snpeff_html = Channel.empty() - ch_snpsift_txt = Channel.empty() - ch_snpeff_version = Channel.empty() - ch_snpsift_version = Channel.empty() - if (params.gff && !params.skip_snpeff) { - SNPEFF_SNPSIFT ( VCF_BGZIP_TABIX_STATS.out.vcf, snpeff_db, snpeff_config, fasta ) - ch_snpeff_vcf = SNPEFF_SNPSIFT.out.vcf - ch_snpeff_tbi = SNPEFF_SNPSIFT.out.tbi - ch_snpeff_stats = SNPEFF_SNPSIFT.out.stats - ch_snpeff_csv = SNPEFF_SNPSIFT.out.csv - ch_snpeff_txt = SNPEFF_SNPSIFT.out.txt - ch_snpeff_html = SNPEFF_SNPSIFT.out.html - ch_snpsift_txt = SNPEFF_SNPSIFT.out.snpsift_txt - ch_snpeff_version = SNPEFF_SNPSIFT.out.snpeff_version - ch_snpsift_version = SNPEFF_SNPSIFT.out.snpsift_version - } + VCF_BGZIP_TABIX_STATS ( + IVAR_VARIANTS_TO_VCF.out.vcf + ) + ch_versions = ch_versions.mix(VCF_BGZIP_TABIX_STATS.out.versions) // - // Variant screenshots with ASCIIGenome + // Run downstream tools for variants QC // - ch_asciigenome_pdf = Channel.empty() - ch_asciigenome_version = Channel.empty() - if (!params.skip_asciigenome) { - bam - .join(VCF_BGZIP_TABIX_STATS.out.vcf, by: [0]) - .join(VCF_BGZIP_TABIX_STATS.out.stats, by: [0]) - .map { meta, bam, vcf, stats -> - if (WorkflowCommons.getNumVariantsFromBCFToolsStats(stats) > 0) { - return [ meta, bam, vcf ] - } - } - .set { ch_asciigenome } - - ASCIIGENOME ( - ch_asciigenome, - fasta, - sizes, - gff, - bed, - params.asciigenome_window_size, - params.asciigenome_read_depth - ) - ch_asciigenome_pdf = ASCIIGENOME.out.pdf - ch_asciigenome_version = ASCIIGENOME.out.version - } + VARIANTS_QC ( + bam, + VCF_BGZIP_TABIX_STATS.out.vcf, + VCF_BGZIP_TABIX_STATS.out.stats, + fasta, + sizes, + gff, + bed, + snpeff_db, + snpeff_config + ) + ch_versions = ch_versions.mix(VARIANTS_QC.out.versions) emit: - tsv = IVAR_VARIANTS.out.tsv // channel: [ val(meta), [ tsv ] ] - ivar_version = IVAR_VARIANTS.out.version // path: *.version.txt - - vcf_orig = IVAR_VARIANTS_TO_VCF.out.vcf // channel: [ val(meta), [ vcf ] ] - log_out = IVAR_VARIANTS_TO_VCF.out.log // channel: [ val(meta), [ log ] ] - multiqc_tsv = IVAR_VARIANTS_TO_VCF.out.tsv // channel: [ val(meta), [ tsv ] ] - - vcf = VCF_BGZIP_TABIX_STATS.out.vcf // channel: [ val(meta), [ vcf ] ] - tbi = VCF_BGZIP_TABIX_STATS.out.tbi // channel: [ val(meta), [ tbi ] ] - stats = VCF_BGZIP_TABIX_STATS.out.stats // channel: [ val(meta), [ txt ] ] - tabix_version = VCF_BGZIP_TABIX_STATS.out.tabix_version // path: *.version.txt - bcftools_version = VCF_BGZIP_TABIX_STATS.out.bcftools_version // path: *.version.txt - - consensus = ch_consensus // channel: [ val(meta), [ fasta ] ] - consensus_qual = ch_consensus_qual // channel: [ val(meta), [ fasta ] ] - bases_tsv = ch_bases_tsv // channel: [ val(meta), [ tsv ] ] - bases_pdf = ch_bases_pdf // channel: [ val(meta), [ pdf ] ] + tsv = ch_ivar_tsv // channel: [ val(meta), [ tsv ] ] - quast_results = ch_quast_results // channel: [ val(meta), [ results ] ] - quast_tsv = ch_quast_tsv // channel: [ val(meta), [ tsv ] ] - quast_version = ch_quast_version // path: *.version.txt + vcf_orig = IVAR_VARIANTS_TO_VCF.out.vcf // channel: [ val(meta), [ vcf ] ] + log_out = IVAR_VARIANTS_TO_VCF.out.log // channel: [ val(meta), [ log ] ] + multiqc_tsv = IVAR_VARIANTS_TO_VCF.out.tsv // channel: [ val(meta), [ tsv ] ] - snpeff_vcf = ch_snpeff_vcf // channel: [ val(meta), [ vcf.gz ] ] - snpeff_tbi = ch_snpeff_tbi // channel: [ val(meta), [ tbi ] ] - snpeff_stats = ch_snpeff_stats // channel: [ val(meta), [ txt ] ] - snpeff_csv = ch_snpeff_csv // channel: [ val(meta), [ csv ] ] - snpeff_txt = ch_snpeff_txt // channel: [ val(meta), [ txt ] ] - snpeff_html = ch_snpeff_html // channel: [ val(meta), [ html ] ] - snpsift_txt = ch_snpsift_txt // channel: [ val(meta), [ txt ] ] - snpeff_version = ch_snpeff_version // path: *.version.txt - snpsift_version = ch_snpsift_version // path: *.version.txt + vcf = VCF_BGZIP_TABIX_STATS.out.vcf // channel: [ val(meta), [ vcf ] ] + tbi = VCF_BGZIP_TABIX_STATS.out.tbi // channel: [ val(meta), [ tbi ] ] + stats = VCF_BGZIP_TABIX_STATS.out.stats // channel: [ val(meta), [ txt ] ] - pangolin_report = ch_pangolin_report // channel: [ val(meta), [ csv ] ] - pangolin_version = ch_pangolin_version // path: *.version.txt + snpeff_vcf = VARIANTS_QC.out.snpeff_vcf // channel: [ val(meta), [ vcf.gz ] ] + snpeff_tbi = VARIANTS_QC.out.snpeff_tbi // channel: [ val(meta), [ tbi ] ] + snpeff_stats = VARIANTS_QC.out.snpeff_stats // channel: [ val(meta), [ txt ] ] + snpeff_csv = VARIANTS_QC.out.snpeff_csv // channel: [ val(meta), [ csv ] ] + snpeff_txt = VARIANTS_QC.out.snpeff_txt // channel: [ val(meta), [ txt ] ] + snpeff_html = VARIANTS_QC.out.snpeff_html // channel: [ val(meta), [ html ] ] + snpsift_txt = VARIANTS_QC.out.snpsift_txt // channel: [ val(meta), [ txt ] ] - nextclade_report = ch_nextclade_report // channel: [ val(meta), [ csv ] ] - nextclade_version = ch_nextclade_version // path: *.version.txt + asciigenome_pdf = VARIANTS_QC.out.asciigenome_pdf // channel: [ val(meta), [ pdf ] ] - asciigenome_pdf = ch_asciigenome_pdf // channel: [ val(meta), [ pdf ] ] - asciigenome_version = ch_asciigenome_version // path: *.version.txt + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/variants_long_table.nf b/subworkflows/local/variants_long_table.nf new file mode 100644 index 00000000..938f3f3b --- /dev/null +++ b/subworkflows/local/variants_long_table.nf @@ -0,0 +1,39 @@ +// +// Create a long table with variant information including AA changes and lineage info +// + +include { BCFTOOLS_QUERY } from '../../modules/nf-core/modules/bcftools/query/main' +include { MAKE_VARIANTS_LONG_TABLE } from '../../modules/local/make_variants_long_table' + +workflow VARIANTS_LONG_TABLE { + take: + vcf // channel: [ val(meta), [ vcf ] ] + tbi // channel: [ val(meta), [ tbi ] ] + snpsift // channel: [ val(meta), [ txt ] ] + pangolin // channel: [ val(meta), [ csv ] ] + + main: + + ch_versions = Channel.empty() + + BCFTOOLS_QUERY ( + vcf.join(tbi, by: [0]), + [], + [], + [] + ) + ch_versions = ch_versions.mix(BCFTOOLS_QUERY.out.versions.first()) + + MAKE_VARIANTS_LONG_TABLE ( + BCFTOOLS_QUERY.out.txt.collect{it[1]}, + snpsift.collect{it[1]}.ifEmpty([]), + pangolin.collect{it[1]}.ifEmpty([]) + ) + ch_versions = ch_versions.mix(MAKE_VARIANTS_LONG_TABLE.out.versions) + + emit: + query_table = BCFTOOLS_QUERY.out.txt // channel: [ val(meta), [ txt ] ] + long_table = MAKE_VARIANTS_LONG_TABLE.out.csv // channel: [ val(meta), [ csv ] ] + + versions = ch_versions // channel: [ versions.yml ] +} diff --git a/subworkflows/local/variants_qc.nf b/subworkflows/local/variants_qc.nf new file mode 100644 index 00000000..cded165c --- /dev/null +++ b/subworkflows/local/variants_qc.nf @@ -0,0 +1,91 @@ +// +// Variant calling QC +// + +include { ASCIIGENOME } from '../../modules/local/asciigenome' +include { SNPEFF_SNPSIFT } from './snpeff_snpsift' + +workflow VARIANTS_QC { + take: + bam // channel: [ val(meta), [ bam ] ] + vcf // channel: [ val(meta), [ vcf ] ] + stats // channel: [ val(meta), [ bcftools_stats ] ] + fasta // channel: /path/to/genome.fasta + sizes // channel: /path/to/genome.sizes + gff // channel: /path/to/genome.gff + bed // channel: /path/to/primers.bed + snpeff_db // channel: /path/to/snpeff_db/ + snpeff_config // channel: /path/to/snpeff.config + + main: + + ch_versions = Channel.empty() + + // + // Annotate variants + // + ch_snpeff_vcf = Channel.empty() + ch_snpeff_tbi = Channel.empty() + ch_snpeff_stats = Channel.empty() + ch_snpeff_csv = Channel.empty() + ch_snpeff_txt = Channel.empty() + ch_snpeff_html = Channel.empty() + ch_snpsift_txt = Channel.empty() + if (params.gff && !params.skip_snpeff) { + SNPEFF_SNPSIFT ( + vcf, + snpeff_db, + snpeff_config, + fasta + ) + ch_snpeff_vcf = SNPEFF_SNPSIFT.out.vcf + ch_snpeff_tbi = SNPEFF_SNPSIFT.out.tbi + ch_snpeff_stats = SNPEFF_SNPSIFT.out.stats + ch_snpeff_csv = SNPEFF_SNPSIFT.out.csv + ch_snpeff_txt = SNPEFF_SNPSIFT.out.txt + ch_snpeff_html = SNPEFF_SNPSIFT.out.html + ch_snpsift_txt = SNPEFF_SNPSIFT.out.snpsift_txt + ch_versions = ch_versions.mix(SNPEFF_SNPSIFT.out.versions) + } + + // + // Variant screenshots with ASCIIGenome + // + ch_asciigenome_pdf = Channel.empty() + if (!params.skip_asciigenome) { + bam + .join(vcf, by: [0]) + .join(stats, by: [0]) + .map { meta, bam, vcf, stats -> + if (WorkflowCommons.getNumVariantsFromBCFToolsStats(stats) > 0) { + return [ meta, bam, vcf ] + } + } + .set { ch_asciigenome } + + ASCIIGENOME ( + ch_asciigenome, + fasta, + sizes, + gff, + bed, + params.asciigenome_window_size, + params.asciigenome_read_depth + ) + ch_asciigenome_pdf = ASCIIGENOME.out.pdf + ch_versions = ch_versions.mix(ASCIIGENOME.out.versions.first()) + } + + emit: + snpeff_vcf = ch_snpeff_vcf // channel: [ val(meta), [ vcf.gz ] ] + snpeff_tbi = ch_snpeff_tbi // channel: [ val(meta), [ tbi ] ] + snpeff_stats = ch_snpeff_stats // channel: [ val(meta), [ txt ] ] + snpeff_csv = ch_snpeff_csv // channel: [ val(meta), [ csv ] ] + snpeff_txt = ch_snpeff_txt // channel: [ val(meta), [ txt ] ] + snpeff_html = ch_snpeff_html // channel: [ val(meta), [ html ] ] + snpsift_txt = ch_snpsift_txt // channel: [ val(meta), [ txt ] ] + + asciigenome_pdf = ch_asciigenome_pdf // channel: [ val(meta), [ pdf ] ] + + versions = ch_versions // channel: [ versions.yml ] +} diff --git a/subworkflows/nf-core/align_bowtie2.nf b/subworkflows/nf-core/align_bowtie2.nf index 9cf615fd..1ec4764b 100644 --- a/subworkflows/nf-core/align_bowtie2.nf +++ b/subworkflows/nf-core/align_bowtie2.nf @@ -2,39 +2,48 @@ // Alignment with Bowtie2 // -params.align_options = [:] -params.samtools_options = [:] - -include { BOWTIE2_ALIGN } from '../../modules/nf-core/modules/bowtie2/align/main' addParams( options: params.align_options ) -include { BAM_SORT_SAMTOOLS } from '../nf-core/bam_sort_samtools' addParams( options: params.samtools_options ) +include { BOWTIE2_ALIGN } from '../../modules/nf-core/modules/bowtie2/align/main' +include { BAM_SORT_SAMTOOLS } from './bam_sort_samtools' workflow ALIGN_BOWTIE2 { take: - reads // channel: [ val(meta), [ reads ] ] - index // channel: /path/to/bowtie2/index/ + reads // channel: [ val(meta), [ reads ] ] + index // channel: /path/to/bowtie2/index/ + save_unaligned // value: boolean main: + ch_versions = Channel.empty() + // // Map reads with Bowtie2 // - BOWTIE2_ALIGN ( reads, index ) + BOWTIE2_ALIGN ( + reads, + index, + save_unaligned + ) + ch_versions = ch_versions.mix(BOWTIE2_ALIGN.out.versions.first()) // // Sort, index BAM file and run samtools stats, flagstat and idxstats // - BAM_SORT_SAMTOOLS ( BOWTIE2_ALIGN.out.bam ) + BAM_SORT_SAMTOOLS ( + BOWTIE2_ALIGN.out.bam + ) + ch_versions = ch_versions.mix(BAM_SORT_SAMTOOLS.out.versions) emit: - bam_orig = BOWTIE2_ALIGN.out.bam // channel: [ val(meta), bam ] - log_out = BOWTIE2_ALIGN.out.log // channel: [ val(meta), log ] - fastq = BOWTIE2_ALIGN.out.fastq // channel: [ val(meta), fastq ] - bowtie2_version = BOWTIE2_ALIGN.out.version // path: *.version.txt - - bam = BAM_SORT_SAMTOOLS.out.bam // channel: [ val(meta), [ bam ] ] - bai = BAM_SORT_SAMTOOLS.out.bai // channel: [ val(meta), [ bai ] ] - stats = BAM_SORT_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] - flagstat = BAM_SORT_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ] - idxstats = BAM_SORT_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ] - samtools_version = BAM_SORT_SAMTOOLS.out.version // path: *.version.txt + bam_orig = BOWTIE2_ALIGN.out.bam // channel: [ val(meta), bam ] + log_out = BOWTIE2_ALIGN.out.log // channel: [ val(meta), log ] + fastq = BOWTIE2_ALIGN.out.fastq // channel: [ val(meta), fastq ] + + bam = BAM_SORT_SAMTOOLS.out.bam // channel: [ val(meta), [ bam ] ] + bai = BAM_SORT_SAMTOOLS.out.bai // channel: [ val(meta), [ bai ] ] + csi = BAM_SORT_SAMTOOLS.out.csi // channel: [ val(meta), [ csi ] ] + stats = BAM_SORT_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] + flagstat = BAM_SORT_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ] + idxstats = BAM_SORT_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ] + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/nf-core/bam_sort_samtools.nf b/subworkflows/nf-core/bam_sort_samtools.nf index 89ce5661..d1e6c74c 100644 --- a/subworkflows/nf-core/bam_sort_samtools.nf +++ b/subworkflows/nf-core/bam_sort_samtools.nf @@ -2,26 +2,56 @@ // Sort, index BAM file and run samtools stats, flagstat and idxstats // -params.options = [:] - -include { SAMTOOLS_SORT } from '../../modules/nf-core/modules/samtools/sort/main' addParams( options: params.options ) -include { SAMTOOLS_INDEX } from '../../modules/nf-core/modules/samtools/index/main' addParams( options: params.options ) -include { BAM_STATS_SAMTOOLS } from './bam_stats_samtools' addParams( options: params.options ) +include { SAMTOOLS_SORT } from '../../modules/nf-core/modules/samtools/sort/main' +include { SAMTOOLS_INDEX } from '../../modules/nf-core/modules/samtools/index/main' +include { BAM_STATS_SAMTOOLS } from './bam_stats_samtools' workflow BAM_SORT_SAMTOOLS { take: - bam // channel: [ val(meta), [ bam ] ] + ch_bam // channel: [ val(meta), [ bam ] ] main: - SAMTOOLS_SORT ( bam ) - SAMTOOLS_INDEX ( SAMTOOLS_SORT.out.bam ) - BAM_STATS_SAMTOOLS ( SAMTOOLS_SORT.out.bam.join(SAMTOOLS_INDEX.out.bai, by: [0]) ) + + ch_versions = Channel.empty() + + SAMTOOLS_SORT ( + ch_bam + ) + ch_versions = ch_versions.mix(SAMTOOLS_SORT.out.versions.first()) + + SAMTOOLS_INDEX ( + SAMTOOLS_SORT.out.bam + ) + ch_versions = ch_versions.mix(SAMTOOLS_INDEX.out.versions.first()) + + SAMTOOLS_SORT + .out + .bam + .join(SAMTOOLS_INDEX.out.bai, by: [0], remainder: true) + .join(SAMTOOLS_INDEX.out.csi, by: [0], remainder: true) + .map { + meta, bam, bai, csi -> + if (bai) { + [ meta, bam, bai ] + } else { + [ meta, bam, csi ] + } + } + .set { ch_bam_bai } + + BAM_STATS_SAMTOOLS ( + ch_bam_bai + ) + ch_versions = ch_versions.mix(BAM_STATS_SAMTOOLS.out.versions) emit: bam = SAMTOOLS_SORT.out.bam // channel: [ val(meta), [ bam ] ] bai = SAMTOOLS_INDEX.out.bai // channel: [ val(meta), [ bai ] ] + csi = SAMTOOLS_INDEX.out.csi // channel: [ val(meta), [ csi ] ] + stats = BAM_STATS_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] flagstat = BAM_STATS_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ] idxstats = BAM_STATS_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ] - version = SAMTOOLS_SORT.out.version // path: *.version.txt + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/nf-core/bam_stats_samtools.nf b/subworkflows/nf-core/bam_stats_samtools.nf index 00476afe..68d632c3 100644 --- a/subworkflows/nf-core/bam_stats_samtools.nf +++ b/subworkflows/nf-core/bam_stats_samtools.nf @@ -2,24 +2,37 @@ // Run SAMtools stats, flagstat and idxstats // -params.options = [:] - -include { SAMTOOLS_STATS } from '../../modules/nf-core/modules/samtools/stats/main' addParams( options: params.options ) -include { SAMTOOLS_IDXSTATS } from '../../modules/nf-core/modules/samtools/idxstats/main' addParams( options: params.options ) -include { SAMTOOLS_FLAGSTAT } from '../../modules/nf-core/modules/samtools/flagstat/main' addParams( options: params.options ) +include { SAMTOOLS_STATS } from '../../modules/nf-core/modules/samtools/stats/main' +include { SAMTOOLS_IDXSTATS } from '../../modules/nf-core/modules/samtools/idxstats/main' +include { SAMTOOLS_FLAGSTAT } from '../../modules/nf-core/modules/samtools/flagstat/main' workflow BAM_STATS_SAMTOOLS { take: - bam_bai // channel: [ val(meta), [ bam ], [bai] ] + ch_bam_bai // channel: [ val(meta), [ bam ], [bai/csi] ] main: - SAMTOOLS_STATS ( bam_bai ) - SAMTOOLS_FLAGSTAT ( bam_bai ) - SAMTOOLS_IDXSTATS ( bam_bai ) + ch_versions = Channel.empty() + + SAMTOOLS_STATS ( + ch_bam_bai, + [] + ) + ch_versions = ch_versions.mix(SAMTOOLS_STATS.out.versions.first()) + + SAMTOOLS_FLAGSTAT ( + ch_bam_bai + ) + ch_versions = ch_versions.mix(SAMTOOLS_FLAGSTAT.out.versions.first()) + + SAMTOOLS_IDXSTATS ( + ch_bam_bai + ) + ch_versions = ch_versions.mix(SAMTOOLS_IDXSTATS.out.versions.first()) emit: stats = SAMTOOLS_STATS.out.stats // channel: [ val(meta), [ stats ] ] flagstat = SAMTOOLS_FLAGSTAT.out.flagstat // channel: [ val(meta), [ flagstat ] ] idxstats = SAMTOOLS_IDXSTATS.out.idxstats // channel: [ val(meta), [ idxstats ] ] - version = SAMTOOLS_STATS.out.version // path: *.version.txt + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/nf-core/fastqc_fastp.nf b/subworkflows/nf-core/fastqc_fastp.nf index 41193c1a..45be0c14 100644 --- a/subworkflows/nf-core/fastqc_fastp.nf +++ b/subworkflows/nf-core/fastqc_fastp.nf @@ -2,47 +2,60 @@ // Read QC and trimming // -params.fastqc_raw_options = [:] -params.fastqc_trim_options = [:] -params.fastp_options = [:] - -include { FASTQC as FASTQC_RAW } from '../../modules/nf-core/modules/fastqc/main' addParams( options: params.fastqc_raw_options ) -include { FASTQC as FASTQC_TRIM } from '../../modules/nf-core/modules/fastqc/main' addParams( options: params.fastqc_trim_options ) -include { FASTP } from '../../modules/nf-core/modules/fastp/main' addParams( options: params.fastp_options ) +include { FASTQC as FASTQC_RAW } from '../../modules/nf-core/modules/fastqc/main' +include { FASTQC as FASTQC_TRIM } from '../../modules/nf-core/modules/fastqc/main' +include { FASTP } from '../../modules/nf-core/modules/fastp/main' workflow FASTQC_FASTP { take: - reads // channel: [ val(meta), [ reads ] ] + reads // channel: [ val(meta), [ reads ] ] + save_trimmed_fail // value: boolean + save_merged // value: boolean main: + + ch_versions = Channel.empty() + fastqc_raw_html = Channel.empty() fastqc_raw_zip = Channel.empty() - fastqc_version = Channel.empty() if (!params.skip_fastqc) { - FASTQC_RAW ( reads ).html.set { fastqc_raw_html } - fastqc_raw_zip = FASTQC_RAW.out.zip - fastqc_version = FASTQC_RAW.out.version + FASTQC_RAW ( + reads + ) + fastqc_raw_html = FASTQC_RAW.out.html + fastqc_raw_zip = FASTQC_RAW.out.zip + ch_versions = ch_versions.mix(FASTQC_RAW.out.versions.first()) } - trim_reads = reads - trim_json = Channel.empty() - trim_html = Channel.empty() - trim_log = Channel.empty() - trim_reads_fail = Channel.empty() - fastp_version = Channel.empty() - fastqc_trim_html = Channel.empty() - fastqc_trim_zip = Channel.empty() + trim_reads = reads + trim_json = Channel.empty() + trim_html = Channel.empty() + trim_log = Channel.empty() + trim_reads_fail = Channel.empty() + trim_reads_merged = Channel.empty() + fastqc_trim_html = Channel.empty() + fastqc_trim_zip = Channel.empty() if (!params.skip_fastp) { - FASTP ( reads ).reads.set { trim_reads } - trim_json = FASTP.out.json - trim_html = FASTP.out.html - trim_log = FASTP.out.log - trim_reads_fail = FASTP.out.reads_fail - fastp_version = FASTP.out.version + FASTP ( + reads, + save_trimmed_fail, + save_merged + ) + trim_reads = FASTP.out.reads + trim_json = FASTP.out.json + trim_html = FASTP.out.html + trim_log = FASTP.out.log + trim_reads_fail = FASTP.out.reads_fail + trim_reads_merged = FASTP.out.reads_merged + ch_versions = ch_versions.mix(FASTP.out.versions.first()) if (!params.skip_fastqc) { - FASTQC_TRIM ( trim_reads ).html.set { fastqc_trim_html } - fastqc_trim_zip = FASTQC_TRIM.out.zip + FASTQC_TRIM ( + trim_reads + ) + fastqc_trim_html = FASTQC_TRIM.out.html + fastqc_trim_zip = FASTQC_TRIM.out.zip + ch_versions = ch_versions.mix(FASTQC_TRIM.out.versions.first()) } } @@ -52,11 +65,12 @@ workflow FASTQC_FASTP { trim_html // channel: [ val(meta), [ html ] ] trim_log // channel: [ val(meta), [ log ] ] trim_reads_fail // channel: [ val(meta), [ fastq.gz ] ] - fastp_version // path: *.version.txt + trim_reads_merged // channel: [ val(meta), [ fastq.gz ] ] fastqc_raw_html // channel: [ val(meta), [ html ] ] fastqc_raw_zip // channel: [ val(meta), [ zip ] ] fastqc_trim_html // channel: [ val(meta), [ html ] ] fastqc_trim_zip // channel: [ val(meta), [ zip ] ] - fastqc_version // path: *.version.txt + + versions = ch_versions.ifEmpty(null) // channel: [ versions.yml ] } diff --git a/subworkflows/nf-core/filter_bam_samtools.nf b/subworkflows/nf-core/filter_bam_samtools.nf index 583aa29f..cfa8b568 100644 --- a/subworkflows/nf-core/filter_bam_samtools.nf +++ b/subworkflows/nf-core/filter_bam_samtools.nf @@ -2,12 +2,9 @@ // Filter co-ordinate sorted BAM, index and run samtools stats, flagstat and idxstats // -params.samtools_view_options = [:] -params.samtools_index_options = [:] - -include { SAMTOOLS_VIEW } from '../../modules/nf-core/modules/samtools/view/main' addParams( options: params.samtools_view_options ) -include { SAMTOOLS_INDEX } from '../../modules/nf-core/modules/samtools/index/main' addParams( options: params.samtools_index_options ) -include { BAM_STATS_SAMTOOLS } from '../nf-core/bam_stats_samtools' addParams( options: params.samtools_index_options ) +include { SAMTOOLS_VIEW } from '../../modules/nf-core/modules/samtools/view/main' +include { SAMTOOLS_INDEX } from '../../modules/nf-core/modules/samtools/index/main' +include { BAM_STATS_SAMTOOLS } from './bam_stats_samtools' workflow FILTER_BAM_SAMTOOLS { take: @@ -15,22 +12,36 @@ workflow FILTER_BAM_SAMTOOLS { main: + ch_versions = Channel.empty() + // // Filter BAM using Samtools view // - SAMTOOLS_VIEW ( bam ) + SAMTOOLS_VIEW ( + bam, + [] + ) + ch_versions = ch_versions.mix(SAMTOOLS_VIEW.out.versions.first()) // // Index BAM file and run samtools stats, flagstat and idxstats // - SAMTOOLS_INDEX ( SAMTOOLS_VIEW.out.bam ) - BAM_STATS_SAMTOOLS ( SAMTOOLS_VIEW.out.bam.join(SAMTOOLS_INDEX.out.bai, by: [0]) ) + SAMTOOLS_INDEX ( + SAMTOOLS_VIEW.out.bam + ) + ch_versions = ch_versions.mix(SAMTOOLS_INDEX.out.versions.first()) + + BAM_STATS_SAMTOOLS ( + SAMTOOLS_VIEW.out.bam.join(SAMTOOLS_INDEX.out.bai, by: [0]) + ) + ch_versions = ch_versions.mix(BAM_STATS_SAMTOOLS.out.versions) emit: - bam = SAMTOOLS_VIEW.out.bam // channel: [ val(meta), [ bam ] ] - bai = SAMTOOLS_INDEX.out.bai // channel: [ val(meta), [ bai ] ] - stats = BAM_STATS_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] - flagstat = BAM_STATS_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ] - idxstats = BAM_STATS_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ] - samtools_version = SAMTOOLS_INDEX.out.version // path: *.version.txt + bam = SAMTOOLS_VIEW.out.bam // channel: [ val(meta), [ bam ] ] + bai = SAMTOOLS_INDEX.out.bai // channel: [ val(meta), [ bai ] ] + stats = BAM_STATS_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] + flagstat = BAM_STATS_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ] + idxstats = BAM_STATS_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ] + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/nf-core/mark_duplicates_picard.nf b/subworkflows/nf-core/mark_duplicates_picard.nf index e5bc9501..08bb41ba 100644 --- a/subworkflows/nf-core/mark_duplicates_picard.nf +++ b/subworkflows/nf-core/mark_duplicates_picard.nf @@ -1,13 +1,10 @@ // -// Picard MarkDuplicates, sort, index BAM file and run samtools stats, flagstat and idxstats +// Picard MarkDuplicates, index BAM file and run samtools stats, flagstat and idxstats // -params.markduplicates_options = [:] -params.samtools_options = [:] - -include { PICARD_MARKDUPLICATES } from '../../modules/nf-core/modules/picard/markduplicates/main' addParams( options: params.markduplicates_options ) -include { SAMTOOLS_INDEX } from '../../modules/nf-core/modules/samtools/index/main' addParams( options: params.samtools_options ) -include { BAM_STATS_SAMTOOLS } from './bam_stats_samtools' addParams( options: params.samtools_options ) +include { PICARD_MARKDUPLICATES } from '../../modules/nf-core/modules/picard/markduplicates/main' +include { SAMTOOLS_INDEX } from '../../modules/nf-core/modules/samtools/index/main' +include { BAM_STATS_SAMTOOLS } from './bam_stats_samtools' workflow MARK_DUPLICATES_PICARD { take: @@ -15,25 +12,53 @@ workflow MARK_DUPLICATES_PICARD { main: + ch_versions = Channel.empty() + // // Picard MarkDuplicates // - PICARD_MARKDUPLICATES ( bam ) + PICARD_MARKDUPLICATES ( + bam + ) + ch_versions = ch_versions.mix(PICARD_MARKDUPLICATES.out.versions.first()) // // Index BAM file and run samtools stats, flagstat and idxstats // - SAMTOOLS_INDEX ( PICARD_MARKDUPLICATES.out.bam ) - BAM_STATS_SAMTOOLS ( PICARD_MARKDUPLICATES.out.bam.join(SAMTOOLS_INDEX.out.bai, by: [0]) ) + SAMTOOLS_INDEX ( + PICARD_MARKDUPLICATES.out.bam + ) + ch_versions = ch_versions.mix(SAMTOOLS_INDEX.out.versions.first()) + + PICARD_MARKDUPLICATES + .out + .bam + .join(SAMTOOLS_INDEX.out.bai, by: [0], remainder: true) + .join(SAMTOOLS_INDEX.out.csi, by: [0], remainder: true) + .map { + meta, bam, bai, csi -> + if (bai) { + [ meta, bam, bai ] + } else { + [ meta, bam, csi ] + } + } + .set { ch_bam_bai } + + BAM_STATS_SAMTOOLS ( + ch_bam_bai + ) + ch_versions = ch_versions.mix(BAM_STATS_SAMTOOLS.out.versions) emit: - bam = PICARD_MARKDUPLICATES.out.bam // channel: [ val(meta), [ bam ] ] - metrics = PICARD_MARKDUPLICATES.out.metrics // channel: [ val(meta), [ metrics ] ] - picard_version = PICARD_MARKDUPLICATES.out.version // path: *.version.txt - - bai = SAMTOOLS_INDEX.out.bai // channel: [ val(meta), [ bai ] ] - stats = BAM_STATS_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] - flagstat = BAM_STATS_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ] - idxstats = BAM_STATS_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ] - samtools_version = SAMTOOLS_INDEX.out.version // path: *.version.txt + bam = PICARD_MARKDUPLICATES.out.bam // channel: [ val(meta), [ bam ] ] + metrics = PICARD_MARKDUPLICATES.out.metrics // channel: [ val(meta), [ metrics ] ] + + bai = SAMTOOLS_INDEX.out.bai // channel: [ val(meta), [ bai ] ] + csi = SAMTOOLS_INDEX.out.csi // channel: [ val(meta), [ csi ] ] + stats = BAM_STATS_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] + flagstat = BAM_STATS_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ] + idxstats = BAM_STATS_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ] + + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/nf-core/primer_trim_ivar.nf b/subworkflows/nf-core/primer_trim_ivar.nf new file mode 100644 index 00000000..e3046bb0 --- /dev/null +++ b/subworkflows/nf-core/primer_trim_ivar.nf @@ -0,0 +1,45 @@ +// +// iVar trim, sort, index BAM file and run samtools stats, flagstat and idxstats +// + +include { IVAR_TRIM } from '../../modules/nf-core/modules/ivar/trim/main' +include { BAM_SORT_SAMTOOLS } from './bam_sort_samtools' + +workflow PRIMER_TRIM_IVAR { + take: + bam // channel: [ val(meta), [ bam ], [bai] ] + bed // path : bed + + main: + + ch_versions = Channel.empty() + + // + // iVar trim primers + // + IVAR_TRIM ( + bam, + bed + ) + ch_versions = ch_versions.mix(IVAR_TRIM.out.versions.first()) + + // + // Sort, index BAM file and run samtools stats, flagstat and idxstats + // + BAM_SORT_SAMTOOLS ( + IVAR_TRIM.out.bam + ) + ch_versions = ch_versions.mix(BAM_SORT_SAMTOOLS.out.versions) + + emit: + bam_orig = IVAR_TRIM.out.bam // channel: [ val(meta), bam ] + log_out = IVAR_TRIM.out.log // channel: [ val(meta), log ] + + bam = BAM_SORT_SAMTOOLS.out.bam // channel: [ val(meta), [ bam ] ] + bai = BAM_SORT_SAMTOOLS.out.bai // channel: [ val(meta), [ bai ] ] + stats = BAM_SORT_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] + flagstat = BAM_SORT_SAMTOOLS.out.flagstat // channel: [ val(meta), [ flagstat ] ] + idxstats = BAM_SORT_SAMTOOLS.out.idxstats // channel: [ val(meta), [ idxstats ] ] + + versions = ch_versions // channel: [ versions.yml ] +} diff --git a/subworkflows/nf-core/vcf_bgzip_tabix_stats.nf b/subworkflows/nf-core/vcf_bgzip_tabix_stats.nf index 73437873..6df4bad7 100644 --- a/subworkflows/nf-core/vcf_bgzip_tabix_stats.nf +++ b/subworkflows/nf-core/vcf_bgzip_tabix_stats.nf @@ -2,26 +2,31 @@ // Run BCFTools bgzip, tabix and stats commands // -params.bgzip_options = [:] -params.tabix_options = [:] -params.stats_options = [:] - -include { TABIX_BGZIP } from '../../modules/nf-core/modules/tabix/bgzip/main' addParams( options: params.bgzip_options ) -include { VCF_TABIX_STATS } from './vcf_tabix_stats' addParams( tabix_options: params.tabix_options, stats_options: params.stats_options ) +include { TABIX_BGZIP } from '../../modules/nf-core/modules/tabix/bgzip/main' +include { VCF_TABIX_STATS } from './vcf_tabix_stats' workflow VCF_BGZIP_TABIX_STATS { take: vcf // channel: [ val(meta), [ vcf ] ] main: - TABIX_BGZIP ( vcf ) - VCF_TABIX_STATS ( TABIX_BGZIP.out.gz ) + + ch_versions = Channel.empty() + + TABIX_BGZIP ( + vcf + ) + ch_versions = ch_versions.mix(TABIX_BGZIP.out.versions.first()) + + VCF_TABIX_STATS ( + TABIX_BGZIP.out.gz + ) + ch_versions = ch_versions.mix(VCF_TABIX_STATS.out.versions) emit: - vcf = TABIX_BGZIP.out.gz // channel: [ val(meta), [ vcf.gz ] ] - tabix_version = TABIX_BGZIP.out.version // path: *.version.txt + vcf = TABIX_BGZIP.out.gz // channel: [ val(meta), [ vcf.gz ] ] + tbi = VCF_TABIX_STATS.out.tbi // channel: [ val(meta), [ tbi ] ] + stats = VCF_TABIX_STATS.out.stats // channel: [ val(meta), [ txt ] ] - tbi = VCF_TABIX_STATS.out.tbi // channel: [ val(meta), [ tbi ] ] - stats = VCF_TABIX_STATS.out.stats // channel: [ val(meta), [ txt ] ] - bcftools_version = VCF_TABIX_STATS.out.bcftools_version // path: *.version.txt + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/nf-core/vcf_tabix_stats.nf b/subworkflows/nf-core/vcf_tabix_stats.nf index a49d824f..623ff347 100644 --- a/subworkflows/nf-core/vcf_tabix_stats.nf +++ b/subworkflows/nf-core/vcf_tabix_stats.nf @@ -2,24 +2,31 @@ // Run BCFTools tabix and stats commands // -params.tabix_options = [:] -params.stats_options = [:] - -include { TABIX_TABIX } from '../../modules/nf-core/modules/tabix/tabix/main' addParams( options: params.tabix_options ) -include { BCFTOOLS_STATS } from '../../modules/nf-core/modules/bcftools/stats/main' addParams( options: params.stats_options ) +include { TABIX_TABIX } from '../../modules/nf-core/modules/tabix/tabix/main' +include { BCFTOOLS_STATS } from '../../modules/nf-core/modules/bcftools/stats/main' workflow VCF_TABIX_STATS { take: vcf // channel: [ val(meta), [ vcf ] ] main: - TABIX_TABIX ( vcf ) - BCFTOOLS_STATS ( vcf ) + + ch_versions = Channel.empty() + + TABIX_TABIX ( + vcf + ) + ch_versions = ch_versions.mix(TABIX_TABIX.out.versions.first()) + + BCFTOOLS_STATS ( + vcf + ) + ch_versions = ch_versions.mix(BCFTOOLS_STATS.out.versions.first()) emit: - tbi = TABIX_TABIX.out.tbi // channel: [ val(meta), [ tbi ] ] - tabix_version = TABIX_TABIX.out.version // path: *.version.txt + tbi = TABIX_TABIX.out.tbi // channel: [ val(meta), [ tbi ] ] + stats = BCFTOOLS_STATS.out.stats // channel: [ val(meta), [ txt ] ] + + versions = ch_versions // channel: [ versions.yml ] - stats = BCFTOOLS_STATS.out.stats // channel: [ val(meta), [ txt ] ] - bcftools_version = BCFTOOLS_STATS.out.version // path: *.version.txt } diff --git a/workflows/illumina.nf b/workflows/illumina.nf index f8cba65a..b8bc5484 100644 --- a/workflows/illumina.nf +++ b/workflows/illumina.nf @@ -5,10 +5,11 @@ */ def valid_params = [ - protocols : ['metagenomic', 'amplicon'], - callers : ['ivar', 'bcftools'], - assemblers : ['spades', 'unicycler', 'minia'], - spades_modes: ['rnaviral', 'corona', 'metaviral', 'meta', 'metaplasmid', 'plasmid', 'isolate', 'rna', 'bio'] + protocols : ['metagenomic', 'amplicon'], + variant_callers : ['ivar', 'bcftools'], + consensus_callers : ['ivar', 'bcftools'], + assemblers : ['spades', 'unicycler', 'minia'], + spades_modes : ['rnaviral', 'corona', 'metaviral', 'meta', 'metaplasmid', 'plasmid', 'isolate', 'rna', 'bio'] ] def summary_params = NfcoreSchema.paramsSummaryMap(workflow, params) @@ -24,15 +25,13 @@ def checkPathParamList = [ ] for (param in checkPathParamList) { if (param) { file(param, checkIfExists: true) } } -// Stage dummy file to be used as an optional input where required -ch_dummy_file = file("$projectDir/assets/dummy_file.txt", checkIfExists: true) - if (params.input) { ch_input = file(params.input) } else { exit 1, 'Input samplesheet file not specified!' } -if (params.spades_hmm) { ch_spades_hmm = file(params.spades_hmm) } else { ch_spades_hmm = ch_dummy_file } +if (params.spades_hmm) { ch_spades_hmm = file(params.spades_hmm) } else { ch_spades_hmm = [] } def assemblers = params.assemblers ? params.assemblers.split(',').collect{ it.trim().toLowerCase() } : [] -def callers = params.callers ? params.callers.split(',').collect{ it.trim().toLowerCase() } : [] -if (!callers) { callers = params.protocol == 'amplicon' ? ['ivar'] : ['bcftools'] } + +def variant_caller = params.variant_caller +if (!variant_caller) { variant_caller = params.protocol == 'amplicon' ? 'ivar' : 'bcftools' } /* ======================================================================================== @@ -41,7 +40,7 @@ if (!callers) { callers = params.protocol == 'amplicon' ? ['ivar'] : ['bcftools */ ch_multiqc_config = file("$projectDir/assets/multiqc_config_illumina.yaml", checkIfExists: true) -ch_multiqc_custom_config = params.multiqc_config ? Channel.fromPath(params.multiqc_config) : Channel.empty() +ch_multiqc_custom_config = params.multiqc_config ? file(params.multiqc_config) : [] // Header files ch_blast_outfmt6_header = file("$projectDir/assets/headers/blast_outfmt6_header.txt", checkIfExists: true) @@ -53,72 +52,30 @@ ch_ivar_variants_header_mqc = file("$projectDir/assets/headers/ivar_variants_hea ======================================================================================== */ -// Don't overwrite global params.modules, create a copy instead and use that within the main script. -def modules = params.modules.clone() - -def multiqc_options = modules['illumina_multiqc'] -multiqc_options.args += params.multiqc_title ? Utils.joinModuleArgs(["--title \"$params.multiqc_title\""]) : '' - -if (!params.skip_assembly) { - multiqc_options.publish_files.put('assembly_metrics_mqc.csv','') -} -if (!params.skip_variants) { - multiqc_options.publish_files.put('variants_metrics_mqc.csv','') -} - -include { BCFTOOLS_ISEC } from '../modules/local/bcftools_isec' addParams( options: modules['illumina_bcftools_isec'] ) -include { CUTADAPT } from '../modules/local/cutadapt' addParams( options: modules['illumina_cutadapt'] ) -include { GET_SOFTWARE_VERSIONS } from '../modules/local/get_software_versions' addParams( options: [publish_files: ['tsv':'']] ) -include { MULTIQC } from '../modules/local/multiqc_illumina' addParams( options: multiqc_options ) -include { PLOT_MOSDEPTH_REGIONS as PLOT_MOSDEPTH_REGIONS_GENOME } from '../modules/local/plot_mosdepth_regions' addParams( options: modules['illumina_plot_mosdepth_regions_genome'] ) -include { PLOT_MOSDEPTH_REGIONS as PLOT_MOSDEPTH_REGIONS_AMPLICON } from '../modules/local/plot_mosdepth_regions' addParams( options: modules['illumina_plot_mosdepth_regions_amplicon'] ) -include { MULTIQC_CUSTOM_TSV_FROM_STRING as MULTIQC_CUSTOM_TSV_FAIL_READS } from '../modules/local/multiqc_custom_tsv_from_string' addParams( options: [publish_files: false] ) -include { MULTIQC_CUSTOM_TSV_FROM_STRING as MULTIQC_CUSTOM_TSV_FAIL_MAPPED } from '../modules/local/multiqc_custom_tsv_from_string' addParams( options: [publish_files: false] ) -include { MULTIQC_CUSTOM_TSV_FROM_STRING as MULTIQC_CUSTOM_TSV_IVAR_NEXTCLADE } from '../modules/local/multiqc_custom_tsv_from_string' addParams( options: [publish_files: false] ) -include { MULTIQC_CUSTOM_TSV_FROM_STRING as MULTIQC_CUSTOM_TSV_BCFTOOLS_NEXTCLADE } from '../modules/local/multiqc_custom_tsv_from_string' addParams( options: [publish_files: false] ) +// +// MODULE: Loaded from modules/local/ +// +include { CUTADAPT } from '../modules/local/cutadapt' +include { MULTIQC } from '../modules/local/multiqc_illumina' +include { PLOT_MOSDEPTH_REGIONS as PLOT_MOSDEPTH_REGIONS_GENOME } from '../modules/local/plot_mosdepth_regions' +include { PLOT_MOSDEPTH_REGIONS as PLOT_MOSDEPTH_REGIONS_AMPLICON } from '../modules/local/plot_mosdepth_regions' +include { MULTIQC_TSV_FROM_LIST as MULTIQC_TSV_FAIL_READS } from '../modules/local/multiqc_tsv_from_list' +include { MULTIQC_TSV_FROM_LIST as MULTIQC_TSV_FAIL_MAPPED } from '../modules/local/multiqc_tsv_from_list' +include { MULTIQC_TSV_FROM_LIST as MULTIQC_TSV_NEXTCLADE } from '../modules/local/multiqc_tsv_from_list' // // SUBWORKFLOW: Consisting of a mix of local and nf-core/modules // -def publish_genome_options = params.save_reference ? [publish_dir: 'genome'] : [publish_files: false] -def publish_index_options = params.save_reference ? [publish_dir: 'genome/index'] : [publish_files: false] -def publish_db_options = params.save_reference ? [publish_dir: 'genome/db'] : [publish_files: false] -def bedtools_getfasta_options = modules['illumina_bedtools_getfasta'] -def bowtie2_build_options = modules['illumina_bowtie2_build'] -def snpeff_build_options = modules['illumina_snpeff_build'] -def makeblastdb_options = modules['illumina_blast_makeblastdb'] -def kraken2_build_options = modules['illumina_kraken2_build'] -def collapse_primers_options = modules['illumina_collapse_primers_illumina'] -if (!params.save_reference) { - bedtools_getfasta_options['publish_files'] = false - bowtie2_build_options['publish_files'] = false - snpeff_build_options['publish_files'] = false - makeblastdb_options['publish_files'] = false - kraken2_build_options['publish_files'] = false - collapse_primers_options['publish_files'] = false -} - -def ivar_trim_options = modules['illumina_ivar_trim'] -ivar_trim_options.args += params.ivar_trim_noprimer ? '' : Utils.joinModuleArgs(['-e']) -ivar_trim_options.args += params.ivar_trim_offset ? Utils.joinModuleArgs(["-x ${params.ivar_trim_offset}"]) : '' - -def ivar_trim_sort_bam_options = modules['illumina_ivar_trim_sort_bam'] -if (params.skip_markduplicates) { - ivar_trim_sort_bam_options.publish_files.put('bam','') - ivar_trim_sort_bam_options.publish_files.put('bai','') -} - -def spades_options = modules['illumina_spades'] -spades_options.args += params.spades_mode ? Utils.joinModuleArgs(["--${params.spades_mode}"]) : '' - -include { INPUT_CHECK } from '../subworkflows/local/input_check' addParams( options: [:] ) -include { PREPARE_GENOME } from '../subworkflows/local/prepare_genome_illumina' addParams( genome_options: publish_genome_options, index_options: publish_index_options, db_options: publish_db_options, bowtie2_build_options: bowtie2_build_options, bedtools_getfasta_options: bedtools_getfasta_options, collapse_primers_options: collapse_primers_options, snpeff_build_options: snpeff_build_options, makeblastdb_options: makeblastdb_options, kraken2_build_options: kraken2_build_options ) -include { PRIMER_TRIM_IVAR } from '../subworkflows/local/primer_trim_ivar' addParams( ivar_trim_options: ivar_trim_options, samtools_options: ivar_trim_sort_bam_options ) -include { VARIANTS_IVAR } from '../subworkflows/local/variants_ivar' addParams( ivar_variants_options: modules['illumina_ivar_variants'], ivar_variants_to_vcf_options: modules['illumina_ivar_variants_to_vcf'], tabix_bgzip_options: modules['illumina_ivar_tabix_bgzip'], tabix_tabix_options: modules['illumina_ivar_tabix_tabix'], bcftools_stats_options: modules['illumina_ivar_bcftools_stats'], ivar_consensus_options: modules['illumina_ivar_consensus'], consensus_plot_options: modules['illumina_ivar_consensus_plot'], quast_options: modules['illumina_ivar_quast'], snpeff_options: modules['illumina_ivar_snpeff'], snpsift_options: modules['illumina_ivar_snpsift'], snpeff_bgzip_options: modules['illumina_ivar_snpeff_bgzip'], snpeff_tabix_options: modules['illumina_ivar_snpeff_tabix'], snpeff_stats_options: modules['illumina_ivar_snpeff_stats'], pangolin_options: modules['illumina_ivar_pangolin'], nextclade_options: modules['illumina_ivar_nextclade'], asciigenome_options: modules['illumina_ivar_asciigenome'] ) -include { VARIANTS_BCFTOOLS } from '../subworkflows/local/variants_bcftools' addParams( bcftools_mpileup_options: modules['illumina_bcftools_mpileup'], quast_options: modules['illumina_bcftools_quast'], consensus_genomecov_options: modules['illumina_bcftools_consensus_genomecov'], consensus_merge_options: modules['illumina_bcftools_consensus_merge'], consensus_mask_options: modules['illumina_bcftools_consensus_mask'], consensus_maskfasta_options: modules['illumina_bcftools_consensus_maskfasta'], consensus_bcftools_options: modules['illumina_bcftools_consensus_bcftools'], consensus_plot_options: modules['illumina_bcftools_consensus_plot'], snpeff_options: modules['illumina_bcftools_snpeff'], snpsift_options: modules['illumina_bcftools_snpsift'], snpeff_bgzip_options: modules['illumina_bcftools_snpeff_bgzip'], snpeff_tabix_options: modules['illumina_bcftools_snpeff_tabix'], snpeff_stats_options: modules['illumina_bcftools_snpeff_stats'], pangolin_options: modules['illumina_bcftools_pangolin'], nextclade_options: modules['illumina_bcftools_nextclade'], asciigenome_options: modules['illumina_bcftools_asciigenome'] ) -include { ASSEMBLY_SPADES } from '../subworkflows/local/assembly_spades' addParams( spades_options: spades_options, bandage_options: modules['illumina_spades_bandage'], blastn_options: modules['illumina_spades_blastn'], blastn_filter_options: modules['illumina_spades_blastn_filter'], abacas_options: modules['illumina_spades_abacas'], plasmidid_options: modules['illumina_spades_plasmidid'], quast_options: modules['illumina_spades_quast'] ) -include { ASSEMBLY_UNICYCLER } from '../subworkflows/local/assembly_unicycler' addParams( unicycler_options: modules['illumina_unicycler'], bandage_options: modules['illumina_unicycler_bandage'], blastn_options: modules['illumina_unicycler_blastn'], blastn_filter_options: modules['illumina_unicycler_blastn_filter'], abacas_options: modules['illumina_unicycler_abacas'], plasmidid_options: modules['illumina_unicycler_plasmidid'], quast_options: modules['illumina_unicycler_quast'] ) -include { ASSEMBLY_MINIA } from '../subworkflows/local/assembly_minia' addParams( minia_options: modules['illumina_minia'], blastn_options: modules['illumina_minia_blastn'], blastn_filter_options: modules['illumina_minia_blastn_filter'], abacas_options: modules['illumina_minia_abacas'], plasmidid_options: modules['illumina_minia_plasmidid'], quast_options: modules['illumina_minia_quast'] ) +include { INPUT_CHECK } from '../subworkflows/local/input_check' +include { PREPARE_GENOME } from '../subworkflows/local/prepare_genome_illumina' +include { VARIANTS_IVAR } from '../subworkflows/local/variants_ivar' +include { VARIANTS_BCFTOOLS } from '../subworkflows/local/variants_bcftools' +include { CONSENSUS_IVAR } from '../subworkflows/local/consensus_ivar' +include { CONSENSUS_BCFTOOLS } from '../subworkflows/local/consensus_bcftools' +include { VARIANTS_LONG_TABLE } from '../subworkflows/local/variants_long_table' +include { ASSEMBLY_SPADES } from '../subworkflows/local/assembly_spades' +include { ASSEMBLY_UNICYCLER } from '../subworkflows/local/assembly_unicycler' +include { ASSEMBLY_MINIA } from '../subworkflows/local/assembly_minia' /* ======================================================================================== @@ -129,28 +86,21 @@ include { ASSEMBLY_MINIA } from '../subworkflows/local/assembly_minia' // // MODULE: Installed directly from nf-core/modules // -include { CAT_FASTQ } from '../modules/nf-core/modules/cat/fastq/main' addParams( options: modules['illumina_cat_fastq'] ) -include { FASTQC } from '../modules/nf-core/modules/fastqc/main' addParams( options: modules['illumina_cutadapt_fastqc'] ) -include { KRAKEN2_KRAKEN2 } from '../modules/nf-core/modules/kraken2/kraken2/main' addParams( options: modules['illumina_kraken2_kraken2'] ) -include { PICARD_COLLECTMULTIPLEMETRICS } from '../modules/nf-core/modules/picard/collectmultiplemetrics/main' addParams( options: modules['illumina_picard_collectmultiplemetrics'] ) -include { MOSDEPTH as MOSDEPTH_GENOME } from '../modules/nf-core/modules/mosdepth/main' addParams( options: modules['illumina_mosdepth_genome'] ) -include { MOSDEPTH as MOSDEPTH_AMPLICON } from '../modules/nf-core/modules/mosdepth/main' addParams( options: modules['illumina_mosdepth_amplicon'] ) +include { CAT_FASTQ } from '../modules/nf-core/modules/cat/fastq/main' +include { FASTQC } from '../modules/nf-core/modules/fastqc/main' +include { KRAKEN2_KRAKEN2 } from '../modules/nf-core/modules/kraken2/kraken2/main' +include { PICARD_COLLECTMULTIPLEMETRICS } from '../modules/nf-core/modules/picard/collectmultiplemetrics/main' +include { CUSTOM_DUMPSOFTWAREVERSIONS } from '../modules/nf-core/modules/custom/dumpsoftwareversions/main' +include { MOSDEPTH as MOSDEPTH_GENOME } from '../modules/nf-core/modules/mosdepth/main' +include { MOSDEPTH as MOSDEPTH_AMPLICON } from '../modules/nf-core/modules/mosdepth/main' // // SUBWORKFLOW: Consisting entirely of nf-core/modules // -def fastp_options = modules['illumina_fastp'] -if (params.save_trimmed_fail) { fastp_options.publish_files.put('fail.fastq.gz','') } - -def bowtie2_align_options = modules['illumina_bowtie2_align'] -if (params.save_unaligned) { bowtie2_align_options.publish_files.put('fastq.gz','unmapped') } - -def markduplicates_options = modules['illumina_picard_markduplicates'] -markduplicates_options.args += params.filter_duplicates ? Utils.joinModuleArgs(['REMOVE_DUPLICATES=true']) : '' - -include { FASTQC_FASTP } from '../subworkflows/nf-core/fastqc_fastp' addParams( fastqc_raw_options: modules['illumina_fastqc_raw'], fastqc_trim_options: modules['illumina_fastqc_trim'], fastp_options: fastp_options ) -include { ALIGN_BOWTIE2 } from '../subworkflows/nf-core/align_bowtie2' addParams( align_options: bowtie2_align_options, samtools_options: modules['illumina_bowtie2_sort_bam'] ) -include { MARK_DUPLICATES_PICARD } from '../subworkflows/nf-core/mark_duplicates_picard' addParams( markduplicates_options: markduplicates_options, samtools_options: modules['illumina_picard_markduplicates_sort_bam'] ) +include { FASTQC_FASTP } from '../subworkflows/nf-core/fastqc_fastp' +include { ALIGN_BOWTIE2 } from '../subworkflows/nf-core/align_bowtie2' +include { PRIMER_TRIM_IVAR } from '../subworkflows/nf-core/primer_trim_ivar' +include { MARK_DUPLICATES_PICARD } from '../subworkflows/nf-core/mark_duplicates_picard' /* ======================================================================================== @@ -165,14 +115,13 @@ def fail_mapped_reads = [:] workflow ILLUMINA { - ch_software_versions = Channel.empty() + ch_versions = Channel.empty() // // SUBWORKFLOW: Uncompress and prepare reference genome files // - PREPARE_GENOME ( - ch_dummy_file - ) + PREPARE_GENOME () + ch_versions = ch_versions.mix(PREPARE_GENOME.out.versions) // Check genome fasta only contains a single contig PREPARE_GENOME @@ -187,7 +136,22 @@ workflow ILLUMINA { .primer_bed .map { WorkflowCommons.checkPrimerSuffixes(it, params.primer_left_suffix, params.primer_right_suffix, log) } - // Check if the primer BED file supplied to the pipeline is from the SWIFT/SNAP protocol + // Check whether the contigs in the primer BED file are present in the reference genome + PREPARE_GENOME + .out + .primer_bed + .map { [ WorkflowCommons.getColFromFile(it, col=0, uniqify=true, sep='\t') ] } + .set { ch_bed_contigs } + + PREPARE_GENOME + .out + .fai + .map { [ WorkflowCommons.getColFromFile(it, col=0, uniqify=true, sep='\t') ] } + .concat(ch_bed_contigs) + .collect() + .map { fai, bed -> WorkflowCommons.checkContigsInBED(fai, bed, log) } + + // Check whether the primer BED file supplied to the pipeline is from the SWIFT/SNAP protocol if (!params.ivar_trim_offset) { PREPARE_GENOME .out @@ -203,6 +167,7 @@ workflow ILLUMINA { ch_input, params.platform ) + .sample_info .map { meta, fastq -> meta.id = meta.id.split('_')[0..-2].join('_') @@ -217,6 +182,7 @@ workflow ILLUMINA { return [ meta, fastq.flatten() ] } .set { ch_fastq } + ch_versions = ch_versions.mix(INPUT_CHECK.out.versions) // // MODULE: Concatenate FastQ files from same sample if required @@ -224,18 +190,21 @@ workflow ILLUMINA { CAT_FASTQ ( ch_fastq.multiple ) + .reads .mix(ch_fastq.single) .set { ch_cat_fastq } + ch_versions = ch_versions.mix(CAT_FASTQ.out.versions.first().ifEmpty(null)) // // SUBWORKFLOW: Read QC and trim adapters // FASTQC_FASTP ( - ch_cat_fastq + ch_cat_fastq, + params.save_trimmed_fail, + false ) - ch_variants_fastq = FASTQC_FASTP.out.reads - ch_software_versions = ch_software_versions.mix(FASTQC_FASTP.out.fastqc_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(FASTQC_FASTP.out.fastp_version.first().ifEmpty(null)) + ch_variants_fastq = FASTQC_FASTP.out.reads + ch_versions = ch_versions.mix(FASTQC_FASTP.out.versions) // // Filter empty FastQ files after adapter trimming @@ -266,9 +235,9 @@ workflow ILLUMINA { } .set { ch_pass_fail_reads } - MULTIQC_CUSTOM_TSV_FAIL_READS ( + MULTIQC_TSV_FAIL_READS ( ch_pass_fail_reads.collect(), - 'Sample\tReads before trimming', + ['Sample', 'Reads before trimming'], 'fail_mapped_reads' ) .set { ch_fail_reads_multiqc } @@ -284,8 +253,8 @@ workflow ILLUMINA { ch_variants_fastq, PREPARE_GENOME.out.kraken2_db ) - ch_kraken2_multiqc = KRAKEN2_KRAKEN2.out.txt - ch_software_versions = ch_software_versions.mix(KRAKEN2_KRAKEN2.out.version.first().ifEmpty(null)) + ch_kraken2_multiqc = KRAKEN2_KRAKEN2.out.txt + ch_versions = ch_versions.mix(KRAKEN2_KRAKEN2.out.versions.first().ifEmpty(null)) if (params.kraken2_variants_host_filter) { ch_variants_fastq = KRAKEN2_KRAKEN2.out.unclassified @@ -306,14 +275,14 @@ workflow ILLUMINA { if (!params.skip_variants) { ALIGN_BOWTIE2 ( ch_variants_fastq, - PREPARE_GENOME.out.bowtie2_index + PREPARE_GENOME.out.bowtie2_index, + params.save_unaligned ) ch_bam = ALIGN_BOWTIE2.out.bam ch_bai = ALIGN_BOWTIE2.out.bai ch_bowtie2_multiqc = ALIGN_BOWTIE2.out.log_out ch_bowtie2_flagstat_multiqc = ALIGN_BOWTIE2.out.flagstat - ch_software_versions = ch_software_versions.mix(ALIGN_BOWTIE2.out.bowtie2_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ALIGN_BOWTIE2.out.samtools_version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(ALIGN_BOWTIE2.out.versions) } // @@ -346,9 +315,9 @@ workflow ILLUMINA { } .set { ch_pass_fail_mapped } - MULTIQC_CUSTOM_TSV_FAIL_MAPPED ( + MULTIQC_TSV_FAIL_MAPPED ( ch_pass_fail_mapped.fail.collect(), - 'Sample\tMapped reads', + ['Sample', 'Mapped reads'], 'fail_mapped_samples' ) .set { ch_fail_mapping_multiqc } @@ -366,7 +335,7 @@ workflow ILLUMINA { ch_bam = PRIMER_TRIM_IVAR.out.bam ch_bai = PRIMER_TRIM_IVAR.out.bai ch_ivar_trim_flagstat_multiqc = PRIMER_TRIM_IVAR.out.flagstat - ch_software_versions = ch_software_versions.mix(PRIMER_TRIM_IVAR.out.ivar_version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(PRIMER_TRIM_IVAR.out.versions) } // @@ -380,7 +349,7 @@ workflow ILLUMINA { ch_bam = MARK_DUPLICATES_PICARD.out.bam ch_bai = MARK_DUPLICATES_PICARD.out.bai ch_markduplicates_flagstat_multiqc = MARK_DUPLICATES_PICARD.out.flagstat - ch_software_versions = ch_software_versions.mix(MARK_DUPLICATES_PICARD.out.picard_version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(MARK_DUPLICATES_PICARD.out.versions) } // @@ -391,7 +360,7 @@ workflow ILLUMINA { ch_bam, PREPARE_GENOME.out.fasta ) - ch_software_versions = ch_software_versions.mix(PICARD_COLLECTMULTIPLEMETRICS.out.version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(PICARD_COLLECTMULTIPLEMETRICS.out.versions.first().ifEmpty(null)) } // @@ -400,18 +369,18 @@ workflow ILLUMINA { ch_mosdepth_multiqc = Channel.empty() ch_amplicon_heatmap_multiqc = Channel.empty() if (!params.skip_variants && !params.skip_mosdepth) { - MOSDEPTH_GENOME ( ch_bam.join(ch_bai, by: [0]), - ch_dummy_file, + [], 200 ) - ch_mosdepth_multiqc = MOSDEPTH_GENOME.out.global_txt - ch_software_versions = ch_software_versions.mix(MOSDEPTH_GENOME.out.version.first().ifEmpty(null)) + ch_mosdepth_multiqc = MOSDEPTH_GENOME.out.global_txt + ch_versions = ch_versions.mix(MOSDEPTH_GENOME.out.versions.first().ifEmpty(null)) PLOT_MOSDEPTH_REGIONS_GENOME ( MOSDEPTH_GENOME.out.regions_bed.collect { it[1] } ) + ch_versions = ch_versions.mix(PLOT_MOSDEPTH_REGIONS_GENOME.out.versions) if (params.protocol == 'amplicon') { MOSDEPTH_AMPLICON ( @@ -419,136 +388,136 @@ workflow ILLUMINA { PREPARE_GENOME.out.primer_collapsed_bed, 0 ) + ch_versions = ch_versions.mix(MOSDEPTH_AMPLICON.out.versions.first().ifEmpty(null)) PLOT_MOSDEPTH_REGIONS_AMPLICON ( MOSDEPTH_AMPLICON.out.regions_bed.collect { it[1] } ) ch_amplicon_heatmap_multiqc = PLOT_MOSDEPTH_REGIONS_AMPLICON.out.heatmap_tsv + ch_versions = ch_versions.mix(PLOT_MOSDEPTH_REGIONS_AMPLICON.out.versions) } } // // SUBWORKFLOW: Call variants with IVar // - ch_ivar_vcf = Channel.empty() - ch_ivar_tbi = Channel.empty() + ch_vcf = Channel.empty() + ch_tbi = Channel.empty() ch_ivar_counts_multiqc = Channel.empty() - ch_ivar_stats_multiqc = Channel.empty() - ch_ivar_snpeff_multiqc = Channel.empty() - ch_ivar_quast_multiqc = Channel.empty() - ch_ivar_pangolin_multiqc = Channel.empty() - ch_ivar_nextclade_multiqc = Channel.empty() - if (!params.skip_variants && 'ivar' in callers) { + ch_bcftools_stats_multiqc = Channel.empty() + ch_snpsift_txt = Channel.empty() + ch_snpeff_multiqc = Channel.empty() + if (!params.skip_variants && variant_caller == 'ivar') { VARIANTS_IVAR ( ch_bam, PREPARE_GENOME.out.fasta, PREPARE_GENOME.out.chrom_sizes, - params.gff ? PREPARE_GENOME.out.gff : [], + PREPARE_GENOME.out.gff, (params.protocol == 'amplicon' && params.primer_bed) ? PREPARE_GENOME.out.primer_bed : [], PREPARE_GENOME.out.snpeff_db, PREPARE_GENOME.out.snpeff_config, ch_ivar_variants_header_mqc ) - ch_ivar_vcf = VARIANTS_IVAR.out.vcf - ch_ivar_tbi = VARIANTS_IVAR.out.tbi - ch_ivar_counts_multiqc = VARIANTS_IVAR.out.multiqc_tsv - ch_ivar_stats_multiqc = VARIANTS_IVAR.out.stats - ch_ivar_snpeff_multiqc = VARIANTS_IVAR.out.snpeff_csv - ch_ivar_quast_multiqc = VARIANTS_IVAR.out.quast_tsv - ch_ivar_pangolin_multiqc = VARIANTS_IVAR.out.pangolin_report - ch_ivar_nextclade_report = VARIANTS_IVAR.out.nextclade_report - ch_software_versions = ch_software_versions.mix(VARIANTS_IVAR.out.ivar_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_IVAR.out.tabix_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_IVAR.out.bcftools_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_IVAR.out.quast_version.ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_IVAR.out.snpeff_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_IVAR.out.snpsift_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_IVAR.out.pangolin_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_IVAR.out.nextclade_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_IVAR.out.asciigenome_version.first().ifEmpty(null)) - - // - // MODULE: Get Nextclade clade information for MultiQC report - // - ch_ivar_nextclade_report - .map { meta, csv -> - def clade = WorkflowCommons.getNextcladeFieldMapFromCsv(csv)['clade'] - return [ "$meta.id\t$clade" ] - } - .set { ch_ivar_nextclade_multiqc } - - MULTIQC_CUSTOM_TSV_IVAR_NEXTCLADE ( - ch_ivar_nextclade_multiqc.collect(), - 'Sample\tclade', - 'ivar_nextclade_clade' - ) - .set { ch_ivar_nextclade_multiqc } + ch_vcf = VARIANTS_IVAR.out.vcf + ch_tbi = VARIANTS_IVAR.out.tbi + ch_ivar_counts_multiqc = VARIANTS_IVAR.out.multiqc_tsv + ch_bcftools_stats_multiqc = VARIANTS_IVAR.out.stats + ch_snpeff_multiqc = VARIANTS_IVAR.out.snpeff_csv + ch_snpsift_txt = VARIANTS_IVAR.out.snpsift_txt + ch_versions = ch_versions.mix(VARIANTS_IVAR.out.versions) } // // SUBWORKFLOW: Call variants with BCFTools // - ch_bcftools_vcf = Channel.empty() - ch_bcftools_tbi = Channel.empty() - ch_bcftools_stats_multiqc = Channel.empty() - ch_bcftools_snpeff_multiqc = Channel.empty() - ch_bcftools_quast_multiqc = Channel.empty() - ch_bcftools_pangolin_multiqc = Channel.empty() - ch_bcftools_nextclade_multiqc = Channel.empty() - if (!params.skip_variants && 'bcftools' in callers) { + if (!params.skip_variants && variant_caller == 'bcftools') { VARIANTS_BCFTOOLS ( ch_bam, PREPARE_GENOME.out.fasta, PREPARE_GENOME.out.chrom_sizes, - params.gff ? PREPARE_GENOME.out.gff : [], + PREPARE_GENOME.out.gff, (params.protocol == 'amplicon' && params.primer_bed) ? PREPARE_GENOME.out.primer_bed : [], PREPARE_GENOME.out.snpeff_db, PREPARE_GENOME.out.snpeff_config ) - ch_bcftools_vcf = VARIANTS_BCFTOOLS.out.vcf - ch_bcftools_tbi = VARIANTS_BCFTOOLS.out.tbi - ch_bcftools_stats_multiqc = VARIANTS_BCFTOOLS.out.stats - ch_bcftools_snpeff_multiqc = VARIANTS_BCFTOOLS.out.snpeff_csv - ch_bcftools_quast_multiqc = VARIANTS_BCFTOOLS.out.quast_tsv - ch_bcftools_pangolin_multiqc = VARIANTS_BCFTOOLS.out.pangolin_report - ch_bcftools_nextclade_report = VARIANTS_BCFTOOLS.out.nextclade_report - ch_software_versions = ch_software_versions.mix(VARIANTS_BCFTOOLS.out.bcftools_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_BCFTOOLS.out.bedtools_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_BCFTOOLS.out.quast_version.ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_BCFTOOLS.out.snpeff_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_BCFTOOLS.out.snpsift_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_BCFTOOLS.out.pangolin_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_BCFTOOLS.out.nextclade_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(VARIANTS_BCFTOOLS.out.asciigenome_version.first().ifEmpty(null)) - - // - // MODULE: Get Nextclade clade information for MultiQC report - // - ch_bcftools_nextclade_report + ch_vcf = VARIANTS_BCFTOOLS.out.vcf + ch_tbi = VARIANTS_BCFTOOLS.out.tbi + ch_bcftools_stats_multiqc = VARIANTS_BCFTOOLS.out.stats + ch_snpeff_multiqc = VARIANTS_BCFTOOLS.out.snpeff_csv + ch_snpsift_txt = VARIANTS_BCFTOOLS.out.snpsift_txt + ch_versions = ch_versions.mix(VARIANTS_BCFTOOLS.out.versions) + } + + // + // SUBWORKFLOW: Call consensus with iVar and downstream QC + // + ch_quast_multiqc = Channel.empty() + ch_pangolin_multiqc = Channel.empty() + ch_nextclade_report = Channel.empty() + if (!params.skip_consensus && params.consensus_caller == 'ivar') { + CONSENSUS_IVAR ( + ch_bam, + PREPARE_GENOME.out.fasta, + PREPARE_GENOME.out.gff, + PREPARE_GENOME.out.nextclade_db + ) + + ch_quast_multiqc = CONSENSUS_IVAR.out.quast_tsv + ch_pangolin_multiqc = CONSENSUS_IVAR.out.pangolin_report + ch_nextclade_report = CONSENSUS_IVAR.out.nextclade_report + ch_versions = ch_versions.mix(CONSENSUS_IVAR.out.versions) + } + + // + // SUBWORKFLOW: Call consensus with BCFTools + // + if (!params.skip_consensus && params.consensus_caller == 'bcftools' && variant_caller) { + CONSENSUS_BCFTOOLS ( + ch_bam, + ch_vcf, + ch_tbi, + PREPARE_GENOME.out.fasta, + PREPARE_GENOME.out.gff, + PREPARE_GENOME.out.nextclade_db + ) + + ch_quast_multiqc = CONSENSUS_BCFTOOLS.out.quast_tsv + ch_pangolin_multiqc = CONSENSUS_BCFTOOLS.out.pangolin_report + ch_nextclade_report = CONSENSUS_BCFTOOLS.out.nextclade_report + ch_versions = ch_versions.mix(CONSENSUS_BCFTOOLS.out.versions) + } + + // + // MODULE: Get Nextclade clade information for MultiQC report + // + ch_nextclade_multiqc = Channel.empty() + if (!params.skip_nextclade) { + ch_nextclade_report .map { meta, csv -> def clade = WorkflowCommons.getNextcladeFieldMapFromCsv(csv)['clade'] return [ "$meta.id\t$clade" ] } - .set { ch_bcftools_nextclade_multiqc } + .set { ch_nextclade_multiqc } - MULTIQC_CUSTOM_TSV_BCFTOOLS_NEXTCLADE ( - ch_bcftools_nextclade_multiqc.collect(), - 'Sample\tclade', - 'bcftools_nextclade_clade' + MULTIQC_TSV_NEXTCLADE ( + ch_nextclade_multiqc.collect(), + ['Sample', 'clade'], + 'nextclade_clade' ) - .set { ch_bcftools_nextclade_multiqc } + .set { ch_nextclade_multiqc } } // - // MODULE: Intersect variants across callers + // SUBWORKFLOW: Create variants long table report // - if (!params.skip_variants && callers.size() > 1) { - BCFTOOLS_ISEC ( - ch_ivar_vcf - .join(ch_ivar_tbi, by: [0]) - .join(ch_bcftools_vcf, by: [0]) - .join(ch_bcftools_tbi, by: [0]) + if (!params.skip_variants && !params.skip_variants_long_table && params.gff && !params.skip_snpeff) { + VARIANTS_LONG_TABLE ( + ch_vcf, + ch_tbi, + ch_snpsift_txt, + ch_pangolin_multiqc ) + ch_versions = ch_versions.mix(VARIANTS_LONG_TABLE.out.versions) } // @@ -560,14 +529,15 @@ workflow ILLUMINA { ch_assembly_fastq, PREPARE_GENOME.out.primer_fasta ) - ch_assembly_fastq = CUTADAPT.out.reads - ch_cutadapt_multiqc = CUTADAPT.out.log - ch_software_versions = ch_software_versions.mix(CUTADAPT.out.version.first().ifEmpty(null)) + ch_assembly_fastq = CUTADAPT.out.reads + ch_cutadapt_multiqc = CUTADAPT.out.log + ch_versions = ch_versions.mix(CUTADAPT.out.versions.first().ifEmpty(null)) if (!params.skip_fastqc) { FASTQC ( CUTADAPT.out.reads ) + ch_versions = ch_versions.mix(FASTQC.out.versions.first().ifEmpty(null)) } } @@ -577,7 +547,8 @@ workflow ILLUMINA { ch_spades_quast_multiqc = Channel.empty() if (!params.skip_assembly && 'spades' in assemblers) { ASSEMBLY_SPADES ( - ch_assembly_fastq, + ch_assembly_fastq.map { meta, fastq -> [ meta, fastq, [], [] ] }, + params.spades_mode, ch_spades_hmm, PREPARE_GENOME.out.fasta, PREPARE_GENOME.out.gff, @@ -585,12 +556,7 @@ workflow ILLUMINA { ch_blast_outfmt6_header ) ch_spades_quast_multiqc = ASSEMBLY_SPADES.out.quast_tsv - ch_software_versions = ch_software_versions.mix(ASSEMBLY_SPADES.out.spades_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_SPADES.out.bandage_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_SPADES.out.blast_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_SPADES.out.quast_version.ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_SPADES.out.abacas_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_SPADES.out.plasmidid_version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(ASSEMBLY_SPADES.out.versions) } // @@ -599,19 +565,14 @@ workflow ILLUMINA { ch_unicycler_quast_multiqc = Channel.empty() if (!params.skip_assembly && 'unicycler' in assemblers) { ASSEMBLY_UNICYCLER ( - ch_assembly_fastq, + ch_assembly_fastq.map { meta, fastq -> [ meta, fastq, [] ] }, PREPARE_GENOME.out.fasta, PREPARE_GENOME.out.gff, PREPARE_GENOME.out.blast_db, ch_blast_outfmt6_header ) ch_unicycler_quast_multiqc = ASSEMBLY_UNICYCLER.out.quast_tsv - ch_software_versions = ch_software_versions.mix(ASSEMBLY_UNICYCLER.out.unicycler_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_UNICYCLER.out.bandage_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_UNICYCLER.out.blast_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_UNICYCLER.out.quast_version.ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_UNICYCLER.out.abacas_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_UNICYCLER.out.plasmidid_version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(ASSEMBLY_UNICYCLER.out.versions) } // @@ -627,26 +588,14 @@ workflow ILLUMINA { ch_blast_outfmt6_header ) ch_minia_quast_multiqc = ASSEMBLY_MINIA.out.quast_tsv - ch_software_versions = ch_software_versions.mix(ASSEMBLY_MINIA.out.minia_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_MINIA.out.blast_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_MINIA.out.quast_version.ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_MINIA.out.abacas_version.first().ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(ASSEMBLY_MINIA.out.plasmidid_version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(ASSEMBLY_MINIA.out.versions) } // // MODULE: Pipeline reporting // - ch_software_versions - .map { it -> if (it) [ it.baseName, it ] } - .groupTuple() - .map { it[1][0] } - .flatten() - .collect() - .set { ch_software_versions } - - GET_SOFTWARE_VERSIONS ( - ch_software_versions + CUSTOM_DUMPSOFTWAREVERSIONS ( + ch_versions.unique().collectFile(name: 'collated_versions.yml') ) // @@ -658,8 +607,8 @@ workflow ILLUMINA { MULTIQC ( ch_multiqc_config, - ch_multiqc_custom_config.collect().ifEmpty([]), - GET_SOFTWARE_VERSIONS.out.yaml.collect(), + ch_multiqc_custom_config, + CUSTOM_DUMPSOFTWAREVERSIONS.out.mqc_yml.collect(), ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml'), ch_fail_reads_multiqc.ifEmpty([]), ch_fail_mapping_multiqc.ifEmpty([]), @@ -673,16 +622,11 @@ workflow ILLUMINA { ch_markduplicates_flagstat_multiqc.collect{it[1]}.ifEmpty([]), ch_mosdepth_multiqc.collect{it[1]}.ifEmpty([]), ch_ivar_counts_multiqc.collect{it[1]}.ifEmpty([]), - ch_ivar_stats_multiqc.collect{it[1]}.ifEmpty([]), - ch_ivar_snpeff_multiqc.collect{it[1]}.ifEmpty([]), - ch_ivar_quast_multiqc.collect().ifEmpty([]), - ch_ivar_pangolin_multiqc.collect{it[1]}.ifEmpty([]), - ch_ivar_nextclade_multiqc.collect().ifEmpty([]), ch_bcftools_stats_multiqc.collect{it[1]}.ifEmpty([]), - ch_bcftools_snpeff_multiqc.collect{it[1]}.ifEmpty([]), - ch_bcftools_quast_multiqc.collect().ifEmpty([]), - ch_bcftools_pangolin_multiqc.collect{it[1]}.ifEmpty([]), - ch_bcftools_nextclade_multiqc.collect().ifEmpty([]), + ch_snpeff_multiqc.collect{it[1]}.ifEmpty([]), + ch_quast_multiqc.collect().ifEmpty([]), + ch_pangolin_multiqc.collect{it[1]}.ifEmpty([]), + ch_nextclade_multiqc.collect().ifEmpty([]), ch_cutadapt_multiqc.collect{it[1]}.ifEmpty([]), ch_spades_quast_multiqc.collect().ifEmpty([]), ch_unicycler_quast_multiqc.collect().ifEmpty([]), diff --git a/workflows/nanopore.nf b/workflows/nanopore.nf index 5ebcfe19..84478f14 100644 --- a/workflows/nanopore.nf +++ b/workflows/nanopore.nf @@ -5,8 +5,8 @@ */ def valid_params = [ - artic_minion_caller : ['nanopolish', 'medaka'], - artic_minion_aligner : ['minimap2', 'bwa'] + artic_minion_caller : ['nanopolish', 'medaka'], + artic_minion_aligner : ['minimap2', 'bwa'] ] def summary_params = NfcoreSchema.paramsSummaryMap(workflow, params) @@ -20,16 +20,9 @@ def checkPathParamList = [ ] for (param in checkPathParamList) { if (param) { file(param, checkIfExists: true) } } -// Stage dummy file to be used as an optional input where required -ch_dummy_file = file("$projectDir/assets/dummy_file.txt", checkIfExists: true) - -// MultiQC config files -ch_multiqc_config = file("$projectDir/assets/multiqc_config_nanopore.yaml", checkIfExists: true) -ch_multiqc_custom_config = params.multiqc_config ? Channel.fromPath(params.multiqc_config) : Channel.empty() - if (params.input) { ch_input = file(params.input) } -if (params.fast5_dir) { ch_fast5_dir = file(params.fast5_dir) } else { ch_fast5_dir = ch_dummy_file } -if (params.sequencing_summary) { ch_sequencing_summary = file(params.sequencing_summary) } else { ch_sequencing_summary = ch_multiqc_config } +if (params.fast5_dir) { ch_fast5_dir = file(params.fast5_dir) } else { ch_fast5_dir = [] } +if (params.sequencing_summary) { ch_sequencing_summary = file(params.sequencing_summary) } else { ch_sequencing_summary = [] } // Need to stage medaka model properly depending on whether it is a string or a file ch_medaka_model = Channel.empty() @@ -41,42 +34,39 @@ if (params.artic_minion_caller == 'medaka') { /* ======================================================================================== - IMPORT LOCAL MODULES/SUBWORKFLOWS + CONFIG FILES ======================================================================================== */ -// Don't overwrite global params.modules, create a copy instead and use that within the main script. -def modules = params.modules.clone() - -def multiqc_options = modules['nanopore_multiqc'] -multiqc_options.args += params.multiqc_title ? Utils.joinModuleArgs(["--title \"$params.multiqc_title\""]) : '' +ch_multiqc_config = file("$projectDir/assets/multiqc_config_nanopore.yaml", checkIfExists: true) +ch_multiqc_custom_config = params.multiqc_config ? file(params.multiqc_config) : [] -include { ASCIIGENOME } from '../modules/local/asciigenome' addParams( options: modules['nanopore_asciigenome'] ) -include { GET_SOFTWARE_VERSIONS } from '../modules/local/get_software_versions' addParams( options: [publish_files: ['tsv':'']] ) -include { MULTIQC } from '../modules/local/multiqc_nanopore' addParams( options: multiqc_options ) +/* +======================================================================================== + IMPORT LOCAL MODULES/SUBWORKFLOWS +======================================================================================== +*/ -include { MULTIQC_CUSTOM_TSV_FROM_STRING as MULTIQC_CUSTOM_TSV_NO_SAMPLE_NAME } from '../modules/local/multiqc_custom_tsv_from_string' addParams( options: [publish_files: false] ) -include { MULTIQC_CUSTOM_TSV_FROM_STRING as MULTIQC_CUSTOM_TSV_NO_BARCODES } from '../modules/local/multiqc_custom_tsv_from_string' addParams( options: [publish_files: false] ) -include { MULTIQC_CUSTOM_TSV_FROM_STRING as MULTIQC_CUSTOM_TSV_BARCODE_COUNT } from '../modules/local/multiqc_custom_tsv_from_string' addParams( options: [publish_files: false] ) -include { MULTIQC_CUSTOM_TSV_FROM_STRING as MULTIQC_CUSTOM_TSV_GUPPYPLEX_COUNT } from '../modules/local/multiqc_custom_tsv_from_string' addParams( options: [publish_files: false] ) -include { MULTIQC_CUSTOM_TSV_FROM_STRING as MULTIQC_CUSTOM_TSV_NEXTCLADE } from '../modules/local/multiqc_custom_tsv_from_string' addParams( options: [publish_files: false] ) -include { PLOT_MOSDEPTH_REGIONS as PLOT_MOSDEPTH_REGIONS_GENOME } from '../modules/local/plot_mosdepth_regions' addParams( options: modules['nanopore_plot_mosdepth_regions_genome'] ) -include { PLOT_MOSDEPTH_REGIONS as PLOT_MOSDEPTH_REGIONS_AMPLICON } from '../modules/local/plot_mosdepth_regions' addParams( options: modules['nanopore_plot_mosdepth_regions_amplicon'] ) +// +// MODULE: Loaded from modules/local/ +// +include { ASCIIGENOME } from '../modules/local/asciigenome' +include { MULTIQC } from '../modules/local/multiqc_nanopore' +include { PLOT_MOSDEPTH_REGIONS as PLOT_MOSDEPTH_REGIONS_GENOME } from '../modules/local/plot_mosdepth_regions' +include { PLOT_MOSDEPTH_REGIONS as PLOT_MOSDEPTH_REGIONS_AMPLICON } from '../modules/local/plot_mosdepth_regions' +include { MULTIQC_TSV_FROM_LIST as MULTIQC_TSV_NO_SAMPLE_NAME } from '../modules/local/multiqc_tsv_from_list' +include { MULTIQC_TSV_FROM_LIST as MULTIQC_TSV_NO_BARCODES } from '../modules/local/multiqc_tsv_from_list' +include { MULTIQC_TSV_FROM_LIST as MULTIQC_TSV_BARCODE_COUNT } from '../modules/local/multiqc_tsv_from_list' +include { MULTIQC_TSV_FROM_LIST as MULTIQC_TSV_GUPPYPLEX_COUNT } from '../modules/local/multiqc_tsv_from_list' +include { MULTIQC_TSV_FROM_LIST as MULTIQC_TSV_NEXTCLADE } from '../modules/local/multiqc_tsv_from_list' // // SUBWORKFLOW: Consisting of a mix of local and nf-core/modules // -def publish_genome_options = params.save_reference ? [publish_dir: 'genome'] : [publish_files: false] -def collapse_primers_options = modules['nanopore_collapse_primers'] -def snpeff_build_options = modules['nanopore_snpeff_build'] -if (!params.save_reference) { - collapse_primers_options['publish_files'] = false - snpeff_build_options['publish_files'] = false -} - -include { INPUT_CHECK } from '../subworkflows/local/input_check' addParams( options: [:] ) -include { PREPARE_GENOME } from '../subworkflows/local/prepare_genome_nanopore' addParams( genome_options: publish_genome_options, collapse_primers_options: collapse_primers_options, snpeff_build_options: snpeff_build_options ) -include { SNPEFF_SNPSIFT } from '../subworkflows/local/snpeff_snpsift' addParams( snpeff_options: modules['nanopore_snpeff'], snpsift_options: modules['nanopore_snpsift'], bgzip_options: modules['nanopore_snpeff_bgzip'], tabix_options: modules['nanopore_snpeff_tabix'], stats_options: modules['nanopore_snpeff_stats'] ) +include { INPUT_CHECK } from '../subworkflows/local/input_check' +include { PREPARE_GENOME } from '../subworkflows/local/prepare_genome_nanopore' +include { SNPEFF_SNPSIFT } from '../subworkflows/local/snpeff_snpsift' +include { VARIANTS_LONG_TABLE } from '../subworkflows/local/variants_long_table' /* ======================================================================================== @@ -87,40 +77,24 @@ include { SNPEFF_SNPSIFT } from '../subworkflows/local/snpeff_snpsift' // // MODULE: Installed directly from nf-core/modules // - -def artic_minion_options = modules['nanopore_artic_minion'] -artic_minion_options.args += params.artic_minion_caller == 'medaka' ? Utils.joinModuleArgs(['--medaka']) : '' -artic_minion_options.args += params.artic_minion_aligner == 'bwa' ? Utils.joinModuleArgs(['--bwa']) : Utils.joinModuleArgs(['--minimap2']) - -def artic_guppyplex_options = modules['nanopore_artic_guppyplex'] -if (params.primer_set_version == 1200) { - def args_split = artic_guppyplex_options.args.tokenize() - def min_idx = args_split.indexOf('--min-length') - def max_idx = args_split.indexOf('--max-length') - if (min_idx != -1) { - args_split[min_idx+1] = '250' - } - if (max_idx != -1) { - args_split[max_idx+1] = '1500' - } - artic_guppyplex_options.args = args_split.join(' ') -} - -include { PYCOQC } from '../modules/nf-core/modules/pycoqc/main' addParams( options: modules['nanopore_pycoqc'] ) -include { NANOPLOT } from '../modules/nf-core/modules/nanoplot/main' addParams( options: modules['nanopore_nanoplot'] ) -include { ARTIC_GUPPYPLEX } from '../modules/nf-core/modules/artic/guppyplex/main' addParams( options: artic_guppyplex_options ) -include { ARTIC_MINION } from '../modules/nf-core/modules/artic/minion/main' addParams( options: artic_minion_options ) -include { BCFTOOLS_STATS } from '../modules/nf-core/modules/bcftools/stats/main' addParams( options: modules['nanopore_bcftools_stats'] ) -include { QUAST } from '../modules/nf-core/modules/quast/main' addParams( options: modules['nanopore_quast'] ) -include { PANGOLIN } from '../modules/nf-core/modules/pangolin/main' addParams( options: modules['nanopore_pangolin'] ) -include { NEXTCLADE } from '../modules/nf-core/modules/nextclade/main' addParams( options: modules['nanopore_nextclade'] ) -include { MOSDEPTH as MOSDEPTH_GENOME } from '../modules/nf-core/modules/mosdepth/main' addParams( options: modules['nanopore_mosdepth_genome'] ) -include { MOSDEPTH as MOSDEPTH_AMPLICON } from '../modules/nf-core/modules/mosdepth/main' addParams( options: modules['nanopore_mosdepth_amplicon'] ) +include { PYCOQC } from '../modules/nf-core/modules/pycoqc/main' +include { NANOPLOT } from '../modules/nf-core/modules/nanoplot/main' +include { ARTIC_GUPPYPLEX } from '../modules/nf-core/modules/artic/guppyplex/main' +include { ARTIC_MINION } from '../modules/nf-core/modules/artic/minion/main' +include { VCFLIB_VCFUNIQ } from '../modules/nf-core/modules/vcflib/vcfuniq/main' +include { TABIX_TABIX } from '../modules/nf-core/modules/tabix/tabix/main' +include { BCFTOOLS_STATS } from '../modules/nf-core/modules/bcftools/stats/main' +include { QUAST } from '../modules/nf-core/modules/quast/main' +include { PANGOLIN } from '../modules/nf-core/modules/pangolin/main' +include { NEXTCLADE_RUN } from '../modules/nf-core/modules/nextclade/run/main' +include { CUSTOM_DUMPSOFTWAREVERSIONS } from '../modules/nf-core/modules/custom/dumpsoftwareversions/main' +include { MOSDEPTH as MOSDEPTH_GENOME } from '../modules/nf-core/modules/mosdepth/main' +include { MOSDEPTH as MOSDEPTH_AMPLICON } from '../modules/nf-core/modules/mosdepth/main' // // SUBWORKFLOW: Consisting entirely of nf-core/modules // -include { FILTER_BAM_SAMTOOLS } from '../subworkflows/nf-core/filter_bam_samtools' addParams( samtools_view_options: modules['nanopore_filter_bam'], samtools_index_options: modules['nanopore_filter_bam_stats'] ) +include { FILTER_BAM_SAMTOOLS } from '../subworkflows/nf-core/filter_bam_samtools' /* ======================================================================================== @@ -135,7 +109,7 @@ def fail_barcode_reads = [:] workflow NANOPORE { - ch_software_versions = Channel.empty() + ch_versions = Channel.empty() // // MODULE: PycoQC on sequencing summary file @@ -145,16 +119,15 @@ workflow NANOPORE { PYCOQC ( ch_sequencing_summary ) - ch_pycoqc_multiqc = PYCOQC.out.json - ch_software_versions = ch_software_versions.mix(PYCOQC.out.version.ifEmpty(null)) + ch_pycoqc_multiqc = PYCOQC.out.json + ch_versions = ch_versions.mix(PYCOQC.out.versions) } // // SUBWORKFLOW: Uncompress and prepare reference genome files // - PREPARE_GENOME ( - ch_dummy_file - ) + PREPARE_GENOME () + ch_versions = ch_versions.mix(PREPARE_GENOME.out.versions) // Check primer BED file only contains suffixes provided --primer_left_suffix / --primer_right_suffix PREPARE_GENOME @@ -162,6 +135,21 @@ workflow NANOPORE { .primer_bed .map { WorkflowCommons.checkPrimerSuffixes(it, params.primer_left_suffix, params.primer_right_suffix, log) } + // Check whether the contigs in the primer BED file are present in the reference genome + PREPARE_GENOME + .out + .primer_bed + .map { [ WorkflowCommons.getColFromFile(it, col=0, uniqify=true, sep='\t') ] } + .set { ch_bed_contigs } + + PREPARE_GENOME + .out + .fai + .map { [ WorkflowCommons.getColFromFile(it, col=0, uniqify=true, sep='\t') ] } + .concat(ch_bed_contigs) + .collect() + .map { fai, bed -> WorkflowCommons.checkContigsInBED(fai, bed, log) } + barcode_dirs = file("${params.fastq_dir}/barcode*", type: 'dir' , maxdepth: 1) single_barcode_dir = file("${params.fastq_dir}/*.fastq" , type: 'file', maxdepth: 1) ch_custom_no_sample_name_multiqc = Channel.empty() @@ -173,7 +161,7 @@ workflow NANOPORE { .map { dir -> def count = 0 for (x in dir.listFiles()) { - if (x.isFile() && x.toString().endsWith('.fastq')) { + if (x.isFile() && x.toString().contains('.fastq')) { count += x.countFastq() } } @@ -189,8 +177,10 @@ workflow NANOPORE { ch_input, params.platform ) + .sample_info .join(ch_fastq_dirs, remainder: true) .set { ch_fastq_dirs } + ch_versions = ch_versions.mix(INPUT_CHECK.out.versions) // // MODULE: Create custom content file for MultiQC to report barcodes were allocated reads >= params.min_barcode_reads but no sample name in samplesheet @@ -201,12 +191,12 @@ workflow NANOPORE { .map { it -> [ "${it[0]}\t${it[-1]}" ] } .set { ch_barcodes_no_sample } - MULTIQC_CUSTOM_TSV_NO_SAMPLE_NAME ( + MULTIQC_TSV_NO_SAMPLE_NAME ( ch_barcodes_no_sample.collect(), - 'Barcode\tRead count', + ['Barcode', 'Read count'], 'fail_barcodes_no_sample' ) - ch_custom_no_sample_name_multiqc = MULTIQC_CUSTOM_TSV_NO_SAMPLE_NAME.out + .set { ch_custom_no_sample_name_multiqc } // // MODULE: Create custom content file for MultiQC to report samples that were in samplesheet but have no barcodes @@ -216,12 +206,12 @@ workflow NANOPORE { .map { it -> [ "${it[1]}\t${it[0]}" ] } .set { ch_samples_no_barcode } - MULTIQC_CUSTOM_TSV_NO_BARCODES ( + MULTIQC_TSV_NO_BARCODES ( ch_samples_no_barcode.collect(), - 'Sample\tMissing barcode', + ['Sample', 'Missing barcode'], 'fail_no_barcode_samples' ) - ch_custom_no_barcodes_multiqc = MULTIQC_CUSTOM_TSV_NO_BARCODES.out + .set { ch_custom_no_barcodes_multiqc } ch_fastq_dirs .filter { (it[1] != null) } @@ -257,9 +247,9 @@ workflow NANOPORE { } .set { ch_pass_fail_barcode_count } - MULTIQC_CUSTOM_TSV_BARCODE_COUNT ( + MULTIQC_TSV_BARCODE_COUNT ( ch_pass_fail_barcode_count.fail.collect(), - 'Sample\tBarcode count', + ['Sample', 'Barcode count'], 'fail_barcode_count_samples' ) @@ -275,7 +265,7 @@ workflow NANOPORE { ARTIC_GUPPYPLEX ( ch_fastq_dirs ) - ch_software_versions = ch_software_versions.mix(ARTIC_GUPPYPLEX.out.version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(ARTIC_GUPPYPLEX.out.versions.first().ifEmpty(null)) // // MODULE: Create custom content file for MultiQC to report samples with reads < params.min_guppyplex_reads @@ -292,9 +282,9 @@ workflow NANOPORE { } .set { ch_pass_fail_guppyplex_count } - MULTIQC_CUSTOM_TSV_GUPPYPLEX_COUNT ( + MULTIQC_TSV_GUPPYPLEX_COUNT ( ch_pass_fail_guppyplex_count.fail.collect(), - 'Sample\tRead count', + ['Sample', 'Read count'], 'fail_guppyplex_count_samples' ) @@ -305,7 +295,7 @@ workflow NANOPORE { NANOPLOT ( ARTIC_GUPPYPLEX.out.fastq ) - ch_software_versions = ch_software_versions.mix(NANOPLOT.out.version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(NANOPLOT.out.versions.first().ifEmpty(null)) } // @@ -321,22 +311,39 @@ workflow NANOPORE { params.artic_scheme, params.primer_set_version ) + ch_versions = ch_versions.mix(ARTIC_MINION.out.versions.first().ifEmpty(null)) // - // SUBWORKFLOW: Filter unmapped reads from BAM + // MODULE: Remove duplicate variants // - FILTER_BAM_SAMTOOLS ( - ARTIC_MINION.out.bam + VCFLIB_VCFUNIQ ( + ARTIC_MINION.out.vcf.join(ARTIC_MINION.out.tbi, by: [0]), ) - ch_software_versions = ch_software_versions.mix(FILTER_BAM_SAMTOOLS.out.samtools_version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(VCFLIB_VCFUNIQ.out.versions.first().ifEmpty(null)) + + // + // MODULE: Index VCF file + // + TABIX_TABIX ( + VCFLIB_VCFUNIQ.out.vcf + ) + ch_versions = ch_versions.mix(TABIX_TABIX.out.versions.first().ifEmpty(null)) // // MODULE: VCF stats with bcftools stats // BCFTOOLS_STATS ( - ARTIC_MINION.out.vcf + VCFLIB_VCFUNIQ.out.vcf + ) + ch_versions = ch_versions.mix(BCFTOOLS_STATS.out.versions.first().ifEmpty(null)) + + // + // SUBWORKFLOW: Filter unmapped reads from BAM + // + FILTER_BAM_SAMTOOLS ( + ARTIC_MINION.out.bam ) - ch_software_versions = ch_software_versions.mix(BCFTOOLS_STATS.out.version.ifEmpty(null)) + ch_versions = ch_versions.mix(FILTER_BAM_SAMTOOLS.out.versions) // // MODULE: Genome-wide and amplicon-specific coverage QC plots @@ -347,26 +354,29 @@ workflow NANOPORE { MOSDEPTH_GENOME ( ARTIC_MINION.out.bam_primertrimmed.join(ARTIC_MINION.out.bai_primertrimmed, by: [0]), - ch_dummy_file, + [], 200 ) ch_mosdepth_multiqc = MOSDEPTH_GENOME.out.global_txt - ch_software_versions = ch_software_versions.mix(MOSDEPTH_GENOME.out.version.first().ifEmpty(null)) + ch_versions = ch_versions.mix(MOSDEPTH_GENOME.out.versions.first().ifEmpty(null)) PLOT_MOSDEPTH_REGIONS_GENOME ( MOSDEPTH_GENOME.out.regions_bed.collect { it[1] } ) + ch_versions = ch_versions.mix(PLOT_MOSDEPTH_REGIONS_GENOME.out.versions) MOSDEPTH_AMPLICON ( ARTIC_MINION.out.bam_primertrimmed.join(ARTIC_MINION.out.bai_primertrimmed, by: [0]), PREPARE_GENOME.out.primer_collapsed_bed, 0 ) + ch_versions = ch_versions.mix(MOSDEPTH_AMPLICON.out.versions.first().ifEmpty(null)) PLOT_MOSDEPTH_REGIONS_AMPLICON ( MOSDEPTH_AMPLICON.out.regions_bed.collect { it[1] } ) ch_amplicon_heatmap_multiqc = PLOT_MOSDEPTH_REGIONS_AMPLICON.out.heatmap_tsv + ch_versions = ch_versions.mix(PLOT_MOSDEPTH_REGIONS_AMPLICON.out.versions) } // @@ -377,8 +387,8 @@ workflow NANOPORE { PANGOLIN ( ARTIC_MINION.out.fasta ) - ch_pangolin_multiqc = PANGOLIN.out.report - ch_software_versions = ch_software_versions.mix(PANGOLIN.out.version.ifEmpty(null)) + ch_pangolin_multiqc = PANGOLIN.out.report + ch_versions = ch_versions.mix(PANGOLIN.out.versions.first().ifEmpty(null)) } // @@ -386,15 +396,16 @@ workflow NANOPORE { // ch_nextclade_multiqc = Channel.empty() if (!params.skip_nextclade) { - NEXTCLADE ( - ARTIC_MINION.out.fasta + NEXTCLADE_RUN ( + ARTIC_MINION.out.fasta, + PREPARE_GENOME.out.nextclade_db ) - ch_software_versions = ch_software_versions.mix(NEXTCLADE.out.version.ifEmpty(null)) + ch_versions = ch_versions.mix(NEXTCLADE_RUN.out.versions.first().ifEmpty(null)) // // MODULE: Get Nextclade clade information for MultiQC report // - NEXTCLADE + NEXTCLADE_RUN .out .csv .map { meta, csv -> @@ -403,9 +414,9 @@ workflow NANOPORE { } .set { ch_nextclade_multiqc } - MULTIQC_CUSTOM_TSV_NEXTCLADE ( + MULTIQC_TSV_NEXTCLADE ( ch_nextclade_multiqc.collect(), - 'Sample\tclade', + ['Sample', 'clade'], 'nextclade_clade' ) .set { ch_nextclade_multiqc } @@ -424,23 +435,24 @@ workflow NANOPORE { params.gff ) ch_quast_multiqc = QUAST.out.tsv - ch_software_versions = ch_software_versions.mix(QUAST.out.version.ifEmpty(null)) + ch_versions = ch_versions.mix(QUAST.out.versions) } // // SUBWORKFLOW: Annotate variants with snpEff // ch_snpeff_multiqc = Channel.empty() + ch_snpsift_txt = Channel.empty() if (params.gff && !params.skip_snpeff) { SNPEFF_SNPSIFT ( - ARTIC_MINION.out.vcf, + VCFLIB_VCFUNIQ.out.vcf, PREPARE_GENOME.out.snpeff_db, PREPARE_GENOME.out.snpeff_config, PREPARE_GENOME.out.fasta ) ch_snpeff_multiqc = SNPEFF_SNPSIFT.out.csv - ch_software_versions = ch_software_versions.mix(SNPEFF_SNPSIFT.out.snpeff_version.ifEmpty(null)) - ch_software_versions = ch_software_versions.mix(SNPEFF_SNPSIFT.out.snpsift_version.ifEmpty(null)) + ch_snpsift_txt = SNPEFF_SNPSIFT.out.snpsift_txt + ch_versions = ch_versions.mix(SNPEFF_SNPSIFT.out.versions) } // @@ -450,7 +462,7 @@ workflow NANOPORE { ARTIC_MINION .out .bam_primertrimmed - .join(ARTIC_MINION.out.vcf, by: [0]) + .join(VCFLIB_VCFUNIQ.out.vcf, by: [0]) .join(BCFTOOLS_STATS.out.stats, by: [0]) .map { meta, bam, vcf, stats -> if (WorkflowCommons.getNumVariantsFromBCFToolsStats(stats) > 0) { @@ -463,27 +475,32 @@ workflow NANOPORE { ch_asciigenome, PREPARE_GENOME.out.fasta, PREPARE_GENOME.out.chrom_sizes, - params.gff ? PREPARE_GENOME.out.gff : [], + PREPARE_GENOME.out.gff, PREPARE_GENOME.out.primer_bed, params.asciigenome_window_size, params.asciigenome_read_depth ) - ch_software_versions = ch_software_versions.mix(ASCIIGENOME.out.version.ifEmpty(null)) + ch_versions = ch_versions.mix(ASCIIGENOME.out.versions.first().ifEmpty(null)) } // - // MODULE: Pipeline reporting + // SUBWORKFLOW: Create variants long table report // - ch_software_versions - .map { it -> if (it) [ it.baseName, it ] } - .groupTuple() - .map { it[1][0] } - .flatten() - .collect() - .set { ch_software_versions } + if (!params.skip_variants_long_table && params.gff && !params.skip_snpeff) { + VARIANTS_LONG_TABLE ( + VCFLIB_VCFUNIQ.out.vcf, + TABIX_TABIX.out.tbi, + ch_snpsift_txt, + ch_pangolin_multiqc + ) + ch_versions = ch_versions.mix(VARIANTS_LONG_TABLE.out.versions) + } - GET_SOFTWARE_VERSIONS ( - ch_software_versions + // + // MODULE: Pipeline reporting + // + CUSTOM_DUMPSOFTWAREVERSIONS ( + ch_versions.unique().collectFile(name: 'collated_versions.yml') ) // @@ -495,13 +512,13 @@ workflow NANOPORE { MULTIQC ( ch_multiqc_config, - ch_multiqc_custom_config.collect().ifEmpty([]), - GET_SOFTWARE_VERSIONS.out.yaml.collect(), + ch_multiqc_custom_config, + CUSTOM_DUMPSOFTWAREVERSIONS.out.mqc_yml.collect(), ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml'), ch_custom_no_sample_name_multiqc.ifEmpty([]), ch_custom_no_barcodes_multiqc.ifEmpty([]), - MULTIQC_CUSTOM_TSV_BARCODE_COUNT.out.ifEmpty([]), - MULTIQC_CUSTOM_TSV_GUPPYPLEX_COUNT.out.ifEmpty([]), + MULTIQC_TSV_BARCODE_COUNT.out.ifEmpty([]), + MULTIQC_TSV_GUPPYPLEX_COUNT.out.ifEmpty([]), ch_amplicon_heatmap_multiqc.ifEmpty([]), ch_pycoqc_multiqc.collect().ifEmpty([]), ARTIC_MINION.out.json.collect{it[1]}.ifEmpty([]),