-
Notifications
You must be signed in to change notification settings - Fork 677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New module : Pilon #3331
Merged
Merged
New module : Pilon #3331
Changes from 4 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
cc11318
Added Pilon module
scorreard 2102fd2
Update modules/nf-core/pilon/main.nf
scorreard b1a38d9
Update tests/modules/nf-core/pilon/main.nf
scorreard 5a5d5cf
Update main.nf
scorreard e10585b
Update modules/nf-core/pilon/meta.yml
scorreard 0e433c7
Update modules/nf-core/pilon/meta.yml
scorreard 8c2ca89
Update modules/nf-core/pilon/meta.yml
scorreard File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
process PILON { | ||
tag "$meta.id" | ||
label 'process_medium' | ||
|
||
conda "bioconda::pilon=1.24" | ||
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? | ||
'https://depot.galaxyproject.org/singularity/pilon:1.24--hdfd78af_0': | ||
'quay.io/biocontainers/pilon:1.24--hdfd78af_0' }" | ||
|
||
input: | ||
tuple val(meta), path(fasta) | ||
tuple val(meta_bam), path(bam), path(bai) | ||
val pilon_mode | ||
|
||
output: | ||
tuple val(meta), path("*.fasta") , emit: improved_assembly | ||
tuple val(meta), path("*.vcf") , emit: vcf , optional : true | ||
tuple val(meta), path("*.change"), emit: change_record , optional : true | ||
tuple val(meta), path("*.bed") , emit: tracks_bed , optional : true | ||
tuple val(meta), path("*.wig") , emit: tracks_wig , optional : true | ||
path "versions.yml" , emit: versions | ||
|
||
when: | ||
task.ext.when == null || task.ext.when | ||
|
||
script: | ||
def args = task.ext.args ?: '' | ||
def prefix = task.ext.prefix ?: "${meta.id}" | ||
def valid_mode = ["frags", "jumps", "unpaired", "bam"] | ||
if ( !valid_mode.contains(pilon_mode) ) { error "Unrecognised mode to run Pilon. Options: ${valid_mode.join(', ')}" } | ||
""" | ||
pilon \\ | ||
--genome $fasta \\ | ||
--output ${meta.id} \\ | ||
--threads $task.cpus \\ | ||
$args \\ | ||
--$pilon_mode $bam | ||
|
||
cat <<-END_VERSIONS > versions.yml | ||
"${task.process}": | ||
pilon: \$(echo \$(pilon --version) | sed 's/^.*version //; s/ .*\$//' ) | ||
""" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
name: "pilon" | ||
description: Automatically improve draft assemblies and find variation among strains, including large event detection | ||
keywords: | ||
- polishing | ||
- assembly | ||
- variant calling | ||
tools: | ||
- "pilon": | ||
description: "Pilon is an automated genome assembly improvement and variant detection tool." | ||
homepage: "https://github.com/broadinstitute/pilon/wiki" | ||
documentation: "https://github.com/broadinstitute/pilon/wiki/Requirements-&-Usage" | ||
tool_dev_url: "https://github.com/broadinstitute/pilon" | ||
doi: "https://doi.org/10.1371/journal.pone.0112963" | ||
licence: "['GPL v2']" | ||
|
||
input: | ||
- meta: | ||
type: map | ||
description: | | ||
Groovy Map containing sample information | ||
e.g. [ id:'test', single_end:false ] | ||
- fasta: | ||
type: file | ||
description: FASTA file | ||
pattern: "*.{fasta}" | ||
scorreard marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- bam: | ||
type: file | ||
description: BAM file | ||
pattern: "*.{bam}" | ||
scorreard marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- bai: | ||
type: file | ||
description: BAI file | ||
pattern: "*.{bai}" | ||
scorreard marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- pilon_mode: | ||
type: value | ||
description: Indicates the type of bam file used (frags for paired-end sequencing of DNA fragments, such as Illumina paired-end reads of fragment size <1000bp, jumps for paired sequencing data of larger insert size, such as Illumina mate pair libraries, typically of insert size >1000bp, unpaired for unpaired sequencing reads, bam will automatically classify the BAM as one of the three types above (version 1.17 and higher). | ||
pattern: ["frags", "jumps", "unpaired", "bam"] | ||
|
||
output: | ||
- meta: | ||
type: map | ||
description: | | ||
Groovy Map containing sample information | ||
e.g. [ id:'test', single_end:false ] | ||
- versions: | ||
type: file | ||
description: File containing software versions | ||
pattern: "versions.yml" | ||
- improved_assembly: | ||
type: file | ||
description: fasta file, improved assembly | ||
pattern: "*.{fasta}" | ||
- change_record: | ||
type: file | ||
description: file containing a space-delimited record of every change made in the assembly as instructed by the --fix option | ||
pattern: "*.{change}" | ||
- vcf: | ||
type: file | ||
description: Pilon variant output | ||
pattern: "*.{vcf}" | ||
- tracks_bed: | ||
type: file | ||
description: files that may be viewed in genome browsers such as IGV, GenomeView, and other applications that support these formats | ||
pattern: "*.{bed}" | ||
- tracks_wig: | ||
type: file | ||
description: files that may be viewed in genome browsers such as IGV, GenomeView, and other applications that support these formats | ||
pattern: "*.{wig}" | ||
|
||
authors: | ||
- "@scorreard" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
#!/usr/bin/env nextflow | ||
|
||
nextflow.enable.dsl = 2 | ||
|
||
include { PILON } from '../../../../modules/nf-core/pilon/main.nf' | ||
|
||
workflow test_pilon { | ||
|
||
input = [ | ||
[ id:'test', single_end:false ], // meta map | ||
file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) | ||
] | ||
|
||
bam_tuple_ch = Channel.of([ [ id:'test', single_end:false ], // meta map | ||
file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), | ||
file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), | ||
]) | ||
|
||
PILON ( input, bam_tuple_ch, "bam" ) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
process { | ||
|
||
publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" } | ||
|
||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
- name: pilon test_pilon | ||
command: nextflow run ./tests/modules/nf-core/pilon -entry test_pilon -c ./tests/config/nextflow.config -c ./tests/modules/nf-core/pilon/nextflow.config | ||
tags: | ||
- pilon | ||
files: | ||
- path: output/pilon/test.fasta | ||
md5sum: 2e881994820a5a641da9ea594ab4958f | ||
- path: output/pilon/versions.yml | ||
d4straub marked this conversation as resolved.
Show resolved
Hide resolved
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because fasta, bam and bai have to come all from the same sample, would it be rather:
I am not sure, thats just a question because I do not know better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an interesting suggestion. Both would work.
I prefer to have them split, and it works if meta and meta_bam are the same.
In the case of purge_dups, they concatenated all the inputs and I had to split them because sometimes my meta reflects the input sequencing type (SR, PacBio, ONT), not only the sample name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm ok, I would prefer having them not split. Because I think the typical application is that before the module is some channel magic that joins the required input (genome = fasta, alignment = bam & bai) to have matching files by ensuring that meta is matching.
That seems to me rather the exception than the usual module output. But I might be wrong.
Unfortunately the nf-core modules guidelines do not specify this clearly, see https://nf-co.re/developers/modules#inputoutput-options, i.e.
Directly associated auxiliary files to an input file MAY be defined within the same input channel alongside the main input channel (e.g. BAM and BAI).
, so it seems I shouldn't insist on it!