-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update how PipeVal is being used to validate input and output files #240
Changes from all commits
aba8443
d677182
d5bcc29
ea041df
75529bb
e4677fd
73f9547
26a09f4
41b0380
6e155a8
6ffaa50
920b132
b0ae698
34cf06a
27f641a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -14,12 +14,13 @@ params { | |||||
// tools and their versions | ||||||
bwa_version = "BWA-MEM2-2.2.1" | ||||||
hisat2_version = "HISAT2-2.2.1" | ||||||
pipeval_version = "3.0.0" | ||||||
|
||||||
docker_image_bwa_and_samtools = "blcdsdockerregistry/bwa-mem2_samtools-1.12:2.2.1" | ||||||
docker_image_hisat2_and_samtools = "blcdsdockerregistry/hisat2_samtools-1.12:2.2.1" | ||||||
docker_image_picardtools = "blcdsdockerregistry/picard:2.26.10" | ||||||
docker_image_sha512sum = "blcdsdockerregistry/align-dna:sha512sum-1.0" | ||||||
docker_image_validate_params = "blcdsdockerregistry/validate:2.1.5" | ||||||
docker_image_validate_params = "blcdsdockerregistry/pipeval:3.0.0" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
To avoid hard-coding and duplication |
||||||
docker_image_gatk = "broadinstitute/gatk:4.2.4.1" | ||||||
docker_image_samtools = "blcdsdockerregistry/samtools:1.15.1" | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -3,7 +3,15 @@ | |||||
// here it actually saves cost, time, and memory to directly pipe the output into | ||||||
// samtools due to the large size of the uncompressed SAM files. | ||||||
include { run_sort_SAMtools ; run_merge_SAMtools } from './samtools.nf' | ||||||
include { run_validate_PipeVal; run_validate_PipeVal as validate_output_file } from './validation.nf' | ||||||
|
||||||
// need to rename run_validate_PipeVal since using twice. Use run_validate_PipeVal for input validation; validate_output_file for output validation | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
The first part of the comment is unnecessary for the actual pipeline I think |
||||||
include { run_validate_PipeVal; run_validate_PipeVal as validate_output_file } from '../external/nextflow-modules/modules/PipeVal/validate/main.nf' addParams( | ||||||
options: [ | ||||||
docker_image_version: params.pipeval_version, | ||||||
process_label: 'process_low', | ||||||
main_process: "BWA-MEM2-${params.bwa_version}" | ||||||
] | ||||||
) | ||||||
include { run_MarkDuplicate_Picard } from './mark_duplicate_picardtools.nf' | ||||||
include { run_MarkDuplicatesSpark_GATK } from './mark_duplicates_spark.nf' | ||||||
include { generate_sha512sum } from './check_512sum.nf' | ||||||
|
@@ -80,16 +88,20 @@ workflow align_DNA_BWA_MEM2_workflow { | |||||
ich_reference_fasta | ||||||
ich_reference_index_files | ||||||
main: | ||||||
run_validate_PipeVal(ich_samples_validate.mix( | ||||||
|
||||||
run_validate_PipeVal_inputs = ich_samples_validate.mix( // | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Empty comment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, generally for input channels, we want to use |
||||||
ich_reference_fasta, | ||||||
ich_reference_index_files | ||||||
), | ||||||
aligner_log_dir | ||||||
) | ||||||
.map { it -> ["file-input", it] } | ||||||
|
||||||
run_validate_PipeVal( | ||||||
run_validate_PipeVal_inputs | ||||||
) | ||||||
|
||||||
// change validation file name depending on whether inputs or outputs are being validated | ||||||
//val_filename = ${task.process.split(':')[1].replace('_', '-')} == run-validate ? "input_validation.txt" : "output_validation.txt" | ||||||
run_validate_PipeVal.out.val_file.collectFile( | ||||||
run_validate_PipeVal.out.validation_result.collectFile( | ||||||
name: 'input_validation.txt', | ||||||
storeDir: "${aligner_validation_dir}" | ||||||
) | ||||||
|
@@ -124,14 +136,16 @@ workflow align_DNA_BWA_MEM2_workflow { | |||||
} | ||||||
} | ||||||
generate_sha512sum(och_bam_index.mix(och_bam), aligner_output_dir) | ||||||
|
||||||
validate_output_file_inputs = och_bam.mix( | ||||||
och_bam_index, | ||||||
Channel.from(params.work_dir, params.output_dir) | ||||||
) | ||||||
.map { it -> ["file-input", it] } | ||||||
validate_output_file( | ||||||
och_bam.mix( | ||||||
och_bam_index, | ||||||
Channel.from(params.work_dir, params.output_dir) | ||||||
), | ||||||
aligner_log_dir | ||||||
validate_output_file_inputs | ||||||
) | ||||||
validate_output_file.out.val_file.collectFile( | ||||||
validate_output_file.out.validation_result.collectFile( | ||||||
name: 'output_validation.txt', | ||||||
storeDir: "${aligner_validation_dir}" | ||||||
) | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,7 @@ | |
// samtools due to the large size of the uncompressed SAM files. | ||
|
||
include { run_sort_SAMtools ; run_merge_SAMtools} from './samtools.nf' | ||
include { run_validate_PipeVal; run_validate_PipeVal as validate_output_file } from './validation.nf' | ||
include { run_validate_PipeVal; run_validate_PipeVal as validate_output_file } from '../external/nextflow-modules/modules/PipeVal/validate/main.nf' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does the workflow for HISAT2 not require updates like the ones with BWA-MEM2? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The HISAT2 workflow needs to get the same updates. However, I cannot get PipeVal to work in the BWA-MEM2 workflow (see previous comments), so I am waiting to update HISAT2. I don't understand what PipeVal is doing on the current
BWA-MEM2:
HISAT2:
Nearly all files are There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Most of the files are Align-DNA also tries to validate the We may need to wait for pipeval to get updated before we can proceed with this PR. |
||
include { run_MarkDuplicate_Picard } from './mark_duplicate_picardtools.nf' | ||
include { run_MarkDuplicatesSpark_GATK } from './mark_duplicates_spark.nf' | ||
include { generate_sha512sum } from './check_512sum.nf' | ||
|
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to specify that we're using
v3.0.0
; the module might get updated to newer versions but the version used by the pipeline is still dictated by indefault.config