Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dsl2 new test configs #1121

Open
wants to merge 8 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions conf/test_default.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be renamed as just test, we need to have at least one test that is just that.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.

Use as follows:
nextflow run nf-core/eager -profile test_default,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

process {
resourceLimits = [
cpus: 4,
memory: '15.GB',
time: '1.h'
]
}

// TO DO: Change name to test.config once migration is complete.
params {
config_profile_name = 'Test profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Input data
input = params.pipelines_testdata_base_path + 'eager/testdata/Mammoth/samplesheet_v3.tsv'

// Genome references
fasta = params.pipelines_testdata_base_path + 'eager/reference/Mammoth/Mammoth_MT_Krause.fasta'

// Preprocessing
preprocessing_tool = 'adapterremoval'

// Sharding FASTQ
run_fastq_sharding = true
fastq_shard_size = 5000

// Mapping
mapping_tool = 'bwaaln'
skip_qualimap = false

// BAM filtering
run_bamfiltering = true
bamfiltering_minreadlength = 30
bamfiltering_mappingquality = 37
deduplication_tool = 'markduplicates'

// PreSeq
mapstats_preseq_mode = 'c_curve'

// Damage calculation
damagecalculation_tool = 'damageprofiler'
skip_qualimap = false

// Genotyping
genotyping_tool = 'ug'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not yet have bcftools functionality in there, or does it run by default?


// Metagenomic screening
run_metagenomics = true
metagenomics_profiling_tool = 'metaphlan'
metagenomics_profiling_database = params.pipelines_testdata_base_path + 'eager/databases/metaphlan/metaphlan4_database.tar.gz'
metagenomics_run_postprocessing = true
}
62 changes: 62 additions & 0 deletions conf/test_humanpopgen.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.

Use as follows:
nextflow run nf-core/eager -profile test_humanpopgen,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

process {
resourceLimits = [
cpus: 4,
memory: '15.GB',
time: '1.h'
]

// To avoid pipeline failure due to not having X reads and to not have overcrowded datasets in the test
withName: ANGSD_CONTAMINATION {
errorStrategy = { task.exitStatus in [134] ? 'ignore' : 'finish' }
}
}

params {
config_profile_name = 'Test human popgen profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Input data
input = params.pipelines_testdata_base_path + 'eager/testdata/Human/human_design_bam_eager3.tsv'

// Genome references
fasta = params.pipelines_testdata_base_path + 'eager/reference/Human/hs37d5_chr21-MT.fa.gz'

// Mapping
mapping_tool = 'bowtie2'
convert_inputbam = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't want this to convert back to FASTQ and start straight from bam input going into filtering (i.e., we don't remap) or did you discuss this with Thiseas already?...

Although now I say that, we otherwise not have a test of converting back to FASTQ so maybe it's OK...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: you have already specified the convert bam in test_modern, so this can be removed

Suggested change
convert_inputbam = true


// BAM filtering
run_bamfiltering = true
bamfiltering_minreadlength = 30
bamfiltering_mappingquality = 37

// Damage
damagecalculation_tool = 'mapdamage'
run_trim_bam = true

// Contamination
run_mtnucratio = true
run_contamination_estimation_angsd = true

// Genotyping
genotyping_tool = 'pileupcaller'
genotyping_pileupcaller_bedfile = params.pipelines_testdata_base_path + 'eager/reference/Human/1240K.pos.list_hs37d5.0based.bed.gz'
genotyping_pileupcaller_snpfile = params.pipelines_testdata_base_path + 'eager/reference/Human/1240K_covered_in_JK2067_downsampled_s0.1.numeric_chromosomes.snp'

//Sex Determination
run_sexdeterrmine = true
sexdeterrmine_bedfile = params.pipelines_testdata_base_path + 'eager/reference/Human/1240K.pos.list_hs37d5.0based.bed.gz'

}
58 changes: 58 additions & 0 deletions conf/test_microbial.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.

Use as follows:
nextflow run nf-core/eager -profile test_microbial,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

process {
resourceLimits = [
cpus: 4,
memory: '15.GB',
time: '1.h'
]
}

params {
config_profile_name = 'Test microbial profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Input data
input = params.pipelines_testdata_base_path + 'eager/testdata/Mammoth/samplesheet_PE_only_v3.tsv'

// Genome references
fasta_sheet = params.pipelines_testdata_base_path + 'eager/reference/reference_sheet_multiref.csv'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are missing the bedtools coverage stats from this one - so will need to be turned on and the bed input file given


// Preprocessing
sequencing_qc_tool = 'falco'
preprocessing_excludeunmerged = true

// Mapping
mapping_tool = 'circularmapper'

// BAM filtering
deduplication_tool = "dedup"
run_bamfiltering = true
bamfiltering_minreadlength = 30
bamfiltering_mappingquality = 37

// Metagenomics
run_metagenomics = true
metagenomics_profiling_tool = 'krakenuniq'
metagenomics_profiling_database = params.pipelines_testdata_base_path + 'eager/databases/krakenuniq/testdb-krakenuniq.tar.gz'
run_host_removal = true

// Manipulate Damage
run_mapdamage_rescaling = true

// Genotyping
genotyping_tool = 'freebayes'


}
50 changes: 50 additions & 0 deletions conf/test_modern.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.

Use as follows:
nextflow run nf-core/eager -profile test_modern,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

process {
resourceLimits = [
cpus: 4,
memory: '15.GB',
time: '1.h'
]
}

params {
config_profile_name = 'Test modern profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Input data
input = params.pipelines_testdata_base_path + 'eager/testdata/Human/human_design_bam_eager3.tsv'

// Genome references
fasta = params.pipelines_testdata_base_path + 'eager/reference/Human/hs37d5_chr21-MT.fa.gz'


// Preprocessing
sequencing_qc_tool = 'falco'
preprocessing_tool = 'fastp'
convert_inputbam = true

// Mapping
mapping_tool = 'bwamem'

// Metagenomics
run_metagenomics = true
metagenomics_complexity_tool = 'prinseq'
metagenomics_profiling_tool = 'kraken2'
metagenomics_profiling_database = params.pipelines_testdata_base_path + 'eager/databases/kraken/eager_test.tar.gz'
metagenomics_run_postprocessing = true

// Genotyping
genotyping_tool = 'hc'

}
51 changes: 51 additions & 0 deletions conf/test_shortdna.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.

Use as follows:
nextflow run nf-core/eager -profile test_shortdna,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

process {
resourceLimits = [
cpus: 4,
memory: '15.GB',
time: '1.h'
]
}

params {
config_profile_name = 'Test very short DNA profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Input data
input = params.pipelines_testdata_base_path + 'eager/testdata/Mammoth/samplesheet_v3.tsv'

// Genome references
fasta_sheet = params.pipelines_testdata_base_path + 'eager/reference/reference_sheet_multiref.csv'

// Mapping
// TO DO: Change when mapAD is there.
// mapping_tool = 'mapad'

// Metagenomics
run_metagenomics = true
metagenomics_complexity_tool = 'bbduk'
metagenomics_profiling_tool = 'malt'
metagenomics_profiling_database = params.pipelines_testdata_base_path + '/eager/databases/malt/eager_test.tar.gz'
metagenomics_run_postprocessing = true
metagenomics_maltextract_taxonlist = params.pipelines_testdata_base_path + '/eager/testdata/Mammoth/maltextract/MaltExtract_list.txt'
metagenomics_maltextract_ncbidir = 'https://github.com/rhuebler/HOPS/raw/external/Resources/'

// Manioulate Damage
run_pmd_filtering = true

// Genotyping
genotyping_tool = 'angsd'


}
23 changes: 14 additions & 9 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -384,15 +384,20 @@ profiles {
test_full {
includeConfig 'conf/test_full.config'
}
test { includeConfig 'conf/test.config' }
test_full { includeConfig 'conf/test_full.config' }
test_nothing { includeConfig 'conf/test_nothing.config' }
test_humanbam { includeConfig 'conf/test_humanbam.config' }
test_multiref { includeConfig 'conf/test_multiref.config' }
test_kraken2 { includeConfig 'conf/test_kraken2.config' }
test_malt { includeConfig 'conf/test_malt.config' }
test_krakenuniq { includeConfig 'conf/test_krakenuniq.config'}
test_metaphlan { includeConfig 'conf/test_metaphlan.config' }
test { includeConfig 'conf/test.config' }
test_full { includeConfig 'conf/test_full.config' }
test_nothing { includeConfig 'conf/test_nothing.config' }
test_humanbam { includeConfig 'conf/test_humanbam.config' }
test_multiref { includeConfig 'conf/test_multiref.config' }
test_kraken2 { includeConfig 'conf/test_kraken2.config' }
test_malt { includeConfig 'conf/test_malt.config' }
test_krakenuniq { includeConfig 'conf/test_krakenuniq.config' }
test_metaphlan { includeConfig 'conf/test_metaphlan.config' }
test_default { includeConfig 'conf/test_default.config' }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above - this test_default should replace test and thus be called just test.

You should remove the old profiles too, except for test_full.

So test, test_full, and all the new ones you've made.

Oh I just remembered: please rename test_nothing with test_minimal (I've been overridden by other pipelines sadly, but so we should stay consistent across the community)

test_modern { includeConfig 'conf/test_modern.config' }
test_microbial { includeConfig 'conf/test_microbial.config' }
test_shortdna { includeConfig 'conf/test_shortdna.config' }
test_humanpopgen { includeConfig 'conf/test_humanpopgen.config' }
}

// Load nf-core custom profiles from different Institutions
Expand Down
2 changes: 1 addition & 1 deletion subworkflows/local/utils_nfcore_eager_pipeline/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@ def validateInputParameters() {
if ( !params.fasta && !params.fasta_sheet ) { exit 1, "[nf-core/eager] ERROR: Neither FASTA file --fasta nor reference sheet --fasta_sheet have been provided."}
if ( params.fasta && params.fasta_sheet ) { exit 1, "[nf-core/eager] ERROR: A FASTA file --fasta and a reference sheet --fasta_sheet have been provided. These parameters are mutually exclusive."}
if ( params.preprocessing_adapterlist && params.preprocessing_skipadaptertrim ) { log.warn("[nf-core/eager] WARNING: --preprocessing_skipadaptertrim will override --preprocessing_adapterlist. Adapter trimming will be skipped!") }
if ( params.deduplication_tool == 'dedup' && ! params.preprocessing_excludeunmerged ) { exit 1, "[nf-core/eager] ERROR: Dedup can only be used on collapsed (i.e. merged) PE reads. For all other cases, please set --deduplication_tool to 'markduplicates'."}
if ( params.deduplication_tool == 'dedup' && ! params.preprocessing_excludeunmerged ) { exit 1, "[nf-core/eager] ERROR: Dedup can only be used on collapsed (i.e. merged) PE reads without singletons. If you want to use Dedup, please provide --preprocessing_excludeunmerged. For all other cases, please set --deduplication_tool to 'markduplicates'."}
if ( params.bamfiltering_retainunmappedgenomicbam && params.bamfiltering_mappingquality > 0 ) { exit 1, ("[nf-core/eager] ERROR: You cannot both retain unmapped reads and perform quality filtering, as unmapped reads have a mapping quality of 0. Pick one or the other functionality.") }
if ( params.genotyping_source == 'trimmed' && ! params.run_trim_bam ) { exit 1, ("[nf-core/eager] ERROR: --genotyping_source cannot be 'trimmed' unless BAM trimming is turned on with `--run_trim_bam`.") }
if ( params.genotyping_source == 'pmd' && ! params.run_pmd_filtering ) { exit 1, ("[nf-core/eager] ERROR: --genotyping_source cannot be 'pmd' unless PMD-filtering is ran.") }
Expand Down