Skip to content

Commit 10386f2

Browse files
--inputgffs (non-mandatory) works
1 parent a162fb2 commit 10386f2

4 files changed

+12
-8
lines changed

main.nf

+12-8
Original file line numberDiff line numberDiff line change
@@ -6,20 +6,19 @@
66
* The GTDB genomes are expected to be downloaded and annotated.
77
*
88
* The workflow starts from a set of annotated genomes in the format of faa.gz files (--inputfaas)
9-
* and gff.gz files (--inputgffs) plus a set of hmm profiles (--hmms). The protein sequences will be
10-
* searched with HMMER using the hmm files and subsequently classified into which profile it fits
11-
* best into. The latter uses a table describing the hierarchy of hmm profiles
9+
* and, optionally, gff.gz files (--inputgffs) plus a set of hmm profiles (--hmms). The protein
10+
* sequences will be searched with HMMER using the hmm files and subsequently classified into which
11+
* profile it fits best into. The latter uses a table describing the hierarchy of hmm profiles
1212
* (--profiles_hierarchy; see --help).
1313
*
1414
* Requirements:
1515
* directory with faa.gz files
16-
* directory with .gff.gz files
1716
* directory with all hmm profiles to be run
1817
* file describing the hmm profile hierarchy
1918
*
2019
* Processing steps:
2120
* Concatenate all faa.gz files into a single one
22-
* Concatenate all gff.gz files into a single one
21+
* Optionally, concatenate all gff.gz files into a single one
2322
* Perform an hmmsearch of all hmm profiles on all the proteomes
2423
* Download the metadata files for archaeal and bacterial genomes from gtdb latest version
2524
* repository and concatenate them into a single metadata file
@@ -50,11 +49,10 @@ def helpMessage() {
5049
5150
The typical command for running the pipeline is as follows:
5251
53-
nextflow run main.nf --inputfaas path/to/genomes.faa.gzs --inputgffs path/to/genomes.gff.gzs --outputdir path/to/results --hmm_mincov value --dbsource GTDB:GTDB:release
52+
nextflow run main.nf --inputfaas path/to/genomes.faa.gzs [--inputgffs path/to/genomes.gff.gzs] --outputdir path/to/results --hmm_mincov value --dbsource GTDB:GTDB:release
5453
5554
Mandatory arguments:
5655
--inputfaas path/to/genomes.faa.gzs Path of directory containing annotated genomes in the format faa.gz
57-
--inputgffs path/to/genomes.gff.gzs Path of directory containing annotated genomes in the format gff.gz
5856
--gtdb_bac_metadata path/to/file Path of tsv file including the metadata for bacterial genomes
5957
--gtdb_arc_metadata path/to/file Path of tsv file including the metadata for archaeal genomes
6058
--hmms path/to/hmm_directory Path of directory with HMM profile files
@@ -66,6 +64,7 @@ def helpMessage() {
6664
--featherprefix prefix Prefix for generated feather files (default "pfitmap-gtdb").
6765
6866
Non Mandatory parameters:
67+
--inputgffs path/to/genomes.gff.gzs Path of directory containing annotated genomes in the format gff.gz
6968
--max_cpus Maximum number of CPU cores to be used (default = 2)
7069
--max_time Maximum time per process (default = 10 days)
7170
@@ -98,7 +97,12 @@ if( !params.gtdb_bac_metadata ) {
9897

9998
// Create channels to start processing
10099
genome_faas = Channel.fromPath(params.inputfaas, checkIfExists : true)
101-
genome_gffs = Channel.fromPath(params.inputgffs, checkIfExists : true)
100+
if ( params.inputgffs ) {
101+
genome_gffs = Channel.fromPath(params.inputgffs, checkIfExists : true)
102+
}
103+
else {
104+
genome_gffs = Channel.empty()
105+
}
102106
hmm_files = Channel.fromPath("$params.hmms/*.hmm")
103107
profiles_hierarchy = Channel.fromPath(params.profiles_hierarchy, checkIfExists : true)
104108
dbsource = Channel.value(params.dbsource)
Binary file not shown.
Binary file not shown.
Binary file not shown.

0 commit comments

Comments
 (0)