Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GUNZIP does not work for relative paths specified with --fasta and --gtf #1311

Closed
fbnrst opened this issue May 31, 2024 · 1 comment
Closed
Labels
bug Something isn't working Ready for review

Comments

@fbnrst
Copy link

fbnrst commented May 31, 2024

Description of the bug

unzipping of fasta and gtf files fail, when I provide a relative path to those files. I can work around by either unzipping the file first, or by providing an absolute path. Below, I provide a minimal example for just the gtf file, I get the same kind of error for the fasta file.

Command used and terminal output

$ wget https://ftp.ensembl.org/pub/release-111/gtf/mus_musculus/Mus_musculus.GRCm39.111.gtf.gz
$ nextflow run nf-core/rnaseq -profile test -r 3.14.0 --outdir output --gtf Mus_musculus.GRCm39.111.gtf.gz

 N E X T F L O W   ~  version 24.04.2

Launching `https://github.com/nf-core/rnaseq` [tiny_engelbart] DSL2 - revision: b89fac3265 [3.14.0]



------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/rnaseq v3.14.0-gb89fac3
------------------------------------------------------
Core Nextflow options
  revision                  : 3.14.0
  runName                   : tiny_engelbart
  launchDir                 : /lustre/projects/Fabian_Rost/temp/rnaseq
  workDir                   : /lustre/projects/Fabian_Rost/temp/rnaseq/work
  projectDir                : /home/rost/.nextflow/assets/nf-core/rnaseq
  userName                  : rost
  profile                   : test
  configFiles               :

Input/output options
  input                     : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/samplesheet/v3.10/samplesheet_test.csv
  outdir                    : output

Reference genome options
  fasta                     : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/genome.fasta
  gtf                       : Mus_musculus.GRCm39.111.gtf.gz
  gff                       : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/genes.gff.gz
  transcript_fasta          : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/transcriptome.fasta
  additional_fasta          : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/gfp.fa.gz
  hisat2_index              : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/hisat2.tar.gz
  rsem_index                : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/rsem.tar.gz
  salmon_index              : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/salmon.tar.gz

Read filtering options
  bbsplit_fasta_list        : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/bbsplit_fasta_list.txt

UMI options
  umitools_bc_pattern       : NNNN

Alignment options
  pseudo_aligner            : salmon
  min_mapped_reads          : 5

Process skipping options
  skip_bbsplit              : false

Institutional config options
  config_profile_name       : Test profile
  config_profile_description: Minimal test dataset to check pipeline function

Max job request options
  max_cpus                  : 2
  max_memory                : 6.GB
  max_time                  : 6.h

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use nf-core/rnaseq for your analysis please cite:

* The pipeline
  https://doi.org/10.5281/zenodo.1400710

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/nf-core/rnaseq/blob/master/CITATIONS.md

WARN: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Both '--gtf' and '--gff' parameters have been provided.
  Using GTF file as priority.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
WARN: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  '--transcript_fasta' parameter has been provided.
  Make sure transcript names in this file match those in the GFF/GTF file.

  Please see:
  https://github.com/nf-core/rnaseq/issues/753
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED                                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT                                            -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX                                       -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED                                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT                                            -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX                                       -
[-        ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ                                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED                                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT                                            -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX                                       -
[-        ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ                                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC                                 -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA                                  [  0%] 0 of 1
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED                                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT                                            -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX                                       [  0%] 0 of 1
[-        ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ                                                               [  0%] 0 of 1
[-        ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC                                 [  0%] 0 of 2
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER                                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA                                  [  0%] 0 of 1
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED                                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT                                            -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX                                       [  0%] 0 of 1
[-        ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ                                                               [  0%] 0 of 2
[-        ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC                                 [  0%] 0 of 2
[-        ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE                             [  0%] 0 of 2
[-        ] process > NFCORE_RNASEQ:RNASEQ:BBMAP_BBSPLIT                                                           -
[-        ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:FQ_SUBSAMPLE                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN                                                   -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_SORT                        -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_INDEX                       -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS    -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_QUANT                                       -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:TX2GENE                                            -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:TXIMPORT                                           -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SE_GENE                                            -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SE_GENE_LENGTH_SCALED                              -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SE_GENE_SCALED                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SE_TRANSCRIPT                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:DESEQ2_QC_STAR_SALMON                                                   -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES                         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:SAMTOOLS_INDEX                                -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS             -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT          -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS          -
[-        ] process > NFCORE_RNASEQ:RNASEQ:STRINGTIE_STRINGTIE                                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:SUBREAD_FEATURECOUNTS                                                   -
[-        ] process > NFCORE_RNASEQ:RNASEQ:MULTIQC_CUSTOM_BIOTYPE                                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDTOOLS_GENOMECOV                                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_FORWARD:UCSC_BEDCLIP                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_FORWARD:UCSC_BEDGRAPHTOBIGWIG         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_REVERSE:UCSC_BEDCLIP                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_REVERSE:UCSC_BEDGRAPHTOBIGWIG         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUALIMAP_RNASEQ                                                         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:DUPRADAR                                                                -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_BAMSTAT                                                 -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_INNERDISTANCE                                           -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_INFEREXPERIMENT                                         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONANNOTATION                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONSATURATION                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_READDISTRIBUTION                                        -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_READDUPLICATION                                         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_PSEUDO_ALIGNMENT:SALMON_QUANT                                  -
Plus 9 more processes waiting for tasks…
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/rnaseq] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF'

Caused by:
  Not a valid path value: 'Mus_musculus.GRCm39.111.gtf.gz'



Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details

Relevant files

nextflow.log

System information

  • Nextflow 24.04.2
  • HPC
  • local executor
  • CentOS Linux release 7.4.1708
  • nf-core/rnaseq v3.14.0
@pinin4fjords
Copy link
Member

Thanks for the report. This due the file getting passes as a string incorrectly- fixed by the PR above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Ready for review
Projects
None yet
Development

No branches or pull requests

2 participants