Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve qualimap execution time #1356

Open
PramodRaoB opened this issue Aug 20, 2024 · 2 comments
Open

Improve qualimap execution time #1356

PramodRaoB opened this issue Aug 20, 2024 · 2 comments
Milestone

Comments

@PramodRaoB
Copy link

Description of feature

The qualimap tool sorts the input bam file by read-name and does so in a single-threaded manner. On a dataset that I'm using, the total execution time of the tool is around 85 mins. However, if we perform the sorting via samtools and then call qualimap on this (with the additional flag --sorted), the execution time drops to around 33 mins. Combined with the 7 mins it took for sorting (with 16 threads), the total execution time for the workflow reduces to less than half its original time.

If this change is fine, I could go ahead and implement it. Thanks!

@MatthiasZepper
Copy link
Member

MatthiasZepper commented Aug 20, 2024

I think, that change is perfectly fine!

There is even already a name sorting happening for the UMI deduplication route, at least for the transcriptome alignments:

    if (params.with_umi) {
        process {
            withName: 'NFCORE_RNASEQ:RNASEQ:SAMTOOLS_SORT' {
                ext.args   = '-n'
                ext.prefix = { "${meta.id}.umi_dedup.transcriptome" }
                publishDir = [
                    path: { params.save_align_intermeds || params.save_umi_intermeds ? "${params.outdir}/${params.aligner}" : params.outdir },
                    mode: params.publish_dir_mode,
                    pattern: '*.bam',
                    saveAs: { params.save_align_intermeds || params.save_umi_intermeds ? it : null }
                ]
            }
            // Name sort BAM before passing to Salmon
            SAMTOOLS_SORT (
                BAM_DEDUP_STATS_SAMTOOLS_UMITOOLS_TRANSCRIPTOME.out.bam,
                ch_fasta.map { [ [:], it ] }
            )

@MatthiasZepper MatthiasZepper added this to the 3.16.0 milestone Aug 20, 2024
@PramodRaoB
Copy link
Author

Thanks @MatthiasZepper. I'll work on it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants