Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

for minimap2 remove secondary alignments by default #80

Open
colindaven opened this issue Sep 29, 2022 · 5 comments
Open

for minimap2 remove secondary alignments by default #80

colindaven opened this issue Sep 29, 2022 · 5 comments

Comments

@colindaven
Copy link
Contributor

Tests show removing secondary alignments by default leads to much closer read numbers to original fastq compositon, when using simulated data.

Seems to be specific to minimap2 and esp long reads.

samtools -F 256 -bo filt.bam orig.bam

@colindaven
Copy link
Contributor Author

colindaven commented Oct 19, 2022

  • implemented, Tests to do

@colindaven
Copy link
Contributor Author

colindaven commented Oct 20, 2022

@irosenboom FYI, in case you test this version and find the new .nosec.bam part in the filenames.

Basically, this is why aligning long reads (or short reads but mostly we use bwa mem for that) with minimap2 led to highly inflated numbers of aligned reads reported. Eg in the mock communities.

  • This is an optional but recommended flag if using minimap2
  • Do you think I should just set it to run if --longread is set? Or if minimap2long or minimap2short is set ? Otherwise not really necessary for bwa mem.

@irosenboom
Copy link
Collaborator

Hi @colindaven , thanks for this interesting update. I would set it to run if minimap2long or minimap2short is set, just in case someone wants to use minimap2 instead of bwa mem for short reads.

@colindaven
Copy link
Contributor Author

colindaven commented Nov 10, 2022

Also added a remove supplementary alignments section to the pipeline. I changed the .nosec.bam to .ns. bam, which occurs once for each filter, so .ns.ns

These seem to be only necessary for long reads aligned with minimap2long in my experience.

Also - the setting is configurable using the nextflow.config, but I would always recommend for quantitative usage such as in metagenomics.

@colindaven
Copy link
Contributor Author

It seems the aligner bwa mem still produces some supplementary alignments despite this switch (significant number with less filtering).

samtools flagstat SRR13594152_200k_R1.fastp.ns.fix.s.dup.mm.mq30.calmd.bam
23831 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
57 + 0 supplementary
0 + 0 duplicates

It seems minimap2 does too ...

samtools flagstat tmp_sample1_R1.trm.ns.fix.s.bam
41498 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
304 + 0 supplementary
0 + 0 duplicates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants