Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid hisat alignments with new genome #165

Open
aakarsh-anand opened this issue Jan 19, 2022 · 0 comments
Open

Invalid hisat alignments with new genome #165

aakarsh-anand opened this issue Jan 19, 2022 · 0 comments

Comments

@aakarsh-anand
Copy link
Contributor

Describe the bug
A clear and concise description of what the bug is. Please include the following in your bug report along with any explicit errors observed

I'm testing a new genome file with our pipelines, and I've generated alignments with it using both bwa and hisat.
I've finished running picard collectWGSmetrics on a F72 node for all the alignments for bwa and hisat using the old genome and for only bwa using the new genome. However, the runs kept failing for all the hisat alignments using the new genome due to the same types of errors which I can't understand:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 27745 at picard.analysis.AbstractWgsMetricsCollector.isReferenceBaseN(AbstractWgsMetricsCollector.java:218) at picard.analysis.WgsMetricsProcessorImpl.processFile(WgsMetricsProcessorImpl.java:92) at picard.analysis.CollectWgsMetrics.doWork(CollectWgsMetrics.java:236) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:308) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)

new genome and hisat2 index generated with hisat2 build:
/hot/users/aanand/GRCh38_masked_hisat

config:
/hot/users/aanand/config/hisat2/a-full-P2.config

bam file (the rest of the hisat2 alignments here also don't work with collectWGSmetrics):
/hot/users/aanand/hisat2_test_results/pipeline-alignDNA.inputs.a-full.P2/align-DNA-20211110-035825/HISAT2-2.2.1/pipeline-alignDNA.inputs.a-full.P2.bam

To Reproduce
Steps to reproduce the behavior:
submit

java -jar /hot/users/aanand/picard.jar CollectWgsMetrics I=/hot/users/aanand/hisat2_test_results/coverages/pipeline-alignDNA.inputs.a-full-fixed.P2.bam O=a-full-P2_collect_wgs_metrics_masked.txt R=/hot/users/aanand/GRCh38_masked_bwa/GCA_000001405.15_GRCh38_no_alt_analysis_set_maskedGRC_exclusions.fasta

via sbatch on F72 node.

Expected behavior
collectWGSmetrics completes and produces an output file with metrics.

Additional context
I tried running picard ValidateSamFile on the new a-full-P2 bam and got this result:

Error Type      Count

ERROR:MISMATCH_FLAG_MATE_NEG_STRAND     30456516

Picard FixMateInformation does not appear to help resolve this issue but I'm trying to use samtools right now as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant