-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Hello VStrains team, thank you for developing such a great tool, but while using it, I faced the following problem.
I attempted to assemble the complete HIV genome from this sample: SRR29407826. I used corona-spades and it worked fine.
However VStrains crashes when assembling it.
First, there was an error related to rev_dict in VStrains_PE_Inference.py. It doesn't had lowercase nucleotides in it and therefore raised KeyError. I fixed it replacing:
rev_dict = {"A": "T", "T": "A", "C": "G", "G": "C"}
with this:
rev_dict = { "A": "T", "T": "A", "C": "G", "G": "C", "a": "t", "t": "a", "c": "g", "g": "c" }
But new issue occurred, after messages in CLI log:
----------------------Paired-End Information Alignment----------------------
Start aligning reads to gfa nodes
Number of processed reads: 0
It freezes forever and do not proceed any further.
Worth mentioning details
In the same log there is a suspicious message:
INFO - graph kmer size: 0
Also VStrains can't read assembly_graph_after_simplification.gfa file (which is the output of spades) without changing its version in header from 1.2 to 1.0 manually.
Steps to reproduce
- Assembly with spades:
spades.py --corona -1 SRR29407826_1.fastq -2 SRR29407826_2.fastq -o spades_G_SRR29407826
- Start VStrains:
vstrains -a spades -g spades_G_SRR29407826/assembly_graph_after_simplification.gfa \
-p spades_G_SRR29407826/contigs.paths \
-o vstrains_G -fwd SRR29407826_1.fastq -rve SRR29407826_2.fastq
Files with reads:.
SRR29407826.zip
VStrains log:
vstrains.log
Spades log:
spades.log