Dear @hannespetur
Thank you and colleagues for the very nice svimmer and graphtyper software.
I would like to use svimmer and graphtyper for forced genotyping of the UNION of Manta ( many WGS) and SVIM-ASM (few assembly) discovered SVs in many WGS samples.
SVIM-ASM github
https://github.com/eldariont/svim-asm
The versions that I am using are svimmer/20211209 and graphtyper/2.7.3
When I try to get the (merged) UNION of SVs via svimmer I get this error.
Traceback (most recent call last):
File "/tools/eb/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/tools/eb/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/multiprocessing/pool.py", line 51, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/tools/eb/software/svimmer/20211209-GCC-10.2.0/svimmer", line 82, in append_svs_from_vcf
svs.append(SV(record, check_type=not args.ignore_types, join_mode=args.join_mode, output_ids=args.ids))
File "/tools/eb/software/svimmer/20211209-GCC-10.2.0/sv.py", line 75, in __init__
assert False
AssertionError
This is caused by svimmer not recognizing the DUP:TANDEM and DUP:INT types that SVIM-ASM outputs.
I can use the svimmer argument --ignore-types to get svimmer to work.
But then graphtyper complains about Unknown SV type and I guess also drops the SVs of unknown type??
<warning> constructor.cpp:106 Unknown SV type DUP:TANDEM
<warning> constructor.cpp:106 Unknown SV type DUP:TANDEM
Would it be possible to add a mapping for DUP:TANDEM and DUP:INT in the main branch of the svimmer code here?
Then the the combination of SVIM-ASM and svimmer/graphtyper would work for me and others with the same use case/combination of tools.
I also don't understand why SVs of type DUP, CNV and INV are mapped to type INS here
|
elif info_dict["SVTYPE"] == "ALU" or info_dict["SVTYPE"] == "LINE1" or info_dict["SVTYPE"] == "SVA" or \ |
That does not make sense to me. INS is a novel sequence , DUP, CNV and INV are sequences already found on the reference genome and therefore also need to genotyped differently in graphtyper?
Also what I find strange is that both svimmer and graphtyper do output SVs of type DUP.
That I can't square with the mapping of DUP, CNV and INV to INS. Or maybe the SV type is re-calculated again somewhere else in svimmer/graphtyper?
Thank you for your thoughts and help on this.
Dear @hannespetur
Thank you and colleagues for the very nice svimmer and graphtyper software.
I would like to use svimmer and graphtyper for forced genotyping of the UNION of Manta ( many WGS) and SVIM-ASM (few assembly) discovered SVs in many WGS samples.
SVIM-ASM github
https://github.com/eldariont/svim-asm
The versions that I am using are
svimmer/20211209andgraphtyper/2.7.3When I try to get the (merged) UNION of SVs via svimmer I get this error.
svimmer/sv.py
Line 75 in f2d78b2
This is caused by svimmer not recognizing the
DUP:TANDEMandDUP:INTtypes that SVIM-ASM outputs.svimmer/sv.py
Line 41 in f2d78b2
I can use the svimmer argument
--ignore-typesto get svimmer to work.But then graphtyper complains about
Unknown SV typeand I guess also drops the SVs of unknown type??Would it be possible to add a mapping for
DUP:TANDEMandDUP:INTin the main branch of the svimmer code here?svimmer/sv.py
Line 41 in f2d78b2
Then the the combination of SVIM-ASM and svimmer/graphtyper would work for me and others with the same use case/combination of tools.
I also don't understand why SVs of type
DUP,CNVandINVare mapped to typeINSheresvimmer/sv.py
Line 45 in f2d78b2
That does not make sense to me.
INSis a novel sequence ,DUP,CNVandINVare sequences already found on the reference genome and therefore also need to genotyped differently in graphtyper?Also what I find strange is that both svimmer and graphtyper do output SVs of type DUP.
That I can't square with the mapping of DUP, CNV and INV to INS. Or maybe the SV type is re-calculated again somewhere else in svimmer/graphtyper?
Thank you for your thoughts and help on this.