-
Notifications
You must be signed in to change notification settings - Fork 28
Description
Hi,
The vcf file generated by longshot is giving error with whatshap phasing.
my command: whatshap phase --ignore-read-groups --reference Asm.pa_ctg.fasta Asm.pa_ctg_bwa_sorted_longshot.vcf.gz Asm.pa_ctg_bwa_HiFi_sorted.bam -o Asm.pa_ctg_bwa_sorted_longshot_phased.vcf.gz
Error:
This is WhatsHap 1.1.dev91+gc714ed5 running under Python 3.7.4
Working on 1 samples from 1 family
======== Working on chromosome 'ptg000003l'
---- Processing individual SAMPLE
Using maximum coverage per sample of 15X
Number of variants skipped due to missing genotypes: 0
Number of remaining heterozygous variants: 272
Reading alignments and detecting alleles ...
Found 485 reads covering 272 variants
Kept 98 reads that cover at least two variants each
Reducing coverage to at most 15X by selecting most informative reads ...
Selected 68 reads covering 259 variants
Best-case phasing would result in 5 non-singleton phased blocks (5 in total)
... after read selection: 5 non-singleton phased blocks (5 in total)
Variants covered by at least one phase-informative read in at least one individual after read selection: 259
Phasing 1 sample by solving the MEC problem ...
MEC cost: 3840
No. of phased blocks: 5
Largest block contains 247 variants (95.4% of accessible variants) between position 13307 and 23353
======== Writing VCF
[E::vcf_parse_format] Invalid character '.' in 'GQ' FORMAT field at ptg000003l:13085
My VCF file:
#fileformat=VCFv4.2
##source=Longshot v0.4.0
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth of reads passing MAPQ filter">
##INFO=<ID=AC,Number=R,Type=Integer,Description="Number of Observations of Each Allele">
##INFO=<ID=AM,Number=1,Type=Integer,Description="Number of Ambiguous Allele Observations">
##INFO=<ID=MC,Number=1,Type=Integer,Description="Minimum Error Correction (MEC) for this single variant">
##INFO=<ID=MF,Number=1,Type=Float,Description="Minimum Error Correction (MEC) Fraction for this variant.">
##INFO=<ID=MB,Number=1,Type=Float,Description="Minimum Error Correction (MEC) Fraction for this variant's haplotype block.">
##INFO=<ID=AQ,Number=1,Type=Float,Description="Mean Allele Quality value (PHRED-scaled).">
##INFO=<ID=GM,Number=1,Type=Integer,Description="Phased genotype matches unphased genotype (boolean).">
##INFO=<ID=DA,Number=1,Type=Integer,Description="Total Depth of reads at any MAPQ (but passing samtools filter 0xF00).">
##INFO=<ID=MQ10,Number=1,Type=Float,Description="Fraction of reads (passing 0xF00) with MAPQ>=10.">
##INFO=<ID=MQ20,Number=1,Type=Float,Description="Fraction of reads (passing 0xF00) with MAPQ>=20.">
##INFO=<ID=MQ30,Number=1,Type=Float,Description="Fraction of reads (passing 0xF00) with MAPQ>=30.">
##INFO=<ID=MQ40,Number=1,Type=Float,Description="Fraction of reads (passing 0xF00) with MAPQ>=40.">
##INFO=<ID=MQ50,Number=1,Type=Float,Description="Fraction of reads (passing 0xF00) with MAPQ>=50.">
##INFO=<ID=PH,Number=G,Type=Integer,Description="PHRED-scaled Probabilities of Phased Genotypes">
##INFO=<ID=SC,Number=1,Type=String,Description="Reference Sequence in 21-bp window around variant.">
##FILTER=<ID=dn,Description="In a dense cluster of variants">
##FILTER=<ID=dp,Description="Exceeds maximum depth">
##FILTER=<ID=sb,Description="Allelic strand bias">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Float,Description="Genotype Quality">
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phase Set">
##FORMAT=<ID=UG,Number=1,Type=String,Description="Unphased Genotype (pre-haplotype-assembly)">
##FORMAT=<ID=UQ,Number=1,Type=Float,Description="Unphased Genotype Quality (pre-haplotype-assembly)">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
ptg000002l 31726 . G T 19.59 PASS DP=12;AC=9,3;AM=0;MC=0;MF=0.000;MB=0.000;AQ=30.13;GM=1;DA=29;MQ10=0.72;MQ20=0.69;MQ30=0.41;MQ40=0.41;MQ50=0.38;PH=19.59,3.06,3.06,238.12;SC=GGCCTATTTTGTACCACACGG; GT:GQ:PS:UG:UQ 0/1:19.59:.:0/1:19.59
ptg000002l 37154 . G T 34.57 PASS DP=9;AC=4,3;AM=0;MC=0;MF=0.000;MB=0.000;AQ=30.13;GM=1;DA=44;MQ10=0.39;MQ20=0.23;MQ30=0.20;MQ40=0.11;MQ50=0.09;PH=34.57,3.01,3.01,102.47;SC=GGCCTATTTTGTACCACACGG; GT:GQ:PS:UG:UQ 0/1:34.57:.:0/1:34.57
ptg000002l 53824 . G C 8.31 PASS DP=6;AC=2,3;AM=1;MC=1;MF=0.200;MB=0.200;AQ=10.96;GM=1;DA=30;MQ10=0.27;MQ20=0.23;MQ30=0.20;MQ40=0.13;MQ50=0.10;PH=8.31,0.83,37.16,15.98;SC=GGCGGACGAAGTTGGCCAACC; GT:GQ:PS:UG:UQ 0|1:7.62:53824:0/1:39.87
ptg000002l 53832 . A G 9.74 PASS DP=6;AC=2,3;AM=1;MC=1;MF=0.200;MB=0.200;AQ=10.63;GM=1;DA=33;MQ10=0.24;MQ20=0.21;MQ30=0.18;MQ40=0.12;MQ50=0.09;PH=9.74,2.17,38.50,5.43;SC=AAGTTGGCCAACCCCACCCCC; GT:GQ:PS:UG:UQ 0|1:4.06:53824:0/1:34.97
ptg000002l 54482 . T G 31.57 PASS DP=9;AC=5,3;AM=0;MC=0;MF=0.000;MB=0.000;AQ=30.13;GM=1;DA=36;MQ10=0.47;MQ20=0.42;MQ30=0.25;MQ40=0.25;MQ50=0.19;PH=31.57,3.01,3.01,129.59;SC=GGCTGTGCCCTCGGCCTATTT; GT:GQ:PS:UG:UQ 0/1:31.57:.:0/1:31.57
ptg000002l 56222 . T G 11.01 PASS DP=38;AC=30,5;AM=0;MC=0;MF=0.000;MB=0.000;AQ=30.13;GM=1;DA=64;MQ10=0.72;MQ20=0.69;MQ30=0.59;MQ40=0.58;MQ50=0.55;PH=11.01,3.37,3.37,500.00;SC=AGCCCACCCGTTTTTGTGGCC; GT:GQ:PS:UG:UQ 0/1:11.01:.:0/1:11.01
ptg000002l 58436 . T G 8.22 PASS DP=19;AC=13,3;AM=1;MC=0;MF=0.000;MB=0.000;AQ=15.18;GM=1;DA=31;MQ10=0.87;MQ20=0.77;MQ30=0.61;MQ40=0.48;MQ50=0.19;PH=8.22,3.72,3.72,347.26;SC=CGCCCCGGCCTTCGGCCGGTT; GT:GQ:PS:UG:UQ 0/1:8.22:.:0/1:8.22
Best