-
Notifications
You must be signed in to change notification settings - Fork 10
Description
I am using extracHAIRS to get reads covering heterozygous SNPs. However, the SNPs seems to be ignored when two ends are overlapping. And the quality score for the base is not correct. In the attached example, the first read pairs are not reported even they cover the SNP, and the second read was reported but with the quality score < instead of D.
Commands
extractHAIRS --bam test.bam --VCF test.vcf --singlereads 1
BAM Files
7001113:798:HGCT5BCXY:1:2107:10688:87752 163 chr22 16056034 27 100M = 16056083 148 CACTCAGCCAGTTCACCCCACCCACATTCCACAGGCTGCTTTAGGCTTTAGGACAGTGGCAAACATGGCCTCTGCCATCCCGGTCTGTGAGCGCCCCTTC DDDDDIHIIIIIIIIIIIIIIIIIIIIIEHHIIIIHIIIIIIIIHHIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGHIIIH MD:Z:100 PG:Z:MarkDuplicates RG:Z:JY333 NM:i:0 AS:i:100 XS:i:100
7001113:798:HGCT5BCXY:1:2107:10688:87752 83 chr22 16056083 27 99M = 16056034 -148 AGGACAGTGGCAAACATGGCCTCTGCCATCCCGGTCTGTGAGCGCCCCTTCTTACACCAAGGTCAGTTGCTAACCAATGAGCTGCTGGGGGCCTCCTTC IIIIIIIIIIIIIIIIHIIIGIIIIHIIIHIIIIIIHIIIIGIIIIIIIGIIIHE1IIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIGIHEFD<0DD< MD:Z:99 PG:Z:MarkDuplicates RG:Z:JY333 NM:i:0 AS:i:99 XS:i:94
7001113:798:HGCT5BCXY:2:1213:5687:10232 163 chr22 16056108 27 98M = 16056333 325 CCATCCCGGTCTGTGAGCGCCCCTTCTTACACCAAGGTCAGTTGCTAACCAATGAGCTGCTGGGGGCCTCCTTCTCCCACTCCCACTGCACTGTGTCC 0<D@DEEHHCDCG<CGFHD<HHHHIIIHEHHIEH@DCHEEHHIIIIIIIE?HHIIFHIEHFH?GHDHEGHHIIHHIIIEHHHHG?HGHHHIIIEHHHH MD:Z:98 PG:Z:MarkDuplicates RG:Z:JY333 NM:i:0AS:i:98 XS:i:93
VCF File
chr22 16056126 . G A 198.77 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=1.519;DP=23;Dels=0.00;FS=1.848;HaplotypeScore=0.6651;MLEAC=1;MLEAF=0.500;MQ=32.97;MQ0=3;MQRankSum=-0.228;QD=8.64;ReadPosRankSum=-0.076;SOR=0.605;VQSLOD=1.97;culprit=FS GT:AD:DP:GQ:PL 0/1:13,10:23:99:227,0,219