Skip to content
This repository has been archived by the owner on Jan 20, 2023. It is now read-only.

Error: Something is not right with your vcf file #4

Open
ksw9 opened this issue Apr 22, 2020 · 3 comments
Open

Error: Something is not right with your vcf file #4

ksw9 opened this issue Apr 22, 2020 · 3 comments

Comments

@ksw9
Copy link

ksw9 commented Apr 22, 2020

Hi,
Thanks for a great program!
I'm getting an error 'Something is not right with your vcf file' and I'm wondering if there are VCF file specifications that need to be met to use this tool? I'm not sure if, for instance, this requires haploid genotype calls?
Uploading cluster10.vcf.gz…

./run_b2cs -x ${xml} beast2 $ref  test2.vcf
2020-04-22 15:57 B2CONSTSITES INFO  Imported sequence NC_000962.3.
2020-04-22 15:57 B2CONSTSITES INFO  Sequence has length 4411532

2020-04-22 15:57 B2CONSTSITES CRITICAL Something is not right with your vcf file
Traceback (most recent call last):
  File "/home/kwalter/.local/bin/run_b2cs", line 8, in <module>
    sys.exit(run_b2constsites())
  File "/home/kwalter/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/kwalter/.local/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/kwalter/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/kwalter/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/kwalter/.local/lib/python3.8/site-packages/b2constsites/run_b2constsites.py", line 53, in run_b2constsites
    cs.load_vcf()
  File "/home/kwalter/.local/lib/python3.8/site-packages/b2constsites/b2constsites.py", line 49, in load_vcf
    self.var_pos.update(self._parse_pos(rec))
  File "/home/kwalter/.local/lib/python3.8/site-packages/b2constsites/b2constsites.py", line 34, in _parse_pos
    if rec.INFO['TYPE'][0] in ACCEPTABLE_VAR_TYPES:
KeyError: 'TYPE'

Following a previously published pipeline, I used VarScan to call variants and have a VCF with diploid calls. I'm attaching the VCF below.

Any suggestions would be great!
Thank you and best,

@andersgs
Copy link
Owner

Hi @ksw9. I am busy with COVID19 response at the moment. I’ll have a look over the coming days. Sorry for the delay.

@ksw9
Copy link
Author

ksw9 commented Dec 9, 2020

Hi,
I hope you are doing well. I was wondering if you could help me troubleshoot the issue we encountered and if your script requires a specifically formatted VCF?

Thank you in advance!
All the best,

@GaryNapier
Copy link

Hi, I just had the same error and thought I'd share a solution.

b2constsites.py has this function, which is checking for nucleotide types (I think on the home page it says it can only deal with SNPs and MNP):

    def _parse_pos(self, rec):
        '''
        Find which positions to mask
        '''
        ACCEPTABLE_VAR_TYPES = {'snp', 'mnp'}
        try:
            if rec.INFO['TYPE'][0] in ACCEPTABLE_VAR_TYPES:
                start_pos = rec.POS
                if rec.INFO['TYPE'][0] == 'mnp':
                    end_pos = start_pos + len(rec.ALT[0])
                else:
                    end_pos = start_pos + 1
        except KeyError:
            logging.critical("Something is not right with your vcf file")
            raise
        return list(range(start_pos, end_pos))

This function is looking for the 'TYPE' field in the INFO column, which does not exist in my VCF (I'm using VCF v 4.2, if that helps), so that's why I got the same KeyError: 'TYPE'

Since I know my VCF is only SNPs because of the filtering I've done, I don't need it to check the type, so I just commented most of the function and left the start and end positions:

def _parse_pos(self, rec):
        '''
        Find which positions to mask
        '''
        ACCEPTABLE_VAR_TYPES = {'snp', 'mnp'}
        # try:
            # if rec.INFO['TYPE'][0] in ACCEPTABLE_VAR_TYPES:
                # start_pos = rec.POS
                # if rec.INFO['TYPE'][0] == 'mnp':
                    # end_pos = start_pos + len(rec.ALT[0])
                # else:
                    # end_pos = start_pos + 1
        # except KeyError:
        #     logging.critical("Something is not right with your vcf file")
        #     raise
        start_pos = rec.POS
        end_pos = start_pos + 1
        return list(range(start_pos, end_pos))

I got no errors and the XML opens fine in Beauti.

Gary

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants