-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter a gene list including intergenic regions #2319
Comments
The problem is somewhat confusing as it is stated: you say you want to filter in intergenic regions but the example you gave seems unrelated. Instead, it seems the variant is in two overlapping genes (here FAM138A and OR4F5) and the problem is that matching by gene name does not work for these records. |
Perhaps I picked a bad example, that variant was tagged as intergenic by annovar but I did not look at it in a genome browser. Looking at another clearly intergenic variant, here is the annovar vcf output My question is: how would I filter for this variant if I am looking for variants flagged as intergenic, but specifically variants that might affect ARHGEF16? I have a very large list of genes, and it would be difficult to correctly list all of the possible variations if I want to find intergenic variants near it. |
The question seems focused on the variant being intergenic. I am sorry but I still don't understand what is not working for you exactly. Can you provide a small test case, a VCF with full header, the gene list you are using, the command which is not working for you, and the output you expect? I see the VCF has the |
Hi,
I see how to filter a gene list for most snv/indels in issue Filter a gene list #1964.
However, I want to look at intergenic variants as well. Annovar includes other info in the Gene.refGene field like
Gene.refGene=FAM138A\x3bOR4F5
If my gene.txt file only contains FAM138A, the intergenic variants are not included.
I'm using bcftools v1.21. My command is in the format bcftools view -i '[email protected] ' file.vcf
Including wildcards in the command or in the genes.txt file didn't work.
Do you have any suggestions?
The text was updated successfully, but these errors were encountered: