Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative genetic codes in bcftools csq for mitochondrial variant annotation #2368

Open
etwatson opened this issue Feb 24, 2025 · 2 comments

Comments

@etwatson
Copy link

Hello,

I’ve been using snpEff for variant effect prediction and find it very useful. However, I have noticed some instances of multi-nucleotide polymorphisms that nullify effect predictions based on single nucleotides - so I have become very interested in using bcftools csq in order to perform this analysis in a haplotype-aware manner.

I frequently analyze mitochondrial variants, which often require the use of alternative genetic codes. Fun mitochondria in fungi that contain mobile introns! From what I can tell, bcftools csq does not currently support specifying different genetic codes for annotation.

Would it be possible to add an option to allow users to specify an alternative genetic code when running bcftools csq? This would be particularly helpful for those working with mitochondrial genomes, where the standard nuclear genetic code does not apply.

If there is already a workaround for this, I would appreciate any guidance. Otherwise, I’d love to hear if this could be considered as a feature request.

Thanks for your time, and I appreciate all the work that goes into maintaining bcftools!

@etwatson etwatson changed the title Alternative Genetic Codes in bcftools csq for Mitochondrial Variant Annotation Alternative genetic codes in bcftools csq for mitochondrial variant annotation Feb 24, 2025
@pd3
Copy link
Member

pd3 commented Mar 5, 2025

It is not supported but we can think about it, if you can provide detailed descriptions with specific examples.

@etwatson
Copy link
Author

etwatson commented Mar 5, 2025

That is excellent. There are 33 well established alternative genetic codes as described by Anjay Elzanowski and Jim Ostell at the National Center for Biotechnology Information (NCBI). Anyone working on these organisms, especially on mitochondrial or chloroplast genetics, would be very appreciative.

As I mentioned, in snpEff, you can set the mitochondrial codon table in the config file, and even make changes to it, since there are some lineage specific codon tables. For example AUA and AUU are initiation codons in human mitochondria, while AUC also acts as an initiation codon in mice mitochondria. However, as you know, snpEff is limited to single nucleotide effect predictions, so it would be excellent if we could use bcftools csq for predicting variant effects.

For some examples here are a few of the major differences from the standard code:

Codon Alternate Standard Taxonomy Translation Table
AGA Ter * Arg R Human mito 2
AGG Ter * Arg R Human mito 2
AUA Met M Ile I Human mito 2
UGA Trp W Ter * Human mito 2
AUA Met M Ile I Yeast mito 3
CUU Thr T Leu L Yeast mito 3
CUC Thr T Leu L Yeast mito 3
CUA Thr T Leu L Yeast mito 3
CUG Thr T Leu L Yeast mito 3
UGA Trp W Ter * Yeast mito 3
UGA Trp W Ter * Mold/Fungi/Protozoan mito 4
AGA Ser S Arg R Invertebrate mito 5
AGG Ser S Arg R Invertebrate mito 5
AUA Met M Ile I Invertebrate mito 5
UGA Trp W Ter * Invertebrate mito 5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants