You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello !
I am using Metaeuk through BUSCO on my genome in order to do some gene prediction.
I expect that the single-copy, full-length detected proteins should have in their vast majority a methyonine at their start. However, this is not the case.
I went through the predicted BUSCOs and several of them started by another aminoacid.
Is there a step in metaeuk that checks for the starting aminoacid ?
I am using the glires database from ODB on a yet unannotated genome, with metaeuk Version 5.34c21f2
I installed BUSCO (so Metaeuk as well) using their conda installation (BUSCO V 5.3) on an Ubuntu operating system.
Best,
Timothee
The text was updated successfully, but these errors were encountered:
Thank you for the comment. I am marking this as a future feature to develop. Right now it is not possible to impose that proteins start with a methionine. There can be several reasons why several of your proteins do not start with M: (1) some proteins simply don't, (2) It can be your contigs are very fragmented so you get a lot of partial proteins; (3) It can be that your investigated organism is not very similar to that, which exists in the target database, in which case, the homology detection will be harder and some parts (potentially the start) of the proteins match poorly.
If this concerns you, I would try to look at a couple of things: (1) What is the fraction of proteins, which do not start with M? Does is it make sense for the taxonomic group you're investigating? How does this correlate with their E-value (do the missing M have worse E-values?) (2) Can you manually check a couple of examples? Does it look like there is an M upstream, which was not detected?
It's been two years... not sure if this is still relevant for you...
I have added the option to scan for an ATG before the first exon. It is still not part of an official MetaEuk release but the option is there from commit 528cddc.
See details here.
Hello !
I am using Metaeuk through BUSCO on my genome in order to do some gene prediction.
I expect that the single-copy, full-length detected proteins should have in their vast majority a methyonine at their start. However, this is not the case.
I went through the predicted BUSCOs and several of them started by another aminoacid.
Is there a step in metaeuk that checks for the starting aminoacid ?
I am using the glires database from ODB on a yet unannotated genome, with metaeuk Version 5.34c21f2
I installed BUSCO (so Metaeuk as well) using their conda installation (BUSCO V 5.3) on an Ubuntu operating system.
Best,
Timothee
The text was updated successfully, but these errors were encountered: