Skip to content

Latest commit

 

History

History
10 lines (5 loc) · 712 Bytes

README.md

File metadata and controls

10 lines (5 loc) · 712 Bytes

aaa2vec

Developed by Yousuf A. Khan, Andrew Shu, and Matthew DeButts

This is a repository for the machine learning called aaa2vec, which predicts whether or not a protein belongs to the AAA protein family. This algorithm relies primarily on the word2vec algorithm developed by Google and has been implemented for the purposes of protein family prediction.

A Python 3 notebook for the code is provided, along with a FASTA file containing all the positive traininge examples for proteins in the AAA+ superfamily. A file with a link to the uniprot database fasta file is also included (the file is not included due to space limitations.

Please contact [email protected] if there are any questions.