Skip to content

Latest commit

 

History

History
4 lines (3 loc) · 416 Bytes

README.md

File metadata and controls

4 lines (3 loc) · 416 Bytes

k-means-to-BLAST-

Uses k-means prediction to ascertain genomic sequences within ambiguously attributed FASTA format data. The resulting cluster sequences are used to develop NCBI BLAST requests, and return sequence identities.

Notably, this was used to find a complete gene used by a haloarchaeal virus to pack its capsids with DNA and determine self from non-self DNA within a haloarchaeal genomic sequence.