seq2Fun provides function to annotate protein or coding sequences with KEGG pathways
Annotate the protein and coding sequences with KEGG pathways could be done with KAAS website. Here the seq2Fun try to do same thing with blast+ and diamond software.
The blast+ or diamond should be installed before installing seq2Fun package
library(devtools)
install_github("guokai8/seq2Fun")
library(seq2Fun)
## check if blast+ or diamond tools have been installed
blast_help()
head(listspecies())
###Find the correct species name
db <- preparedb(species = "Arabidopsis thaliana", seqtype = "AA", savedb = TRUE)
## Take several minutes
str(db, 2)
###savedb will write out the sequences file in the work directory
makeblastdb(db, dbtype = "prot")
###make blast db, set runblast = FALSE if you prefer diamond
seqs <- db$db[sample(500, 10)] ## random choose 10 sequences
ann <- seq2fun(query = seqs, db = db, type = "blastp", evalue = 1e-10, num_threads = 2)
## set bidirectional = TRUE if you prefer bidirectional blast
## set runblast = FALSE if you prefer diamond
head(ann)
The seq2Fun downloads and uses KEGG data. Non-academic uses may require a KEGG license agreement (details at http://www.kegg.jp/kegg/legal.html).
For any questions please contact [email protected]