Skip to content
Benjamin Buchfink edited this page Dec 16, 2020 · 29 revisions

Synopsis

# downloading the tool
wget http://github.com/bbuchfink/diamond/releases/download/v2.0.5/diamond-linux64.tar.gz
tar xzf diamond-linux64.tar.gz
# creating a diamond-formatted database file
./diamond makedb --in reference.fasta -d reference
# running a search in blastp mode
./diamond blastp -d reference -q queries.fasta -o matches.tsv
# running a search in blastx mode
./diamond blastx -d reference -q reads.fasta -o matches.tsv

Some important points to consider:

  • Repeat masking is applied to the query and reference sequences by default. To disable it, use --masking 0.
  • DIAMOND is optimized for large input files of >1 million proteins. Naturally the tool can be used for smaller files as well, but the algorithm will not reach its full efficiency.
  • The program may use quite a lot of memory and also temporary disk space. Should the program fail due to running out of either one, you need to set a lower value for the block size parameter -b.
  • You can adjust the sensitivity using the options --mid-sensitive, --sensitive, --more-sensitive, --very-sensitive and --ultra-sensitive.