Skip to content

Advanced settings

Nadja Brait edited this page Jul 18, 2024 · 1 revision

All parameters can be easily adjusted to the user’s needs in the configuration file config.yaml, allowing users to customize their workflows without modifying the underlying code.

config.yaml:

db_dir: "databases" # relative to workflow base or absolute
search:
  # NOTE: with custom databases make sure to set: taxonlist: ""  (empty string)
  db: "rvdb80.dmnd"
  taxonlist: "--taxonlist 2732396,2731342" # screen only for Orthornaviridae and Monodnaviridae
  # taxonlist: "" <<< use with custom dbs that haven't been built with a diamond taxonomy !
  evalue: 1e-4
  min_length_aa: 83
  # expert
  other_args: "-c1 -b6 -F15 --sensitive --max-hsps 100 --range-culling -k 20 --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore stitle"
  # devel - sensitivity adjustments for <40% identity - see diamond options - negatively affects speed
  # other_args: "-c1 -b6 -F15 --ultra-sensitive --range-culling -k 20 --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore stitle"
  chop_window: 60000
  chop_step: 50000

retrosearch:
  db: "uniref50.dmnd"
  evalue: 1e-4
  # expert
  other_args: "-c1 -b6 -F15 --sensitive --max-hsps 100 --range-culling -k 20 --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore stitle"
  # devel - only retro against Anopheles
  # other_args: "--taxonlist 7164 -c1 -b6 -F15 --range-culling -k 20 --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore stitle"
  # devel - sensitivity adjustments for <40% identity - see diamond options - negatively affects speed
  # other_args: "-c1 -b6 -F15 --ultra-sensitive --range-culling -k 20 --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore stitle"

mask:
  # set to "" to disable masking
  # rest same as `search` above
  db: "mask.dmnd"

validate:
  # Options
  #   -b --min-bitscore-frac=<0:1>  Minimum bitscore relative to top hit per locus
  #                                 to include hit in validation [default: 0.5]
  #   -E --eve-score-high=<0:100>   Minimum eve-score for high-confidence validatEVEs
  #                                 [default: 30]
  #   -e --eve-score-low=<0:100>    Minimum eve-score for low-confidence validatEVEs
  #                                 [default: 10]
  #   -r --retro-score-low=<0:100>  Minimum retro-score for low-confidence validatEVEs
  #                                 even if with high eve-score [default: 10]
  #   -m --maybe-score-frac=<0:1>   Relative weight of maybe-viral hints in eve-score
  #                                 computation [default: 0.2]
  # Example for stricter search, 75% of top-bitscore, minimum eve-score of 50
  # args: "-b 0.75 -E 50"

Sensitivity settings

detectEVE employs DIAMOND which has a number of sensitivity settings to accommodate different applications. The default mode for detectEVE is the --sensitive mode tailored to hits of >40% identity, however can be changed to --very-sensitive and --ultra-sensitive modes for higher sensitivity. Higher sensitivity settings, will however increase memory usage and running time.

Memory Usage

DIAMOND's --block-size/-b is the main parameter for controlling the program’s memory and disk space usage. detectEVE's default parameters is -b6, allowing for improved performance, however will increase the use of memory and temporary disk space. The program can be expected to use roughly 20 times this number of memory (in GB). The parameter can be decreased for reducing memory use, but will also decrease performance. For --very-sensitive and --ultra-sensitive modes DIAMOND recommends to use -b0.4 as a default.