Skip to content

Files

Latest commit

f516fcb · Dec 3, 2021

History

History

doc

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Dec 3, 2021

How to use

  1. Place the training data in data/Challenge, and test data in data/Evaluation
  2. Run src/IO.py, and get csv files that will be used for the next steps.
  3. Run src/model-size.py to get the sizes of models with various parameters.
  4. Run src/table-maker.py, and get the estimated performance of each model under various parameters.
  5. Select the best pair of parameter by using the tables from 3 and 4.
  6. Run src/method.py to get the final model.

Scripts

IO.py

Read training data in data/Challenge and test data in data/Evaluation, and generate the csv files for SNPeff. To be more precise, this script outputs three csv files:

  1. out/SNPeff/SNPeff_train.csv
  2. out/SNPeff/SNPeff_test.csv
  3. out/SNPeff/variant_gene_list.csv

Sample usage

python3 src/IO.py

method.py

For given training data in data/Challenge, test data in data/Evaluation, csv files in out/SNPeff, and (d_cn, k_var) from argv, output the shallow network models trained by our method.

Sample usage

python3 src/method.py --cn 0.1 --snpeff 260 --repeat 10 --path out/model/

table-maker.py

For given training data in data/Challenge, csv files in out/SNPeff, and n and a range of (d_cn, k_var) from argv, output a table that describes the performance of our method using n-fold cross validation under each (d_vn, k_var). More precisely, this script outputs csv files:

  1. [path]/exact_aucs.csv
  2. [path]/exact_accs.csv
  3. [path]/approx_aucs.csv
  4. [path]/approx_accs.csv

Sample usage

python3 src/table-maker.py --cn-start 0 --cn-step 0.01 --cn-stop 0.4 --snpeff-start 0 --snpeff-step 10 --snpeff-stop 600 --repeat 10 --path out/table/

model-size.py

For given training data in data/Challenge, csv files in out/SNPeff, and a range of (d_cn, k_var) from argv, output the table of the size of models under each (d_cn, k_var).

Sample usage

python3 src/model-size.py --cn-start 0 --cn-step 0.01 --cn-stop 0.4 --snpeff-start 0 --snpeff-step 10 --snpeff-stop 600 --path out/modelTMP/num_cands.csv