This repository contains the codes for David Poole, Ali Mohammad Mehr, Wan Shing Martin Wang ,"Conditioning on “and nothing else”: Simple Models of Missing Data betweenNaive Bayes and Logistic Regression," ICML 2020 Workshop Artemiss Submission.
-
pip install -r requirements.txt
-
./run_tests.sh 1 200 10
pip install -r requirements.txt
will install Python dependencies for the code. The main dependencies include sklearn and pymc3.
Running ./run_tests.sh 1 200 10
will run the tests and produce the output graphs. As you can see, this script accepts 3 arguments as follows:
-
start_id: The id of the first test. Suggested value: 1
-
end_id: The id of the last test. Suggested value: 200
-
Number of concurrent tests running. Suggested value: 10 (if your system has at least 10 CPU threads)
The script will run
In the end, the script will run python Read-test-results-and-plot-graphs.py
to make the graphs shown in paper alongs with printing average logloss comparisons.
Using the suggested argument values for ./run_tests.sh
will result in running 200 tests which are run simultaneously on 10 CPU threads. Note that it will take more than 6 hours for the code to run 200 tests simultaneously on 10 CPU threads.