-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests #43
base: master
Are you sure you want to change the base?
Tests #43
Conversation
Any notebooks that require datasets to run should place code for downloading\extracting that data into download_data.sh. This allows systematic downloading and caching of datasets for testing.
The path construction for the dataset was prepending the users $HOME directory. I removed this to make it more transportable and similar to the other notebooks.
This makes it easier to track if the dataset has been downloaded already or not so we can cache things on della.
htfa notebook expects the dataset to be in /data. This is where it resides in the Docker image. I extracted the download commands for the data files and put them into download_data.sh. I modified the commands to extract the data into a local data/ folder. I then modified the notebook to look for the data in both of these locations and throw an exception otherwise.
The rt-cloud repo is a dependency of the the notebook. It is not pip installable so I added it as a git sub-module. I have also extracted a dependency list from rt-cloud/environment.yml to include with the other notebook dependencies.
…into tests Conflicts: notebooks/real-time/rtcloud_notebook.ipynb
One final issue to fix is that the last cell in the notebook executes the a python script on the command line. Even when this script fails this does not cause the cell in the notebook to fail.
Importing the main function and running directly ensures that tests fail if there are errors in sample.py. This is because if a cell has a command line execution that fails this is not considered a cell failure for testbook (is this a bug?)
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, @davidt0x! I have a couple of suggestions.
notebooks/isc/ISC.ipynb
Outdated
@@ -92,8 +92,7 @@ | |||
"source": [ | |||
"# Download and extract example data from Zenodo\n", | |||
"!wget https://zenodo.org/record/4300904/files/brainiak-aperture-isc-data.tgz\n", | |||
"!tar -xzf brainiak-aperture-isc-data.tgz\n", | |||
"!rm brainiak-aperture-isc-data.tgz" | |||
"!tar -xzf brainiak-aperture-isc-data.tgz\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding --skip-old-files
to minimize I/O? Same goes for any other extraction code in notebooks.
notebooks/isc/ISC.ipynb
Outdated
@@ -92,8 +92,7 @@ | |||
"source": [ | |||
"# Download and extract example data from Zenodo\n", | |||
"!wget https://zenodo.org/record/4300904/files/brainiak-aperture-isc-data.tgz\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have --no-clobber
so the file is not downloaded again? Same goes for any other download code in notebooks.
requirements.txt
Outdated
@@ -0,0 +1,32 @@ | |||
testbook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is easier for notebook authors in the future if each notebook has its own requirements file.
@@ -111,7 +111,7 @@ | |||
"source": [ | |||
"*1.2 Load participant data*<a id=\"load_ppt\"></a>\n", | |||
"\n", | |||
"Any 4 dimensional fMRI data that is readible by nibabel can be used as input to this pipeline. For this example, data is taken from the open access repository DataSpace: http://arks.princeton.edu/ark:/88435/dsp01dn39x4181. This file is unzipped and placed in the home directory with the name Corr_MVPA " | |||
"Any 4 dimensional fMRI data that is readible by nibabel can be used as input to this pipeline. For this example, data is taken from the open access repository DataSpace: http://arks.princeton.edu/ark:/88435/dsp01dn39x4181. This file is unzipped and placed same directory as this notebook with the name Corr_MVPA " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CameronTEllis FYI: please note a minor update to the data directory to make it compatible for automated testing.
Ok, I think I have addressed your comments @mihaic. Can you take a look? |
"Clear any pre-existing plot for this run using 'clearRunPlot(runNum)'\n", | ||
"###################################################################################\n", | ||
"/tmp/notebook-simdata/labels.npy\n", | ||
"Collected training data for TR 0\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi David, Can you clear the "outputs" created from running the notebook before checking in? I think it's in the Jupiter menu Cell->clear->all_outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Setup pytests using testbook for all notebooks.