This repository contains resources for the Open Science workshop for teaching about jupyter notebooks.
Throughout this workshop, we will be using Jupyter Notebooks. Although the vlabs's available you will be using will have Jupyter setup, these notes are provided for you want to set it up in your computer.
The Jupyter Notebook is an interactive computing environment that enables users to author notebooks, which contain a complete and self-contained record of a computation. These notebooks can be shared more efficiently. The notebooks may contain:
- Live code
- Interactive widgets
- Plots
- Narrative text
- Equations
- Images
- Video
It is good to note that "Jupyter" is a loose acronym meaning Julia, Python, and R; the primary languages supported by Jupyter. However, other languages are supported by Jupyter.
The notebook can allow a computational researcher to create reproducible documentation of their research. As Bioinformatics is datacentric, use of Jupyter Notebooks increases research transparency, hence promoting open science.
-
Clone this repository to your working directory.
git clone https://github.com/BioinfoNet/TeachingJupyterNotebooks.git
-
Download Anaconda for your operating system for Python 3. Use this link
-
Follow the install instructions for your operating system. Here are the instructions.
-
Afterwards, navigate to the directory where the folder is using
cd
andls
. Then run this command in your terminalconda env create -f environment.yml
This will create an environment called jupyter-notebook-tutorial. You can activate it like this
`source activate jupyter-notebook-tutorial`
-
In your terminal, in the directory where you cloned this repository. Run this command
jupyter notebook jupyter-notebook-slides.ipynb
Alternatively, let's get packages which will enable you to use the tools that are demonstrated.
-
Open your terminal. Type this command to get Nbextensions
pip install jupyter_contrib_nbextensions
jupyter-contrib nbextension install --sys-prefix
-
Now to get the interactive dataframe tool called qgrid
pip install qgrid==0.3.3
-
You can also search github to figure out what you've installed does for instance, here's the repository for jupyter themes https://github.com/dunovank/jupyter-themes, Nbextensions https://github.com/ipython-contrib/jupyter_contrib_nbextensions and qgrid https://github.com/quantopian/qgrid. Additionally, these were covered during the presentation.
-
Then get bioconda. This avails tools commonly used for bioinformatics e.g samtools, bowtie and bwa. Follow the steps in this link
-
To get tools specifically for bioinformatics. Get scikit-bio
pip install scikit-bio
The repository has a number of files that constitute elements of the jupyter notebook. They include:
-
README.md
: Markdown text with an explanation of how the user can make use of these resources. -
environment.yml
: Has instructions to create the same environment the creator has in your own system. -
jupyter-notebook-slides.ipynb
: Contains the presentation that shows the reader how to use notebooks with bioinformatic examples mostly. If you're having problems starting up this notebook, try openingjupyter-notebook-slides2.ipynb
-
files:
Has a variety of files from notebooks, fasta, fastq files among other files. -
storeddf.ipynb
: Contains created dataframes of counts of specific bases of several microbes 16S rRNA gene.
This was a quick introduction. To learn more about Jupyter notebooks, and what you can do with it, check the following resources: