Skip to content

JupyterHub Binder deployment strategies on AWS

Bhavya Kandimalla edited this page Oct 4, 2019 · 32 revisions

Notes for deploying JupyterHub/Binder on AWS

JupyterHub Binder

Binder


Deploying JupyterHub

1. select amazon image machine and instance

open ports 80, 443, and 22

2. extend AWS instance volume size

reference: https://hackernoon.com/tutorial-how-to-extend-aws-ebs-volumes-with-no-downtime-ec7d9e82426e

a. login to AWS console
b. choose "EC2" from services list
c. click on "Volumes" under ELASTIC BLOCK STORE menu (on the left)
d. choose the volume to resize, right click on "Modify Volume"
e. set the new size for volume

    # extend from 8GB to 50GB
    # need to at least ~15-20GB

f. click on modify
g. make sure partition is extended

    lsblk
-OR-
df -h
# if partition is not extended see reference`

3. set up lets encrypt & nginx

reference: https://github.com/dandi/infrastructure/wiki/Girder-setup-on-aws

install pre-requisites:

apt-get update && apt-get upgrade -y                                                         #update package list 
apt-get install -y git python3.7 python3-setuptools python3-pip nginx vim fail2ban

setup nginx:

vim /etc/nginx/sites-enabled/hub.dandiarchive.org

edit nginx site file:

reference: https://jupyterhub.readthedocs.io/en/stable/reference/config-proxy.html
reference: https://jupyterhub.readthedocs.io/en/stable/reference/config-proxy.html

# top-level http config for websocket headers
# If Upgrade is defined, Connection = upgrade
# If Upgrade is empty, Connection = close
map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}

server {
        server_name    hub.dandiarchive.org;
    location / {
    #        proxy_pass http://localhost:8080/;
        proxy_pass http://localhost:8000/;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

                # websocket headers
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
    }

    listen 443 ssl; # managed by Certbot
    ssl_certificate /etc/letsencrypt/live/hub.dandiarchive.org/fullchain.pem; # managed by Certbot
    ssl_certificate_key /etc/letsencrypt/live/hub.dandiarchive.org/privkey.pem; # managed by Certbot
    include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
            
    ssl_session_cache shared:SSL:50m;
    ssl_stapling on;
    ssl_stapling_verify on;
    add_header Strict-Transport-Security max-age=15768000;
}

server {
    if ($host = hub.dandiarchive.org) {
        return 301 https://$host$request_uri;
    } # managed by Certbot

    listen 80;
    server_name    hub.dandiarchive.org;
    return 404; # managed by Certbot
}

restart nginx:

nginx -t                    # test nginx configuration
service nginx restart       # restart nginx
service nginx status        # check nginx status

setup lets encrypt:

apt-get install -y software-properties-common
add-apt-repository universe
add-apt-repository -y ppa:certbot/certbot
apt-get update
apt-get install -y certbot python-certbot-nginx
certbot --nginx

3. install docker

reference: https://phoenixnap.com/kb/install-kubernetes-on-ubuntu

apt-get update && apt-get upgrade -y    # update package list
apt-get install docker.io
docker -v                               # check docker version
sudo groupadd docker                    # add docker group
sudo gpasswd -a $USER docker            # add $USER
systemctl enable docker                 # set docker to launch at boot
systemctl status docker                 # check docker is running
systemctl start docker                  # start docker if it is not running

4. create jupyterhub docker image

reference: https://medium.com/@bluedme/connecting-jupyterhub-to-auth0-e92f0bb6efb0

docker pull jupyterhub/jupyterhub                                               # download jupyterhub container
docker run -p 8000:8000 -d --name jupyterhub jupyterhub/jupyterhub jupyterhub   # launch jupyterhub server
docker exec -it jupyterhub bash                                                 # go inside/allow to run a bash process in container
useradd --create-home <USER>                                                    # create user (with password) to log into jupyterHub server
passwd <USER>                                                                   

username: dandi
password: JupyterDemo1357

conda install notebook                                                         # install jupyter notebook
conda install jupyterlab                                                       # install jupyter lab
apt-get update && apt-get upgrade -y                                           # update package list 
apt-get install python3-pip                                                    # install pip
exit                                                                           # exit container

docker restart jupyterhub                                                      # restart jupyterhub server

docker ps                                                                      # list docker containers
docker commit <container ID> <jupyterhub/jupyterhub:1>                         # commit docker image

5. install and start minikube

install kubectl:

reference: https://kubernetes.io/docs/tasks/tools/install-kubectl/#install-kubectl-on-linux

    curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl   # download latest release
    chmod +x ./kubectl                                                                                                                                                          # make the kubectl binary executable
    sudo mv ./kubectl /usr/local/bin/kubectl                                                                                                                                    # move the binary in to your PATH
    kubectl version                                                                                                                                                             # check kubectl is installed and version is up-to-date

install minikube:

reference: https://kubernetes.io/docs/tasks/tools/install-minikube/

    curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64        # download latest release
    chmod +x minikube                                                                                     # make the kubectl binary executable
    sudo mkdir -p /usr/local/bin/                                                                         # move the binary in to your PATH 
    sudo install minikube /usr/local/bin/

check minikube version and start:

    minikube version
    sudo minikube start --vm-driver=none        # start minikube without VM. if command returns error, see reference

FOLLOW STEP 6 OR 7-

6. deploy jupyterhub to minikube (pod)

reference: https://sweetcode.io/learning-kubernetes-getting-started-minikube/
reference: https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/

setup pod

check which nodes and pods are up:

kubectl get nodes
kubectl get pods                            # no pods should be deployed to the cluster yet

install pre-requisites:

apt-get install socat                       # required for port-forwarding

edit pod configurations:

vim pod.yaml            # create pod configuration options files

apiVersion: v1
kind: Pod
metadata:
name: pod-jupyter-test
labels:
   app: pod-jupyter-test
   spec:  # specification of the pod's contents
restartPolicy: Never
    containers:
    - name: pod-jupyter-test
        image: jupyterhub/jupyterhub
            ports:
    - containerPort: 8000

deploy pod

kubectl create -f pod.yaml
kubectl get pods                                            # list pods
kubectl describe pod pod-jupyter-test                       # describe pods
nohup kubectl port-forward pod-jupyter-test 8000:8000 &     # run port forwarder in the background (even after logout)

7. deploy jupyterhub to minikube (deployment)

reference: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

setup deployment

rename and commit docker image to dockerhub

sudo docker login                                                              # login to dockerhub account
<dockerhub credentials go here>
sudo docker ps                                                                 # list docker containers
sudo docker tag <image_id or image_name> <container ID> <username/image:tag>   # rename docker image
sudo docker push <username/image:tag>                                          # push/commit docker image to dockerhub

edit deployment configurations:

vim deployment.yaml            # create deployment configuration options files

apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-jupyter-test
labels:
  app: pod-jupyter-test
spec:
  replicas: 3
  selector:
matchLabels:
  app: pod-jupyter-test
template:
  metadata:
    labels:
      app: pod-jupyter-test
spec:
  containers:
  - name: testserver
    image: bkandimalla/dandi
    ports:
    - containerPort: 8000

launch deployment

kubectl create -f deployment.yaml                                             
kubectl get deployment                                                        # list deployments
kubectl get pods                                                              # list pods
kubectl describe deployment deployment-jupyter-test                           # describe deployment
nohup kubectl port-forward deployment/deployment-jupyter-test 8000:8000 &     # run port forwarder in the background (even after logout)