This has been based on:
- Make sure to do all the operations in the same AWS zone that you will use in
step 5 in the
group_vars/all
file. - Create an https certificate using AWS cert manager. For attaching this to load balancers, it's free, and JupyterHub allows proxy offloading to this certificate.
- Create the GitHub App id/token. We have it done through a bot github user account (dandibot).
- Setup AWS CI instance with authorized roles. (see the blog post for details)
- AmazonEC2FullAccess
- AmazonSQSFullAccess
- IAMFullAccess
- AmazonS3FullAccess
- AmazonVPCFullAccess
- AmazonElasticFileSystemFullAccess
- AmazonRoute53FullAccess
- AmazonEventBridgeFullAccess and then add the public dns name to the hosts file also install git in the CI instance.
- Install ansible locally and create a password for ansible to encrypt some of
the ansible variables.
openssl rand -hex 32 > ansible_password
This is used to encrypt some of the values such as github tokens, AWS certificate ID using the following form.ansible-vault encrypt_string "string_to_encrypt"
- Update the variables and some yaml files.
Specifically this involves:
group_vars/all
,config.yaml.j2
) Also note that the namespace has to be unique across any JH instances created with this setup. - create policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"ec2:DescribeLaunchTemplateVersions",
"ec2:DescribeInstanceTypes"
],
"Resource": "*"
}
]
}
- update
z2jh.yaml
to reflect new policy arn.
ansible-playbook -i hosts z2jh.yml -v --vault-password-file ansible_password
To use this repo for reprohub deployment:
cd z2jh-aws-ansible
cp -r ../dandi-info/. .
ansible-playbook -i hosts z2jh.yml -v --vault-password-file ansible_password
To teardown
ansible-playbook -i hosts teardown.yml -v --vault-password-file ansible_password -t all-fixtures
To remove kubernetes without removing shared EFS:
ansible-playbook -i hosts teardown.yml -v --vault-password-file ansible_password -t kubernetes
group_vars/all
: ansible file contains variables for various templatescluster-autoscaler-multi-asg.yaml.j2
: k8s cluster autoscaler specconfig.yaml.j2
: z2jh jupyterhub configurationhosts
: ansible provides IP of control hostnodes[1-3].yaml.j2
: k8s node specs for on demand nodes in multiple zonespod.yaml.j2
: k8s pod for introspecting shared storagepv_efs.yaml.j2
: k8s persistent volume spec for EFSpvc_efs.yaml.j2
: k8s persistent volume claim for EFSspot-ig.yaml.j2
: k8s non-GPU spec for compute nodesspot-ig-gpu.yaml.j2
: k8s GPU spec for compute nodesstorageclass.yaml.j2
: k8s EFS storageclassteardown.yml
: ansible file for tearing down the clusterz2jh.yml
: ansible file for starting up the cluster