Skip to content

4. Using Dockerised Knetminer with Amazon AWS and ElasticBeanStalk

Ajit Singh edited this page Sep 23, 2019 · 1 revision

Using Dockerised Knetminer with Amazon AWS and ElasticBeanStalk

Knetminer can be deployed on AWS Beanstalk via CLI or Beanstalk section on the AWS Console.

Prerequisites

Installing AWS Beanstalk CLI - ONLY FOR CLI APPROACH

To kick start a Knetminer deployment on AWS Beanstalk at command line, need to install AWS Beanstalk CLI.

Dataset and Permissions

To use a dataset of your choice for this deployment:

  • the dataset directory need to be uploaded to an S3 bucket. The dataset directory should follow the convention explained in the section.
  • either the bucket needs to be public or appropriate IAM policy (to allow listing and reading the S3 bucket) need to be attached to the Beanstalk IAM role being used.

Deployment

Assuming knetminer github repository is downloaded, following instructions need to be followed to customise and deploy Knetminer.

Beanstalk S3 configuration

Edit Beanstalk S3 configuration file to use the right S3 bucket for the dataset.

cd knetminer/common/quickstart/aws
vi .ebextensions/01_s3.config

Change the S3 bucket path from

aws s3 cp s3://knetminer-testing-bucket/arabadopsis/ /home/ec2-user/knetminer-dataset --recursive

to

aws s3 cp s3://<MY-KNETMINER-BUCKET/<DATASET-FOLDER>/ /home/ec2-user/knetminer-dataset --recursive

Beanstalk EC2 instance configuration

Depending on the dataset size, you need to pick an appropriate AWS instance type for Beanstalk to use to deploy Knetminer. AWS instance types, their specifications and pricing for different AWS regions can be found at https://aws.amazon.com/ec2/pricing/on-demand/ . Below are some sample instance types to pick with different CPU and MEMORY configurations.

INSTANCE-TYPE vCPUs MEMORY
t2.medium 2 4 GiB
m4.large 4 8 GiB
m4.xlarge 8 16 GiB
m4.2xlarge 2 32 GiB

Edit Beanstalk instance configuration file to use the right instance type

cd knetminer/common/quickstart/aws
vi .ebextensions/00_instance.config

use the required instance type value in the below line

InstanceType: <INSTANCE-TYPE>

Example:

InstanceType: m4.xlarge

Add or delete EC2KeyName entity. This is OPTIONAL and required only to logon(SSH) to Beanstalk EC2 instance to troubleshoot. Delete the line if SSH login to the instance is not required.

EC2KeyName: <SSH-KEY-NAME>

Example:

EC2KeyName: mysshkeyname

Edit Docker run file - ONLY FOR PREDEFINED DATASET

When using a predefined dataset in the knetminer github, add a command entity in the Dockerrun JSON file.

vi Dockerrun.aws.json

Change the

  "Entrypoint": "./runtime-helper.sh"
}

to

  "Entrypoint": "./runtime-helper.sh",
  "Command": "arabidopsis /root/knetminer-dataset"
}

Create a new AWS Beanstalk environment - via AWS Console

Prepare the code zip file

To deploy via AWS Console, need to prepare a zip file with Docker files along with above customization/configuration files.

cd knetminer/common/quickstart/aws
zip -r code.zip .

Log on to Beanstalk section on the AWS Console, you can proceed with either by creating a new application or selecting an Application that already exists. In the selected Application, kick start a new environment by clicking on the 'Actions' button on the right hand side of the page and selecting 'Create environment'. Use the following values in the New Environment wizard.

  • Environment: 'Web server environment'
  • Environment name: User friendly name (E.g: knetminer-test)
  • Domain: User friendly DNS prefix (E.g: knetminer-test)
  • Platform: Preconfigured platform -> Docker
  • Application code: Select 'Upload your code' and select the code.zip file created above.

Create a new AWS Beanstalk environment - via CLI

You can proceed with creating a new environment in the selected AWS Beanstalk application. This step will provision AWS resources (instance, load balancer) and Knetminer Docker container is added to the AWS instance launched.

eb create

This will prompt for:

  • unique environment name - provide a userfriendly name (e.g: knetminer-test)
  • DNS CNAME prefix - can be left with default value
  • load balancer type - can be left with default value

Browsing Knetminer UI

Deployment process will provision AWS resources (instance, load balancer) and Knetminer Docker container is added to the AWS instance launched. This will usually take approximately 15 minutes depending on the dataset size.

Logon to Beanstalk section in the AWS Web Console, browse to the application and newly launched environment and find the URL (e.g: knetminer-test.eu-west-2.elasticbeanstalk.com). Copy the URL, suffix with /client (e.g: knetminer-test.eu-west-2.elasticbeanstalk.com/client) to browse the Knetminer UI.

Delete Knetminer environment

Deployed Knetminer AWS Beanstalk environment can be terminated via:

AWS Console

On the Beanstalk section of AWS Console, browse to the enviromment, click on 'Actions' button on the right hand side of the page and select 'Terminate enviroment' from the drop-down list.

AWS Beanstalk CLI

eb terminate <environment-name> # e.g: eb terminate knetminer-test

For further help, browse https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.terminating.html