Skip to content

AWS Workstation Setup

David Brainard edited this page Jul 10, 2017 · 7 revisions

Here are some instructions for how to set up a workstation for working with AWS. This will allow you to do things like create and connect to AWS EC2 instances, and access files stored on AWS S3.

Install Python and AWS CLI

The main tool for connecting your workstation to AWS is the AWS Command Line Interface. If you have pip installed, then it should be easy:

sudo pip install awscli

On macOS / OS X, you might have python byt not pip. In that case you could use Homebrew to install a regular version of Python that comes with pip:

brew install python

Then the pip command worked as above.

While you are here and thinking about brew, you also need to install jq:

brew install jq

Configure AWS Command Line Interface

You also need to configure the AWS Command Line Interface to work with our AWS account:

aws configure

It will prompt you for some information:

AWS Access Key ID: ??? 
AWS Secret Access Key: ??? 
Default region name: us-west-2
Default output format: json

The first two you have to get from the lab's AWS account administrator. The admin should make you an IAM user account and grant you the permissions you'll need:

The default region should be us-east-1, and the default output format should be json.

Work with EC2 Instances

Once your AWS CLI is configured you should be able to run jobs on EC2 instances.

Here's an example script for running a simple calculation on a short-lived EC2 instance.

Mount S3 Buckets Locally

Once your AWS CLI is configured you should be able to access S3 data.

A nice way to do this is to mount an S3 bucket to your local file system and access it like a regular directory. We can do this with a tool called yas3fs.

When it works, the S3 bucket will look like it's part of the local file system. You'll be able to browse the files and open them in the Finder, Matlab, etc. The data will only be transferred when you actually open a file, so it's OK if you can't fit the whole bucket on your local drive.

macOX / OS X

Here are instructions for setting up yas3fs on macOS / OS X.

Install osxfuse

You need to have osxfuse installed. This allows you to mount things on the local file system, without getting involved with the kernel.

Go to the releases page, download the .dmg, and run the installer.

Install yas3fs

Now you can install "Yet Another S3 File System", aka yas3fs. It should be easy with pip:

sudo pip install yas3fs

Mount a Bucket

Finally, you should be able to mount an S3 bucket. You need the name of the bucket, for example render-toolbox-test. You also need to pick a local folder where the bucket will show up, for example ~/Desktop/render-toolbox-test. You can change either value as needed.

yas3fs s3://render-toolbox-test ~/Desktop/render-toolbox-test --mkdir

You should see the volume in the Finder, and you should be able to browse and open files. Browsing might be a little slow and you might have to wait several seconds for all of the files in a folder to appear. This is expected because S3 is not really a file system and not optimized for browsing folders.

Unmount a Bucket

When done, you can unmount the bucket:

umount  ~/Desktop/render-toolbox-test
rmdir ~/Desktop/render-toolbox-test

Linux

Here are instructions for setting up yas3fs on Linux. This example worked on Linux Mint. Other Ubuntus should work the same way. Other Linuxes should be pretty similar.

Install yas3fs

To install yas3fs and configure it:

sudo apt-get -y install fuse python-pip 
sudo pip install yas3fs
sudo sed -i'' 's/^# *user_allow_other/user_allow_other/' /etc/fuse.conf
sudo chmod a+r /etc/fuse.conf

Mount a Bucket

To mount a bucket:

export AWS_ACCESS_KEY_ID=(same as above)
export AWS_SECRET_ACCESS_KEY=(same as above)

mkdir ~/Desktop/render-toolbox-working
yas3fs s3://render-toolbox-working /home/ben/Desktop/render-toolbox-working --recheck-s3

I'm not sure why I had to explicitly export my AWS credentials, but this got it working. Likewise, I'm not sure why we need the --recheck-s3 flag, but this made it go.

Unmount a Bucket

To unmount that bucket:

fusermount -u ~/Desktop/render-toolbox-working

AWS Instance

We can also mount S3 buckets from EC2 instances. This is nice for big jobs because the data transfers between S3 and EC2 are cheaper and faster than between S3 and your local workstation.

Setup on EC2 instance goes a lot like the Linux setup above. The main difference is that the instance uses an IAM role for EC2 to gain AWS permissions, instead of a human user account with ID and access key.

IAM Role

You must assign the IAM role when you start the instance. You can do this through the EC2 web interface, or through MatlabJobSupport. We have an example of this in our simple calculation example. Look for the iamProfile parameter.

For the Brainard Lab, you can generally use the default role, called ecsInstanceRole. This role has the necessary permissions including AmazonS3FullAccess.

Install yas3fs

sudo apt-get update
sudo apt-get -y install fuse python-pip 
sudo pip install yas3fs
sudo sed -i'' 's/^# *user_allow_other/user_allow_other/' /etc/fuse.conf
sudo chmod a+r /etc/fuse.conf

Mount a Bucket

mkdir ~/render-toolbox-working
yas3fs s3://render-toolbox-working ~/render-toolbox-working

Unmount a Bucket

fusermount -u ~/render-toolbox-working