-
Notifications
You must be signed in to change notification settings - Fork 0
AWS Workstation Setup
Here are some instructions for how to set up a workstation for working with AWS. This will allow you to do things like create and connect to AWS EC2 instances, and access files stored on AWS S3.
The main tool for connecting your workstation to AWS is the AWS Command Line Interface. If you have pip installed, then it should be easy:
sudo pip install awscli
On macOS / OS X, you might have python byt not pip. In that case you could use Homebrew to install a regular version of Python that comes with pip:
brew install python
Then the pip command worked as above.
While you are here and thinking about brew, you also need to install jq:
brew install jq
You also need to configure the AWS Command Line Interface to work with our AWS account:
aws configure
It will prompt you for some information:
AWS Access Key ID: ???
AWS Secret Access Key: ???
Default region name: us-west-2
Default output format: json
The first two you have to get from the lab's AWS account administrator. The admin should make you an IAM user account and grant you the permissions you'll need:
- For working with EC2 instances, you need AmazonEC2FullAccess.
- For accessing S3 data, you need AmazonS3FullAccess.
The default region should be us-east-1
, and the default output format should be json
.
Once your AWS CLI is configured you should be able to run jobs on EC2 instances.
Here's an example script for running a simple calculation on a short-lived EC2 instance.
Once your AWS CLI is configured you should be able to access S3 data.
A nice way to do this is to mount an S3 bucket to your local file system and access it like a regular directory. We can do this with a tool called yas3fs.
When it works, the S3 bucket will look like it's part of the local file system. You'll be able to browse the files and open them in the Finder, Matlab, etc. The data will only be transferred when you actually open a file, so it's OK if you can't fit the whole bucket on your local drive.
Here are instructions for setting up yas3fs on macOS / OS X.
You need to have osxfuse installed. This allows you to mount things on the local file system, without getting involved with the kernel.
Go to the releases page, download the .dmg, and run the installer.
Now you can install "Yet Another S3 File System", aka yas3fs. It should be easy with pip:
sudo pip install yas3fs
Finally, you should be able to mount an S3 bucket. You need the name of the bucket, for example render-toolbox-test
. You also need to pick a local folder where the bucket will show up, for example ~/Desktop/render-toolbox-test
. You can change either value as needed.
yas3fs s3://render-toolbox-test ~/Desktop/render-toolbox-test --mkdir
You should see the volume in the Finder, and you should be able to browse and open files. Browsing might be a little slow and you might have to wait several seconds for all of the files in a folder to appear. This is expected because S3 is not really a file system and not optimized for browsing folders.
When done, you can unmount the bucket:
umount ~/Desktop/render-toolbox-test
rmdir ~/Desktop/render-toolbox-test
Here are instructions for setting up yas3fs on Linux. This example worked on Linux Mint. Other Ubuntus should work the same way. Other Linuxes should be pretty similar.
To install yas3fs and configure it:
sudo apt-get -y install fuse python-pip
sudo pip install yas3fs
sudo sed -i'' 's/^# *user_allow_other/user_allow_other/' /etc/fuse.conf
sudo chmod a+r /etc/fuse.conf
To mount a bucket:
export AWS_ACCESS_KEY_ID=(same as above)
export AWS_SECRET_ACCESS_KEY=(same as above)
mkdir ~/Desktop/render-toolbox-working
yas3fs s3://render-toolbox-working /home/ben/Desktop/render-toolbox-working --recheck-s3
I'm not sure why I had to explicitly export my AWS credentials, but this got it working. Likewise, I'm not sure why we need the --recheck-s3
flag, but this made it go.
To unmount that bucket:
fusermount -u ~/Desktop/render-toolbox-working
We can also mount S3 buckets from EC2 instances. This is nice for big jobs because the data transfers between S3 and EC2 are cheaper and faster than between S3 and your local workstation.
Setup on EC2 instance goes a lot like the Linux setup above. The main difference is that the instance uses an IAM role for EC2 to gain AWS permissions, instead of a human user account with ID and access key.
You must assign the IAM role when you start the instance. You can do this through the EC2 web interface, or through MatlabJobSupport. We have an example of this in our simple calculation example. Look for the iamProfile
parameter.
For the Brainard Lab, you can generally use the default role, called ecsInstanceRole
. This role has the necessary permissions including AmazonS3FullAccess
.
sudo apt-get update
sudo apt-get -y install fuse python-pip
sudo pip install yas3fs
sudo sed -i'' 's/^# *user_allow_other/user_allow_other/' /etc/fuse.conf
sudo chmod a+r /etc/fuse.conf
mkdir ~/render-toolbox-working
yas3fs s3://render-toolbox-working ~/render-toolbox-working
fusermount -u ~/render-toolbox-working