The project is organized as follows, per LucidChart diagram:

-
(A) A Lambda function (
get-iss-position) deployed through a Docker image runs hourly through an EventBridge trigger, pulling GPS location data of the International Space Station via NASA's Open Notify API. The extracted data is deposited as a parquet in a S3 bucket (iss-position). -
(B) A Glue ETL (
iss-daily-avg-speed) is triggered by EventBridge at 1:30 AM UTC daily, taking in GPS data for the previous day from theiss-positionS3 bucket, then using PySpark the compute the average hourly speed travelled by the ISS in that previous day. The computed average hourly speed for the previous day is written as a parquet to a separate S3 bucket (iss-avg-speed) -
(C) Two tables are initialized in a Redshift Serverless data warehouse,
iss_last_positionandiss-avg-speed, to query the previous day's last recorded location + average hourly speed respectively. A Lambda function (update-redshift-tables) is triggered to run everytime theiss-avg-speedS3 bucket is updated, inserting the previous day's last recorded location and hourly speed into their respective Redshift tables. -
(D) The two Redshift tables feed a Google Looker Studio dashboard, visualizing the trend in average hourly speed in the past week, and the last recorded positions of the ISS in the past three days. Note that due to the cost of maintaining the Redshift Serverless data warehouse, the dashboard is currently fed by static csv files exported from the Redshift tables before I took down the warehouse.
- Docker engine set-up locally: https://docs.docker.com/engine/install/
- AWS CLI configured for use with your AWS account (ex: with an access key assigned to your IAM user): https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html
These steps outline how to programatically deploy this project to your AWS account.
- Clone this repo to your machine and switch directory to the repo
git clone https://github.com/Jason-B-Jiang/where-is-iss.git
cd where-is-iss
- Open config.txt and fill in your AWS region, AWS account ID, as well as desired admin username and password for Redshift data warehouse. For example:
AWS_REGION=us-east-1
AWS_ACCOUNT_ID=123456789012
REDSHIFT_ADMIN_PW=Abcd1234
REDSHIFT_ADMIN_USER=admin
- Run the set-up script to automatically deploy this project, with all necessary AWS resources + IAM roles as needed. Important note: certain docker steps in the script run with sudo - please enter your system password whenever prompted
chmod u+x SETUP.sh
./SETUP.sh
- (Optional) Invoke lambda function and Glue job to test
# Invoke Lambda function - should write to S3 bucket called "iss-location"
aws lambda invoke --function-name get-iss-position response.json
# Delete json response file
rm response.json
# Invoke Glue job - make note of JobRunId for tracking
# Should write to S3 bucket called "iss-daily-avg-speed"
aws glue start-job-run --job-name iss-daily-avg-speed
aws glue get-job-run --job-name iss-daily-avg-speed --run-id <JobRunId>
This will remove ALL assets created on your AWS account during set-up (ex: IAM roles, policies, Lambda functions, S3 buckets, etc).
- Run teardown script to automatically delete all AWS assets created
chmod u+x TEARDOWN.sh
./TEARDOWN.sh
-
Because hourly speed of the ISS is estimated via absolute distance between coordinates, speed estimate is very inaccurate as ISS can end up rather close to its previous position if it made close to / past a full orbit in one hour
-
The Glue job for computing average speed for previous day will fail if this project is deployed between 12:00 AM - 1:30 AM UTC, as there will be no previous day location data recorded.
-
AWS QuickSight is a more obvious + direct choice for creating a visualization from the Redshift tables, but I opted for Google Looker Studio instead due to the cost of using QuickSight.
-
Project deployment and teardown can be further automated through a tool like Terraform.
-
This project obviously has more moving parts / complexity than required, as my main goal for this project was to explore a variety of AWS tools.
