Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ node_modules/

# CDK
**/*cdk.out
**/*cdk.context.json

# onboard.sh artifacts
artifacts
Expand Down
106 changes: 106 additions & 0 deletions infrastructure/iac/aws-cdk/lambda_lightsail_poc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# POC With Lambda, SSM, and Lightsail

## Context

The rootski postgres database has been exposed publicly for some time.
While the database has been protected with a username/password, it is still
vulerable to attacks since the world can reach the database.

We wanted to secure the database by blocking all traffic to the database lightsail
instance from the outside world. Specifically, we wanted only the lambda function
to be able to access the database.

We attempted to give our backend Lambda function and our database lightsail instance
contact with each other by adding a "VPC Peer Connection" to the lightsail VPC
and the default VPC in us-west-2.

At the hackathon, we enabled the VPC Peer Connection and deployed the lambda into
the default VPC, only to discover that when you do that, lambda functions lose
their internet access! (unless you pay $400+/year for a NAT Gateway for your default VPC).

The behavior we observed was that, deployed into the default VPC, the lambda function
hung and timed out with no logs.

Dismayed, we decided to do a proof of concept to investigate whether it is possible
in general to achieve private peered network access with a lambda function and a
lightsail instance... and we did it!

## Problem

There are three services our lambda function needs to be able to access:

1. The postgres database running on a lightsail instance
2. AWS SSM parameter store to read configuration such as the database credentials
3. AWS Cognito to fetch the "JSON Web Keys" which are used to validate JWT tokens

## Solution

### 1. Accessing Lightsail by a private IP Address

First, we created a VPC connection with the lightsail VPC in us-west-2, and the Default VPC in us-west-2.

The CDK code in this POC creates a lightsail instance with a webserver (nginx) running on port 80.
This CDK code also creates a lambda function deployed into the Default VPC. It makes two requests to lightsail
where it tries to access the lightsail instance with the instance's:

1. public IP address, and it FAILS! This is expected, because the lambda has no public internet access.
2. private IP address, and it WORKS! This is expected, because the lambda's VPC is peered with the lightsail VPC.

SUCCESS! We should be able to place a firewall rule on the lightsail instance block any incoming
traffic coming from IP addresses the CIDR range `172.0.0.0/8` AKA "any IP address starting with `172`" or,
said differently, *only clients on the same network as the lightsail instance* 🎉 🎉.

### 2. SSM VPC Endpoint

It turns out that most/all AWS services are accessed by publicly exposed endpoints.
So, for example, if you use `boto3` to try to read a parameter from SSM, `boto3` reaches
out to the public SSM endpoint hosted by AWS.

Here's the problem, since our lambda function didn't have access to the public internet,
it couldn't reach the *any* publicly accessible endpoints, let alone the public SSM endpoint.
It turns out, AWS has a solution called "VPC Endpoints" which allow you to enable services
inside a VPC to reach certain AWS services without the requests needing to leave your VPC!

In this POC, we "created" (enabled) the VPC Endpoint for the SSM service the default VPC of us-west-2,
and we had lambda try to read a SSM parameter. It works!

### 3. Cognito JWT Keys

Unfortunately, AWS doesn't let you create a VPC endpoint for AWS Cognito. So our lambda won't
be able to access cognito to request the service. But this could be okay!

Our *API Gateway* in charge of invoking our backend API lambda function *can* access Cognito.
We can have API Gateway validate tokens *before they are even passed to the lambda*.
Unauthenticated requests will simply never make it to our backend code.

The API Gateway will reject requests to auth-protected endpoints like `POST /breakdown` if there is no
valid JWT token from our cognito user pool in the headers.

This means our API code in the lambda function will not need to reach out to Cognito to download the
keys. Instead, it will simply trust that all tokens are valid, and use the contents to identify the
user.

## Conclusion

It's sad that our lambda does not have internet access when in the Default VPC, but this is by
far the best solution to protect the backend database. This is important because the database
stores email addresses which are PII data. We can't leak those!

Here are the considerations we will need to make with rootski now:

1. The backend API code can only reach services that are in the lightsail VPC or
on a list of AWS services that support VPC endpoints.
2. We will need to register our backend API endpoints in the API Gateway. Here,
we will explicitly require certain endpoints to be authenticated, and others
simply passed through to the backend. [EDIT] This won't actually work because
of the `GET /breakdowns` endpoint. `GET /breakdowns` does not require auth,
but it behaves differently if the user *is* authenticated. This means that

1. we need to write a special lambda authorizer for this endpoint that
allows the request to reach the backend if either of these conditions are met:

- there is no token in the headers (request is not even claiming to be authenticated)
- there is a token in the headers and the token is valid

2. we need to allow the backend API to get the JWKs another way,
maybe by writing a CRON Lambda that saves the JWKs to SSM daily.
75 changes: 75 additions & 0 deletions infrastructure/iac/aws-cdk/lambda_lightsail_poc/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
"""
App defining an API Gateway and Lambda Function with a FastAPI app.

.. note::

Eric recommends *never* putting node.try_get_context() calls inside of a stack.
Hiding those calls inside of a stack makes it very unintuitive to figure out what inputs
you need to actually create a stack. Instead, write stacks and constructs assuming any
required inputs will be passed as arguments to the constructor. Make the calls to try_get_context
in the app.py.

In each stack/construct, use a dataclass, enum, or constants to define actual string constants
used for ContextVars (inputs) and Cloudformation Outputs (outputs).
"""

import aws_cdk as cdk
from aws_cdk import Stack
from aws_cdk import aws_ssm as ssm
from constructs import Construct
from lambda_lightsail_poc.constructs.lightsail_instance import LightsailInstance
from lambda_lightsail_poc.constructs.ping_lambda import PingLambdaFunction
from lambda_lightsail_poc.constructs.ssm_vpc_endpoint import SsmVpcEndpoint


class LambdaLightsailPOCStack(Stack):
def __init__(
self,
scope: Construct,
construct_id: str,
**kwargs,
):
super().__init__(scope, construct_id, **kwargs)

self.vpc_endpoint = SsmVpcEndpoint(
self,
"SSM-VPC-Endpoint",
)

self.lightsail_instance = LightsailInstance(
self,
construct_id="LightsailInstance",
name_prefix="POC-",
)

self.test_ssm_param = ssm.StringParameter(
self,
"parameter-for-ping-lambda",
parameter_name="/lightsail-poc/test-parameter",
string_value="Hi friends! 😈",
)

self.lightsail_lambda_pinger = PingLambdaFunction(
self,
construct_id="PingLambdaFunction",
lightsail_public_ip=self.lightsail_instance.static_ip.attr_ip_address,
lightsail_private_ip=self.lightsail_instance.instance.attr_private_ip_address,
test_ssm_parameter=self.test_ssm_param,
)


if __name__ == "__main__":
app = cdk.App()

environment = cdk.Environment(
account="091910621680",
region="us-west-2",
)

LambdaLightsailPOCStack(
app,
"Lambda-Lightsail-POC-Stack-cdk",
env=environment,
)

app.synth()
15 changes: 15 additions & 0 deletions infrastructure/iac/aws-cdk/lambda_lightsail_poc/cdk.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"app": "python3 app.py",
"context": {
"@aws-cdk/aws-apigateway:usagePlanKeyOrderInsensitiveId": true,
"aws-cdk:enableDiffNoFail": "true",
"@aws-cdk/core:stackRelativeExports": "true",
"@aws-cdk/aws-ecr-assets:dockerIgnoreSupport": true,
"@aws-cdk/aws-secretsmanager:parseOwnedSecretName": true,
"@aws-cdk/aws-kms:defaultKeyPolicies": true,
"@aws-cdk/aws-ecs-patterns:removeDefaultDesiredCount": true,
"@aws-cdk/aws-rds:lowercaseDbIdentifier": true,
"@aws-cdk/aws-efs:defaultEncryptionAtRest": true,
"@aws-cdk/aws-lambda:recognizeVersionProps": true
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
from textwrap import dedent

import aws_cdk as cdk
from aws_cdk import Stack
from aws_cdk import aws_lightsail as lightsail
from constructs import Construct

LIGHTSAIL_USER_DATA_SCRIPT: str = dedent(
"""\
#!/bin/bash

set -x

# act as the super user for this script
sudo su

# map python -> python2 (yum needs python2)
unlink /usr/bin/python
ln -sfn /usr/bin/python2 /usr/bin/python

# update and install docker
# NOTE, -y makes yum answer yes to all prompts
# httpd-tools is for bcrypt via the htpasswd command for generating basic auth passwords for the /docs and traefik UIs
yum update -y
yum -y install docker git httpd-tools zsh
usermod -a -G docker ec2-user # allow ec2-user to use docker commands

# install docker-compose and make the binary executable
curl -L https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m) -o /usr/bin/docker-compose
chmod +x /usr/bin/docker-compose

# install ohmyzsh
sh -c "$(wget https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh -O -) --unattended"

# start docker
service docker start

cat << EOF > /tmp/docker-compose.yml
version: "3.9"

# run a basic webserver on port 80 to test network connectivity
services:
nginx:
image: nginx
ports:
- 80:80
deploy:
labels:
traefik.enable: "false"
replicas: 1

EOF

docker swarm init
docker stack deploy -c /tmp/docker-compose.yml nginx
"""
)


class LightsailInstance(Construct):
def __init__(self, scope: cdk.Stack, construct_id: str, name_prefix: str, **kwargs):
super().__init__(scope, construct_id, **kwargs)
self.instance = lightsail.CfnInstance(
self,
id=name_prefix + "lightsail-instance-for-vpc-lambda",
instance_name="lightsail-instance-for-vpc-lambda",
key_pair_name="rootski.id_rsa",
availability_zone="us-west-2a",
networking=lightsail.CfnInstance.NetworkingProperty(
ports=[
lightsail.CfnInstance.PortProperty(
access_direction="inbound",
cidrs=["0.0.0.0/0"],
common_name="SSH",
from_port=22,
protocol="tcp",
to_port=22,
),
lightsail.CfnInstance.PortProperty(
access_direction="inbound",
cidrs=["0.0.0.0/0"],
common_name="Postgres",
from_port=8000,
protocol="tcp",
to_port=8000,
),
# traefik
lightsail.CfnInstance.PortProperty(
access_direction="inbound",
cidrs=["0.0.0.0/0"],
common_name="Postgres",
from_port=80,
protocol="tcp",
to_port=80,
),
lightsail.CfnInstance.PortProperty(
access_direction="outbound",
cidrs=["0.0.0.0/0"],
common_name="All Outbound Traffic",
from_port=0,
protocol="all",
to_port=65535,
),
]
),
# found using 'aws lightsail get-blueprints --profile rootski'
blueprint_id="amazon_linux_2",
# found using 'aws lightsail get-bundles --profile rootski'
bundle_id="micro_2_0",
user_data=LIGHTSAIL_USER_DATA_SCRIPT,
)

# free as long as the instance is running
self.static_ip = lightsail.CfnStaticIp(
self,
id="Rootski-DB-Lightsail-StaticIp",
static_ip_name=name_prefix + "Rootski-DB-Lightsail-StaticIp",
attached_to=self.instance.ref,
)


if __name__ == "__main__":

class LightsailStack(Stack):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.lightsail_instance = LightsailInstance(self, "test-instance")

app = cdk.App()
stack = LightsailStack(app, "test-stack")
instance = stack.lightsail_instance

print(instance.instance.user_data)
Loading