Skip to content

Commit

Permalink
feat: Support AWS S3 Express One Zone buckets (#229)
Browse files Browse the repository at this point in the history
# What
This change adds the `S3_SERVICE` configuration variable which will default to `s3` and may be one of `s3express` or `s3`.

It also introduces the `virtual-v2` `S3_STYLE` argument option in support of the connectivity requirement of the S3 Express One Zone (directory) buckets.  We are using this as a successor to `virtual` and believe it should work well in all AWS usages but want to be cautious as we make this change.

Many thanks for @hveiga for driving the implementation of this feature in their original pull request.

Setting this variable to s3express will change the "service" used to sign the requests with the V4 header to s3express. Currently the gateway works without this step, but it's advised in the documentation [here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-express-security-best-practices.html).

## Other Changes
We are moving the determination of the hostname used to query S3 into the docker entrypoint (or bootstrap script for non-docker installs).  If `S3_STYLE` is set to `virtual` (this is the default and aws recommended scheme) then the hostname will be:
```
${S3_BUCKET_NAME}.${S3_SERVER}:${S3_SERVER_PORT}
``` 
which will be used in these locations:
* The `proxy_path` directive
* The HTTP `Host` header sent to AWS
* The `host` element of the canonical headers used in signing AWS signature V4 requests.

Based on my reading here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html
It looks like AWS recommends that the bucket be always prepended and other schemes exist only for backwards compatibility reasons.  However, please comment on this discussion if you have concerns #231

Co-authored-by: @hveiga <[email protected]>"
  • Loading branch information
4141done authored Apr 25, 2024
1 parent 7f3064b commit 7f5df74
Show file tree
Hide file tree
Showing 23 changed files with 362 additions and 43 deletions.
5 changes: 4 additions & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ jobs:
test-oss:
runs-on: ubuntu-22.04
needs: build-oss-for-test
strategy:
matrix:
path_style: [virtual, virtual-v2]
steps:
- uses: actions/checkout@v4
- name: Install dependencies
Expand Down Expand Up @@ -82,7 +85,7 @@ jobs:
run: |
docker load --input ${{ runner.temp }}/oss.tar
- name: Run tests - stable njs version
run: ./test.sh --type oss
run: S3_STYLE=${{ matrix.path_style }} ./test.sh --type oss

build-latest-njs-for-test:
runs-on: ubuntu-22.04
Expand Down
43 changes: 42 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -346,4 +346,45 @@ test-settings.*
s3-requests.http
httpRequests/

.bin/
.bin/

# Created by https://www.toptal.com/developers/gitignore/api/terraform
# Edit at https://www.toptal.com/developers/gitignore?templates=terraform

### Terraform ###
# Local .terraform directories
**/.terraform/*

# .tfstate files
*.tfstate
*.tfstate.*

# Crash log files
crash.log
crash.*.log

# Exclude all .tfvars files, which are likely to contain sensitive data, such as
# password, private keys, and other secrets. These should not be part of version
# control as they are data points which are potentially sensitive and subject
# to change depending on the environment.
*.tfvars
*.tfvars.json

# Ignore override files as they are usually used to override resources locally and so
# are not checked in
override.tf
override.tf.json
*_override.tf
*_override.tf.json

# Include override files you do wish to add to version control using negated pattern
# !example_override.tf

# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan
# example: *tfplan*

# Ignore CLI configuration files
.terraformrc
terraform.rc
.tfplan
# End of https://www.toptal.com/developers/gitignore/api/terraform
3 changes: 2 additions & 1 deletion common/docker-entrypoint.d/00-check-for-required-env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ set -e

failed=0

required=("S3_BUCKET_NAME" "S3_SERVER" "S3_SERVER_PORT" "S3_SERVER_PROTO"
required=("S3_SERVICE" "S3_BUCKET_NAME" "S3_SERVER" "S3_SERVER_PORT" "S3_SERVER_PROTO"
"S3_REGION" "S3_STYLE" "ALLOW_DIRECTORY_LIST" "AWS_SIGS_VERSION"
"CORS_ENABLED")

Expand Down Expand Up @@ -122,6 +122,7 @@ if [ $failed -gt 0 ]; then
fi

echo "S3 Backend Environment"
echo "Service: ${S3_SERVICE}"
echo "Access Key ID: ${AWS_ACCESS_KEY_ID}"
echo "Origin: ${S3_SERVER_PROTO}://${S3_BUCKET_NAME}.${S3_SERVER}:${S3_SERVER_PORT}"
echo "Region: ${S3_REGION}"
Expand Down
21 changes: 21 additions & 0 deletions common/docker-entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,27 @@ if [ -z "${CORS_ALLOWED_ORIGIN+x}" ]; then
export CORS_ALLOWED_ORIGIN="*"
fi

# This is the primary logic to determine the s3 host used for the
# upstream (the actual proxying action) as well as the `Host` header
#
# It is currently slightly more complex than necessary because we are transitioning
# to a new logic which is defined by "virtual-v2". "virtual-v2" is the recommended setting
# for all deployments.

# S3_UPSTREAM needs the port specified. The port must
# correspond to https/http in the proxy_pass directive.
if [ "${S3_STYLE}" == "virtual-v2" ]; then
export S3_UPSTREAM="${S3_BUCKET_NAME}.${S3_SERVER}:${S3_SERVER_PORT}"
export S3_HOST_HEADER="${S3_BUCKET_NAME}.${S3_SERVER}:${S3_SERVER_PORT}"
elif [ "${S3_STYLE}" == "path" ]; then
export S3_UPSTREAM="${S3_SERVER}:${S3_SERVER_PORT}"
export S3_HOST_HEADER="${S3_SERVER}:${S3_SERVER_PORT}"
else
export S3_UPSTREAM="${S3_SERVER}:${S3_SERVER_PORT}"
export S3_HOST_HEADER="${S3_BUCKET_NAME}.${S3_SERVER}"
fi


# Nothing is modified under this line

if [ -z "${NGINX_ENTRYPOINT_QUIET_LOGS:-}" ]; then
Expand Down
20 changes: 6 additions & 14 deletions common/etc/nginx/include/s3gateway.js
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ _requireEnvVars('S3_SERVER_PORT');
_requireEnvVars('S3_REGION');
_requireEnvVars('AWS_SIGS_VERSION');
_requireEnvVars('S3_STYLE');
_requireEnvVars('S3_SERVICE');


/**
Expand Down Expand Up @@ -86,7 +87,7 @@ const INDEX_PAGE = "index.html";
* Constant defining the service requests are being signed for.
* @type {string}
*/
const SERVICE = 's3';
const SERVICE = process.env['S3_SERVICE'] || "s3";

/**
* Transform the headers returned from S3 such that there isn't information
Expand Down Expand Up @@ -165,12 +166,7 @@ function s3date(r) {
function s3auth(r) {
const bucket = process.env['S3_BUCKET_NAME'];
const region = process.env['S3_REGION'];
let server;
if (S3_STYLE === 'path') {
server = process.env['S3_SERVER'] + ':' + process.env['S3_SERVER_PORT'];
} else {
server = process.env['S3_SERVER'];
}
const host = r.variables.s3_host;
const sigver = process.env['AWS_SIGS_VERSION'];

let signature;
Expand All @@ -180,7 +176,7 @@ function s3auth(r) {
let req = _s3ReqParamsForSigV2(r, bucket);
signature = awssig2.signatureV2(r, req.uri, req.httpDate, credentials);
} else {
let req = _s3ReqParamsForSigV4(r, bucket, server);
let req = _s3ReqParamsForSigV4(r, bucket, host);
signature = awssig4.signatureV4(r, awscred.Now(), region, SERVICE,
req.uri, req.queryParams, req.host, credentials);
}
Expand Down Expand Up @@ -221,15 +217,11 @@ function _s3ReqParamsForSigV2(r, bucket) {
* @see {@link https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html | AWS V4 Signing Process}
* @param r {NginxHTTPRequest} HTTP request object
* @param bucket {string} S3 bucket associated with request
* @param server {string} S3 host associated with request
* @param host {string} S3 host associated with request
* @returns {S3ReqParams} s3ReqParams object (host, uri, queryParams)
* @private
*/
function _s3ReqParamsForSigV4(r, bucket, server) {
let host = server;
if (S3_STYLE === 'virtual' || S3_STYLE === 'default' || S3_STYLE === undefined) {
host = bucket + '.' + host;
}
function _s3ReqParamsForSigV4(r, bucket, host) {
const baseUri = s3BaseUri(r);
const computed_url = !utils.parseBoolean(r.variables.forIndexPage)
? r.variables.uri_path
Expand Down
1 change: 1 addition & 0 deletions common/etc/nginx/nginx.conf
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ env S3_REGION;
env AWS_SIGS_VERSION;
env DEBUG;
env S3_STYLE;
env S3_SERVICE;
env ALLOW_DIRECTORY_LIST;
env PROVIDE_INDEX_PAGE;
env APPEND_SLASH_FOR_POSSIBLE_DIRECTORY;
Expand Down
15 changes: 7 additions & 8 deletions common/etc/nginx/templates/default.conf.template
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,10 @@ map $uri_full_path $uri_path {
default $PREFIX_LEADING_DIRECTORY_PATH$uri_full_path;
}

map $S3_STYLE $s3_host_hdr {
virtual "${S3_BUCKET_NAME}.${S3_SERVER}";
path "${S3_SERVER}:${S3_SERVER_PORT}";
default "${S3_BUCKET_NAME}.${S3_SERVER}";
}
# S3_HOST_HEADER is set in the startup script
# (either ./common/docker-entrypoint.sh or ./standalone_ubuntu_oss_install.sh)
# based on the S3_STYLE configuration option.
js_var $s3_host ${S3_HOST_HEADER};

js_var $indexIsEmpty true;
js_var $forIndexPage true;
Expand Down Expand Up @@ -141,7 +140,7 @@ server {
proxy_set_header X-Amz-Security-Token $awsSessionToken;

# We set the host as the bucket name to inform the S3 API of the bucket
proxy_set_header Host $s3_host_hdr;
proxy_set_header Host $s3_host;

# Use keep alive connections in order to improve performance
proxy_http_version 1.1;
Expand Down Expand Up @@ -202,7 +201,7 @@ server {
proxy_set_header X-Amz-Security-Token $awsSessionToken;

# We set the host as the bucket name to inform the S3 API of the bucket
proxy_set_header Host $s3_host_hdr;
proxy_set_header Host $s3_host;

# Use keep alive connections in order to improve performance
proxy_http_version 1.1;
Expand Down Expand Up @@ -265,7 +264,7 @@ server {
proxy_set_header X-Amz-Security-Token $awsSessionToken;

# We set the host as the bucket name to inform the S3 API of the bucket
proxy_set_header Host $s3_host_hdr;
proxy_set_header Host $s3_host;

# Use keep alive connections in order to improve performance
proxy_http_version 1.1;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ proxy_set_header Authorization $s3auth;
proxy_set_header X-Amz-Security-Token $awsSessionToken;

# We set the host as the bucket name to inform the S3 API of the bucket
proxy_set_header Host $s3_host_hdr;
proxy_set_header Host $s3_host;

# Use keep alive connections in order to improve performance
proxy_http_version 1.1;
Expand Down
25 changes: 25 additions & 0 deletions deployments/s3_express/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions deployments/s3_express/.tool-versions
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
terraform 1.8.1
45 changes: 45 additions & 0 deletions deployments/s3_express/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Purpose
This Terraform script sets up an AWS S3 Express One Zone bucket for testing.

## Usage
Use environment variables to authenticate:

```bash
export AWS_ACCESS_KEY_ID="anaccesskey"
export AWS_SECRET_ACCESS_KEY="asecretkey"
export AWS_REGION="us-west-2"
```

Generate a plan:
```bash
terraform plan -out=plan.tfplan \
> -var="bucket_name=my-bucket-name--usw2-az1--x-s3" \
> -var="region=us-west-2" \
> -var="availability_zone_id=usw2-az1" \
> -var="[email protected]"
```
> [!NOTE]
> Note that AWS S3 Express One Zone is only available in [certain regions and availability zones](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-express-networking.html#s3-express-endpoints). If you get an error like this: `api error InvalidBucketName`. If you have met the [naming rules](https://docs.aws.amazon.com/AmazonS3/latest/userguide/directory-bucket-naming-rules.html), this likely means you have chosen a bad region/availability zone combination.

If you are comfortable with the plan, apply it:
```
terraform apply "plan.tfplan"
```

Then build the image (you can also use the latest release)
```bash
docker build --file Dockerfile.oss --tag nginx-s3-gateway:oss --tag nginx-s3-gateway .
```

Configure and run the image:

```bash
docker run --rm --env-file ./settings.s3express.example --publish 80:80 --name nginx-s3-gateway \
nginx-s3-gateway:oss
```

Confirm that it is working. The terraform script will prepopulate the bucket with a single test object
```bash
curl http://localhost:80/test.txt
```
51 changes: 51 additions & 0 deletions deployments/s3_express/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
provider "aws" {
region = var.region
}

resource "aws_s3_directory_bucket" "example" {
bucket = var.bucket_name
location {
name = var.availability_zone_id
}

force_destroy = true
}

data "aws_partition" "current" {}
data "aws_caller_identity" "current" {}

data "aws_iam_policy_document" "example" {
statement {
effect = "Allow"

actions = [
"s3express:*",
]

resources = [
aws_s3_directory_bucket.example.arn,
]

principals {
type = "AWS"
identifiers = ["arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:root"]
}
}
}

resource "aws_s3_bucket_policy" "example" {
bucket = aws_s3_directory_bucket.example.bucket
policy = data.aws_iam_policy_document.example.json
}

# The filemd5() function is available in Terraform 0.11.12 and later
# For Terraform 0.11.11 and earlier, use the md5() function and the file() function:
# etag = "${md5(file("path/to/file"))}"
# etag = filemd5("path/to/file")
resource "aws_s3_object" "example" {
bucket = aws_s3_directory_bucket.example.bucket
key = "test.txt"
source = "${path.root}/test_data/test.txt"
}


22 changes: 22 additions & 0 deletions deployments/s3_express/settings.s3express.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
S3_BUCKET_NAME=my-bucket-name--usw2-az1--x-s3
AWS_ACCESS_KEY_ID=ZZZZZZZZZZZZZZZZZZZZ
AWS_SECRET_ACCESS_KEY=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
AWS_SESSION_TOKEN=bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
S3_SERVER=s3express-usw2-az1.us-west-2.amazonaws.com
S3_SERVER_PORT=443
S3_SERVER_PROTO=https
S3_REGION=us-west-2
S3_STYLE=virtual-v2
S3_SERVICE=s3express
DEBUG=true
AWS_SIGS_VERSION=4
ALLOW_DIRECTORY_LIST=false
PROVIDE_INDEX_PAGE=false
APPEND_SLASH_FOR_POSSIBLE_DIRECTORY=false
DIRECTORY_LISTING_PATH_PREFIX=""
PROXY_CACHE_MAX_SIZE=10g
PROXY_CACHE_SLICE_SIZE="1m"
PROXY_CACHE_INACTIVE=60m
PROXY_CACHE_VALID_OK=1h
PROXY_CACHE_VALID_NOTFOUND=1m
PROXY_CACHE_VALID_FORBIDDEN=30s
2 changes: 2 additions & 0 deletions deployments/s3_express/test_data/test.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Congratulations, friend. You are using Amazon S3 Express One Zone.
πŸš‚πŸš‚πŸš‚ Choo-choo~ πŸš‚πŸš‚πŸš‚
Loading

0 comments on commit 7f5df74

Please sign in to comment.