Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

Use ephemeral EC2 runners for building multi-arch docker images #1442

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .github/actions/ec2-runners/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright 2018-2022 Cargill Incorporated
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

FROM ubuntu:focal

RUN apt-get update && apt-get install -yq --no-install-recommends \
python3 \
python3-pip \
&& pip3 install \
botocore \
boto3 \
requests

COPY aws.py /aws.py

ENTRYPOINT ["/aws.py"]
171 changes: 171 additions & 0 deletions .github/actions/ec2-runners/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# ec2-runners

Creates a self-hosted runner for Github Actions on EC2. Useful for when
Github hosted runners are too slow.
This is exposed as a Github Actions self-hosted runner scoped to the repo where
this action is run from.

Provides two actions:

`start`:

* Creates two instances
* Bootstraps the buildx cluster
* Installs GHA runner software with the `--ephemeral` option

`stop`:

* Terminates any instances whose name matches the label provided

# Example usage

```yaml
name: GHA Buildx
on:
- push
- workflow_dispatch
jobs:
start_cluster:
name: Start buildx cluster
runs-on: ubuntu-latest
outputs:
label: ${{ steps.start_buildx_cluster.outputs.label }}
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ secrets.AWS_REGION }}

- name: Start EC2 runners
id: start_buildx_cluster
uses: ./.github/actions/ec2-runners
with:
action: start
amd_ami_id: ${{ secrets.AMD_AMI_ID }}
amd_instance_type: t2.nano
arm_ami_id: ${{ secrets.ARM_AMI_ID }}
arm_instance_type: t4g.nano
gh_personal_access_token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
mode: buildx
security_group_id: ${{ secrets.SECURITY_GROUP_ID }}
subnet: ${{ secrets.SUBNET }}

build_docker:
name: Build docker
needs: start_cluster
runs-on: ${{ needs.start_cluster.outputs.label }}
steps:
- name: Debug
run: docker buildx ls

stop_cluster:
name: Stop buildx cluster
needs:
- start_cluster
- build_docker
runs-on: ubuntu-latest
if: ${{ always() }}
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ secrets.AWS_REGION }}

- name: Destroy cluster
uses: ./.github/actions/ec2-runners
with:
action: stop
label: ${{ needs.start_cluster.outputs.label }}
```

# Configuration

## Inputs

`action`

* `start` - deploy a new cluster
* `stop` - destroy a running cluster

`amd_ami_id`

AMI ID for the AMD instance. Should have docker installed.

`amd_instance_type`

Instance Type for the AMD instance

`arm_ami_id`

AMI ID for the ARM instance. Should have Docker installed and daemon exposed on
port `2375`.

`arm_instance_type`

Instance Type for the ARM instance

`gh_personal_access_token`

GitHub Personal Access Token with "repo" permissions

`label`

Label applied to the created EC2 instances during creation.
No effect during `start`.
This is required when running the `stop` action.

`mode`

* `buildx` - start a two node buildx cluster for multi-arch builds
* `single` - start a single self-hosted AMD runner

Defaults to `buildx`

`security_group_id`

Must allow inbound traffic from the local subnet to port `2375`.
Must allow outbound traffic to connect to GitHub.

`subnet`

Subnet to apply to the instances

## Outputs

`label`

Random value generated when creating a new cluster.
This is used for job isolation.
Capture this output in the `start` action to provide to the `stop` action so
the instances are terminated.

# Setup

Assumes you have two pre-builts AMIs

AMD runner: Docker installed

ARM runner: Docker installed and daemon exposed on port `2375`

Steps to expose docker daemon:

```
sudo vi /etc/docker/daemon.json

{
"hosts": ["unix:///var/run/docker.sock", "tcp://0.0.0.0:2375"]
}
```
```
sudo vi /lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd --containerd=/run/containerd/containerd.sock
```
```
sudo systemctl daemon-reload

sudo systemctl restart docker.service
```
78 changes: 78 additions & 0 deletions .github/actions/ec2-runners/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
name: GHA Buildx
description: Provision a self-hosted buildx cluster for GHA
inputs:
action:
description: >-
- 'start' - deploy a new cluster
- 'stop' - destroy a running cluster
required: true

amd_ami_id:
description: >-
AMI ID for the AMD instance
required: false

amd_instance_type:
description: >-
Instance Type for the AMD instance
required: false

arm_ami_id:
description: >-
AMI ID for the ARM instance
required: false

arm_instance_type:
description: >-
Instance Type for the ARM instance
required: false

gh_personal_access_token:
description: >-
GitHub Personal Access Token
required: true

label:
description: >-
Label applied to the created EC2 instances.
This is required when running the 'stop' action.
required: false

mode:
description: >-
'buildx' - start a two node buildx cluster for multi-arch builds.
'single' - start a single self-hosted AMD runner.
Defaults to 'buildx'.
required: false
default: 'buildx'

security_group_id:
description: >-
Must allow outbound traffic to connect to GitHub
required: false

subnet:
description: >-
Subnet to apply to the instances
required: false

outputs:
label:
description: >-
Random value generated when creating a new cluster.
Used to make sure jobs only run on the clusters they create.

runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{ inputs.action }}
- ${{ inputs.amd_ami_id }}
- ${{ inputs.amd_instance_type }}
- ${{ inputs.arm_ami_id }}
- ${{ inputs.arm_instance_type }}
- ${{ inputs.gh_personal_access_token }}
- ${{ inputs.label }}
- ${{ inputs.mode }}
- ${{ inputs.security_group_id }}
- ${{ inputs.subnet }}
Loading