Skip to content

Commit 5e789ba

Browse files
authored
Migrate CI to docker buildx and other improvements (#4765)
* Migrate CI to docker buildx and other improvements ## Motivation - Improve build times in forks. Specially when rerunning builds because of some flaky test. - Start using `docker buildx` to pave the way for multiplatform builds. ## Performance improvements These timings were taken for the `kind_integration.yml` workflow when we merged and rerun the lodash bump PR (#4762) Before these improvements: - when merging: `24:18` - when rerunning after merge (docker cache warm): `19:00` - when running the same changes in a fork (no docker cache): `32:15` After these improvements: - when merging: `25:38` - when rerunning after merge (docker cache warm): `19:25` - when running the same changes in a fork (docker cache warm): `19:25` As explained below, non-forks and forks now use the same cache, so the important take is that forks will always start with a warm cache and we'll no longer see long build times like the `32:15` above. The downside is a slight increase in the build times for non-forks (up to a little more than a minute, depending on the case). ## Build containers in parallel The `docker_build` job in the `kind_integration.yml`, `cloud_integration.yml` and `release.yml` workflows relied on running `bin/docker-build` which builds all the containers in sequence. Now each container is built in parallel using a matrix strategy. ## New caching strategy CI now uses `docker buildx` for building the container images, which allows using an external cache source for builds, a location in the filesystem in this case. That location gets cached using actions/cache, using the key `{{ runner.os }}-buildx-${{ matrix.target }}-${{ env.TAG }}` and the restore key `${{ runner.os }}-buildx-${{ matrix.target }}-`. For example when building the `web` container, its image and all the intermediary layers get cached under the key `Linux-buildx-web-git-abc0123`. When that has been cached in the `main` branch, that cache will be available to all the child branches, including forks. If a new branch in a fork asks for a key like `Linux-buildx-web-git-def456`, the key won't be found during the first CI run, but the system falls back to the key `Linux-buildx-web-git-abc0123` from `main` and so the build will start with a warm cache (more info about how keys are matched in the [actions/cache docs](https://docs.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key)). ## Packet host no longer needed To benefit from the warm caches both in non-forks and forks like just explained, we're required to ditch doing the builds in Packet and now everything runs in the github runners VMs. As a result there's no longer separate logic for non-forks and forks in the workflow files; `kind_integration.yml` was greatly simplified but `cloud_integration.yml` and `release.yml` got a little bigger in order to use the actions artifacts as a repository for the images built. This bloat will be fixed when support for [composite actions](https://github.com/actions/runner/blob/users/ethanchewy/compositeADR/docs/adrs/0549-composite-run-steps.md) lands in github. ## Local builds You still are able to run `bin/docker-build` or any of the `docker-build.*` scripts. And to make use of buildx, run those same scripts after having set the env var `DOCKER_BUILDKIT=1`. Using buildx supposes you have installed it, as instructed [here](https://github.com/docker/buildx). ## Other - A new script `bin/docker-cache-prune` is used to remove unused images from the cache. Without that the cache grows constantly and we can rapidly hit the 5GB limit (when the limit is attained the oldest entries get evicted). - The `go-deps` dockerfile base image was changed from `golang:1.14.2` (ubuntu based) to `golang-1:14.2-alpine` also to conserve cache space. # Addressed separately in #4875: Got rid of the `go-deps` image and instead added something similar on top of all the Dockerfiles dealing with `go`, as a first stage for those Dockerfiles. That continues to serve as a way to pre-populate go's build cache, which speeds up the builds in the subsequent stages. That build should in theory be rebuilt automatically only when `go.mod` or `go.sum` change, and now we don't require running `bin/update-go-deps-shas`. That script was removed along with all the logic elsewhere that used it, including the `go_dependencies` job in the `static_checks.yml` github workflow. The list of modules preinstalled was moved from `Dockerfile-go-deps` to a new script `bin/install-deps`. I couldn't find a way to generate that list dynamically, so whenever a slow-to-compile dependency is found, we have to make sure it's included in that list. Although this simplifies the dev workflow, note that the real motivation behind this was a limitation in buildx's `docker-container` driver that forbids us from depending on images that haven't been pushed to a registry, so we have to resort to building the dependencies as a first stage in the Dockerfiles.
1 parent 46d22f8 commit 5e789ba

25 files changed

+219
-377
lines changed

.dockerignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
**/node_modules
88
bin
99
!bin/fetch-proxy
10+
!bin/install-deps
1011
!bin/web
1112
**/Dockerfile*
1213
Dockerfile*

.github/workflows/cloud_integration.yml

Lines changed: 40 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -8,72 +8,62 @@ on:
88
- main
99
env:
1010
GH_ANNOTATION: true
11+
DOCKER_BUILDKIT: 1
1112
jobs:
12-
# todo: Keep in sync with `release.yml`
1313
docker_build:
14-
name: Docker build
1514
runs-on: ubuntu-18.04
15+
strategy:
16+
matrix:
17+
target: [proxy, controller, web, cni-plugin, debug, cli-bin, grafana]
18+
name: Docker build (${{ matrix.target }})
1619
steps:
1720
- name: Checkout code
1821
# actions/checkout@v2
1922
uses: actions/checkout@722adc6
20-
- name: Setup SSH config for Packet
23+
- name: Set environment variables from scripts
2124
run: |
22-
mkdir -p ~/.ssh/
23-
touch ~/.ssh/id && chmod 600 ~/.ssh/id
24-
echo "${{ secrets.DOCKER_SSH_CONFIG }}" > ~/.ssh/config
25-
echo "${{ secrets.DOCKER_PRIVATE_KEY }}" > ~/.ssh/id
26-
echo "${{ secrets.DOCKER_KNOWN_HOSTS }}" > ~/.ssh/known_hosts
27-
ssh linkerd-docker docker version
25+
. bin/_tag.sh
26+
echo ::set-env name=TAG::$(CI_FORCE_CLEAN=1 bin/root-tag)
27+
28+
. bin/_docker.sh
29+
echo ::set-env name=DOCKER_REGISTRY::$DOCKER_REGISTRY
30+
echo ::set-env name=DOCKER_BUILDKIT_CACHE::${{ runner.temp }}/.buildx-cache
31+
- name: Cache docker layers
32+
33+
uses: actions/cache@b820478
34+
with:
35+
path: ${{ env.DOCKER_BUILDKIT_CACHE }}
36+
key: ${{ runner.os }}-buildx-${{ matrix.target }}-${{ env.TAG }}
37+
restore-keys: |
38+
${{ runner.os }}-buildx-${{ matrix.target }}-
2839
- name: Build docker images
2940
env:
30-
DOCKER_HOST: ssh://linkerd-docker
3141
DOCKER_TRACE: 1
3242
run: |
33-
export PATH="`pwd`/bin:$PATH"
34-
bin/docker-build
35-
# todo: Keep in sync with `release.yml`
36-
docker_push:
37-
name: Docker push
38-
runs-on: ubuntu-18.04
39-
needs: [docker_build]
40-
steps:
41-
- name: Checkout code
42-
# actions/checkout@v2
43-
uses: actions/checkout@722adc6
44-
- name: Set environment variables from scripts
45-
run: |
46-
. bin/_tag.sh
47-
echo ::set-env name=TAG::$(CI_FORCE_CLEAN=1 bin/root-tag)
43+
docker buildx create --driver docker-container --use
44+
bin/docker-build-${{ matrix.target }}
4845
- name: Configure gcloud
4946
5047
uses: linkerd/linkerd2-action-gcloud@308c4df
5148
with:
5249
cloud_sdk_service_account_key: ${{ secrets.CLOUD_SDK_SERVICE_ACCOUNT_KEY }}
5350
gcp_project: ${{ secrets.GCP_PROJECT }}
5451
gcp_zone: ${{ secrets.GCP_ZONE }}
55-
- name: Docker SSH setup
56-
run: |
57-
mkdir -p ~/.ssh/
58-
touch ~/.ssh/id && chmod 600 ~/.ssh/id
59-
echo "${{ secrets.DOCKER_SSH_CONFIG }}" > ~/.ssh/config
60-
echo "${{ secrets.DOCKER_PRIVATE_KEY }}" > ~/.ssh/id
61-
echo "${{ secrets.DOCKER_KNOWN_HOSTS }}" > ~/.ssh/known_hosts
62-
ssh linkerd-docker docker version
6352
- name: Push docker images to registry
64-
env:
65-
DOCKER_HOST: ssh://linkerd-docker
6653
run: |
67-
export PATH="`pwd`/bin:$PATH"
68-
bin/docker-push-deps
69-
bin/docker-push $TAG
70-
bin/docker-retag-all $TAG main
71-
bin/docker-push main
54+
. bin/_docker.sh
55+
docker_push "${{ matrix.target }}" "$TAG"
56+
docker_retag "${{ matrix.target }}" "$TAG" main
57+
docker_push "${{ matrix.target }}" main
58+
- name: Prune docker layers cache
59+
# changes generate new images while the existing ones don't get removed
60+
# so we manually do that to avoid bloating the cache
61+
run: bin/docker-cache-prune
7262
# todo: Keep in sync with `release.yml`
7363
cloud_integration_tests:
7464
name: Cloud integration tests
7565
runs-on: ubuntu-18.04
76-
needs: [docker_push]
66+
needs: [docker_build]
7767
steps:
7868
- name: Checkout code
7969
# actions/checkout@v2
@@ -90,14 +80,14 @@ jobs:
9080
id: install_cli
9181
run: |
9282
TAG="$(CI_FORCE_CLEAN=1 bin/root-tag)"
93-
image="gcr.io/linkerd-io/cli-bin:$TAG"
94-
id=$(bin/docker create $image)
95-
bin/docker cp "$id:/out/linkerd-linux" "$HOME/.linkerd"
96-
"$HOME/.linkerd" version --client
83+
CMD="$PWD/target/release/linkerd2-cli-$TAG-linux"
84+
bin/docker-pull-binaries $TAG
85+
$CMD version --client
9786
# validate CLI version matches the repo
98-
[[ "$TAG" == "$($HOME/.linkerd version --short --client)" ]]
87+
[[ "$TAG" == "$($CMD version --short --client)" ]]
9988
echo "Installed Linkerd CLI version: $TAG"
100-
echo "::set-output name=tag::$TAG"
89+
echo "::set-env name=CMD::$CMD"
90+
echo "::set-env name=TAG::$TAG"
10191
- name: Create GKE cluster
10292
10393
uses: linkerd/linkerd2-action-gcloud@308c4df
@@ -107,17 +97,15 @@ jobs:
10797
gcp_zone: ${{ secrets.GCP_ZONE }}
10898
preemptible: false
10999
create: true
110-
name: testing-${{ steps.install_cli.outputs.tag }}-${{ github.run_id }}
100+
name: testing-${{ env.TAG }}-${{ github.run_id }}
111101
num_nodes: 2
112102
- name: Run integration tests
113103
env:
114104
GITCOOKIE_SH: ${{ secrets.GITCOOKIE_SH }}
115105
run: |
116-
export PATH="`pwd`/bin:$PATH"
117106
echo "$GITCOOKIE_SH" | bash
118-
version="$($HOME/.linkerd version --client --short | tr -cd '[:alnum:]-')"
119-
bin/tests --skip-kind-create "$HOME/.linkerd"
107+
bin/tests --skip-kind-create "$CMD"
120108
- name: CNI tests
121109
run: |
122-
export TAG="$($HOME/.linkerd version --client --short)"
110+
export TAG="$($CMD version --client --short)"
123111
go test -cover -race -v -mod=readonly ./cni-plugin/test -integration-tests

.github/workflows/kind_integration.yml

Lines changed: 25 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,14 @@ on:
99
- main
1010
env:
1111
GH_ANNOTATION: true
12+
DOCKER_BUILDKIT: 1
1213
jobs:
1314
docker_build:
14-
name: Docker build
1515
runs-on: ubuntu-18.04
16+
strategy:
17+
matrix:
18+
target: [proxy, controller, web, cni-plugin, debug, cli-bin, grafana]
19+
name: Docker build (${{ matrix.target }})
1620
steps:
1721
- name: Checkout code
1822
# actions/checkout@v2
@@ -24,39 +28,35 @@ jobs:
2428
2529
. bin/_docker.sh
2630
echo ::set-env name=DOCKER_REGISTRY::$DOCKER_REGISTRY
27-
- name: Setup SSH config for Packet
28-
if: github.event_name == 'push' || !github.event.pull_request.head.repo.fork
29-
run: |
30-
mkdir -p ~/.ssh/
31-
touch ~/.ssh/id && chmod 600 ~/.ssh/id
32-
echo "${{ secrets.DOCKER_SSH_CONFIG }}" > ~/.ssh/config
33-
echo "${{ secrets.DOCKER_PRIVATE_KEY }}" > ~/.ssh/id
34-
echo "${{ secrets.DOCKER_KNOWN_HOSTS }}" > ~/.ssh/known_hosts
35-
ssh linkerd-docker docker version
36-
echo ::set-env name=DOCKER_HOST::ssh://linkerd-docker
31+
echo ::set-env name=DOCKER_BUILDKIT_CACHE::${{ runner.temp }}/.buildx-cache
32+
- name: Cache docker layers
33+
34+
uses: actions/cache@b820478
35+
with:
36+
path: ${{ env.DOCKER_BUILDKIT_CACHE }}
37+
key: ${{ runner.os }}-buildx-${{ matrix.target }}-${{ env.TAG }}
38+
restore-keys: |
39+
${{ runner.os }}-buildx-${{ matrix.target }}-
3740
- name: Build docker images
3841
env:
3942
DOCKER_TRACE: 1
4043
run: |
41-
export PATH="`pwd`/bin:$PATH"
42-
bin/docker-build
44+
docker buildx create --driver docker-container --use
45+
bin/docker-build-${{ matrix.target }}
46+
- name: Prune docker layers cache
47+
# changes generate new images while the existing ones don't get removed
48+
# so we manually do that to avoid bloating the cache
49+
run: bin/docker-cache-prune
4350
- name: Create artifact with CLI and image archives
4451
env:
4552
ARCHIVES: /home/runner/archives
4653
run: |
4754
mkdir -p $ARCHIVES
48-
49-
for image in proxy controller web cni-plugin debug cli-bin grafana; do
50-
docker save "$DOCKER_REGISTRY/$image:$TAG" > $ARCHIVES/$image.tar || tee save_fail &
51-
done
52-
55+
docker save "$DOCKER_REGISTRY/${{ matrix.target }}:$TAG" > $ARCHIVES/${{ matrix.target }}.tar
5356
# save windows cli into artifacts
54-
cp -r ./target/cli/windows/linkerd $ARCHIVES/linkerd-windows.exe
55-
56-
# Wait for `docker save` background processes to complete. Exit early
57-
# if any job failed.
58-
wait < <(jobs -p)
59-
test -f save_fail && exit 1 || true
57+
if [ '${{ matrix.target }}' == 'cli-bin' ]; then
58+
cp -r ./target/cli/windows/linkerd $ARCHIVES/linkerd-windows.exe
59+
fi
6060
# `with.path` values do not support environment variables yet, so an
6161
# absolute path is used here.
6262
#
@@ -92,7 +92,6 @@ jobs:
9292
- name: Run CLI Integration tests
9393
run: |
9494
go test --failfast --mod=readonly ".\test\cli" --linkerd=$PWD\image-archives\linkerd-windows.exe --cli-tests -v
95-
# todo: Keep in sync with `release.yml`
9695
kind_integration_tests:
9796
strategy:
9897
matrix:
@@ -127,31 +126,12 @@ jobs:
127126
128127
. bin/_docker.sh
129128
echo ::set-env name=DOCKER_REGISTRY::$DOCKER_REGISTRY
130-
- name: Setup SSH config for Packet
131-
if: github.event_name == 'push' || !github.event.pull_request.head.repo.fork
132-
run: |
133-
mkdir -p ~/.ssh/
134-
touch ~/.ssh/id && chmod 600 ~/.ssh/id
135-
echo "${{ secrets.DOCKER_SSH_CONFIG }}" > ~/.ssh/config
136-
echo "${{ secrets.DOCKER_PRIVATE_KEY }}" > ~/.ssh/id
137-
echo "${{ secrets.DOCKER_KNOWN_HOSTS }}" > ~/.ssh/known_hosts
138-
- name: Download image archives (Forked repositories)
139-
if: github.event_name == 'pull_request' && github.event.pull_request.head.repo.fork
129+
- name: Download image archives
140130
# actions/download-artifact@v1
141131
uses: actions/download-artifact@18f0f59
142132
with:
143133
name: image-archives
144134
- name: Load cli-bin image into local docker images
145-
if: github.event_name == 'push' || !github.event.pull_request.head.repo.fork
146-
run: |
147-
# `docker load` only accepts input from STDIN, so pipe the image
148-
# archive into the command.
149-
#
150-
# In order to pipe the image archive, set `DOCKER_HOST` for a single
151-
# command and `docker save` the CLI image from the Packet host.
152-
DOCKER_HOST=ssh://linkerd-docker docker save "$DOCKER_REGISTRY/cli-bin:$TAG" | docker load
153-
- name: Load cli-bin image into local docker images (Forked repositories)
154-
if: github.event_name == 'pull_request' && github.event.pull_request.head.repo.fork
155135
run: docker load < image-archives/cli-bin.tar
156136
- name: Install CLI
157137
run: |
@@ -162,10 +142,5 @@ jobs:
162142
# Validate the CLI version matches the current build tag.
163143
[[ "$TAG" == "$($HOME/.linkerd version --short --client)" ]]
164144
- name: Run integration tests
165-
if: github.event_name == 'push' || !github.event.pull_request.head.repo.fork
166-
run: |
167-
bin/tests --images --images-host ssh://linkerd-docker --name ${{ matrix.integration_test }} "$HOME/.linkerd"
168-
- name: Run integration tests (Forked repositories)
169-
if: github.event_name == 'pull_request' && github.event.pull_request.head.repo.fork
170145
run: |
171146
bin/tests --images --name ${{ matrix.integration_test }} "$HOME/.linkerd"

0 commit comments

Comments
 (0)