The continuous integration (CI) and continuous delivery (CD) pipeline of
GATK-SV is developed on GitHub Actions.
The CI/CD pipeline is defined via multiple workflows where each is
a .yml file stored under the .github/workflows directory. The workflows
are triggered automatically when a pull request (PR) is issued or merged.
The workflows automate testing, building, and deploying the pipeline,
and they currently cover the following areas.
- Lint Python scripts (
pytest.yaml): using flake8 asserts if the Python scripts follow the PEP-8 style guides; - Test, build, and publish docker images using
build_docker.py(sv_pipeline_docker.yml).
The GATK-SV Docker images are built and published using
the build_docker.py, which
is documented at this README and can be
executed locally.
The Docker Images workflow (DIW) automates the
test, build, and publication of GATK-SV Docker images using build_docker.py,
such that, the images are built when a PR is issued against the repository,
and published to Google Cloud Platform (GCP) Container Registry
(GCR) (at us.gcr.io) when the PR is merged.
The DIW consists of three jobs:
-
Determine Build Args. This job determines the arguments to be used by thebuild_docker.pyscript, specifically:- Given the size and the number of GATK-SV Docker images, DIW builds and
publishes only the Docker images affected by the changes introduced in
a PR. Accordingly, first the files changed between the
HEADand theBASEcommits of the PR'd branch are determined usinggit diff(for details, please refer to the in-line comments for the stepDetermine Commit SHAsinDIW), and then the affected images are determined. These images are used as the values of--targetsargument of thebuild_docker.pyscript. - A step to compose a tag for the Docker images in the
DATE-HEAD_SHA_8template, whereDATEisYYYYMMDDextracted from the time stamp of the last commit on the PR'd branch, andHEAD_SHA_8is the first eight characters of its commit SHA. For instance20211201-86fe06fd.
- Given the size and the number of GATK-SV Docker images, DIW builds and
publishes only the Docker images affected by the changes introduced in
a PR. Accordingly, first the files changed between the
-
Test Images Build. This job is triggered when a commit is pushed to the PR'd branch; it builds docker images determined by theDetermine Build Argsjob. This job fails if the building of the Docker images was unsuccessful. The Docker images built by this job will not be published to GCR and are discarded as the job succeeds. -
Publish. This job is triggered when a PR is merged to a commit is pushed to themainbranch. Similar to theTest Images Buildjob, this job builds Docker images and fails if the build process was unsuccessful. However, in addition, this job pushes the built images to GCR. To authorize access to GCR, this job assumes a GCP service account (SA) with read and write access to the GCR registry. The secrets related to the SA are defined as encrypted environment secrets.
_This section describes configuring the Deploy environment to be used
by the Publish job and is intended for the edification of repository admins.
An SA is used to authorize DIW to access GCR. (A future extension may
adopt an OpenID Connect [OIDC]-based
authentication and authorization). In order to assume the SA, the Publish
job needs the SA secrets (e.g., private key and client email) and
project name. This information is defined in a GitHub environment,
and is exposed to the Publish job as encrypted environment secrets.
The encrypted secrets are decrypted in the environment context and are
not exposed to the user code (if the norms of best practices are followed).
GitHub's environment secrets are a subset of repository-wide secrets,
which is a subset of organization-level secrets. We encrypt SA credentials
as GitHub's environment secrets as they allow pausing the execution of any
action that accesses the environment until it is approved by assigned
individuals.
In order to set up the Deploy environment, you may take the following steps:
-
Create an SA on GCP IAM. For simplicity, you may assign the service account the
Editorrole. However, in order to follow the principles of the least privilege, you may assign theStorage Object Admin,Storage Legacy Bucket Writer, andStorage Object Vieweras the minimum required permissions (ref). -
Get the service account's keys by going to the
Service Accounts pageand selecting the above-created service account and going to theKEYStab. Then click on theADD KEYbutton, and chooseCreate new key. In the pop-up window, selectJSONtype and click on theCREATEbutton. It will download a JSON file containing the secrets required to assume the service account. -
Base64 encode the service account's secrets in the JSON format as the following.
openssl base64 -in service-account.json -out service-account.txt
-
Create an environment following these steps and name the environment
Deploy. -
Create the following two encrypted secrets in the
Deployenvironment using these steps:name:GCP_PROJECT_ID;value: the ID of the GCP project under which you will use the GCR registry.name:GCP_GCR_SA_KEY;value: the above-created base64 encoding of the SA's secrets. After you set this encrypted secret, we recommend that you delete both the.jsonand.txtfiles containing SA's secrets.
-
[Optional] Under the
Environment protection ruleson theDeployenvironment's configuration page, you may check theRequired reviewerscheckbox and assign maintainers who can approve the execution of the instances of the jobs that require access to theDeployenvironment.
Once the Deploy environment is set up, and the Required reviewers
option under the section Environment protection rules is checked,
with every push to the main branch (e.g., merging a PR), the
DIW execution will pause at the Publish job with the following
message:
Waiting for review: Deploy needs approval to start deploying changes.
If enabled, any Required reviewers will see the following
additional link that they can click to approve or reject running the
job.
Review pending deployments