Skip to content

Commit

Permalink
chore: add rudimentary docs on the QEMU artifact
Browse files Browse the repository at this point in the history
  • Loading branch information
darora committed Nov 27, 2024
1 parent 36510ea commit 03eb5cf
Show file tree
Hide file tree
Showing 3 changed files with 102 additions and 8 deletions.
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ disk/focal-raw.img: output-cloudimg/packer-cloudimg
container-disk-image: output-cloudimg/packer-cloudimg
docker build . -t supabase-postgres-test:$(GIT_SHA) -f ./Dockerfile-kubevirt

eks-node-container-disk-image: output-cloudimg/packer-cloudimg
sudo nerdctl build . -t supabase-postgres-test:$(GIT_SHA) --namespace k8s.io -f ./Dockerfile-kubevirt

host-disk: disk/focal-raw.img
sudo chown 107 -R disk

Expand Down
12 changes: 4 additions & 8 deletions ebssurrogate/scripts/qemu-bootstrap-nix.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
#!/usr/bin/env bash
#
# This script creates filesystem and setups up chrooted
# enviroment for further processing. It also runs
# ansible playbook and finally does system cleanup.
#
# Adapted from: https://github.com/jen20/packer-ubuntu-zfs

set -o errexit
set -o pipefail
Expand Down Expand Up @@ -41,9 +35,8 @@ tee /etc/ansible/ansible.cfg <<EOF
callbacks_enabled = timer, profile_tasks, profile_roles
EOF
# Run Ansible playbook
#export ANSIBLE_LOG_PATH=/tmp/ansible.log && export ANSIBLE_DEBUG=True && export ANSIBLE_REMOTE_TEMP=/mnt/tmp
export ANSIBLE_LOG_PATH=/tmp/ansible.log && export ANSIBLE_REMOTE_TEMP=/mnt/tmp
ansible-playbook ./ansible/playbook.yml --extra-vars '{"nixpkg_mode": true, "debpkg_mode": false, "stage2_nix": false}' # $ARGS - I think this is being not passed in correctly
ansible-playbook ./ansible/playbook.yml --extra-vars '{"nixpkg_mode": true, "debpkg_mode": false, "stage2_nix": false}'
}

function setup_postgesql_env {
Expand Down Expand Up @@ -80,7 +73,10 @@ setup_postgesql_env
setup_locale
execute_playbook

####################
# stage 2 things
####################

function install_nix() {
sudo su -c "curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | sh -s -- install --no-confirm \
--extra-conf \"substituters = https://cache.nixos.org https://nix-postgres-artifacts.s3.amazonaws.com\" \
Expand Down
95 changes: 95 additions & 0 deletions qemu_artifact.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# QEMU artifact

We build a container image that contains a QEMU qcow2 disk image. This container image can be use with KubeVirt's [containerDisk](https://kubevirt.io/user-guide/storage/disks_and_volumes/#containerdisk) functionality to boot up VMs off the qcow2 image.

Container images are a convenient mechanism to ship the disk image to the nodes where they're needed.

Given the size of the image, the first VM using it on a node might take a while to come up, while the image is being pulled down. The image can be pre-fetched to avoid this; we might also switch to other deployment mechanisms in the future.

# Building QEMU artifact

## Creating a bare-metal instance

We launch an Ubuntu 22 bare-metal instance; we're using the `c6g.metal` instance type in this case, but any ARM instance type is sufficient for our purposes.

aws ec2 create-security-group --group-name "launch-wizard-1" --description "launch-wizard-1 created 2024-11-26T00:32:56.039Z" --vpc-id "vpc-0fbfcc428751ce76b"
aws ec2 authorize-security-group-ingress --group-id "sg-preview-1" --ip-permissions '{"IpProtocol":"tcp","FromPort":22,"ToPort":22,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}'
aws ec2 run-instances --image-id "ami-0a87daabd88e93b1f" --instance-type "c6g.metal" --key-name "darora-aps1" --block-device-mappings '{"DeviceName":"/dev/sda1","Ebs":{"Encrypted":false,"DeleteOnTermination":true,"Iops":3000,"SnapshotId":"snap-0fe84a34403e3da8b","VolumeSize":200,"VolumeType":"gp3","Throughput":125}}' --network-interfaces '{"AssociatePublicIpAddress":true,"DeviceIndex":0,"Groups":["sg-preview-1"]}' --tag-specifications '{"ResourceType":"instance","Tags":[{"Key":"Name","Value":"darora-pg-image"}]}' --metadata-options '{"HttpEndpoint":"enabled","HttpPutResponseHopLimit":2,"HttpTokens":"required"}' --private-dns-name-options '{"HostnameType":"ip-name","EnableResourceNameDnsARecord":true,"EnableResourceNameDnsAAAARecord":false}' --count "1"

## Install deps

On the instance, install the dependencies we require for producing QEMU artifacts:

sudo apt-get update
sudo apt-get install -y qemu-system qemu-system-arm qemu-utils qemu-efi-aarch64 libvirt-clients libvirt-daemon libqcow-utils software-properties-common git make libnbd-bin nbdkit fuse2fs cloud-image-utils awscli
sudo usermod -aG kvm ubuntu
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=arm64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
sudo apt-get update && sudo apt-get install packer=1.11.2-1
sudo apt-get install -y docker.io


Some dev deps that might be useful:

sudo apt-get install -y emacs ripgrep vim-tiny byobu


## Clone repo and build

Logout/login first to pick up new group memberships!

git clone https://github.com/supabase/postgres.git
cd postgres
git checkout da/qemu-rebasing # choose appropriate branch here
make init container-disk-image

### Build process

The current AMI process involves a few steps:

1. nix package is build and published using GHA (`.github/workflows/nix-build.yml`)
- this builds Postgres alongwith the PG extensions we use.
2. "stage1" build (`amazon-arm64-nix.pkr.hcl`, invoked via `.github/workflows/ami-release-nix.yml`)
- uses an upstream Ubuntu image to initialize the AMI
- installs and configures the majority of the software that gets shipped as part of the AMI (e.g. gotrue, postgrest, ...)
3. "stage2" build (`stage2-nix-psql.pkr.hcl`, invoked via `.github/workflows/ami-release-nix.yml`)
- uses the image published from (2)
- installs and configures the software that is build and published using nix in (1)
- cleans up build dependencies etc

The QEMU artifact process collapses (2) and (3):

a. nix package is build and published using GHA (`.github/workflows/nix-build.yml`)
b. packer build (`qemu-arm64-nix.pkr.hcl`)
- uses an upstream Ubuntu live image as the base
- performs the work that was performed as part of the "stage1" and "stage2" builds
- this work is executed using `ebssurrogate/scripts/qemu-bootstrap-nix.sh`

## Publish image for later use

Publish the built image to a registry of your choosing, and use the published image with KubeVirt.


# Iterating on the QEMU artifact

For a tighter iteration loop on the Postgres artifact, the recommended workflow is to do so on an Ubuntu bare-metal node that's part of the EKS cluster that you're deploying to.

- Use the `host-disk` make target to build the raw image file on disk. (`/path/to/postgres/disk/focal-raw.img`)
- Update the VM spec to use `hostDisk` instead of `containerDisk`. Note that only one VM can use an image at a time, so you can't create multiple VMs backed by the same host disk.
- Enable the `HostDisk` feature flag for KubeVirt
- Deploy the VM to the node

Additionally, to iterate on the container image part of things, you can build the image on the bare-metal node (`eks-node-container-disk-image` target), rather than needing to publish it to ECR or similar registry. However, this part can take a while, so iterating using host disks remains the fastest dev loop.

## Dependencies note

Installing `docker.io` on an EKS node might interfere with the k8s setup of the node. You can instead install `nerdctl` and `buildkit`:

curl -L -O https://github.com/containerd/nerdctl/releases/download/v2.0.0/nerdctl-2.0.0-linux-arm64.tar.gz
tar -xzf nerdctl-2.0.0-linux-arm64.tar.gz
sudo mv ./nerdctl /usr/local/bin/
curl -O -L https://github.com/moby/buildkit/releases/download/v0.17.1/buildkit-v0.17.1.linux-arm64.tar.gz
tar -xzf buildkit-v0.17.1.linux-arm64.tar.gz
sudo mv bin/* /usr/local/bin/

You'll need to run buildkit: `sudo buildkitd`

0 comments on commit 03eb5cf

Please sign in to comment.