-
Notifications
You must be signed in to change notification settings - Fork 126
Description
We've been hitting storage issues on the aarch64 multi-arch builder lately and it's causing our builds to fail with a message similar to, but not limited to, the following:
[2024-07-17T16:24:06.840Z] Committing 01fcos: /home/jenkins/agent/workspace/build-arch/src/config/overlay.d/01fcos ... error: Writing content object: min-free-space-percent '3%' would be exceeded, at least 4.1?kB requested
OR
OSError: [Errno 28] No space left on device:
OR
qemu-img: error while writing at byte 2859466752: No space left on device
I was able to log into the aarch64 builder today as the builder user and I found /sysroot
at 100% usage.
core@coreos-aarch64-builder:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p4 200G 200G 2.2M 100% /sysroot
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 126G 200K 126G 1% /dev/shm
...
...
I freed up some space today by running podman volume prune
after noticing that most of the storage space was being used by those volumes.
builder@coreos-aarch64-builder:~$ podman volume prune
WARNING! This will remove all volumes not used by at least one container. The following volumes will be removed:
04ca0c2da268f19d45440991aebc0ca9f2518c09f2a0dcdbeae66cccc563a521
11e3d74469587125fd71ce12e2d84cf6210363e1ce50c432e5ac0da098089a2b
164a592f879a706839806895605af1b1e599c82a54d7a7e9cd1b11421f4201bb
f5fa83bd6c333d4e302f180c5aa838217c2cb41e98186b98ddaf2b92d83022bc
Are you sure you want to continue? [y/N] y
04ca0c2da268f19d45440991aebc0ca9f2518c09f2a0dcdbeae66cccc563a521
11e3d74469587125fd71ce12e2d84cf6210363e1ce50c432e5ac0da098089a2b
164a592f879a706839806895605af1b1e599c82a54d7a7e9cd1b11421f4201bb
f5fa83bd6c333d4e302f180c5aa838217c2cb41e98186b98ddaf2b92d83022bc
builder@coreos-aarch64-builder:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p4 200G 92G 109G 46% /sysroot
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 126G 400K 126G 1% /dev/shm
efivarfs 512K 4.6K 508K 1% /sys/firmware/efi/efivars
tmpfs 51G 9.9M 51G 1% /run
tmpfs 126G 0 126G 0% /tmp
/dev/nvme0n1p3 350M 265M 62M 82% /boot
tmpfs 26G 452K 26G 1% /run/user/1001
tmpfs 26G 60K 26G 1% /run/user/1002
tmpfs 26G 16K 26G 1% /run/user/1000
Hopefully this will be mitigated once we redeploy the multi-arch builders on AWS and increase the size of the disk to at least 600GB from 200GB. While not necessary to redeploy the builder, landing coreos/fedora-coreos-pipeline#986 would make it much easier. However, it might be worth exploring if we can reduce/prevent the number of dangling volumes on the builders.