Skip to content

aarch64 multi-arch builds fail due to no disk space left on builder #1554

@marmijo

Description

@marmijo

We've been hitting storage issues on the aarch64 multi-arch builder lately and it's causing our builds to fail with a message similar to, but not limited to, the following:

[2024-07-17T16:24:06.840Z] Committing 01fcos: /home/jenkins/agent/workspace/build-arch/src/config/overlay.d/01fcos ... error: Writing content object: min-free-space-percent '3%' would be exceeded, at least 4.1?kB requested

OR

OSError: [Errno 28] No space left on device: 

OR

qemu-img: error while writing at byte 2859466752: No space left on device

I was able to log into the aarch64 builder today as the builder user and I found /sysroot at 100% usage.

core@coreos-aarch64-builder:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p4  200G  200G  2.2M 100% /sysroot
devtmpfs        4.0M     0  4.0M   0% /dev
tmpfs           126G  200K  126G   1% /dev/shm
...
...

I freed up some space today by running podman volume prune after noticing that most of the storage space was being used by those volumes.

builder@coreos-aarch64-builder:~$ podman volume prune
WARNING! This will remove all volumes not used by at least one container. The following volumes will be removed:
04ca0c2da268f19d45440991aebc0ca9f2518c09f2a0dcdbeae66cccc563a521
11e3d74469587125fd71ce12e2d84cf6210363e1ce50c432e5ac0da098089a2b
164a592f879a706839806895605af1b1e599c82a54d7a7e9cd1b11421f4201bb
f5fa83bd6c333d4e302f180c5aa838217c2cb41e98186b98ddaf2b92d83022bc
Are you sure you want to continue? [y/N] y
04ca0c2da268f19d45440991aebc0ca9f2518c09f2a0dcdbeae66cccc563a521
11e3d74469587125fd71ce12e2d84cf6210363e1ce50c432e5ac0da098089a2b
164a592f879a706839806895605af1b1e599c82a54d7a7e9cd1b11421f4201bb
f5fa83bd6c333d4e302f180c5aa838217c2cb41e98186b98ddaf2b92d83022bc
builder@coreos-aarch64-builder:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p4  200G   92G  109G  46% /sysroot
devtmpfs        4.0M     0  4.0M   0% /dev
tmpfs           126G  400K  126G   1% /dev/shm
efivarfs        512K  4.6K  508K   1% /sys/firmware/efi/efivars
tmpfs            51G  9.9M   51G   1% /run
tmpfs           126G     0  126G   0% /tmp
/dev/nvme0n1p3  350M  265M   62M  82% /boot
tmpfs            26G  452K   26G   1% /run/user/1001
tmpfs            26G   60K   26G   1% /run/user/1002
tmpfs            26G   16K   26G   1% /run/user/1000


Hopefully this will be mitigated once we redeploy the multi-arch builders on AWS and increase the size of the disk to at least 600GB from 200GB. While not necessary to redeploy the builder, landing coreos/fedora-coreos-pipeline#986 would make it much easier. However, it might be worth exploring if we can reduce/prevent the number of dangling volumes on the builders.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions