-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent segmentation fault running Node emulated with target arm64, host amd64 #215
Comments
I have stumbled on similar thing in different context and ecosystem docker/setup-qemu-action#188 We have a Ruby app depends on quite a bunch of "native" extensions packages - that are packages having to be built through gcc Targeting we have consistent Similar thing happened with a Go project needed some stuff built with gcc. I have also managed to reproduce it locally with qemu v7. What worked was updating to qemu v8, give it a try if it will help. |
Thanks @smoke. This repo only supports up to v7 though right? In the meantime I will try and get a stack, I can see it has been removed in the core dump logout above. |
I've got a similar thing with regular segfaults in gcc while building C or Go applications. It appears that when I create an image with a more recent version of qemu (8.1.5) which is the latest in the repo and use that to install the emulators, the segfault does not happen. Is there a reason that the later versions (later than v7) have not been pushed to dockerhub as the latest version? |
Hitting similar issues when compiling php extensions while building a base image on top of neither v7.0.0 nor 8.1.5 are working in this case. |
When you say locally, do you mean outside of a container, just on the CPU? I've just tried to reproduce with simply It could be a (Edited, realised I'm running into this on both ubuntu 20.04 and 22.04) |
I've also started hitting segmentation faults today when building a multi-platform Docker image on GitHub Actions with a In my case, the segmentation faults occur when compiling the CPP application (that is deployed via the Docker image) during the build of the
The error occurs at random steps in the compilation process. I've had no issues building this Docker image hundreds of times over the past year until today; the last successful build with the same code revision before the segmentation faults started happening was at 00:00 GMT today (23.01.2025), so it looks like something must've changed/been updated since then (or I somehow got lucky before and didn't hit the issue; that seems a bit unlikely though, considering I've had a segmentation fault in every single one of the six workflow runs I did today). Edit: Using QEMU 8.x (as suggested by @smoke) instead of 7.x seems to work for me: - name: Set up QEMU
uses: docker/setup-qemu-action@v3
with:
image: tonistiigi/binfmt:qemu-v8.1.5 |
This works around the compilation issues observed with QEMU v7 (which is still used by docker/setup-qemu-action by default) as described in tonistiigi/binfmt#215.
This works around the compilation issues observed with QEMU 7.x (which is still used by docker/setup-qemu-action by default) as described in tonistiigi/binfmt#215.
@ajbarber I have used locally |
I'm also encountering this. I'm using the github qemu action to build a variety of ruby versions for different architectures. Since the last 3 days, Some of the errors I encountered
I run my actions on ubuntu-24.04, and coincidentally there was a runner image update right when this started. I compared builds and with It updates buildx from
|
By limiting it to the only used platform we hopefully can work around issues where files magically matched the ppc64 handler. Example output from dmesg: segfault at 116643c0 ip 00000000004fa380 sp 00007ffe80c32758 error 4 in qemu-ppc64-static[fa380,401000+340000] likely on CPU 6 (core 0, socket 0) Xref: tonistiigi/binfmt#215 Signed-off-by: Felix Moessbauer <[email protected]>
This fixes a qemu build issue observed on the (non versioned) tonistiigi/binfmt:latest@sha256:f6b82a01e1... qemu-user-static deploy image. Example failure as observed in the GitHub Actions: Traceback (most recent call last): File "/usr/bin/py3compile", line 323, in <module> main() File "/usr/bin/py3compile", line 302, in main compile(files, versions, File "/usr/bin/py3compile", line 187, in compile cfn = interpreter.cache_file(fn, version) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/share/python3/debpython/interpreter.py", line 212, in cache_file (fname[:-3], self.magic_tag(version), last_char)) ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/share/python3/debpython/interpreter.py", line 246, in magic_tag return self._execute('import imp; print(imp.get_tag())', version) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/share/python3/debpython/interpreter.py", line 359, in _execute raise Exception('{} failed with status code {}'.format(command, output['returncode'])) Exception: ('python3.11', '-c', 'import imp; print(imp.get_tag())') failed with status code -11 Xref: tonistiigi/binfmt#215 Signed-off-by: Felix Moessbauer <[email protected]>
Update: So having excluded qemu by running their binary directly, now I suspect the issue is somewhere within |
@ajbarber After some digging I'm pretty sure this issue relates to a kernel hardening. This also explains why various qemu versions are affected. More details can be found in this Debian bug: [1]. This bug first happened after [2] was applied (which later was reverted) and reverted again [3] after a fix for QEMU in Debian was available. Probably Ubuntu included just the kernel patch (revert-revert) but not the QEMU patch which then broke things again. |
This apparently fixes sporadic crashes of arm64 image builds, see also [1] and [2]. Ubuntu's version of qemu-user does not seem to have this fixed yet either, therefore inject the current Debian package. In addition, this moves away from the floating docker.io/tonistiigi/binfmt:latest that docker/setup-qemu-action@v3 uses. This loose coupling is questionable, not only in the light of this issue. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087822 [2] tonistiigi/binfmt#215 Co-Developed-by: Felix Moessbauer <[email protected]> Signed-off-by: Jan Kiszka <[email protected]>
This apparently fixes sporadic crashes of arm64 image builds, see also [1] and [2]. Ubuntu's version of qemu-user does not seem to have this fixed yet either, therefore inject the current Debian package. In addition, this moves away from the floating docker.io/tonistiigi/binfmt:latest that docker/setup-qemu-action@v3 uses. This loose coupling is questionable, not only in the light of this issue. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087822 [2] tonistiigi/binfmt#215 Co-Developed-by: Felix Moessbauer <[email protected]> Signed-off-by: Jan Kiszka <[email protected]>
Thanks very much @fmoessbauer. Reading some of those materials you linked, the segfaulting in question of qemu depends not only on release versions, but also the configuration flags passed at build time. We have maintainers of qemu saying clearly not to configure with binfmt/scripts/configure_qemu.sh Line 65 in 85908cc
To confirm @fmoessbauer's hypothesis, I also replicated the crash/no crash behaviour of I think qemu was forward patched in 8.1 to deal with things either way: https://gitlab.com/qemu-project/qemu/-/issues/1763#note_1508827541 So we need either to remove the line above in @tonistiigi do you accept PRs? |
Yes. Do you have example repro as well for this case? |
|
This apparently fixes sporadic crashes of arm64 image builds, see also [1] and [2]. Ubuntu's version of qemu-user does not seem to have this fixed yet either, therefore inject the current Debian package. In addition, this moves away from the floating docker.io/tonistiigi/binfmt:latest that docker/setup-qemu-action@v3 uses. This loose coupling is questionable, not only in the light of this issue. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087822 [2] tonistiigi/binfmt#215 Co-Developed-by: Felix Moessbauer <[email protected]> Signed-off-by: Jan Kiszka <[email protected]>
### Before submitting Please complete the following checklist when submitting a PR: - [ ] All new features must include a unit test. If you've fixed a bug or added code that should be tested, add a test to the [`tests`](../tests) directory! - [ ] All new functions and code must be clearly commented and documented. If you do make documentation changes, make sure that the docs build and render correctly by running `make docs`. - [ ] Ensure that the test suite passes, by running `make test`. - [x] Add a new entry to the `.github/CHANGELOG.md` file, summarizing the change, and including a link back to the PR. - [ ] Ensure that code is properly formatted by running `make format`. When all the above are checked, delete everything above the dashed line and fill in the pull request template. ------------------------------------------------------------------------------------------------------------ **Context:** The [aarch64 wheel build CI action has been failing](https://github.com/PennyLaneAI/pennylane-lightning/actions/workflows/wheel_linux_aarch64.yml) since circa 24 Jan 2025. They fail with a segmentation fault during the CIBW process. This has also been observed for similar wheel builds with QEMU with other repositories: docker/setup-qemu-action#188 ssciwr/clang-format-wheel#124 tonistiigi/binfmt#215 tonistiigi/binfmt#165 and fix attempt: ssciwr/clang-format-wheel#125 It is due to using an old version (v7) of qemu that comes with binfmt. `setup-qemu-action` by default uses `binfmt:latest` image which has not been updated in 2 years. **Description of the Change:** Use a newer QEMU image (v8) from binfmt. **Benefits:** aarch64 wheel builds will succeed again, [e.g.](https://github.com/PennyLaneAI/pennylane-lightning/actions/runs/13019772888?pr=1056) **Possible Drawbacks:** **Related GitHub Issues:** [sc-83297] --------- Co-authored-by: ringo-but-quantum <[email protected]> Co-authored-by: Ali Asadi <[email protected]>
Reverting to - name: Setup QEMU
uses: docker/setup-qemu-action@53851d14592bedcffcf25ea515637cff71ef929a # v3.3.0
with:
image: tonistiigi/binfmt:qemu-v8.1.5-43 |
Reverting from what version? |
For me, using |
for me unfortunately doesn't work |
@ajbarber From @cagnulein in my case the GH runner image is |
thanks @stefanprodan i tried also this combination but it always segfaults to me. |
This worked for me. As @stefanprodan pointed out, the default
|
|
relates to #165 (comment) |
Same issue unfortunately
|
This apparently fixes sporadic crashes of arm64 image builds, see also [1] and [2]. Ubuntu's version of qemu-user does not seem to have this fixed yet either, therefore inject the current Debian package. In addition, this moves away from the floating docker.io/tonistiigi/binfmt:latest that docker/setup-qemu-action@v3 uses. This loose coupling is questionable, not only in the light of this issue. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087822 [2] tonistiigi/binfmt#215 Co-Developed-by: Felix Moessbauer <[email protected]> Signed-off-by: Jan Kiszka <[email protected]>
This apparently fixes sporadic crashes of arm64 image builds, see also [1] and [2]. Ubuntu's version of qemu-user does not seem to have this fixed yet either, therefore inject the current Debian package. In addition, this moves away from the floating docker.io/tonistiigi/binfmt:latest that docker/setup-qemu-action@v3 uses. This loose coupling is questionable, not only in the light of this issue. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087822 [2] tonistiigi/binfmt#215 Co-Developed-by: Felix Moessbauer <[email protected]> Signed-off-by: Jan Kiszka <[email protected]>
This is a workaround for an upstream issue with QEMU and binfmt on 24.04 tonistiigi/binfmt#215
This is a workaround for an upstream issue with QEMU and binfmt on 24.04 tonistiigi/binfmt#215
Avoid segfault probably related to tonistiigi/binfmt#215
Hello, thanks for binfmt.
When running
on a Dockerfile which installs node on ubuntu,
node --version
in the container build intermittently segfaults. I captured it happening withQEMU_STRACE=1
as below.For comparison I also provide a log with
node --version
working, at the same point as where the crash occurred above, but this time working, as it is an intermittent problem.Systemd coredump:
Dockerfile:
The text was updated successfully, but these errors were encountered: