-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow specifying libcxx builder image. #110303
Conversation
@llvm/pr-subscribers-lldb @llvm/pr-subscribers-github-workflow Author: Eric (EricWF) ChangesThis change attempts to shift the libc++ builders over to new backend This has been a long time in the making, and support from github This change should also demonstrate another important property: If this goes well, we'll be able to test the upgrade as a part Full diff: https://github.com/llvm/llvm-project/pull/110303.diff 1 Files Affected:
diff --git a/.github/workflows/libcxx-build-and-test.yaml b/.github/workflows/libcxx-build-and-test.yaml
index b5e60781e00064..64855dad7197da 100644
--- a/.github/workflows/libcxx-build-and-test.yaml
+++ b/.github/workflows/libcxx-build-and-test.yaml
@@ -49,7 +49,8 @@ env:
jobs:
stage1:
if: github.repository_owner == 'llvm'
- runs-on: libcxx-runners-8-set
+ runs-on: libcxx-runners-set
+ container: ghcr.io/libcxx/actions-builder:testing-2024-09-21
continue-on-error: false
strategy:
fail-fast: false
@@ -84,7 +85,8 @@ jobs:
**/crash_diagnostics/*
stage2:
if: github.repository_owner == 'llvm'
- runs-on: libcxx-runners-8-set
+ runs-on: libcxx-runners-set
+ container: ghcr.io/libcxx/actions-builder:testing-2024-09-21
needs: [ stage1 ]
continue-on-error: false
strategy:
@@ -160,20 +162,21 @@ jobs:
'benchmarks',
'bootstrapping-build'
]
- machine: [ 'libcxx-runners-8-set' ]
+ machine: [ 'libcxx-runners-set' ]
include:
- config: 'generic-cxx26'
- machine: libcxx-runners-8-set
+ machine: libcxx-runners-set
- config: 'generic-asan'
- machine: libcxx-runners-8-set
+ machine: libcxx-runners-set
- config: 'generic-tsan'
- machine: libcxx-runners-8-set
+ machine: libcxx-runners-set
- config: 'generic-ubsan'
- machine: libcxx-runners-8-set
+ machine: libcxx-runners-set
# Use a larger machine for MSAN to avoid timeout and memory allocation issues.
- config: 'generic-msan'
- machine: libcxx-runners-8-set
+ machine: libcxx-runners-set
runs-on: ${{ matrix.machine }}
+ container: ghcr.io/libcxx/actions-builder:testing-2024-09-21
steps:
- uses: actions/checkout@v4
- name: ${{ matrix.config }}
|
This is amazing! About the tests, I am not certain why the transitive includes test started failing, but I ran into something similar in #109720. I think this may be how we're running |
✅ With the latest revision this PR passed the C/C++ code formatter. |
e6e801a
to
055dc12
Compare
Since we don't generate a full dependency graph of headers, we can greatly simplify the script that parses the result of --trace-includes. At the same time, we also unify the mechanism for detecting whether a header is a public/C compat/internal/etc header with the existing mechanism in header_information.py. As a drive-by this fixes the headers_in_modulemap.sh.py test which had been disabled by mistake because it used its own way of determining the list of libc++ headers. By consistently using header_information.py to get that information, problems like this shouldn't happen anymore. This should also unblock #110303, which was blocked because of a brittle implementation of the transitive includes check which broke when the repository was cloned at a path like /path/__something/more.
How do I constantly fudge up my git history.... Fixing and force-pushing shortly. |
Since we don't generate a full dependency graph of headers, we can greatly simplify the script that parses the result of --trace-includes. At the same time, we also unify the mechanism for detecting whether a header is a public/C compat/internal/etc header with the existing mechanism in header_information.py. As a drive-by this fixes the headers_in_modulemap.sh.py test which had been disabled by mistake because it used its own way of determining the list of libc++ headers. By consistently using header_information.py to get that information, problems like this shouldn't happen anymore. This should also unblock llvm#110303, which was blocked because of a brittle implementation of the transitive includes check which broke when the repository was cloned at a path like /path/__something/more.
Np! Haven't merged it yet though. Just waiting for CI to pass |
4b01f56
to
ac41555
Compare
Hmm am I reading this right that the latest run still failed, despite the cherry-pick?
Ignore me |
When running in constrained environments like docker, disabling ASLR might fail with errors like: ``` AssertionError: False is not true : launch failed (Cannot launch '/__w/.../lldb-dap/stackTrace/subtleFrames/TestDAP_subtleFrames.test_subtleFrames/a.out': personality set failed: Operation not permitted) ``` E.g., #110303 Hence we already run `settings set target.disable-aslr false` as part of the init-commands for the non-DAP tests (see #88312 and https://discourse.llvm.org/t/running-lldb-in-a-container/76801). But we never adjusted it for the DAP tests. As a result we get conflicting test logs like: ``` { "arguments": { "commandEscapePrefix": null, "disableASLR": true, .... "initCommands": [ ... "settings set target.disable-aslr false", ``` Disabling ASLR by default in tests isn't useulf (it's only really a debugging aid for users). So this patch sets `disableASLR=False` by default.
FYI, had to adjust the flag in one other place. Feel free to rebase the branch on |
ac41555
to
c557438
Compare
It looks like it's still failing with the latest run :-( |
Argh that's unfortunate. How about we skip this test in libc++ CI to unblock this PR and I'll open a github issue to re-enable the test? @EricWF It's probably easiest if you just add llvm-project/libcxx/utils/ci/run-buildbot Line 397 in 8c4bc1e
But if you prefer me doing it separately, let me know. |
I have concerns about using the |
That's fair. In that case, @walter-erquinigo @clayborg Do you have any ideas on how to best debug this? Summary: the
Our theory was that this happened when trying to disable ASLR. So we're no longer doing that for the DAP tests. But we're still failing with the above. I'll try raise a draft PR that mimics this but with some additional LLDB logging. |
This change attempts to shift the libc++ builders over to new backend infrastructure that allows running an arbitrary container for the libc++ job. This has been a long time in the making, and support from github and gke is finally at the point where it's possible (hopefully). This change should also demonstrate another important property: No Downtime Upgrades. If this goes well, we'll be able to test the upgrade as a part of the PR process, and then commiting it to main should (ideally) not break anything.
ed532af
to
947e12d
Compare
Hmm so I opened a draft PR with this change and explicitly set With server patch: https://github.com/llvm/llvm-project/actions/runs/11552969549/job/32154860810?pr=113891 So it does look like this is still |
Ooh that's because it's hardcoded in the llvm-project/lldb/tools/lldb-dap/lldb-dap.cpp Lines 2103 to 2104 in f147437
Fix should be simple enough. Just need to always pass the |
More context can be found in llvm#110303 For DAP tests running in constrained environments (e.g., Docker containers), disabling ASLR isn't allowed. So we set `disableASLR=False` (since llvm#113593). However, the `dap_server.py` will currently only forward the value of `disableASLR` to the DAP executable if it's set to `True`. If the DAP executable wasn't provided a `disableASLR` field it defaults to `true` too (https://github.com/llvm/llvm-project/blob/f14743794587db102c6d1b20f9c87a1ac20decfd/lldb/tools/lldb-dap/lldb-dap.cpp#L2103-L2104). This means that passing `disableASLR=False` from the tests is currently not possible. This is also true for many of the other boolean arguments of `request_launch`. But this patch only addresses `disableASLR` for now since it's blocking a libc++ patch.
More context can be found in llvm#110303 For DAP tests running in constrained environments (e.g., Docker containers), disabling ASLR isn't allowed. So we set `disableASLR=False` (since llvm#113593). However, the `dap_server.py` will currently only forward the value of `disableASLR` to the DAP executable if it's set to `True`. If the DAP executable wasn't provided a `disableASLR` field it defaults to `true` too (https://github.com/llvm/llvm-project/blob/f14743794587db102c6d1b20f9c87a1ac20decfd/lldb/tools/lldb-dap/lldb-dap.cpp#L2103-L2104). This means that passing `disableASLR=False` from the tests is currently not possible. This is also true for many of the other boolean arguments of `request_launch`. But this patch only addresses `disableASLR` for now since it's blocking a libc++ patch.
More context can be found in #110303 For DAP tests running in constrained environments (e.g., Docker containers), disabling ASLR isn't allowed. So we set `disableASLR=False` (since #113593). However, the `dap_server.py` will currently only forward the value of `disableASLR` to the DAP executable if it's set to `True`. If the DAP executable wasn't provided a `disableASLR` field it defaults to `true` too: https://github.com/llvm/llvm-project/blob/f14743794587db102c6d1b20f9c87a1ac20decfd/lldb/tools/lldb-dap/lldb-dap.cpp#L2103-L2104 This means that passing `disableASLR=False` from the tests is currently not possible. This is also true for many of the other boolean arguments of `request_launch`. But this patch only addresses `disableASLR` for now since it's blocking a libc++ patch.
Just merged the fix. Let me know if you're still facing issues after the rebase |
@Michael137 Thanks for addressing this. I really appreciate it. |
When running in constrained environments like docker, disabling ASLR might fail with errors like: ``` AssertionError: False is not true : launch failed (Cannot launch '/__w/.../lldb-dap/stackTrace/subtleFrames/TestDAP_subtleFrames.test_subtleFrames/a.out': personality set failed: Operation not permitted) ``` E.g., llvm#110303 Hence we already run `settings set target.disable-aslr false` as part of the init-commands for the non-DAP tests (see llvm#88312 and https://discourse.llvm.org/t/running-lldb-in-a-container/76801). But we never adjusted it for the DAP tests. As a result we get conflicting test logs like: ``` { "arguments": { "commandEscapePrefix": null, "disableASLR": true, .... "initCommands": [ ... "settings set target.disable-aslr false", ``` Disabling ASLR by default in tests isn't useulf (it's only really a debugging aid for users). So this patch sets `disableASLR=False` by default.
More context can be found in llvm#110303 For DAP tests running in constrained environments (e.g., Docker containers), disabling ASLR isn't allowed. So we set `disableASLR=False` (since llvm#113593). However, the `dap_server.py` will currently only forward the value of `disableASLR` to the DAP executable if it's set to `True`. If the DAP executable wasn't provided a `disableASLR` field it defaults to `true` too: https://github.com/llvm/llvm-project/blob/f14743794587db102c6d1b20f9c87a1ac20decfd/lldb/tools/lldb-dap/lldb-dap.cpp#L2103-L2104 This means that passing `disableASLR=False` from the tests is currently not possible. This is also true for many of the other boolean arguments of `request_launch`. But this patch only addresses `disableASLR` for now since it's blocking a libc++ patch.
@EricWF The CI failures are unrelated issues on I'll let you merge this since you likely want to adjust the capacity and other stuff before or closely after you merge, but as far as I'm concerned this is good to go. Thanks a whole lot for this improvement! |
@ldionne Squashed and merged. I'll be watching the bots closely |
This change attempts to shift the libc++ builders over to new backend
infrastructure that allows running an arbitrary container for the
libc++ job.
This has been a long time in the making, and support from github
and gke is finally at the point where it's possible (hopefully).
This change should also demonstrate another important property:
No Downtime Upgrades.
If this goes well, we'll be able to test the upgrade as a part
of the PR process, and then commiting it to main should (ideally)
not break anything.