Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on GitHub's macos-14 runner #8858

Closed
rami3l opened this issue Aug 9, 2024 · 3 comments
Closed

Segmentation fault on GitHub's macos-14 runner #8858

rami3l opened this issue Aug 9, 2024 · 3 comments
Assignees
Labels
bug This issue is a bug. p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Comments

@rami3l
Copy link

rami3l commented Aug 9, 2024

Describe the bug

Hello there! Rustup maintainer here.

We use AWS S3 to manage our releases, and since recently when running aws --debug s3 cp --recursive using GitHub Actions' preinstalled aws CLI, the command might fail with a segfault.

The original Zulip post reporting this issue: https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Strange.20error.20with.20Rustup's.20release.20process.20on.20macOS.20ARM64.

Expected Behavior

The command runs without issue and exit successfully.

Current Behavior

aws --debug s3 cp --recursive deploy/ s3://rustup-builds/5b5ec92726932b280e2f6e60f17df50644803a68
2024-08-06 13:56:38,328 - MainThread - awscli.clidriver - DEBUG - CLI version: aws-cli/2.17.18 Python/3.11.9 Darwin/23.5.0 exe/x86_64
2024-08-06 13:56:38,329 - MainThread - awscli.clidriver - DEBUG - Arguments entered to CLI: ['--debug', 's3', 'cp', '--recursive', 'deploy/', 's3://rustup-builds/5b5ec92726932b280e2f6e60f17df50644803a68']
2024-08-06 13:56:38,351 - MainThread - botocore.hooks - DEBUG - Event building-command-table.main: calling handler <function add_s3 at 0x112c41120>
[..]
2024-08-06 13:56:38,524 - ThreadPoolExecutor-1_0 - s3transfer.utils - DEBUG - Acquiring 0
2024-08-06 13:56:38,524 - ThreadPoolExecutor-0_9 - s3transfer.tasks - DEBUG - CompleteMultipartUploadTask(transfer_id=0, {'bucket': 'rustup-builds', 'key': '5b5ec92726932b280e2f6e60f17df50644803a68/dist/aarch64-apple-darwin/rustup-init', 'extra_args': {}}) about to wait for the following futures [<s3transfer.futures.ExecutorFuture object at 0x11370b890>, <s3transfer.futures.ExecutorFuture object at 0x11369cd10>, <s3transfer.futures.ExecutorFuture object at 0x1137156d0>]
2024-08-06 13:56:38,524 - ThreadPoolExecutor-0_9 - s3transfer.tasks - DEBUG - CompleteMultipartUploadTask(transfer_id=0, {'bucket': 'rustup-builds', 'key': '5b5ec92726932b280e2f6e60f17df50644803a68/dist/aarch64-apple-darwin/rustup-init', 'extra_args': {}}) about to wait for <s3transfer.futures.ExecutorFuture object at 0x11370b890>
2024-08-06 13:56:38,524 - ThreadPoolExecutor-1_0 - s3transfer.utils - DEBUG - Releasing acquire 0/None
/Users/runner/work/_temp/5fd0079c-2c9a-42bc-9d2f-99ce5c32b24d.sh: line 1:  3767 Segmentation fault: 11  aws --debug s3 cp --recursive deploy/ s3://rustup-builds/5b5ec92726932b280e2f6e60f17df50644803a68
Error: Process completed with exit code 139.

https://github.com/rust-lang/rustup/actions/runs/10267911140/job/28409617969

Reproduction Steps

Run the command under the macos-14 environment, and it will fail quite often.

Possible Solution

This could also be a problem with https://github.com/actions/runner-images updating to a faulty version of AWS CLI.

Please note that the repo above claims to have installed AWS CLI like so:

echo "Installing aws..."
awscliv2_pkg_path=$(download_with_retry "https://awscli.amazonaws.com/AWSCLIV2.pkg")
sudo installer -pkg "$awscliv2_pkg_path" -target /

https://github.com/actions/runner-images/blob/22143c7c6811f8936d42f49d964007df427788cb/images/macos/scripts/build/install-aws-tools.sh#L9-L11

Additional Information/Context

This doesn't seem to reproduce with other runners in our CI, so it could be ARM64 macOS-specific:

Current runner version: '2.317.0'
Operating System
  macOS
  14.5
  23F79
Runner Image
  Image: macos-14-arm64
  Version: 20240728.1
  Included Software: https://github.com/actions/runner-images/blob/macos-14-arm64/20240728.1/images/macos/macos-14-arm64-Readme.md
  Image Release: https://github.com/actions/runner-images/releases/tag/macos-14-arm64%2F20240728.1
Runner Image Provisioner
  2.0.374.1+4097a9592d27ce71de414581a65bffbda888dd1b

CLI version used

aws-cli/2.17.18 Python/3.11.9 Darwin/23.5.0 exe/x86_64 [sic]

Environment details (OS name and version, etc.)

macOS ARM64 14.5

@rami3l rami3l added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Aug 9, 2024
@tim-finnigan tim-finnigan self-assigned this Aug 14, 2024
@tim-finnigan
Copy link
Contributor

Thanks for reaching out. I was not able to reproduce this issue. You mentioned that "it will fail quite often", could you provide more details on those failures? Can you provide steps for reproducing this issue consistently?

@tim-finnigan tim-finnigan added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Aug 14, 2024
@rami3l
Copy link
Author

rami3l commented Aug 15, 2024

Thanks for reaching out. I was not able to reproduce this issue. You mentioned that "it will fail quite often", could you provide more details on those failures? Can you provide steps for reproducing this issue consistently?

@tim-finnigan Sorry for my late update!

As far as I know it's really just a regular aws s3 cp call that involves multipart uploads, that's why I found this bug extremely confusing.

After a few days of investigation on my side, I've noticed that this problem is only reproducible when running a x64 version of AWS CLI under Rosetta 2, and after switching to the native version the problem seems gone: rust-lang/rustup#3989

I'm not sure why GitHub Actions is shipping the x64 version of AWS CLI though. Is that the expected behavior with https://awscli.amazonaws.com/AWSCLIV2.pkg?

Anyway, since this might not be a supported scenario, my best bet would be asking GitHub instead 🤔

Have a nice day :)

@rami3l rami3l closed this as not planned Won't fix, can't repro, duplicate, stale Aug 15, 2024
Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.
Projects
None yet
Development

No branches or pull requests

2 participants