-
Notifications
You must be signed in to change notification settings - Fork 35
scylla_image_setup: avoid script error when perftune.py failed #787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scylla_image_setup: avoid script error when perftune.py failed #787
Conversation
|
@syuu1228 please rebase to latest |
|
|
||
| LOGGER = logging.getLogger(__name__) | ||
|
|
||
| def disable_perftune(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@syuu1228 don't we need perftune running ? what is the impact of disabling it?
|
@syuu1228 please rebase , it seems you 2 additional commits that don't suppose to be here |
7b3c918 to
c780e71
Compare
|
Rebased with latest next |
78ebc60 to
7dba512
Compare
adbb50a to
ab18dcf
Compare
|
Added code to handle special exit code, which is added scylla_sysconfig_setup on scylladb/scylladb#26344 |
|
next-machine-image passed: https://jenkins.scylladb.com/view/master/job/scylla-master/job/releng-testing/job/next-machine-image/544/ |
|
@avikivity Please review this as well, this is scylla-machine-image part of scylladb/seastar#2956 |
@avikivity ping review |
| except subprocess.CalledProcessError as e: | ||
| if e.returncode == 3: | ||
| disable_perftune() | ||
| LOGGER.warning('Failed to enable perftune.py, continue without using it') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yaronkaikov one problem this change may cause is that our tests will miss these failures.
We may need to introduce a specific check that this warning isn't there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have talked with @syuu1228 about this, we should probably raise this message only for the known instance types with the issues we know of, other wise when we hit a regression we would never find it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented the limitation. Now it avoid backtrace only on im4gn.8xlarge and im4gn.16xlarge, which are the target of this patch. On other instance type or on other IaaS, it just causes backtrace.
ec7b1dc to
14dd0ae
Compare
| except subprocess.CalledProcessError as e: | ||
| if e.returncode == 3: | ||
| disable_perftune() | ||
| LOGGER.warning('Failed to enable perftune.py, continue without using it') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have talked with @syuu1228 about this, we should probably raise this message only for the known instance types with the issues we know of, other wise when we hit a regression we would never find it
0feeab4 to
43b1037
Compare
|
next-machine-image passed: https://jenkins.scylladb.com/view/master/job/scylla-master/job/releng-testing/job/next-machine-image/577/ |
|
@gmizrahi Please test this PR, here's AMI ID and version tag: |
|
@yaronkaikov please review updated patch |
43b1037 to
498c324
Compare
The merge-base changed after approval.
Switch from print() to logging API on scylla_image_setup.
On EC2, some environments cause errors on perftune.py due to corrupted NUMA topology information. Even in such an environment, scylla_image_setup should not cause a script error. It should handle the error, print a warning message, and then continue the startup process. Related scylladb/seastar#2925
498c324 to
ba96671
Compare
@avikivity ping |
|
longevity failed, but the error is known issue so we can ignore this (described at #712 (comment)): https://jenkins.scylladb.com/view/master/job/scylla-master/job/releng-testing/job/longevity/job/longevity-100gb-4h-test/110/ |
|
@yaronkaikov all tests passed, I think we can merge this |
On EC2, some environments cause errors on perftune.py due to corrupted
NUMA topology information.
Even in such an environment, scylla_image_setup should not cause a script error.
It should handle the error, print a warning message, and then continue the startup process.
Related scylladb/seastar#2925
This is scylla-machine-image part of the fix for scylladb/seastar#2925