-
Notifications
You must be signed in to change notification settings - Fork 843
Fix flake in TestWaitForEvent/WaitForEvent_build_block_after_re-org
#4789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Yacov Manevich <[email protected]>
Signed-off-by: Yacov Manevich <[email protected]>
This reverts commit ef3c797.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses a flaky test by increasing the context timeout in TestWaitForEvent from 2 to 5 seconds. The test was experiencing intermittent failures on overloaded machines because the 2-second timeout was too close to the internal 2-second wait period, causing the context to expire prematurely in edge cases.
- Increased timeout from 2 to 5 seconds to provide sufficient buffer for the internal wait operation
- Prevents context deadline expiration on slower/overloaded systems
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
TestWaitForEvent/WaitForEvent_build_block_after_re-org
|
Closes #4757 |
JonathanOppenheimer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change the test timeout in coreth as well?
Signed-off-by: Yacov Manevich <[email protected]>
maru-ava
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super smelly to rely on tight timing instead of state validation, so hopefully this goes away at some point in the not-too-distant future.
Why this should be merged
Fixes the flake by giving more time. The wait for event internally waits for 2 seconds and the deadline for the context is also 2 seconds which is borderline in times where the machine is overloaded and the context may expire too soon even though it should've succeeded.
How this works
Increases the timeout.
How this was tested
Commented out the other tests and ran the test 1000 times in parallel in a loop. Can be seen here.
Need to be documented in RELEASES.md?