-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixup Executor::spin_all() regression fix #2517
Conversation
Checks if executors are busy waiting while they should block in spin_some or spin_all. Signed-off-by: Janosch Machowinski <[email protected]>
This test was strange. It looked like, it assumed that spin_all did not return instantly. Also it was racy, as the thread could terminate instantly. Signed-off-by: Janosch Machowinski <[email protected]>
…available Signed-off-by: Janosch Machowinski <[email protected]>
Co-authored-by: Tomoya Fujita <[email protected]> Signed-off-by: jmachowinski <[email protected]>
Signed-off-by: Janosch Machowinski <[email protected]>
Before, the method would not recollect available work in case of spin_some, spin_all. This would lead to the method behaving differently than to what the documentation states. Signed-off-by: Janosch Machowinski <[email protected]>
Signed-off-by: William Woodall <[email protected]>
Signed-off-by: William Woodall <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some notes for anyone reviewing this.
// clear the wait result and wait for work without blocking to collect the work | ||
// for the first time | ||
// both spin_some and spin_all wait for work at the beginning | ||
wait_result_.reset(); | ||
wait_for_work(std::chrono::milliseconds(0)); | ||
bool just_waited = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change wasn't part of the original report (busy wait in spin_all) but it is related and is needed. It essentially ensures that we don't start out spin_some or spin_all by iterating over existing work from a previous spin method. Ideally the other spin methods would reset the result before returning so that this wouldn't be necessary, but this is safer. We do want to do this because in the case of spin_some this could cause it to never wait, iterating over old work and exiting after. This isn't really what we want.
Also, it is safe to throw away a wait result and wait again as all the things we wait on need to be able to be waited on, be ready, not be handled, be waited on again, and then be ready again. Or else they need to be ok with not being handled. Put another way, being waited on and then being ready does not guarantee being executed, and it may require being waited on again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall change and test looks good. My only nit is that we need to probably put some of your comments here in the file for posterity.
Signed-off-by: William Woodall <[email protected]>
I added a comment in 8587653, so if the Rpr job comes back green I'll merge without new CI, since it was just a comment and Rpr checks style and linting. |
I will then follow up with a jazzy backport. |
@Mergifyio backport jazzy |
✅ Backports have been created
|
* test(Executors): Added tests for busy waiting Checks if executors are busy waiting while they should block in spin_some or spin_all. Signed-off-by: Janosch Machowinski <[email protected]> * fix: Reworked spinAll test This test was strange. It looked like, it assumed that spin_all did not return instantly. Also it was racy, as the thread could terminate instantly. Signed-off-by: Janosch Machowinski <[email protected]> * fix(Executor): Fixed spin_all not returning instantly is no work was available Signed-off-by: Janosch Machowinski <[email protected]> * Update rclcpp/test/rclcpp/executors/test_executors.cpp Co-authored-by: Tomoya Fujita <[email protected]> Signed-off-by: jmachowinski <[email protected]> * test(executors): Added test for busy waiting while calling spin Signed-off-by: Janosch Machowinski <[email protected]> * fix(executor): Reset wait_result on every call to spin_some_impl Before, the method would not recollect available work in case of spin_some, spin_all. This would lead to the method behaving differently than to what the documentation states. Signed-off-by: Janosch Machowinski <[email protected]> * restore previous test logic for now Signed-off-by: William Woodall <[email protected]> * refactor spin_some_impl's logic and improve busy wait tests Signed-off-by: William Woodall <[email protected]> * added some more comments about the implementation Signed-off-by: William Woodall <[email protected]> --------- Signed-off-by: Janosch Machowinski <[email protected]> Signed-off-by: jmachowinski <[email protected]> Signed-off-by: William Woodall <[email protected]> Co-authored-by: Janosch Machowinski <[email protected]> Co-authored-by: jmachowinski <[email protected]> Co-authored-by: Tomoya Fujita <[email protected]> (cherry picked from commit f7185dc)
* test(Executors): Added tests for busy waiting Checks if executors are busy waiting while they should block in spin_some or spin_all. Signed-off-by: Janosch Machowinski <[email protected]> * fix: Reworked spinAll test This test was strange. It looked like, it assumed that spin_all did not return instantly. Also it was racy, as the thread could terminate instantly. Signed-off-by: Janosch Machowinski <[email protected]> * fix(Executor): Fixed spin_all not returning instantly is no work was available Signed-off-by: Janosch Machowinski <[email protected]> * Update rclcpp/test/rclcpp/executors/test_executors.cpp Co-authored-by: Tomoya Fujita <[email protected]> Signed-off-by: jmachowinski <[email protected]> * test(executors): Added test for busy waiting while calling spin Signed-off-by: Janosch Machowinski <[email protected]> * fix(executor): Reset wait_result on every call to spin_some_impl Before, the method would not recollect available work in case of spin_some, spin_all. This would lead to the method behaving differently than to what the documentation states. Signed-off-by: Janosch Machowinski <[email protected]> * restore previous test logic for now Signed-off-by: William Woodall <[email protected]> * refactor spin_some_impl's logic and improve busy wait tests Signed-off-by: William Woodall <[email protected]> * added some more comments about the implementation Signed-off-by: William Woodall <[email protected]> --------- Signed-off-by: Janosch Machowinski <[email protected]> Signed-off-by: jmachowinski <[email protected]> Signed-off-by: William Woodall <[email protected]> Co-authored-by: Janosch Machowinski <[email protected]> Co-authored-by: jmachowinski <[email protected]> Co-authored-by: Tomoya Fujita <[email protected]> (cherry picked from commit f7185dc) Co-authored-by: William Woodall <[email protected]>
} | ||
EXPECT_GT(this->waitable->get_count(), 0u); | ||
|
||
first_check_passed = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to lock cv_m while accessing the share state (e.g. first_check_passed). If you don't do this, you get race conditions
cv.wait_for(lk, 10s); | ||
} | ||
EXPECT_GT(this->waitable->get_count(), 0u); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is racy / does not really check for busy waits. You need to wait a bit here, to give a busy waiting executor time to call is_ready a lot of times. As the test is implemented right now, it could pass even if the executor is busy waiting.
This is a follow up of #2509 which addresses #2508.