-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restart caclmgrd whenever catch exception in child thread or in main thread #194
base: master
Are you sure you want to change the base?
Conversation
…thread Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
scripts/caclmgrd:858
- The parameter name 'exception_queue' is clear and appropriate.
def check_and_update_control_plane_acls(self, namespace, num_changes, exception_queue):
scripts/caclmgrd:1055
- The exception handling logic is correctly implemented and ensures that the process is terminated if an exception occurs in any child thread.
namespace, error, _ = exception_queue.get_nowait()
scripts/caclmgrd
Outdated
msg = traceback.format_exception(exc_type, exc_value, exc_traceback) | ||
for tb_line in msg: | ||
for tb_line_split in tb_line.splitlines(): | ||
self.log_error(tb_line_split) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In main thread, it still needs to keep checking if there are next db updates, so it can't join and wait for the child thread's result and handle the exception, that's why I choose exception queue and only check if the queue is empty or not before starting checking the db updates every time in main thread.
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Zhaohui Sun <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@qiluo-msft could you please help review? thanks |
Description
If there is exception happens in child thread of caclmgrd, the whole caclmgrd service will get stuck.
Can't get any chance to recover it until we restart caclmgrd service manually.
So, if it detect any exception in child thread or main thread, it will kill the caclmgrd process itself and systemctl service will restart it.
Microsoft work item
27122359
Test evidence