Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race-condition while calculating the object state. #10364

Open
w1ll-i-code opened this issue Mar 10, 2025 · 1 comment · May be fixed by #10372
Open

Race-condition while calculating the object state. #10364

w1ll-i-code opened this issue Mar 10, 2025 · 1 comment · May be fixed by #10372
Labels
area/checks Check execution and results

Comments

@w1ll-i-code
Copy link

Describe the bug

The function Checkable::ProcessCheckResult in the file lib/icinga/checkable-check.cpp locks and unlocks the Mutex on the Objects several times. If two check results come in at the same time (e.g. through a running check and the /v1/actions/process-check-result endpoint) it can happen that both update the object before they dispatch the events to the IDO and notifications, resulting in the second state being dispatched multiple times:

To Reproduce

This is used in production for a countdown to notify if no event came in in the last few minutes

  1. Create a check that always returns a warning
  2. Create a service that runs that check every 30 seconds (or less, for more events)
  3. Send repeatedly check results for that service to the /v1/actions/process-check-result endpoint
  4. After some time, there should occur this exact scenario
  5. The chances can be improved with multiple services
  6. The chances are improved after a restart, as multiple services need to rerun the check, while a retry logic for the checkresult endpoint will send the new state.
  7. The occurrence can be detected by reading out the icinga:history:stream:state redis stream from icingadb, checking if an event was sent twice.

Expected behavior

I expect all state changes to trigger the appropriate events, even if they happened at the same time.

Screenshots

Image

Your Environment

Include as many relevant details about the environment you experienced the problem in

  • Version used (icinga2 --version): r2.14.3-1
  • Operating System and version: Red Hat Enterprise Linux 8.10
@w1ll-i-code
Copy link
Author

Screenshot from the IDO, but this also occurs on icingadb, as the bug happens before the events to those are dispatched.

@w1ll-i-code w1ll-i-code linked a pull request Mar 12, 2025 that will close this issue
@yhabteab yhabteab added the area/checks Check execution and results label Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/checks Check execution and results
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants