Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency with timeperiod may incorrectly make redundancy group reachable #10377

Open
julianbrost opened this issue Mar 13, 2025 · 0 comments
Open
Labels
area/runtime Downtimes, comments, dependencies, events

Comments

@julianbrost
Copy link
Contributor

Dependency::IsAvailable() currently simply returns true both in situations where the dependency is satisfied and where it should be more or less ignored:

bool Dependency::IsAvailable(DependencyType dt) const
{
Checkable::Ptr parent = GetParent();
Host::Ptr parentHost;
Service::Ptr parentService;
tie(parentHost, parentService) = GetHostService(parent);
/* ignore if it's the same checkable object */
if (parent == GetChild()) {
Log(LogNotice, "Dependency")
<< "Dependency '" << GetName() << "' passed: Parent and child " << (parentService ? "service" : "host") << " are identical.";
return true;
}
/* ignore pending */
if (!parent->GetLastCheckResult()) {
Log(LogNotice, "Dependency")
<< "Dependency '" << GetName() << "' passed: Parent " << (parentService ? "service" : "host") << " '" << parent->GetName() << "' hasn't been checked yet.";
return true;
}
if (GetIgnoreSoftStates()) {
/* ignore soft states */
if (parent->GetStateType() == StateTypeSoft) {
Log(LogNotice, "Dependency")
<< "Dependency '" << GetName() << "' passed: Parent " << (parentService ? "service" : "host") << " '" << parent->GetName() << "' is in a soft state.";
return true;
}
} else {
Log(LogNotice, "Dependency")
<< "Dependency '" << GetName() << "' failed: Parent " << (parentService ? "service" : "host") << " '" << parent->GetName() << "' is in a soft state.";
}
int state;
if (parentService)
state = ServiceStateToFilter(parentService->GetState());
else
state = HostStateToFilter(parentHost->GetState());
/* check state */
if (state & GetStateFilter()) {
Log(LogNotice, "Dependency")
<< "Dependency '" << GetName() << "' passed: Parent " << (parentService ? "service" : "host") << " '" << parent->GetName() << "' matches state filter.";
return true;
}
/* ignore if not in time period */
TimePeriod::Ptr tp = GetPeriod();
if (tp && !tp->IsInside(Utility::GetTime())) {
Log(LogNotice, "Dependency")
<< "Dependency '" << GetName() << "' passed: Outside time period.";
return true;
}
if (dt == DependencyCheckExecution && !GetDisableChecks()) {
Log(LogNotice, "Dependency")
<< "Dependency '" << GetName() << "' passed: Checks are not disabled.";
return true;
} else if (dt == DependencyNotification && !GetDisableNotifications()) {
Log(LogNotice, "Dependency")
<< "Dependency '" << GetName() << "' passed: Notifications are not disabled";
return true;
}
Log(LogNotice, "Dependency")
<< "Dependency '" << GetName() << "' failed. Parent "
<< (parentService ? "service" : "host") << " '" << parent->GetName() << "' is "
<< (parentService ? Service::StateToString(parentService->GetState()) : Host::StateToString(parentHost->GetState()));
return false;
}

Before the introduction of redundancy groups, this was perfectly fine as both situations could be handled identically. However, with redundancy groups, that's no longer the case and this can result in strange effects.

Imagine the following situation: one child, depending on two parents in a redundancy group where both parents are critical. Intuitively, in that case the redundancy group is failed and the child should be considered unreachable. But what happens if (at least) one of the dependencies has a timeperiod configured and it's currently outside of that period? Dependency::IsAvailable() returns true and thus the redundancy groups finds an available dependency and treats the whole group as available.

Instead, the individual dependencies should probably have three states instead: available, failed, and irrelevant (because outside of the timeperiod, there may be other cases where it's irrelevant, timeperiods are just what caught my attention). Combining the state of individual dependency objects should then ignore irrelevant ones.

refs https://github.com/Icinga/icinga2/pull/10290#discussion_r1950625454[^1]

@julianbrost julianbrost added the area/runtime Downtimes, comments, dependencies, events label Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/runtime Downtimes, comments, dependencies, events
Projects
None yet
Development

No branches or pull requests

1 participant