notification_interval in escalation is not used #303

lgrn · 2018-11-12T14:49:53Z

Configure a service to have notification_interval 5 and a unique contact.
Tie two "1:0" escalations ("start on first notification, never end") to the service with notification_interval 10 and 15 respectively, and unique contacts.

The service now has two overlapping escalations, each with unique contacts, as well as its own service-specific contact and notification_interval. Now make the service go CRITICAL.

Expected behavior: Since escalations are active, service settings in regards to contacts and notification intervals should presumably be overridden by what the escalation(s) dictate. The contacts should get notifications at separate intervals, decided by what escalation they belong to (10 or 15 minutes). The service contact and service notification interval should
not be used.

What happens instead: Contacts are correctly taken from the escalations (the service contact is not notified), but the notification_interval is taken from the service, and applied for both of these contacts, instead of the notification_interval specified in the escalation.

Log example, where "CONTACT1" and "CONTACT2" are from the two escalations respectively. Service notifications are sent and logged with a 600 second interval (5 minutes, from the service object).

 [1542029872] SERVICE NOTIFICATION: CONTACT1;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.012ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542029872] SERVICE NOTIFICATION: CONTACT2;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.012ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542030472] SERVICE NOTIFICATION: CONTACT1;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.010ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542030472] SERVICE NOTIFICATION: CONTACT2;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.010ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542031072] SERVICE NOTIFICATION: CONTACT1;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.010ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542031072] SERVICE NOTIFICATION: CONTACT2;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.010ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542031672] SERVICE NOTIFICATION: CONTACT1;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.010ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542031672] SERVICE NOTIFICATION: CONTACT2;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.010ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542032272] SERVICE NOTIFICATION: CONTACT1;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.010ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542032272] SERVICE NOTIFICATION: CONTACT2;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.010ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542032872] SERVICE NOTIFICATION: CONTACT1;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.012ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542032872] SERVICE NOTIFICATION: CONTACT2;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.012ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542033472] SERVICE NOTIFICATION: CONTACT1;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.020ms, lost 0% :: 1.3.3.7: rta nan, lost 100%
 [1542033472] SERVICE NOTIFICATION: CONTACT2;apanarenapa;ping;CRITICAL;service-notify;CRITICAL - 127.0.0.1: rta 0.020ms, lost 0% :: 1.3.3.7: rta nan, lost 100%

Escalations:

define serviceescalation{
     host_name                      apanarenapa
     service_description            ping
     contacts                       CONTACT2
     first_notification             1
     last_notification              0
     notification_interval          10
     escalation_options             c,r,u,w
     }
 
 define serviceescalation{
     host_name                      apanarenapa
     service_description            ping
     contacts                       CONTACT1
     first_notification             1
     last_notification              0
     notification_interval          15
     escalation_options             c,r,u,w
     }

Service object:

define service{
     use                            default-service
     host_name                      apanarenapa
     service_description            ping
     check_command                  check_ping!500,90%!800,100%
     check_interval                 1
     notification_interval          5
     contacts                       never_used
     }

Template:

 define service{
     is_volatile                    0
     max_check_attempts             3
     check_interval                 5
     retry_interval                 1
     active_checks_enabled          1
     passive_checks_enabled         1
     check_period                   24x7
     parallelize_check              0
     obsess                         0
     check_freshness                0
     event_handler_enabled          1
     flap_detection_enabled         1
     process_perf_data              1
     retain_status_information      1
     retain_nonstatus_information   1
     notification_interval          0
     notification_period            24x7
     notification_options           c,f,r,s,u,w
     notifications_enabled          1
     contacts                       never_used
     register                       0
     name                           default-service
     }

OP5 Jira: https://jira.op5.com/browse/MON-11356

The text was updated successfully, but these errors were encountered:

vpber · 2021-11-03T09:46:47Z

This is somewhat by design, from the Naemon documentation https://www.naemon.org/documentation/usersguide/objectdefinitions.html#serviceescalation:
Note: If multiple escalation entries for a host overlap for one or more notification ranges, the smallest notification interval from all escalation entries is used.

So actually the expected behavior is:
Since escalations are active, service settings in regards to contacts and notification intervals should presumably be overridden by what the escalation(s) dictate. The contacts should get notifications at the lowest defined escalation interval (10 min), regardless of what escalation they belong to (10 or 15 minutes) because they are overlapping. The service contact and service notification interval should not be used.

The error in your case is that Naemon does not use the lowest defined escalation interval (10 min) but uses the "original" service notification interval (5 min) instead.

Also I guess the documentation should say "multiple escalation entries for a service" instead of "host" here.

sni transferred this issue from naemon/naemon Aug 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notification_interval in escalation is not used #303

notification_interval in escalation is not used #303

lgrn commented Nov 12, 2018

vpber commented Nov 3, 2021

notification_interval in escalation is not used #303

notification_interval in escalation is not used #303

Comments

lgrn commented Nov 12, 2018

vpber commented Nov 3, 2021