-
Notifications
You must be signed in to change notification settings - Fork 137
Description
We detected root cause for most recent dhcpcd program crashes that happened on dhcpcd --exit resulting in handling SIGTERM. Basically problem was in eloop_start() where exitnow was checked to late if SIGTERM was process, resulting in entering eloop_run_ppoll() -> ppoll after stop_all_interfaces() already done.
int
eloop_start(struct eloop *eloop, sigset_t *signals)
{
int error;
struct eloop_timeout *t;
struct timespec ts, *tsp;
assert(eloop != NULL);
#ifdef HAVE_KQUEUE
UNUSED(signals);
#endif
for (;;) {
**if (eloop->exitnow)
break;**
#ifndef HAVE_KQUEUE
if (_eloop_nsig != 0) {
int n = _eloop_sig[--_eloop_nsig];
if (eloop->signal_cb != NULL)
eloop->signal_cb(n, eloop->signal_cb_ctx);
continue;
}
#endif
..
error = eloop_run_ppoll(eloop, tsp, signals);
As a consequence sometimes we would detect some network event over ppoll() and still call corresponding callback e.g. REPLY6 after SIGTERM already handled.
We suggest moving this check:
**if (eloop->exitnow)
break;**
just below if (_eloop_nsig != 0)
So once SIGTERM callback dhcpcd_signal_cb() is done and we are back in eloop_start() we should just exit and prevent falling through in ppoll() call anymore.