Improve stop by adding step to first send TERM to just the process #198

benhoyt · 2023-02-28T22:09:08Z

Currently when a service is stopped, Pebble does the following:

Send SIGTERM to the process group (negative pid). If the process pid exits, consider that a success.
If the pid hasn't exited before a kill delay, send SIGKILL to the process group.

We want to make that a bit more graceful by changing it to the following:

Send SIGTERM to only the individual process pid. If all the pids in the process tree exit, consider that a success.
If some of the pids in the tree don't exit, send SIGTERM to each process in the tree that is still alive.
If some of the pids are still alive after the kill delay, send SIGKILL to each process in the tree that is still alive.

Ideally we'd use the cgroup process tree to enumerate subprocesses, as that includes things like daemon processes. However, if that's too time-consuming, we could start with a simplification: for steps 2 and 3, wait for the process group and send signals to the process group (negative pid) instead of the tree.

Context: this is in part to better handle the issue in https://bugs.launchpad.net/juju/+bug/2008443 (but should also make stopping nicer in general).

See also my "Using cgroups in Pebble - design notes" doc.

See also #149.

benhoyt · 2023-08-21T03:15:56Z

It turns out that cgroups are hard to use in containers (in Docker they require privileged containers, in K8s/containerd they require a special setting), so while the cgroups approach might be good for using Pebble on machines, it's no good for the K8s use case (eg: Juju sidecar charms). We have fixed the initial issue surfaced by Patroni that brought this up in #149. But leaving this open to consider future work improving the process tree and termination handling (whether it's via cgroups, /proc, or an injected environment variable).

benhoyt · 2024-09-30T01:06:33Z

This relies on improving the process tree handling with cgroups (which we decided not to do), and with the improvements we've made earlier to service handling, this hasn't been asked for or needed. Going to close for now -- we can also revisit in future.

benhoyt added Feature A feature request Low Priority The opposite of "Priority" labels Mar 13, 2024

benhoyt closed this as not planned Won't fix, can't repro, duplicate, stale Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve stop by adding step to first send TERM to just the process #198

Improve stop by adding step to first send TERM to just the process #198

benhoyt commented Feb 28, 2023 •

edited

Loading

benhoyt commented Aug 21, 2023

benhoyt commented Sep 30, 2024

Improve stop by adding step to first send TERM to just the process #198

Improve stop by adding step to first send TERM to just the process #198

Comments

benhoyt commented Feb 28, 2023 • edited Loading

benhoyt commented Aug 21, 2023

benhoyt commented Sep 30, 2024

benhoyt commented Feb 28, 2023 •

edited

Loading