TLS: use fewer inotify instances and/or be more resilient to them not working #2513

nyh · 2024-10-21T07:02:37Z

Recently some ScyllaDB test runs failed (scylladb/scylladb#21199) with the error:

std::system_error (error system:24, could not create inotify instance: Too many open files)

It turns out that Seastar's TLS implementation internally uses inotify to automatically recognize when the certificate files have changed. The error message is misleading - for inotify_init() an EMFILE errno (which gets printed as "Too many open files") does not refer the limit on number of open files, but to the separate per-user limit on the number of inotify instances. This number is configured in /proc/sys/fs/inotify/max_user_instances and is often fairly low - e.g., on my Fedora 40 it defaults to just 128. It appears that Seastar creates an inotify instance for each shard, and the ScyllaDB test framework ran many tests in parallel, and the result was running out of inotify instances.

The ScyllaDB testers easily solved this problem by increasing the /proc/sys/fs/inotify/max_user_instances, but I'm worried that this problem can hit other Seastar users as well, who won't be aware that Seastar TLS even uses inotify, and that /proc/sys/fs/inotify/max_user_instances is so low. I want to propose that we consider two options, perhaps even both:

Use fewer inotify instances - perhaps even just once per per process instead of one per shard (e.g., have just shard 0 be responsible for the very rare inotify work). On modern many-shard machines, this will make a big difference.
Be forgiving towards failing inotify_init() in TLS: Make it a logged warning (don't use the phrase "Too many open files" in the logged message) - not an exception. When inotify_init() failed, either the certificate reloading would not work at all - or if we consider it an important feature, we can implement something similar without inotify (e.g., once a minute check if the file changed).

CC @elcallio

The text was updated successfully, but these errors were encountered:

bhalevy · 2024-12-08T10:14:41Z

@regevran please assign

bhalevy · 2024-12-08T10:17:02Z

Is it possible the inotify instances are leaking somehow when running the tests via test.py?

elcallio · 2024-12-08T10:31:20Z

No. The problem is that we use a few per shard (one for each file referenced - typically 3-4 per TLS config, say up to 3-4 TLS configs in scylla.yaml, run 8 shards and then run tests in parallel. This adds up. With ludicrously small default settings we hit the ceiling.

I've done some work to maybe make the TLS usage in scylla use a shard-0-only solution for TLS inotify, but I got sidetracked. In any case, however you design it, it will be slightly hackish...

…ose rebuild Refs scylladb#2513 Adds a more advanced callback type, which takes both actual reloading builder as argument (into which new files are loaded), and allows proper future-wait in callback. Exposes certificates rebuilding (via builder) to allow "manual", quick, reload of certs. The point of these seemingly small changes is to allow client code to, for example, limit actual reloadable_certs (and by extension inotify watches) to shard 0 (or whatever), and simply use this as a trigger for manual reload of other shards. Note: we cannot do any magical "shard-0-only" file monitor in the objects themselves, not without making the certs/builders sharded or similarly stored (which contradict the general design of light objects, copyable between shards etc). But with this, a calling app in which certs _are_ held in sharded manners, we can fairly easily delegate non-shard-0 ops in a way that fits that topology. Note: a builder can be _called_ from any shard (as long as it is safe in its originating shard), but the objects returned are only valid on the current shard. Similarly, it is safe to share the reloading builder across shards _in the callback_, since rebuilding is blocked for the duration of the call.

nyh added the TLS label Oct 21, 2024

nyh mentioned this issue Oct 21, 2024

random tests fail on failure to create inotify instances (Too many open files) scylladb/scylladb#21199

Closed

bhalevy assigned regevran Dec 8, 2024

elcallio self-assigned this Dec 8, 2024

elcallio mentioned this issue Dec 11, 2024

tls: Add optional builder + future-wait to cert reload callback + expose rebuild #2573

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TLS: use fewer inotify instances and/or be more resilient to them not working #2513

TLS: use fewer inotify instances and/or be more resilient to them not working #2513

nyh commented Oct 21, 2024 •

edited

Loading

bhalevy commented Dec 8, 2024

bhalevy commented Dec 8, 2024

elcallio commented Dec 8, 2024

TLS: use fewer inotify instances and/or be more resilient to them not working #2513

TLS: use fewer inotify instances and/or be more resilient to them not working #2513

Comments

nyh commented Oct 21, 2024 • edited Loading

bhalevy commented Dec 8, 2024

bhalevy commented Dec 8, 2024

elcallio commented Dec 8, 2024

nyh commented Oct 21, 2024 •

edited

Loading