Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide ability to replace Unix.select in Picos_select #129

Open
edwintorok opened this issue May 3, 2024 · 2 comments
Open

Provide ability to replace Unix.select in Picos_select #129

edwintorok opened this issue May 3, 2024 · 2 comments

Comments

@edwintorok
Copy link
Contributor

edwintorok commented May 3, 2024

Using Unix.select is problematic, because if any file descriptor value is beyond 1024 then the call will fail with an exception.
Note that this refers to the value of the file descriptor, and not the number of file descriptors that select is watching: you could fail even if you only ever watch a single file descriptor, if you open that file descriptor late enough that it gets a number >1024.

This is not a concern for short-lived programs, but can be a real problem for long-lived daemons.

It'd be better to use poll(3p), which is part of POSIX.1-2017 and available nearly everywhere.
This is available in the iomux library.

Unfortunately Unix.select in the stdlib cannot be efficiently reimplemented using poll, because it has a different interface, but we shouldn't encourage more uses of select, other than for compatibility with legacy applications (I am currently working on removing all uses of Unix.select from such a legacy application...)
I know a project that has replaced select with poll in an application by writing a drop-in replacement that kept the same signature, and it was a very bad idea for performance: it allocates a lot, and it performs O(watched-fds) work every time.

More efficient OS specific implementations are also possible:

As you can see there are plenty of choices, and the interfaces are quite distinct, and most of these are modelled more after a poll-like interface than a select-like interface. It is probably out of scope for this project to unify all these mechanisms, but it'd be good if the default is something better than select.

A neutral approach might be to use poll by default, and provide the ability for the user to override the implementation (via a functor, at main thread configure time, or via a settable global, etc.).

A picos_select.core functor with an implementation in picos_select.iomux might provide the most flexibility:

  • you get an acceptable default that is available on a lot of platforms
  • if for some reason you don't want to depend on iomux in your application, you can choose not to, by linking with picos_select.core and providing your own implementation for the poll signature.
  • for backwards compat picos_select could pick iomux by default

(There might be other ways, I haven't looked at how Dune's virtual libraries could be used for this).

If you want I can attempt to open a draft PR with a proof of concept, but thought to open an issue describing the problem first, and maybe you already have a design on how you'd like this handled.

@polytypic
Copy link
Collaborator

polytypic commented May 4, 2024

Thanks for the thorough summary! I would also add iocp as a desirable IO backend.

Yes, I totally agree that Unix.select is highly problematic (slow, limited, ...).

My idea with Picos_select and Picos_stdio has been to

  • verify and validate that this approach (i.e. using automatically managed per domain IO (sys)threads) can be implemented and performs well (enough),
  • provide examples of what can be done and how things can be structured,
  • not try to design a (completely) new IO library, and
  • use only what OCaml and Stdlib offers,

in order to keep complexity and scope rather bounded.

IOW, Picos_select and Picos_stdio are primarily examples (for the community) and validation of the core Picos framework. I hope that they can also be useful and good enough for some, possibly even many, applications that don't strictly need the best possible IO performance, i.e. they hopefully work as a MVP.

OTOH, I'm aware of various alternatives and limitations of Unix.select and the idea has then been to provide more advanced high level async IO libraries separately. This is actually something that is on my TODO list for Picos and I do hope and plan to start working on those eventually/soon. Alternatively, I would also be happy to help people do such work if someone wants to step in.

Also, I hadn't thought about making Picos_select use multiple backends, but that is also a potentially useful option to consider. The Picos_select API isn't based on list of FDs and should probably be amenable to being implemented in terms of various other mechanisms aside from select.

@edwintorok
Copy link
Contributor Author

[...]

  • use only what OCaml and Stdlib offers,
    [...]

Perhaps it'd be useful to separate the timeout handling and priority queue from file descriptor handling (I understand that these are deeply tied together and will be part of the same event loop).

AFAICT cancel_after and wakeup rely on having a pipe as a signaling mechanism, but doesn't itself rely on select.
So as long as the user supplies their own pipes and event loops this module/submodule could provide an implementation of cancel_after and a function to compute the next timeout (to be used in event loops).
This doesn't require directly supporting new backends as part of Picos, or introduce any new dependencies (except perhaps as optional ones in a Picos_poll example), and would prevent introducing a dependency on Unix.select for applications that don't desire to have it.

Then you can of course have a minimal example based on Unix.select (and perhaps I could help write an equivalent example for iomux, or one of the other libraries), but one could experiment with different backends without having to rewrite the timeout handling. And you could more directly compare the performance of various backends, since only one thing at a time would change (the IO backend), and not the data structure used to compute timeouts, etc.

The Picos_select API isn't based on list of FDs and should probably be amenable to being implemented in terms of various other mechanisms aside from select.
[...]

Indeed, it might be possible to keep the same interface and replace just the implementation, but I'd like to avoid reinventing the (timing) wheel, hence the suggestion to decouple cancel_after from Unix.select.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants