-
Notifications
You must be signed in to change notification settings - Fork 305
Async write prevents Rust bindings and component composition #870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This was quite the topic of contention IIRC -- heads up I opened a thread on the BCA Zulip#wasi > converting fs async writes to sync writes returning streams to try and discuss this and get some more insight here, since I remember this being debated for STDIN/STDOUT and what the usual interface for those functions should be. |
|
If you're thinking of 757, then I don't think these are related. In this PR here, the relevant interfaces still use |
|
I talked about this with @lukewagner, @dicej, @tschneidereit, @rvolosatovs, and @TartanLlama in more depth today. We talked about this specifically as well as thoughts more generally on this change. I want to write down what we talked about to both provide thoughts on this PR specifically as well as more general guidance we were thinking of might be appropriate for WASI. The conclusions we personally reached on this were, and y'all please correct me if I'm misrepresenting anything:
One thing this rationale does not apply to, however, is the stdio So, overall, that's a long-winded way of saying "lgtm r+" on this (unless anyone disagrees with me). The only change I'd personaly like to see as a result is expanded documentation on the resource-related methods here clarifying that should the resource be dropped the implementation will continue to read/write the file (this is applicable to other preexisting APIs too) and consume/work on the streams in question. Put another way, closing a file does not close pending operations automatically. |
|
Agreed with @alexcrichton's summary of the design principles for when to use |
Yeah, this PR surfaces the underspecified semantics of At the component-model level, drop only means that a specific component instance relinquishes its access to a resource. It does not prescribe any particular meaning or required side effects beyond that. In particular, it is perfectly valid for the resource to continue to exist on the host side after drop. The concrete semantics therefore have to be defined here at the WASI layer, and they may differ per resource type. A useful source of inspiration here is Linux (and POSIX/Unix more generally). The documentation distinguishes clearly between file descriptors and open file descriptions:
The POSIX From a semantics perspective, WASI can model something very similar. We can define an underlying file or socket "object", where the various WASI resources and derived streams are reference-counted handles to that object. The underlying object is only released back to the operating system once the final reference is dropped. Concretely, for files, the following keep the underlying file object alive:
For TCP sockets, the following keep the underlying socket object alive:
Notably, the following do not keep the original object alive:
You note this as just one possible implementation strategy. However, closing files and sockets has observable side effects:
As a result, reference counting is effectively the only viable implementation strategy if POSIX-compatible semantics are desired.
They indeed don't keep a |
At the moment (pre-runtime-instantiation), I think we have a model that, once created, component instances basically live as long as their containing Store (both in the spec and impl meanings of the word). (Hypothetically a GC could collect unreachable component instances, but I don't think anyone has built one and we definitely don't design for it.) So I would think this isn't a problem in the short term. In a future with runtime instantiation (where component instances can be created as resources that can be explicitly promptly freed via |
Agreed on all counts of what you describe @badeend, and I believe what you describe is the current semantics in Wasmtime as implemented for WASIp3. Or at least that's the intention.
Good point! Personally I think that's ok, iunno if others feel different. Orthogonally, one nice part about stream-y things not being |
|
@alexcrichton True, although fwiw, there's some farther-future promise pipelining-like optimizations that we probably want eventually that would provide the same tail-call behavior for async function call results (specifically the ability to pass a But I suppose yet another reason to prefer |
|
Thanks for the extended discussion and laying out the rationale here everyone -- I think I have one last possibly silly question... Should we be returning a The latter is of course much more tedious to write, but it seems like we should be able to save some cycles by enabling an error to be returned without constructing the future and resolving it at all. That said, platforms could optimize such a result in this very specific case but I wonder if the default should be to force poll creation. While not the same, the last time a similar API surface conversation came up in HTTP we went in between |
|
These methods are now synchronous. They can start an operation, but they can't perform any I/O themselves anymore. The only way the extra outer |
seems like failure to allocate memory (or some other failure to allocate) fits with returning an outer result |
Yup, this is exactly what I meant -- the "somehow failed before doing any I/O" is not so easy to rule out. For example if it is possible to do a completely synchronous call and get a hint or know about failure ahead of time it would be nice to be able to "exit early" in a sense and avoid creating a future and doing the relevant back-and-forth at all. I'm thinking less about figuring out all the ways this case could happen and whether we want the API to require the creation of a future or not. |
Several WASI interfaces (stdio, filesystem, sockets) use the following pattern:
readis synchronous and returns astreamandfuturethat are independent of the implicitthishandle.writeis asynchronous and implicitly borrowsthisfor the entire duration of consuming the input stream.Problem: Component composition
Because
writekeepsthisborrowed until the input stream is fully consumed, the resource and its owning component must stay alive for the entire stream operation. This prevents a component from performing a one-time setup, forwarding streams, and then returning immediately. In middleware-style patterns, the component cannot "step out of the way" once the streams are connected up, since the asyncwriteties the stream's lifetime to the component's presence.Problem: Rust lifetimes
On the Rust side, this pattern makes it impossible to wrap such a resource into a single struct without resorting to
unsafecode. Thewritecall produces a future that captures a borrow ofthis, which makes theFutureself-referential with respect to the resource being stored. For example:Conceptually, the
send_resultfuture is a self-borrow ofsocket. This pattern cannot be expressed safely in Rust today and prevents ergonomic bindings for common I/O abstractions likeAsyncReadandAsyncWrite.Solution
The most straight-forward thing I could come up with is to turn the
writemethods synchronous and return afutureinstead, mirroring thereadmethods. Other thoughts are welcome too of course.From e.g.:
write: async func(data: stream<u8>) -> result<_, error-code>;To:
write: func(data: stream<u8>) -> future<result<_, error-code>>;WDTY?