Skip to content

Conversation

@badeend
Copy link
Member

@badeend badeend commented Jan 10, 2026

Several WASI interfaces (stdio, filesystem, sockets) use the following pattern:

resource example {
  read: func() -> tuple<stream<u8>, future<result<_, error-code>>>;
  write: async func(data: stream<u8>) -> result<_, error-code>;
}
  • read is synchronous and returns a stream and future that are independent of the implicit this handle.
  • write is asynchronous and implicitly borrows this for the entire duration of consuming the input stream.

Problem: Component composition

Because write keeps this borrowed until the input stream is fully consumed, the resource and its owning component must stay alive for the entire stream operation. This prevents a component from performing a one-time setup, forwarding streams, and then returning immediately. In middleware-style patterns, the component cannot "step out of the way" once the streams are connected up, since the async write ties the stream's lifetime to the component's presence.

Problem: Rust lifetimes

On the Rust side, this pattern makes it impossible to wrap such a resource into a single struct without resorting to unsafe code. The write call produces a future that captures a borrow of this, which makes the Future self-referential with respect to the resource being stored. For example:

struct MyTcpStream {
    socket: TcpSocket,
    send: StreamWriter<u8>,
    send_result: Pin<Box<dyn Future<Output = Result<(), ErrorCode>>>>,
    receive: StreamReader<u8>,
    receive_result: FutureReader<Result<(), ErrorCode>>,
}
impl MyTcpStream {
    pub fn from(socket: TcpSocket) -> Self {
        let (send, send_reader) = wit_stream::new();
        let send_result = Box::pin(socket.send(send_reader));
        let (receive, receive_result) = socket.receive();
        Self {
            socket, // ERROR! `socket` is still borrowed by the `send` call.
            send,
            send_result,
            receive,
            receive_result,
        }
    }
}

Conceptually, the send_result future is a self-borrow of socket. This pattern cannot be expressed safely in Rust today and prevents ergonomic bindings for common I/O abstractions like AsyncRead and AsyncWrite.

Solution

The most straight-forward thing I could come up with is to turn the write methods synchronous and return a future instead, mirroring the read methods. Other thoughts are welcome too of course.

From e.g.:
write: async func(data: stream<u8>) -> result<_, error-code>;
To:
write: func(data: stream<u8>) -> future<result<_, error-code>>;

WDTY?

@badeend badeend requested review from a team as code owners January 10, 2026 15:15
@github-actions github-actions bot added P-filesystem Proposal: wasi-filesystem P-sockets Proposal: wasi-sockets P-cli Proposal: wasi-cli labels Jan 10, 2026
@vados-cosmonic
Copy link
Contributor

This was quite the topic of contention IIRC -- heads up I opened a thread on the BCA Zulip#wasi > converting fs async writes to sync writes returning streams to try and discuss this and get some more insight here, since I remember this being debated for STDIN/STDOUT and what the usual interface for those functions should be.

@badeend
Copy link
Member Author

badeend commented Jan 11, 2026

If you're thinking of 757, then I don't think these are related. In this PR here, the relevant interfaces still use streams. It's just about async vs. future.

@alexcrichton
Copy link
Contributor

I talked about this with @lukewagner, @dicej, @tschneidereit, @rvolosatovs, and @TartanLlama in more depth today. We talked about this specifically as well as thoughts more generally on this change. I want to write down what we talked about to both provide thoughts on this PR specifically as well as more general guidance we were thinking of might be appropriate for WASI. The conclusions we personally reached on this were, and y'all please correct me if I'm misrepresenting anything:

  • Using async func vs func() -> future is subtle, but rightfully so. That we have to deal with this question is not unreasonable given the subtle, but important, semantic differences.
  • An important difference of the two signatures is the transferrability of borrow<T> values. With async you can't transfer resources during the operation, but with func() -> future you can transfer while the operation is going. This is applicable here because methods always implicitly have a self: borrow<_> parameter.
  • For methods doing stream-y things it feels that transferrability of the resource is an important concern, so this alone feels reasonable enough for motivatnig this change.
  • Using async func implicitly disallows transferring/closing a resource during the operation, but func() -> future does not. This means that this change is also coupled with an implicit question of "what happens when a resource is closed after an operation is started?"
  • Our conclusion was that this should behave similar to other WASI APIs, such as tcp-socket.listen. Specifically implementations are expected to keep things alive and running as necessary, for example write-via-stream would continue to run even if the original file/socket were closed. Reference counting is a way of implementing this for example.
  • A future possibility for alleviating the reference counting and/or shifting slightly would be a function that consumes a tcp-socket, for example, and produces a stream<u8>. This consumption sidesteps the question of what to do on close and encapsulates the socket in the stream<u8> effectively. There's possible ideas of this being per-resource WIT functions or perhaps component model builtins, but this didn't dissuade us from thinking we can change things as-is for now.

One thing this rationale does not apply to, however, is the stdio write-via-stream functions. In those cases there's no borrow<T> to worry about transferring/closing. There's still sort of the middleware use case of "you gotta be there in the middle", however, which could be a concern. We didn't talk too much specifically about this part but I'm personally coming around to feeling it's fine to update them in this PR.

So, overall, that's a long-winded way of saying "lgtm r+" on this (unless anyone disagrees with me). The only change I'd personaly like to see as a result is expanded documentation on the resource-related methods here clarifying that should the resource be dropped the implementation will continue to read/write the file (this is applicable to other preexisting APIs too) and consume/work on the streams in question. Put another way, closing a file does not close pending operations automatically.

@lukewagner
Copy link
Member

Agreed with @alexcrichton's summary of the design principles for when to use async functions or not in WASI, which covers the changes in the PR for filesystem and sockets. But I didn't catch what the reason was for removing async from stdio, given that they're non-member functions and so the transfer use cases don't apply?

@badeend
Copy link
Member Author

badeend commented Jan 22, 2026

"what happens when a resource is closed after an operation is started?"

Yeah, this PR surfaces the underspecified semantics of drop.

At the component-model level, drop only means that a specific component instance relinquishes its access to a resource. It does not prescribe any particular meaning or required side effects beyond that. In particular, it is perfectly valid for the resource to continue to exist on the host side after drop.

The concrete semantics therefore have to be defined here at the WASI layer, and they may differ per resource type.

A useful source of inspiration here is Linux (and POSIX/Unix more generally). The documentation distinguishes clearly between file descriptors and open file descriptions:

Open file descriptions: The term open file description is the one used by POSIX to refer to the entries in the system-wide table of open files. (..) When a file descriptor is duplicated (using dup(2) or similar), the duplicate refers to the same open file description as the original file descriptor. (..) Such sharing can also occur between processes: a child process created via fork(2) inherits duplicates of its parent's file descriptors, and those duplicates refer to the same open file descriptions. close(fd) closes a file descriptor. If fd is the last file descriptor referring to the underlying open file description, the resources associated with the open file description are freed.

The POSIX close syscall does not necessarily close the file; it merely drops a reference. Only when the last reference is dropped is the underlying object actually released.

From a semantics perspective, WASI can model something very similar. We can define an underlying file or socket "object", where the various WASI resources and derived streams are reference-counted handles to that object. The underlying object is only released back to the operating system once the final reference is dropped.

Concretely, for files, the following keep the underlying file object alive:

  • The descriptor resource itself
  • The stream & future returned by descriptor::read-via-stream
  • The stream passed into + the future returned by descriptor::write-via-stream
  • The stream passed into + the future returned by descriptor::append-via-stream
  • The stream & future returned by descriptor::read-directory

For TCP sockets, the following keep the underlying socket object alive:

  • The tcp-socket itself
  • The stream & future returned by tcp-socket::receive
  • The stream passed into + the future returned by tcp-socket::send
  • The stream returned by tcp-socket::listen

Notably, the following do not keep the original object alive:

  • The descriptor returned by descriptor::open-at, which represents a distinct file object from the base descriptor used to open it
  • The client sockets returned by tcp-socket::listen, which are distinct objects from the listening socket used to accept them

Reference counting is a way of implementing this for example

You note this as just one possible implementation strategy. However, closing files and sockets has observable side effects:

  • Closing a file releases advisory locks
  • Closing a socket releases its local port binding

As a result, reference counting is effectively the only viable implementation strategy if POSIX-compatible semantics are desired.


I didn't catch what the reason was for removing async from stdio, given that they're non-member functions and so the transfer use cases don't apply?

They indeed don't keep a this borrow alive. But they do require that the entire component instance remains alive.

@lukewagner
Copy link
Member

They indeed don't keep a this borrow alive. But they do require that the entire component instance remains alive.

At the moment (pre-runtime-instantiation), I think we have a model that, once created, component instances basically live as long as their containing Store (both in the spec and impl meanings of the word). (Hypothetically a GC could collect unreachable component instances, but I don't think anyone has built one and we definitely don't design for it.) So I would think this isn't a problem in the short term.

In a future with runtime instantiation (where component instances can be created as resources that can be explicitly promptly freed via resource.drop) where a runtime-instantiated component wants to virtualize stdio: in general it will be necessary to keep the component instance alive as long as it has streams going in or out (since destroying the instance would forcibly drop the paired ends of the stdio streams and any linear-memory state used to implement them), so I think the borrow of the component instance implied by async is still fine and even necessary. (This wouldn't be a problem for our other WASI HTTP/Socket/Filesystem streaming use cases since you wouldn't expect to implement individual request/response/descriptor/socket resources with individual runtime component instances -- the component instance holding the linear memory and readable/writable ends of streams would be owned by the (longer-lived) component instance defining and implementing all the individual resource types. Thus, it's the non-method-ness of stdio functions that is the key determinant here.)

@alexcrichton
Copy link
Contributor

From a semantics perspective, WASI can model something very similar.

Agreed on all counts of what you describe @badeend, and I believe what you describe is the current semantics in Wasmtime as implemented for WASIp3. Or at least that's the intention.

As a result, reference counting is effectively the only viable implementation strategy if POSIX-compatible semantics are desired.

Good point! Personally I think that's ok, iunno if others feel different.


Orthogonally, one nice part about stream-y things not being async irrespective of borrow, like stdio, is that you can get "tail call" semantics in the sense where if you forward the call somewhere else the async task doesn't have to stay alive. With a plain func you can return and get your task off the component model stack/your own runtime stack. That way you could, in theory, polyfill with a bit less overhead

@lukewagner
Copy link
Member

@alexcrichton True, although fwiw, there's some farther-future promise pipelining-like optimizations that we probably want eventually that would provide the same tail-call behavior for async function call results (specifically the ability to pass a future<T> when lifting a T value and then, complementarily, the ability to extract a future<T> from a func(...) -> T subtask). So hopefully this would go away as a design consideration.

But I suppose yet another reason to prefer stdio.write-via-stream returning a future is simply to have better symmetry with stdio.read-via-stream, which also returns a future. So lgtm as-is from me too.

@vados-cosmonic
Copy link
Contributor

vados-cosmonic commented Jan 25, 2026

Thanks for the extended discussion and laying out the rationale here everyone -- I think I have one last possibly silly question... Should we be returning a future<result<_, error-code>> or a result<future<result<_, error-code>>, error-code> ?

The latter is of course much more tedious to write, but it seems like we should be able to save some cycles by enabling an error to be returned without constructing the future and resolving it at all. That said, platforms could optimize such a result in this very specific case but I wonder if the default should be to force poll creation.

While not the same, the last time a similar API surface conversation came up in HTTP we went in between result<future<...>> and future<result<...>> at least once. We settled on future<result<...>> IIRC users who want trailers at all will poll the future (and things will fall as they may) versus completely ignoring it and never incurring any additional cost.

@badeend
Copy link
Member Author

badeend commented Jan 25, 2026

These methods are now synchronous. They can start an operation, but they can't perform any I/O themselves anymore. The only way the extra outer result would avoid creating or polling a future is if the operation somehow failed before doing any I/O.
I don't think that will (or even can) happen in practice.

@programmerjake
Copy link
Contributor

The only way the extra outer result would avoid creating or polling a future is if the operation somehow failed before doing any I/O.
I don't think that will (or even can) happen in practice.

seems like failure to allocate memory (or some other failure to allocate) fits with returning an outer result

@vados-cosmonic
Copy link
Contributor

These methods are now synchronous. They can start an operation, but they can't perform any I/O themselves anymore. The only way the extra outer result would avoid creating or polling a future is if the operation somehow failed before doing any I/O.

Yup, this is exactly what I meant -- the "somehow failed before doing any I/O" is not so easy to rule out. For example if it is possible to do a completely synchronous call and get a hint or know about failure ahead of time it would be nice to be able to "exit early" in a sense and avoid creating a future and doing the relevant back-and-forth at all.

I'm thinking less about figuring out all the ways this case could happen and whether we want the API to require the creation of a future or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P-cli Proposal: wasi-cli P-filesystem Proposal: wasi-filesystem P-sockets Proposal: wasi-sockets

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants