Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client example not handling closed connection #22

Closed
belm0 opened this issue Sep 9, 2018 · 16 comments
Closed

client example not handling closed connection #22

belm0 opened this issue Sep 9, 2018 · 16 comments
Assignees

Comments

@belm0
Copy link
Member

belm0 commented Sep 9, 2018

The client example is not handling the case where server suddenly drops the connection, since it's blocked waiting for keyboard input.

Also there seems to be no reason that input is being run on a background thread, only to then await on the result.

There should be two tasks running in parallel: one to accept and process commands, and another to loop on get_message() and queue the results.

@belm0
Copy link
Member Author

belm0 commented Sep 9, 2018

Taking a step back, it's odd that the API defers the closed connection event until the next call to get_message(). The exception should be surfaced from the API as soon as it happens so that the nursery passed to connect() will get cancelled. This is another argument for using start() to invoke connect() and mentioned in #20.

In fact the API should be promoting use of an async context manager:

async with trio_websocket.open_connection(host, port) as connection:
    try:
        ...
    except ConnectionClosed:
        pass

@mehaase mehaase self-assigned this Oct 5, 2018
@mehaase
Copy link
Contributor

mehaase commented Oct 5, 2018

I think raising at the WebSocket I/O call is the most user-friendly way to do this (see the database example in this post). Cancelling the nursery is pretty unfriendly because the caller has to handle ClosedConnection and Cancelled correctly, and the code path that catches Cancelled cannot get the WebSocket's close code.

Maybe there needs to be an optional cancel scope for use cases like the example client, where we would prefer for our handler to be cancelled.

mehaase added a commit that referenced this issue Oct 11, 2018
This optional scope is cancelled when the connection is closed.
mehaase added a commit that referenced this issue Oct 11, 2018
If the server closes the connection, the client will quit immediately
rather than waiting for the next user command on stdin.
@mehaase
Copy link
Contributor

mehaase commented Oct 11, 2018

I've written up a quick implementation with an optional cancel_scope argument for client connections. This scope is cancelled when the connection closes. I updated the example client. Here's a sample run where the server closed the connection while the client was waiting for the user to enter a command.

(venv) $ python examples/client.py ws://localhost:8000/asdf
DEBUG:root:Connecting to WebSocket…
DEBUG:trio-websocket:Connecting to ws://localhost:8000/asdf?
DEBUG:trio-websocket:conn#0 sending 158 bytes
DEBUG:trio-websocket:conn#0 received 110 bytes
DEBUG:trio-websocket:conn#0 received event: ConnectionEstablished
DEBUG:root:Connected!
cmd> DEBUG:trio-websocket:conn#0 received zero bytes (connection closed)
DEBUG:trio-websocket:conn#0 websocket closed <ConnectionClosed <CloseReason code=1006 name=ABNORMAL_CLOSURE reason=None>>
DEBUG:trio-websocket:conn#0 reader task finished
DEBUG:root:Connection closed

I don't love this implementation. It doesn't provide an easy way for the user to know that the server closed the connection. It also adds another argument to the WebSocketConnection constructor and the client factory functions. Definitely welcoming any feedback here...

Other possible implementations:

  • Add a [synchronous] callback function that is called when the connection closes. This still pollutes the arguments for the constructor/factories, but it is more general than cancellation and the close reason could be passed to the callback.
  • Add an async def wait_closed() function that blocks until the connection closes and then returns the close reason. This does not pollute the arguments for the constructor/factories, is more general than cancellation, and provides the close reason to the caller. I think this is my favorite approach but I worry that it is over-engineered, especially since I don't have any use cases for this "handle connection closed immediately" behavior other than the example client.

Thoughts?

@belm0
Copy link
Member Author

belm0 commented Oct 12, 2018

Passing in an optional Trio.Event() would be more general than a cancel scope, while avoiding callbacks. I've got a decently sized codebase built from day-one on trio and there's not a single callback to be found.

I think the "handle closed connection immediately" use case is real. For example, what if the client only sends a command occasionally, but at those times needs a response as fast as possible. It's important to keep a connection active, reconnecting as needed.

However I'm not convinced any new API is necessary, since we already get a signal when the connection closes, by way of the open_websocket_url context manager exiting. Rather it may just be an issue with the structure of this example.

I tried to address this simply in #52. But it breaks the regular send and receive case, I can't figure out why.

@njsmith
Copy link
Member

njsmith commented Oct 13, 2018

I'm not sure how to handle this kind of automagical cancel-on-disconnect situation. (It also comes up on the server side – if a connection drops, should you immediately cancel the connection handler? In an HTTP server, should you pre-emptively cancel a request handler if the client disappears and won't be able to receive the response?)

In general in Trio the bias is toward delivering notifications by someone doing await wait_for_event(), not via callbacks or cancellation or whatever. So my gut feeling is that if you want to notice when the connection drops, maybe have a task to listen for that or something? But sometimes my gut feelings are wrong. Maybe it really that really is too annoying and people really prefer automagic cancellation.

In this case, wouldn't it make more sense anyway to have a background task that listens for incoming messages and prints them? and if it gets an EOF, then it could raise SystemExit or something?

@belm0
Copy link
Member Author

belm0 commented Oct 13, 2018

(ignore my previous comment about a signal already being available-- due to confusion from outdated dependencies)

Putting this particular example aside, aggressive reconnect seems like something people will want and, if not supported automatically by the library, there should at least be an example users can look at.

Monitoring in a separate task sounds reasonable, but at a minimum WebsocketConnection should expose a trio.Event for closed so that users don't have to poll is_closed(). (This is along the lines of wait_closed() proposed by @mehaase but more general.)

@njsmith
Copy link
Member

njsmith commented Oct 13, 2018

What's the difference between await conn.close_event.wait() and await conn.wait_closed()?

@belm0
Copy link
Member Author

belm0 commented Oct 13, 2018

Not much-- I was thinking it might be easier to pass around event or there may be utils which take event. But just using thunks is probably more common. And the event is more dangerous since it's mutable property.

@njsmith
Copy link
Member

njsmith commented Oct 13, 2018

I'm not 100% sure wait_closed is even a good idea actually, versus using receive.

The thing is, underneath, the only way to actually watch for the socket being closed is to read events off of it. But since you don't want to buffer an arbitrary number of events, if some events arrive and the user hasn't called receive, then eventually we'll have to stop reading from the socket until they do call receive. Which means wait_closed is unreliable unless some other task is regularly calling receive. But if a task is regularly calling receive, then they'll get notified of closure, so the wait_closed seems unnecessary.

@mehaase
Copy link
Contributor

mehaase commented Oct 15, 2018

@njsmith: I'm not 100% sure wait_closed is even a good idea actually, versus using receive.

This point about userspace buffering is a very convincing argument in favor of waiting for a ConnectionClosed exception over any variant of wait_closed().

@belm0 I think the "handle closed connection immediately" use case is real. For example, what if the client only sends a command occasionally, but at those times needs a response as fast as possible. It's important to keep a connection active, reconnecting as needed.

Yes, I see there are two scenarios here. One is the "userspace buffer is full" scenario that Nathaniel discussed. The other is a "no activity on the socket" scenario that you brought up. In your scenario, I think we should recommend users to create a heartbeat task that sends pings. We can provide a recipe for this in the docs.

  async def heartbeat(ws):
    while True:
      await trio.sleep(N)
      await ws.ping()

The ping() method will raise ConnectionClosed, so the user can start the hearbeat task in the same nursery as the handler and when the connection closes it will cancel the nursery.

I'm not sure if the current ping() is ideal: it sends a ping frame and then returns. So the "recipe" above doesn't actually check if any pong frames are received, but it would at least detected a dropped connection.

Maybe ping() should be modified to wait for a corresponding pong()? The WebSocket RFC allows a payload with a ping frame, so we could send a random nonce and then continue when we receive a pong frame with the same nonce.

@njsmith: In this case, wouldn't it make more sense anyway to have a background task that listens for incoming messages and prints them? and if it gets an EOF, then it could raise SystemExit or something?

Yes, this is exactly how it should be. I'll create a new PR with this design.

@njsmith
Copy link
Member

njsmith commented Oct 15, 2018

Maybe ping() should be modified to wait for a corresponding pong()

One subtlety to watch out for here:

If an endpoint receives a Ping frame and has not yet sent Pong frame(s) in response to previous Ping frame(s), the endpoint MAY elect to send a Pong frame for only the most recently processed Ping frame.

So if we did do this, we'd want to make sure that receiving a pong for a later ping would also wake up earlier pings. (Probably not too hard, but it would require that the ping frames be unique and that we keep a central list of outstanding pings.)

@mehaase
Copy link
Contributor

mehaase commented Oct 16, 2018

Oh, wow, good catch in the spec. I think that completely blows up my desire for each call to ping() to wait for corresponding pong, since the remote endpoint isn't required to always send one. A couple other ideas:

  1. A pong wakes up all tasks that are waiting on ping(). If user cares about matching each ping to a pong, then they are responsible for only having one ping/pong in flight at a time, or they implement their own logic to try to match pings and pongs together.
  2. Place a counter/timestamp in each ping: each pong wakes up pings less than or equal to the pong's counter/timestamp.

Given that users may want to use the ping payload for their own purposes, I lean towards #1. This would require minimal changes to the current implementation and would mostly consist of a recipe in the docs, to go along with the heartbeat recipe mentioned above.

mehaase added a commit that referenced this issue Oct 16, 2018
This is the second attempt at this. Instead of modifying
trio-websocket to support cancelling a handler, this commit
modifies only the example client itself: it now has one task
that gets messages and prints them, and a second task that
gets user input and sends messages/commands/etc.
mehaase added a commit that referenced this issue Oct 16, 2018
As suggested in this issue thread, there needs to be a way for
clients that send infrequent messages to make sure that the
connection is still open. Ping/pong are the natural way to do
this, but the ping behavior needs to be a bit more robust.

This commit adds a `wait_pong()` method that waits for a pong
to arrive and returns its payload.
mehaase added a commit that referenced this issue Oct 16, 2018
Add a heartbeat recipe to the README. Also add a heartbeat mode to
the example client.
@mehaase
Copy link
Contributor

mehaase commented Oct 16, 2018

PRs #55 and #56 should cover this issue. The first one fixes the client so that it exits immediately when the connection is closed (with no change to trio-websocket itself), and the second one improves ping/pong and documents a heartbeat recipe. Let me know what you think.

@njsmith
Copy link
Member

njsmith commented Oct 16, 2018

I checked how aaugustin's websockets library handles the ping thing, and I wasn't disappointed. It looks like its approach is:

  • Keep an OrderedDict of outstanding pings
  • When sending a ping, assign a random payload that's not currently outstanding
  • Or, the user can provide a custom payload, but if it duplicates one of the outstanding payloads, raise an error
  • ping waits for a reply (by returning a Future; since it's asyncio you have the option of ignoring this)
  • When a pong is received, use the OrderedDict to tell whether it's a solicited pong, and if so then alert the appropriate call to ping + all previous calls

My inclination would be to copy this design, except make ping an async function so that waiting for the pong is mandatory. If you want to do a fire-and-forget ping, that's what pong is for...

@njsmith
Copy link
Member

njsmith commented Oct 16, 2018

Here's the source i was looking at: https://github.com/aaugustin/websockets/blob/master/src/websockets/protocol.py

In particular check out the methods ping, pong, and read_data_frame (which handles incoming pongs).

mehaase added a commit that referenced this issue Oct 16, 2018
As suggested in the discussion thread, this commit mimics the
ping/pong API in aaugstin's websockets. A call to ping waits for
a response with the same payload. (An exception is raised if the
payload matches a ping that's already in flight.) A payload is
randomly generated if omitted by the caller. Add a new pong()
method that can be used for sending an unsolicited pong.
mehaase added a commit that referenced this issue Oct 16, 2018
Add a heartbeat recipe to the README. Also add a heartbeat mode to
the example client.
@mehaase
Copy link
Contributor

mehaase commented Oct 16, 2018

I've submitted PR #58 that mimics aaugustin's design. I'll leave 56 and 58 open for now so we can compare the two approaches.

mehaase added a commit that referenced this issue Oct 23, 2018
@mehaase mehaase closed this as completed Oct 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants