Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConnectionClosed exception from serve_websocket() #96

Closed
belm0 opened this issue Dec 11, 2018 · 4 comments
Closed

ConnectionClosed exception from serve_websocket() #96

belm0 opened this issue Dec 11, 2018 · 4 comments
Labels
bug Something isn't working

Comments

@belm0
Copy link
Member

belm0 commented Dec 11, 2018

Hit a case where serve_websocket is leaking a ConnectionClosed exception. My handler doesn't appear to be involved in this trace. What am I missing?

  ...
  File "/.../site-packages/trio_websocket/_impl.py", line 327, in serve_websocket
    await server.run(task_status=task_status)
  File "/.../site-packages/trio_websocket/_impl.py", line 1088, in run
    await trio.sleep_forever()
  File "/.../site-packages/trio/_core/_run.py", line 397, in __aexit__
    raise combined_error_from_nursery
  File "/.../site-packages/trio/_highlevel_serve_listeners.py", line 129, in serve_listeners
    task_status.started(listeners)
  File "/.../site-packages/trio/_core/_run.py", line 397, in __aexit__
    raise combined_error_from_nursery
  File "/.../site-packages/trio/_highlevel_serve_listeners.py", line 27, in _run_handler
    await handler(stream)
  File "/.../site-packages/trio_websocket/_impl.py", line 1116, in _handle_connection
    await connection.aclose()
  File "/.../site-packages/trio/_core/_run.py", line 397, in __aexit__
    raise combined_error_from_nursery
  File "/.../site-packages/trio_websocket/_impl.py", line 921, in _reader_task
    await handler(event)
  File "/.../site-packages/trio_websocket/_impl.py", line 808, in _handle_connection_closed_event
    await self._write_pending()
  File "/.../site-packages/trio_websocket/_impl.py", line 959, in _write_pending
    raise ConnectionClosed(self._close_reason) from None
  trio_websocket._impl.ConnectionClosed: <CloseReason code=1006 name=ABNORMAL_CLOSURE reason=None>
@belm0
Copy link
Member Author

belm0 commented Jan 14, 2019

From the trace, it's clearly related to a newly incoming connection. There is probably server code using _write_pending but not handling ConnectionClosed. @mehaase would you take a look?

This is a trio-websocket reliability issue for my use case, happening enough to be a problem.

(Note that async stack traces can mislead by incorrectly attributing code location to the last line of an async block. E.g. the exception in _handle_connection could well be anywhere in the nursery scope, such as the nursery.start_soon(connection._reader_task) call, yet Python attributes it to the last line, 1116, await connection.aclose().)

@mehaase
Copy link
Contributor

mehaase commented Jan 21, 2019

I'll look at it this week.

@mehaase
Copy link
Contributor

mehaase commented Jan 22, 2019

I think I've managed to reproduce this. Here is an implementation of a faulty server:

from random import random

import trio
from trio_websocket import serve_websocket, ConnectionClosed


async def main():
    ''' Main entry point. '''
    print('Starting websocket server…')
    await serve_websocket(handler, 'localhost', 8000, ssl_context=None)


async def handler(request):
    ''' Reverse incoming websocket messages and send them back. '''
    try:
        ws = await request.accept()
        await ws.send_message('foo')
        await trio.sleep(random()/1000)
        await trio.aclose_forcefully(ws._stream)
    except ConnectionClosed:
        pass

if __name__ == '__main__':
    trio.run(main)

The client is this:

import trio
from trio_websocket import connect_websocket, ConnectionClosed

async def main():
    async with trio.open_nursery() as nursery:
        for n in range(1000):
            print('connection {}'.format(n))
            try:
                connection = await connect_websocket(nursery, 'localhost', 8000,
                    '/', use_ssl=False)
                await connection.get_message()
                await connection.aclose()
            except ConnectionClosed:
                print('connection closed')

if __name__ == '__main__':
    trio.run(main)

The client connects to the server repeatedly. Within a small number of iterations (often less than 10), the server crashes with the following exception.

$ python bad-server.py
Starting websocket server…
Traceback (most recent call last):
  File "bad-server.py", line 24, in <module>
    trio.run(main)
  File "/home/mhaase/code/trio-websocket/venv/lib/python3.6/site-packages/trio/_core/_run.py", line 1337, in run
    raise runner.main_task_outcome.error
  File "bad-server.py", line 10, in main
    await serve_websocket(handler, 'localhost', 8000, ssl_context=None)
  File "/home/mhaase/code/trio-websocket/trio_websocket/_impl.py", line 327, in serve_websocket
    await server.run(task_status=task_status)
  File "/home/mhaase/code/trio-websocket/trio_websocket/_impl.py", line 1091, in run
    await trio.sleep_forever()
  File "/home/mhaase/code/trio-websocket/venv/lib/python3.6/site-packages/trio/_core/_run.py", line 397, in __aexit__
    raise combined_error_from_nursery
  File "/home/mhaase/code/trio-websocket/venv/lib/python3.6/site-packages/trio/_highlevel_serve_listeners.py", line 129, in serve_listeners
    task_status.started(listeners)
  File "/home/mhaase/code/trio-websocket/venv/lib/python3.6/site-packages/trio/_core/_run.py", line 397, in __aexit__
    raise combined_error_from_nursery
  File "/home/mhaase/code/trio-websocket/venv/lib/python3.6/site-packages/trio/_highlevel_serve_listeners.py", line 27, in _run_handler
    await handler(stream)
  File "/home/mhaase/code/trio-websocket/trio_websocket/_impl.py", line 1119, in _handle_connection
    await connection.aclose()
  File "/home/mhaase/code/trio-websocket/venv/lib/python3.6/site-packages/trio/_core/_run.py", line 397, in __aexit__
    raise combined_error_from_nursery
  File "/home/mhaase/code/trio-websocket/trio_websocket/_impl.py", line 924, in _reader_task
    await handler(event)
  File "/home/mhaase/code/trio-websocket/trio_websocket/_impl.py", line 811, in _handle_connection_closed_event
    await self._write_pending()
  File "/home/mhaase/code/trio-websocket/trio_websocket/_impl.py", line 962, in _write_pending
    raise ConnectionClosed(self._close_reason) from None
trio_websocket._impl.ConnectionClosed: <CloseReason code=1006 name=ABNORMAL_CLOSURE reason=None>

The root cause is that the reader task is getting a connection closed event, and it wants to call _write_pending(), but before it can do that the stream is already broken. The reader task does not catch ConnectionClosed, which causes the exception to propagate out and cancel the entire nursery.

Keeping mind that I want to overhaul the close logic in #90 at a later date, I'll try to keep the fix for this bug as small as possible. The simplest fix is that the reader task should catch ConnectionClosed when it calls _write_pending() directly or when it calls it indirectly through a handler function. While chasing this down, I also noticed that WebSocketConnection.aclose() can also leak a ConnectionClosed exception, so I'll fix that, too.

With these changes, the scripts above run without error. I'll work on a test case tomorrow. The bug is triggered by a race (it only occurs if the server has a random delay added), so building a test case should be a matter of carefully sequencing events.

@mehaase mehaase added the bug Something isn't working label Jan 22, 2019
@belm0
Copy link
Member Author

belm0 commented Jan 22, 2019

Thank you! That does seem to match, and my app has a non-trivial amount of stuff on the trio scheduler which could cause the delay.

mehaase added a commit that referenced this issue Jan 24, 2019
@belm0 belm0 closed this as completed Feb 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants