Skip to content

Conversation

@LucasArmandVast
Copy link
Contributor

@LucasArmandVast LucasArmandVast commented Nov 12, 2025

When parallel_requests = False, we implement a FIFO queue in the PyWorker backend to ensure requests are handled sequentially. This replaces the random semaphore first-in-random-out pseudo-queue we had before.

This has been tested with the default serverless templates (comfy, vLLM, TGI) on prod.

@LucasArmandVast LucasArmandVast marked this pull request as ready for review November 12, 2025 20:19
Bump pyworker version

def advance_queue_after_completion(event: asyncio.Event):
"""Pop current head and wake next waiter, if any."""
if self.queue and self.queue[0] is event:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check if [0] is an event? Small little nit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not checking if it is an event, but verifying that the event we pass in is in fact the current head of the queue.

lib/backend.py Outdated
Comment on lines 243 to 255
if disconnect_task in first_done and not event.is_set():
was_head = (self.queue and self.queue[0] is event)
try:
self.queue.remove(event)
except ValueError:
pass
if was_head and self.queue:
self.queue[0].set()

for t in first_pending:
t.cancel()
await asyncio.gather(*first_pending, return_exceptions=True)
return web.Response(status=499)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt this duplicating advance_queue_after_completion?

lib/backend.py Outdated
Comment on lines 278 to 288
except asyncio.CancelledError:
# Cleanup if request was cancelled
was_head = (self.queue and self.queue[0] is event)
try:
self.queue.remove(event)
except ValueError:
pass
if was_head and self.queue:
self.queue[0].set()

return web.Response(status=499)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here?

@Colter-Downing
Copy link
Contributor

From gpt 5.1 =)

The main one is how cancel_api_call_if_disconnected is used. That function does await request.wait_for_disconnection(), logs, marks the request as canceled in metrics, and then raises asyncio.CancelledError. You always run it in a background task, disconnect_task = create_task(cancel_api_call_if_disconnected()), and then you race that task against other tasks using asyncio.wait. However, whenever disconnect_task “wins” the race, you never actually await it or gather it. In both the first race and the second race you only ever call gather on the “pending” tasks, not on the “done” set. If cancel_api_call_if_disconnected completes by raising CancelledError, that exception will just sit on the finished task, and in CPython that usually produces “Task exception was never retrieved” warnings. Your outer except asyncio.CancelledError around the handler does not help, because that only catches cancellation of the handler coroutine itself, not exceptions inside a separate Task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants