Fix AttributeError in streaming response cleanup #4236

r-bit-rry · 2025-11-26T08:57:38Z

This PR fixes issue #3185
The code calls await event_gen.aclose() but OpenAI's AsyncStream doesn't have an aclose() method - it has close() (which is async).
when clients cancel streaming requests, the server tries to clean up with:

await event_gen.aclose()  # ❌ AsyncStream doesn't have aclose()!

But AsyncStream has never had a public aclose() method. The error message literally tells us:

AttributeError: 'AsyncStream' object has no attribute 'aclose'. Did you mean: 'close'?
                                                                            ^^^^^^^^

Verification

Reproduction script reproduce_issue_3185.sh can be used to verify the fix.
Manual checks, validation against original OpenAI library code

mattf

@r-bit-rry a chain of hasattr like this suggests we've done something wrong in the design. have we or can we just call close?

r-bit-rry · 2025-11-27T06:48:15Z

@r-bit-rry a chain of hasattr like this suggests we've done something wrong in the design. have we or can we just call close?

It really comes down to what we want to support, since this was never strictly typed, I'm assuming there are other objects that can be generated by the sse_generator.
I'd go even a step further to ask why do we even assume close method exists?

and on a more serious note @mattf
PEP 525 defines aclose() for async generators
OpenAI and Anthropic SDKs deviate from the standard and use close() instead (while underneath the hood they both call httpx.Response.aclose() )
We cannot control third-party SDK design choices

so its our decision if we want to enforce certain typings, and act upon, or let this pattern "catch all"

mattf

as propose, the hasattr chain will cover up an api contract bug somewhere in the system.

an AsyncStream is making it to a place where only AsyncIterators should be.

i did a little sleuthing and i think there is a bug in at least _maybe_overwrite_id. there are multiple provider impls, so there may be others.

will you find the places where the api contract is being violated and patch them?

also, will you create a regression test that at least tests the openai mixin provider?

r-bit-rry · 2025-11-27T12:23:51Z

@mattf sure thing, I'll start working on those

r-bit-rry · 2025-11-27T15:38:47Z

@mattf We're facing two options in order to avoid hasattr chain as I see it when treating the AsyncStream object:
option 1: wrapping

async def wrap_async_stream(stream): 
    async for item in stream: yield item

this is explicit and simple but carries a small overhead per chunk

option 2: adapter pattern

class AsyncStreamAdapter(AsyncIterator[T]):
    def __init__(self, stream): self._stream = stream
    async def aclose(self): await self._stream.close()  # Delegate close→aclose

direct delegation with no re-yielding and a more explicit intent

regarding locations of violations, where we will need patching, these are the places I was able to spot:

_maybe_overwrite_id(): Location: src/llama_stack/providers/utils/inference/openai_mixin.py:251
when overwrite_completion_id=False and stream=True

returned AsyncStream (has close()) instead of AsyncIterator (has aclose())

PassthroughInferenceAdapter - openai_chat_completion()
Location: src/llama_stack/providers/remote/inference/passthrough/passthrough.py:123-136

Returned raw client.chat.completions.create() response

PassthroughInferenceAdapter - openai_completion()
Location: src/llama_stack/providers/remote/inference/passthrough/passthrough.py:108-121

Returned raw client.completions.create() response

LiteLLMOpenAIMixin - openai_chat_completion()
Location: src/llama_stack/providers/utils/inference/litellm_openai_mixin.py:222-278

Returned raw litellm.acompletion() result

LiteLLMOpenAIMixin - openai_completion()
Location: src/llama_stack/providers/utils/inference/litellm_openai_mixin.py:179-220
Returned raw litellm.atext_completion() result

mattf · 2025-11-27T16:00:07Z

@r-bit-rry great finds! it looks like we're violating the api contract and using # ignore to cover it up...resulting in the bug. what fix do you suggest?

fix(server.py): check attr sse_generator returned object

57f8f6d

r-bit-rry requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners November 26, 2025 08:57

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 26, 2025

Merge branch 'main' into fix/issue-3185

9b3c041

mattf reviewed Nov 26, 2025

View reviewed changes

mattf requested changes Nov 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix AttributeError in streaming response cleanup #4236

Fix AttributeError in streaming response cleanup #4236

Uh oh!

r-bit-rry commented Nov 26, 2025

Uh oh!

mattf left a comment

Uh oh!

r-bit-rry commented Nov 27, 2025 •

edited

Loading

Uh oh!

mattf left a comment

Uh oh!

r-bit-rry commented Nov 27, 2025

Uh oh!

r-bit-rry commented Nov 27, 2025

Uh oh!

mattf commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix AttributeError in streaming response cleanup #4236

Are you sure you want to change the base?

Fix AttributeError in streaming response cleanup #4236

Uh oh!

Conversation

r-bit-rry commented Nov 26, 2025

Verification

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

r-bit-rry commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

r-bit-rry commented Nov 27, 2025

Uh oh!

r-bit-rry commented Nov 27, 2025

Uh oh!

mattf commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

r-bit-rry commented Nov 27, 2025 •

edited

Loading