Skip to content

Conversation

@r-bit-rry
Copy link
Contributor

This PR fixes issue #3185
The code calls await event_gen.aclose() but OpenAI's AsyncStream doesn't have an aclose() method - it has close() (which is async).
when clients cancel streaming requests, the server tries to clean up with:

await event_gen.aclose()  # ❌ AsyncStream doesn't have aclose()!

But AsyncStream has never had a public aclose() method. The error message literally tells us:

AttributeError: 'AsyncStream' object has no attribute 'aclose'. Did you mean: 'close'?
                                                                            ^^^^^^^^

Verification

  • Reproduction script reproduce_issue_3185.sh can be used to verify the fix.
  • Manual checks, validation against original OpenAI library code

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 26, 2025
Copy link
Collaborator

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r-bit-rry a chain of hasattr like this suggests we've done something wrong in the design. have we or can we just call close?

@r-bit-rry
Copy link
Contributor Author

r-bit-rry commented Nov 27, 2025

@r-bit-rry a chain of hasattr like this suggests we've done something wrong in the design. have we or can we just call close?

It really comes down to what we want to support, since this was never strictly typed, I'm assuming there are other objects that can be generated by the sse_generator.
I'd go even a step further to ask why do we even assume close method exists?

and on a more serious note @mattf
PEP 525 defines aclose() for async generators
OpenAI and Anthropic SDKs deviate from the standard and use close() instead (while underneath the hood they both call httpx.Response.aclose() )
We cannot control third-party SDK design choices

so its our decision if we want to enforce certain typings, and act upon, or let this pattern "catch all"

Copy link
Collaborator

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as propose, the hasattr chain will cover up an api contract bug somewhere in the system.

an AsyncStream is making it to a place where only AsyncIterators should be.

i did a little sleuthing and i think there is a bug in at least _maybe_overwrite_id. there are multiple provider impls, so there may be others.

will you find the places where the api contract is being violated and patch them?

also, will you create a regression test that at least tests the openai mixin provider?

@r-bit-rry
Copy link
Contributor Author

@mattf sure thing, I'll start working on those

@r-bit-rry
Copy link
Contributor Author

@mattf We're facing two options in order to avoid hasattr chain as I see it when treating the AsyncStream object:
option 1: wrapping

async def wrap_async_stream(stream): 
    async for item in stream: yield item

this is explicit and simple but carries a small overhead per chunk

option 2: adapter pattern

class AsyncStreamAdapter(AsyncIterator[T]):
    def __init__(self, stream): self._stream = stream
    async def aclose(self): await self._stream.close()  # Delegate close→aclose

direct delegation with no re-yielding and a more explicit intent

regarding locations of violations, where we will need patching, these are the places I was able to spot:

  1. _maybe_overwrite_id(): Location: src/llama_stack/providers/utils/inference/openai_mixin.py:251
    when overwrite_completion_id=False and stream=True

returned AsyncStream (has close()) instead of AsyncIterator (has aclose())

  1. PassthroughInferenceAdapter - openai_chat_completion()
    Location: src/llama_stack/providers/remote/inference/passthrough/passthrough.py:123-136

Returned raw client.chat.completions.create() response

  1. PassthroughInferenceAdapter - openai_completion()
    Location: src/llama_stack/providers/remote/inference/passthrough/passthrough.py:108-121

Returned raw client.completions.create() response

  1. LiteLLMOpenAIMixin - openai_chat_completion()
    Location: src/llama_stack/providers/utils/inference/litellm_openai_mixin.py:222-278

Returned raw litellm.acompletion() result

  1. LiteLLMOpenAIMixin - openai_completion()
    Location: src/llama_stack/providers/utils/inference/litellm_openai_mixin.py:179-220
    Returned raw litellm.atext_completion() result

@mattf
Copy link
Collaborator

mattf commented Nov 27, 2025

@r-bit-rry great finds! it looks like we're violating the api contract and using # ignore to cover it up...resulting in the bug. what fix do you suggest?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants