Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions client-sdks/stainless/openapi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4151,6 +4151,16 @@ components:
type: array
- type: 'null'
nullable: true
reasoning_content:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdoern can we get away by adding just the reasoning field? while we are doing this, can you think about how we should add Gemini-3's "encrypted thought summaries" field also?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I can probably just use reasoning, I can look into the gemini field

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm actually @ashwinb , OpenAIChoiceDelta (used for streaming) already has reasoning_content. so should we mimic that and support both reasoning and reasoning_content for streaming/non-streaming?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added support for thought_signatures b99eb2a

pulled this from gemini docs: https://ai.google.dev/gemini-api/docs/thought-signatures

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a somewhat non-trivial decision to make here in terms of the API shape for reasoning. Gemini chooses extra_content to transport all new fields, whereas vLLM has clearly gone another way and used the fact that OpenAI's SDK uses "TypedDicts" which are amenable to adding sub-fields directly in other places (reasoning_content). I somewhat like vLLM's decision better and would rather add two sub-fields: (1) reasoning and (2) thought_signature to both the streaming (ChunkDelta) and non-streaming (Chunk) fields.

Do folks who have worked closer to inference have thoughts here @mattf @bbrowning?

Copy link
Collaborator Author

@cdoern cdoern Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I am fine with either approach here. One half-baked idea I have is that each provider could have a custom version of this class that extends it vLLMOpenAIMessageParams(OpenAIMessageParams) or something and we could add custom content there somehow?

the gemini docs outline their thought_signature support as:

{
      "role": "model",
        "tool_calls": [
          {
            "extra_content": {
              "google": {
                "thought_signature": "<Signature A>"
              }
            },
            "function": {
              "arguments": "{\"flight\":\"AA100\"}",
              "name": "check_flight"
            },
            "id": "function-call-1",
            "type": "function"
          }
        ]
    }

so I wonder if this will work without extra_content?

anyOf:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIAssistantMessageParam
type: object
OpenAIChatCompletionContentPartImageParam:
Expand Down Expand Up @@ -4218,6 +4228,11 @@ components:
title: OpenAIChatCompletionToolCallFunction
- type: 'null'
title: OpenAIChatCompletionToolCallFunction
extra_content:
anyOf:
- additionalProperties: true
type: object
- type: 'null'
type: object
title: OpenAIChatCompletionToolCall
description: Tool call specification for OpenAI-compatible chat completion responses.
Expand Down Expand Up @@ -4880,6 +4895,11 @@ components:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIChoiceDelta
type: object
OpenAIChunkChoice:
Expand Down Expand Up @@ -11747,6 +11767,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down Expand Up @@ -11776,6 +11804,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down
36 changes: 36 additions & 0 deletions docs/static/deprecated-llama-stack-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -994,6 +994,16 @@ components:
type: array
- type: 'null'
nullable: true
reasoning_content:
anyOf:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIAssistantMessageParam
type: object
OpenAIChatCompletionContentPartImageParam:
Expand Down Expand Up @@ -1061,6 +1071,11 @@ components:
title: OpenAIChatCompletionToolCallFunction
- type: 'null'
title: OpenAIChatCompletionToolCallFunction
extra_content:
anyOf:
- additionalProperties: true
type: object
- type: 'null'
type: object
title: OpenAIChatCompletionToolCall
description: Tool call specification for OpenAI-compatible chat completion responses.
Expand Down Expand Up @@ -1723,6 +1738,11 @@ components:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIChoiceDelta
type: object
OpenAIChunkChoice:
Expand Down Expand Up @@ -8590,6 +8610,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down Expand Up @@ -8619,6 +8647,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down
36 changes: 36 additions & 0 deletions docs/static/experimental-llama-stack-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -859,6 +859,16 @@ components:
type: array
- type: 'null'
nullable: true
reasoning_content:
anyOf:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIAssistantMessageParam
type: object
OpenAIChatCompletionContentPartImageParam:
Expand Down Expand Up @@ -926,6 +936,11 @@ components:
title: OpenAIChatCompletionToolCallFunction
- type: 'null'
title: OpenAIChatCompletionToolCallFunction
extra_content:
anyOf:
- additionalProperties: true
type: object
- type: 'null'
type: object
title: OpenAIChatCompletionToolCall
description: Tool call specification for OpenAI-compatible chat completion responses.
Expand Down Expand Up @@ -1588,6 +1603,11 @@ components:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIChoiceDelta
type: object
OpenAIChunkChoice:
Expand Down Expand Up @@ -7582,6 +7602,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down Expand Up @@ -7611,6 +7639,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down
36 changes: 36 additions & 0 deletions docs/static/llama-stack-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3172,6 +3172,16 @@ components:
type: array
- type: 'null'
nullable: true
reasoning_content:
anyOf:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIAssistantMessageParam
type: object
OpenAIChatCompletionContentPartImageParam:
Expand Down Expand Up @@ -3239,6 +3249,11 @@ components:
title: OpenAIChatCompletionToolCallFunction
- type: 'null'
title: OpenAIChatCompletionToolCallFunction
extra_content:
anyOf:
- additionalProperties: true
type: object
- type: 'null'
type: object
title: OpenAIChatCompletionToolCall
description: Tool call specification for OpenAI-compatible chat completion responses.
Expand Down Expand Up @@ -3901,6 +3916,11 @@ components:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIChoiceDelta
type: object
OpenAIChunkChoice:
Expand Down Expand Up @@ -10417,6 +10437,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down Expand Up @@ -10446,6 +10474,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down
36 changes: 36 additions & 0 deletions docs/static/stainless-llama-stack-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4151,6 +4151,16 @@ components:
type: array
- type: 'null'
nullable: true
reasoning_content:
anyOf:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIAssistantMessageParam
type: object
OpenAIChatCompletionContentPartImageParam:
Expand Down Expand Up @@ -4218,6 +4228,11 @@ components:
title: OpenAIChatCompletionToolCallFunction
- type: 'null'
title: OpenAIChatCompletionToolCallFunction
extra_content:
anyOf:
- additionalProperties: true
type: object
- type: 'null'
type: object
title: OpenAIChatCompletionToolCall
description: Tool call specification for OpenAI-compatible chat completion responses.
Expand Down Expand Up @@ -4880,6 +4895,11 @@ components:
- type: string
- type: 'null'
nullable: true
reasoning:
anyOf:
- type: string
- type: 'null'
nullable: true
title: OpenAIChoiceDelta
type: object
OpenAIChunkChoice:
Expand Down Expand Up @@ -11747,6 +11767,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down Expand Up @@ -11776,6 +11804,14 @@ components:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
type: array
- type: 'null'
reasoning_content:
anyOf:
- type: string
- type: 'null'
reasoning:
anyOf:
- type: string
- type: 'null'
type: object
title: OpenAIAssistantMessageParam
description: A message containing the model's (assistant) response in an OpenAI-compatible chat completion request.
Expand Down
9 changes: 8 additions & 1 deletion src/llama_stack_api/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -438,6 +438,7 @@ class OpenAIChatCompletionToolCall(BaseModel):
id: str | None = None
type: Literal["function"] = "function"
function: OpenAIChatCompletionToolCallFunction | None = None
extra_content: dict[str, Any] | None = None


@json_schema_type
Expand All @@ -448,12 +449,16 @@ class OpenAIAssistantMessageParam(BaseModel):
:param content: The content of the model's response
:param name: (Optional) The name of the assistant message participant.
:param tool_calls: List of tool calls. Each tool call is an OpenAIChatCompletionToolCall object.
:param reasoning_content: (Optional) The reasoning content from the model (for vLLM ≤ v0.8.4)
:param reasoning: (Optional) The reasoning content from the model (for vLLM ≥ v0.9.x)
"""

role: Literal["assistant"] = "assistant"
content: OpenAIChatCompletionTextOnlyMessageContent | None = None
name: str | None = None
tool_calls: list[OpenAIChatCompletionToolCall] | None = None
reasoning_content: str | None = None
reasoning: str | None = None


@json_schema_type
Expand Down Expand Up @@ -605,14 +610,16 @@ class OpenAIChoiceDelta(BaseModel):
:param refusal: (Optional) The refusal of the delta
:param role: (Optional) The role of the delta
:param tool_calls: (Optional) The tool calls of the delta
:param reasoning_content: (Optional) The reasoning content from the model (non-standard, for o1/o3 models)
:param reasoning_content: (Optional) The reasoning content from the model (for vLLM ≤ v0.8.4)
:param reasoning: (Optional) The reasoning content from the model (for vLLM ≥ v0.9.x)
"""

content: str | None = None
refusal: str | None = None
role: str | None = None
tool_calls: list[OpenAIChatCompletionToolCall] | None = None
reasoning_content: str | None = None
reasoning: str | None = None


@json_schema_type
Expand Down
Loading