-
-
Notifications
You must be signed in to change notification settings - Fork 6
feat:Adopt GA session schemas; remove event schemas; update OpenAPI refs #223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughReplaced prerelease session schemas with GA variants in the OpenAPI spec, removed two public event schemas, adjusted anyOf references to GA types, removed one client event type, tweaked descriptions (including idle turn-detection notes), and dropped a const metadata flag. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant API as Realtime API
rect rgb(235, 245, 255)
note over Client,API: GA session creation flow
Client->>API: POST /sessions (create)
alt Standard session
API-->>Client: 201 RealtimeSessionCreateResponseGA
else Transcription session
API-->>Client: 201 RealtimeTranscriptionSessionCreateResponseGA
end
end
rect rgb(245, 235, 255)
note over Client,API: Session lifecycle events
Client->>API: WebSocket subscribe
API-->>Client: ...updates (e.g., TranscriptionSessionUpdated)
note right of API: "Created" event removed from public stream
opt Idle turn detection
API-->>Client: timeout_triggered (on idle threshold)
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/libs/tryAGI.OpenAI/openapi.yaml (1)
24582-24590
: GA session schema needs a concretetype
(and likelyobject
) to satisfy discriminator and const semantics.
required: [client_secret, type]
is declared, buttype
isn’t defined inproperties
(as shown). Add a constanttype: "realtime"
; consider also const-encodingobject: "realtime.session"
.Apply:
RealtimeSessionCreateResponseGA: required: - client_secret - type type: object properties: + type: + type: string + enum: [realtime] + description: The type of session. Always `realtime` for realtime sessions. + x-stainless-const: true + object: + type: string + enum: [realtime.session] + description: The object type. Always `realtime.session`. + x-stainless-const: true audio:Also verify
client_secret
is actually returned by this endpoint; if not, drop it fromrequired
.
🧹 Nitpick comments (1)
src/libs/tryAGI.OpenAI/openapi.yaml (1)
24450-24582
: Non-GA and GA session objects coexist; consider deprecating or removing the old one.If nothing references
RealtimeSessionCreateResponse
anymore, mark it deprecated or remove it to avoid confusion and accidental use.I can search the repo to confirm references before you remove it.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (79)
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI..JsonSerializerContext.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeClientEvent.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAAudioInputTurnDetectionEagerness.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAAudioInputTurnDetectionEagernessNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAAudioInputTurnDetectionType.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAAudioInputTurnDetectionTypeNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAIncludeItem.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAIncludeItemNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAMaxOutputTokens.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAMaxOutputTokensNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAModel.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAModelNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAOutputModalitie.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAOutputModalitieNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGATracingEnum.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGATracingEnumNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAType.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGATypeNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTranscriptionSessionCreateResponseGAIncludeItem.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTranscriptionSessionCreateResponseGAIncludeItemNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTranscriptionSessionCreateResponseGAType.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTranscriptionSessionCreateResponseGATypeNullable.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.Session2.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonSerializerContextTypes.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeClientEvent.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeClientEventSessionUpdate.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeServerEventTranscriptionSessionCreated.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateRequestGAAudioInputTurnDetection.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponse.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseAudio.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseAudioInput.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseAudioInputNoiseReduction.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseAudioInputTurnDetection.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseAudioOutput.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGA.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGA.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudio.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudio.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInput.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInput.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInputNoiseReduction.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInputNoiseReduction.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInputTurnDetection.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInputTurnDetection.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInputTurnDetectionEagerness.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInputTurnDetectionType.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioOutput.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioOutput.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAClientSecret.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAClientSecret.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAIncludeItem.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAMaxOutputTokens.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAModel.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAOutputModalitie.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGATracingEnum.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGATracingEnum2.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGATracingEnum2.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGATracingEnumMetadata.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGATracingEnumMetadata.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAType.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseOutputModalities.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseOutputModalities.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseTracingEnum2.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseTracingEnumMetadata.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseTurnDetection.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseTurnDetection.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGA.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGA.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAAudio.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAAudio.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAAudioInput.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAAudioInput.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAAudioInputNoiseReduction.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAAudioInputNoiseReduction.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAAudioInputTurnDetection.Json.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAAudioInputTurnDetection.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAIncludeItem.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateResponseGAType.g.cs
is excluded by!**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.Session2.g.cs
is excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/tryAGI.OpenAI/openapi.yaml
(6 hunks)
🔇 Additional comments (3)
src/libs/tryAGI.OpenAI/openapi.yaml (3)
24319-24319
: New event mention: verifytimeout_triggered
is defined.You now document an emitted
timeout_triggered
event. Ensure a corresponding event schema/name exists elsewhere, or rename here to match the actual event.Would you like me to search and list where this event is (or isn’t) defined across the spec?
24622-24622
: Repeat of new event mention: verifytimeout_triggered
exists.Same concern as earlier idle-timeout description: ensure the event is defined and consistently named.
24955-24957
: ValidateRealtimeAudioFormats
supports string literalpcm16
used in example.The example uses
"format": "pcm16"
, while other examples show an object form. Please confirm#/components/schemas/RealtimeAudioFormats
permits the string variant and includespcm16
.If not, align the example or widen the schema (e.g.,
oneOf
string enum | object form).Also applies to: 25006-25006
description: 'The event type, must be `session.update`.' | ||
x-stainless-const: true | ||
description: "Send this event to update the session’s configuration.\nThe client may send this event at any time to update any field\nexcept for `voice` and `model`. `voice` can be updated only if there have been no other\naudio outputs yet. \n\nWhen the server receives a `session.update`, it will respond\nwith a `session.updated` event showing the full, effective configuration.\nOnly the fields that are present in the `session.update` are updated. To clear a field like\n`instructions`, pass an empty string. To clear a field like `tools`, pass an empty array.\nTo clear a field like `turn_detection`, pass `null`.\n" | ||
description: "Send this event to update the session’s configuration.\nThe client may send this event at any time to update any field\nexcept for `voice` and `model`. `voice` can be updated only if there have been no other audio outputs yet.\n\nWhen the server receives a `session.update`, it will respond\nwith a `session.updated` event showing the full, effective configuration.\nOnly the fields that are present in the `session.update` are updated. To clear a field like\n`instructions`, pass an empty string. To clear a field like `tools`, pass an empty array.\nTo clear a field like `turn_detection`, pass `null`.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc says you can clear turn_detection
with null, but schema doesn’t allow null.
The description instructs clients to send turn_detection: null
, yet the schema blocks nulls (no nullable: true
). This will break clients/codegen.
Follow-up: See suggested fixes on Lines 24561-24576 and 24966-24981 to mark these fields nullable.
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around line 21728, the documentation
tells clients to send turn_detection: null but the schema does not permit nulls;
update the OpenAPI schema by adding nullable: true (or using type: ['object',
'null'] where appropriate) to the turn_detection property definitions referenced
in the file and to the other occurrences indicated around lines 24561-24576 and
24966-24981 so the field accepts null; ensure any component/schema definitions
and request/response examples remain consistent with the nullable change.
- $ref: '#/components/schemas/RealtimeSessionCreateResponseGA' | ||
- $ref: '#/components/schemas/RealtimeTranscriptionSessionCreateResponseGA' | ||
description: "The session configuration for either a realtime or transcription session.\n" | ||
discriminator: | ||
propertyName: type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discriminator likely broken: GA “session” schema lacks a concrete type
value + no mapping.
discriminator.propertyName: type
is set, but RealtimeSessionCreateResponseGA
appears to not define a type
property/value, while the transcription GA variant does. Add a constant type: "realtime"
and explicit discriminator mapping.
Apply:
discriminator:
propertyName: type
+ mapping:
+ realtime: '#/components/schemas/RealtimeSessionCreateResponseGA'
+ transcription: '#/components/schemas/RealtimeTranscriptionSessionCreateResponseGA'
And add type
in the GA session schema (see Lines 24582-24590 comment).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
- $ref: '#/components/schemas/RealtimeSessionCreateResponseGA' | |
- $ref: '#/components/schemas/RealtimeTranscriptionSessionCreateResponseGA' | |
description: "The session configuration for either a realtime or transcription session.\n" | |
discriminator: | |
propertyName: type | |
discriminator: | |
propertyName: type | |
mapping: | |
realtime: '#/components/schemas/RealtimeSessionCreateResponseGA' | |
transcription: '#/components/schemas/RealtimeTranscriptionSessionCreateResponseGA' |
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around lines 22211-22215 (and update the
GA session schema at lines ~24582-24590), the discriminator is misconfigured:
discriminator.propertyName is set to "type" but RealtimeSessionCreateResponseGA
does not declare a constant type value and there's no mapping entry. Fix by
adding a required, constant property type: "realtime" to the
RealtimeSessionCreateResponseGA schema, and add an explicit
discriminator.mapping entry mapping the discriminator value "realtime" to the
RealtimeSessionCreateResponseGA component (e.g.,
"#/components/schemas/RealtimeSessionCreateResponseGA") so the OAS discriminator
can resolve the subtype correctly.
turn_detection: | ||
type: object | ||
properties: | ||
prefix_padding_ms: | ||
type: integer | ||
description: "Amount of audio to include before the VAD detected speech (in\nmilliseconds). Defaults to 300ms.\n" | ||
silence_duration_ms: | ||
type: integer | ||
description: "Duration of silence to detect speech stop (in milliseconds). Defaults\nto 500ms. With shorter values the model will respond more quickly,\nbut may jump in on short pauses from the user.\n" | ||
threshold: | ||
type: number | ||
description: "Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A\nhigher threshold will require louder audio to activate the model, and\nthus might perform better in noisy environments.\n" | ||
type: | ||
type: string | ||
description: "Type of turn detection, only `server_vad` is currently supported.\n" | ||
description: "Configuration for turn detection. Can be set to `null` to turn off. Server\nVAD means that the model will detect the start and end of speech based on\naudio volume and respond at the end of user speech.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Schema-Doc mismatch: allow null for turn_detection
and add bounds for threshold
.
- Description: “Can be set to
null
to turn off.” Butnullable
isn’t set. threshold
is described as 0.0–1.0; encode that as min/max for better validation.
Apply:
turn_detection:
type: object
+ nullable: true
properties:
prefix_padding_ms:
type: integer
silence_duration_ms:
type: integer
threshold:
type: number
+ minimum: 0
+ maximum: 1
type:
type: string
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
turn_detection: | |
type: object | |
properties: | |
prefix_padding_ms: | |
type: integer | |
description: "Amount of audio to include before the VAD detected speech (in\nmilliseconds). Defaults to 300ms.\n" | |
silence_duration_ms: | |
type: integer | |
description: "Duration of silence to detect speech stop (in milliseconds). Defaults\nto 500ms. With shorter values the model will respond more quickly,\nbut may jump in on short pauses from the user.\n" | |
threshold: | |
type: number | |
description: "Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A\nhigher threshold will require louder audio to activate the model, and\nthus might perform better in noisy environments.\n" | |
type: | |
type: string | |
description: "Type of turn detection, only `server_vad` is currently supported.\n" | |
description: "Configuration for turn detection. Can be set to `null` to turn off. Server\nVAD means that the model will detect the start and end of speech based on\naudio volume and respond at the end of user speech.\n" | |
turn_detection: | |
type: object | |
nullable: true | |
properties: | |
prefix_padding_ms: | |
type: integer | |
silence_duration_ms: | |
type: integer | |
threshold: | |
type: number | |
minimum: 0 | |
maximum: 1 | |
type: | |
type: string |
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around lines 24561 to 24576, the schema
for turn_detection incorrectly omits nullable and the threshold lacks numeric
bounds; update the turn_detection property to include nullable: true (or
explicitly allow ["object","null"] if using oneOf/anyOf style) so it can be set
to null as described, and add minimum: 0.0 and maximum: 1.0 to the threshold
property (ensuring type: number/float) to enforce the documented 0.0–1.0 range.
description: "Configuration for turn detection. Can be set to `null` to turn off. Server\nVAD means that the model will detect the start and end of speech based on\naudio volume and respond at the end of user speech.\n" | ||
description: "A Realtime session configuration object.\n" | ||
x-oaiMeta: | ||
example: "{\n \"id\": \"sess_001\",\n \"object\": \"realtime.session\",\n \"expires_at\": 1742188264,\n \"model\": \"gpt-realtime\",\n \"output_modalities\": [\"audio\"],\n \"instructions\": \"You are a friendly assistant.\",\n \"tools\": [],\n \"tool_choice\": \"none\",\n \"max_output_tokens\": \"inf\",\n \"tracing\": \"auto\",\n \"truncation\": \"auto\",\n \"prompt\": null,\n \"audio\": {\n \"input\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"transcription\": { \"model\": \"whisper-1\" },\n \"noise_reduction\": null,\n \"turn_detection\": null\n },\n \"output\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"voice\": \"alloy\",\n \"speed\": 1.0\n }\n }\n}\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example contains fields not in schema (truncation
, prompt
).
The example JSON shows truncation
and prompt
which aren’t defined in this schema block. This will confuse users and can fail example validation.
Apply:
- "truncation": "auto",
- "prompt": null,
(Or add these fields to the schema if they are intended.)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
example: "{\n \"id\": \"sess_001\",\n \"object\": \"realtime.session\",\n \"expires_at\": 1742188264,\n \"model\": \"gpt-realtime\",\n \"output_modalities\": [\"audio\"],\n \"instructions\": \"You are a friendly assistant.\",\n \"tools\": [],\n \"tool_choice\": \"none\",\n \"max_output_tokens\": \"inf\",\n \"tracing\": \"auto\",\n \"truncation\": \"auto\",\n \"prompt\": null,\n \"audio\": {\n \"input\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"transcription\": { \"model\": \"whisper-1\" },\n \"noise_reduction\": null,\n \"turn_detection\": null\n },\n \"output\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"voice\": \"alloy\",\n \"speed\": 1.0\n }\n }\n}\n" | |
- "truncation": "auto", | |
- "prompt": null, |
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around line 24579, the example JSON
includes fields "truncation" and "prompt" that are not defined in the
corresponding schema; remove those fields from the example to match the schema,
or if they are intended properties, add "truncation" and/or "prompt" definitions
to the schema (with appropriate types and descriptions) and update any required
lists so the example validates against the schema.
type: object | ||
properties: | ||
prefix_padding_ms: | ||
type: integer | ||
description: "Amount of audio to include before the VAD detected speech (in\nmilliseconds). Defaults to 300ms.\n" | ||
silence_duration_ms: | ||
type: integer | ||
description: "Duration of silence to detect speech stop (in milliseconds). Defaults\nto 500ms. With shorter values the model will respond more quickly,\nbut may jump in on short pauses from the user.\n" | ||
threshold: | ||
type: number | ||
description: "Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A\nhigher threshold will require louder audio to activate the model, and\nthus might perform better in noisy environments.\n" | ||
type: | ||
type: string | ||
description: "Type of turn detection, only `server_vad` is currently supported.\n" | ||
description: "Configuration for turn detection. Can be set to `null` to turn off. Server\nVAD means that the model will detect the start and end of speech based on\naudio volume and respond at the end of user speech.\n" | ||
description: "Configuration for input audio for the session.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Transcription GA: allow null for turn_detection
and add bounds for threshold
.
Mirror the realtime session changes: set nullable: true
and encode 0–1 range.
Apply:
turn_detection:
type: object
+ nullable: true
properties:
prefix_padding_ms:
type: integer
silence_duration_ms:
type: integer
threshold:
type: number
+ minimum: 0
+ maximum: 1
type:
type: string
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
type: object | |
properties: | |
prefix_padding_ms: | |
type: integer | |
description: "Amount of audio to include before the VAD detected speech (in\nmilliseconds). Defaults to 300ms.\n" | |
silence_duration_ms: | |
type: integer | |
description: "Duration of silence to detect speech stop (in milliseconds). Defaults\nto 500ms. With shorter values the model will respond more quickly,\nbut may jump in on short pauses from the user.\n" | |
threshold: | |
type: number | |
description: "Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A\nhigher threshold will require louder audio to activate the model, and\nthus might perform better in noisy environments.\n" | |
type: | |
type: string | |
description: "Type of turn detection, only `server_vad` is currently supported.\n" | |
description: "Configuration for turn detection. Can be set to `null` to turn off. Server\nVAD means that the model will detect the start and end of speech based on\naudio volume and respond at the end of user speech.\n" | |
description: "Configuration for input audio for the session.\n" | |
turn_detection: | |
type: object | |
nullable: true | |
properties: | |
prefix_padding_ms: | |
type: integer | |
silence_duration_ms: | |
type: integer | |
threshold: | |
type: number | |
minimum: 0 | |
maximum: 1 | |
type: | |
type: string |
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around lines 24966 to 24981, the
transcription GA schema needs to allow null for the turn_detection field and
enforce a 0–1 range for the threshold; change the turn_detection property to
include nullable: true and add minimum: 0 and maximum: 1 (or inclusive bounds
appropriate to your schema flavor) under the threshold property so the OpenAPI
spec mirrors the realtime session behavior.
Summary by CodeRabbit