-
Notifications
You must be signed in to change notification settings - Fork 11.6k
Minh/out of band transcription #2250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
minh-hoque
commented
Nov 20, 2025
- Added new section to the realtime prompting guide handling background noise
- Added out-of-band realtime model transcription cookbook
…enAI Realtime API, including setup instructions, prompts, and audio streaming functionality. Also, add an accompanying image for visual reference.
…g OpenAI Realtime API. The notebook includes detailed setup instructions, prompts for transcription, and audio streaming functionality, enhancing user experience and accuracy in transcription tasks.
…by specifying that `conversation: "none"` refers to the main conversation session state. This enhances understanding of session state usage in transcription tasks.
…I Realtime API, updating the title for clarity and enhancing the registry with metadata including date, authors, and tags for better organization and discoverability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…cription resources.
…th a new version for improved visual representation.
…etails to emphasize the advantages of using the Realtime model for transcription, including reduced mismatch, greater steerability, and session context awareness. Adjusted phrasing for clarity and improved user understanding of transcription tasks.
… by modifying execution count handling and enhancing the description of transcription model performance, highlighting specific instances of accuracy in comparison to the traditional model.
…d trade-offs for the Realtime model, including cost profiles and implementation complexity. Update the image for better visual representation and add unique IDs for markdown and code cells to improve organization.
…tion notebook to specify "negligible input prompt" for better understanding of pricing structure.
…on iterating the `REALTIME_MODEL_TRANSCRIPTION_PROMPT` for tailored use cases, emphasizing the removal of Policy Number formatting rules for better applicability.
…on notebook to streamline instructions and focus on clear audio transcription.
|
anurag-openai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, i would add some examples of showing where transcription model fails, but realtime model does better. ie. foreign name, etc
|
@emre-openai modified title and added conclusion, thank you! not sure what part you are referring to with: 5,6 and 8 seems very concise |
…ok to reflect focus on transcribing user audio with a separate realtime request, enhancing clarity and purpose description.
…urpose description to clarify the use of the same websocket session for out-of-band audio transcription. Additionally, add a conclusion section outlining the benefits and considerations for implementing out-of-band transcription, improving guidance for users.