Skip to content

Conversation

@minh-hoque
Copy link
Contributor

  • Added new section to the realtime prompting guide handling background noise
  • Added out-of-band realtime model transcription cookbook

…enAI Realtime API, including setup instructions, prompts, and audio streaming functionality. Also, add an accompanying image for visual reference.
…g OpenAI Realtime API. The notebook includes detailed setup instructions, prompts for transcription, and audio streaming functionality, enhancing user experience and accuracy in transcription tasks.
…by specifying that `conversation: "none"` refers to the main conversation session state. This enhances understanding of session state usage in transcription tasks.
…I Realtime API, updating the title for clarity and enhancing the registry with metadata including date, authors, and tags for better organization and discoverability.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…th a new version for improved visual representation.
…etails to emphasize the advantages of using the Realtime model for transcription, including reduced mismatch, greater steerability, and session context awareness. Adjusted phrasing for clarity and improved user understanding of transcription tasks.
… by modifying execution count handling and enhancing the description of transcription model performance, highlighting specific instances of accuracy in comparison to the traditional model.
…d trade-offs for the Realtime model, including cost profiles and implementation complexity. Update the image for better visual representation and add unique IDs for markdown and code cells to improve organization.
…tion notebook to specify "negligible input prompt" for better understanding of pricing structure.
…on iterating the `REALTIME_MODEL_TRANSCRIPTION_PROMPT` for tailored use cases, emphasizing the removal of Policy Number formatting rules for better applicability.
…on notebook to streamline instructions and focus on clear audio transcription.
@emre-openai
Copy link
Contributor

  • The term “Out-of-Band” in the title may not be familiar to most readers who are not in the realtime transcription space. If it isn’t a widely used term in realtime world, you can consider a clearer alternative such as “Dual-Model Transcription with the OpenAI Realtime API” or “Multi-Model Transcription with the OpenAI Realtime API.” While the definition appears early in the guide, the title alone can communicate the concept better to maximize clarity/click-through.
  • You can rename 'Key details:' and highlight the below improvement more.
  • bullets under 5,6 and 8 seems very concise, you can add more details if possible.
  • The ending could be stronger. Instead of the current “From the above example, we can notice:”, consider closing with a more impactful Conclusion section, with the main takeaways and when to apply this pattern.

Copy link

@anurag-openai anurag-openai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, i would add some examples of showing where transcription model fails, but realtime model does better. ie. foreign name, etc

@minh-hoque
Copy link
Contributor Author

minh-hoque commented Nov 20, 2025

@emre-openai modified title and added conclusion, thank you!

not sure what part you are referring to with: 5,6 and 8 seems very concise

…ok to reflect focus on transcribing user audio with a separate realtime request, enhancing clarity and purpose description.
…urpose description to clarify the use of the same websocket session for out-of-band audio transcription. Additionally, add a conclusion section outlining the benefits and considerations for implementing out-of-band transcription, improving guidance for users.
@minh-hoque minh-hoque merged commit 22fda54 into main Nov 20, 2025
1 check passed
@minh-hoque minh-hoque deleted the minh/out-of-band-transcription branch November 20, 2025 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants