-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dialogflow-cx): Example implementation of Streaming Detect Intent with continuous microphone input and audio output #13053
base: main
Are you sure you want to change the base?
Conversation
…rophone input and speaker output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi team,
I'm currently reviewing this pull request and will provide my feedback in a few minutes. In the meantime, here's a summary of the changes for you and other reviewers to quickly get up to speed:
This pull request implements a real-time bidirectional streaming audio interface with Google Cloud Dialogflow CX. The primary goal is to create an example that captures audio from a user's microphone, streams it to Dialogflow CX for intent detection, and plays back the synthesized audio responses.
Here's a breakdown of the changes:
- Intent: Provide a complete example implementation of Streaming Detect Intent with continuous microphone input and audio output. This addresses issue Streaming Detect Intent with continuous microphone input #13041.
- Changes:
- Added a new Python file:
dialogflow-cx/streaming_detect_intent_infinite.py
. This file contains the core logic for the bidirectional audio streaming with Dialogflow CX. It's a substantial addition (lines 1-727). - Updated
dialogflow-cx/requirements.txt
(lines 1-7) to include necessary dependencies:termcolor==2.5.0
andpyaudio==0.2.14
. Thegoogle-cloud-dialogflow-cx
library was also updated to version 1.38.0.
- Added a new Python file:
- Activity: The author, ffeldhaus, has indicated that tests and lint checks have passed.
The example is well-documented and includes instructions on setting up the environment and running the script. It also handles potential errors and provides informative logging.
I'll be checking for things like error handling, code style, and overall clarity of the example. I'll post my detailed review shortly.
And now, a little haiku to lighten the mood:
Code flows like a stream,
Dialog's voice, a clear response,
Smooth talk, all in sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request introduces a continuous streaming example for Dialogflow CX, addressing issue #13041. The implementation is comprehensive and well-structured, demonstrating bidirectional streaming, audio input/output, and graceful exit handling. The code is generally clean and follows best practices. However, there are a few minor suggestions for improvement.
Update example invocation Co-authored-by: code-review-assist[bot] <182814678+code-review-assist[bot]@users.noreply.github.com>
Add validation of agent name Co-authored-by: code-review-assist[bot] <182814678+code-review-assist[bot]@users.noreply.github.com>
Internal b/389902797 |
Here is the summary of changes. You are about to add 1 region tag.
This comment is generated by snippet-bot.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR also needs tests. Since it's an "infinite" streaming CLI app, there's no need to test it end-to-end, but unit tests around the most important functions would be nice.
It's a little tricky since most functions are methods depending on the internal state of an object, so each test might need to create its own object.
I'm okay with more minimal testing around AudioIO
since that's what streams from the microphone and can get tricky. However the Dialogflow code should be tested, it seems it could simply use a bytes generator with some fixed data, or maybe it could be read from a file?
I added a simple test similar to the existing test in streaming_detect_intent_partial_response_test.py reusing the some hello.wav resource file as input and the same existing Dialogflow Agent. I reused and extended the MockPyAudio from transcribe_streaming_infinite_v2_test.py. Please review again, all comments so far should be addressed. |
Description
Fixes #13041
Checklist
nox -s py-3.9
(see Test Environment Setup)nox -s lint
(see Test Environment Setup)README.rst is not updated as it seems to be auto generated.