Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dialogflow-cx): Example implementation of Streaming Detect Intent with continuous microphone input and audio output #13053

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

ffeldhaus
Copy link

Description

Fixes #13041

Checklist

README.rst is not updated as it seems to be auto generated.

Sorry, something went wrong.

@ffeldhaus ffeldhaus requested review from a team as code owners January 14, 2025 11:21
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi team,

I'm currently reviewing this pull request and will provide my feedback in a few minutes. In the meantime, here's a summary of the changes for you and other reviewers to quickly get up to speed:

This pull request implements a real-time bidirectional streaming audio interface with Google Cloud Dialogflow CX. The primary goal is to create an example that captures audio from a user's microphone, streams it to Dialogflow CX for intent detection, and plays back the synthesized audio responses.

Here's a breakdown of the changes:

  • Intent: Provide a complete example implementation of Streaming Detect Intent with continuous microphone input and audio output. This addresses issue Streaming Detect Intent with continuous microphone input #13041.
  • Changes:
    • Added a new Python file: dialogflow-cx/streaming_detect_intent_infinite.py. This file contains the core logic for the bidirectional audio streaming with Dialogflow CX. It's a substantial addition (lines 1-727).
    • Updated dialogflow-cx/requirements.txt (lines 1-7) to include necessary dependencies: termcolor==2.5.0 and pyaudio==0.2.14. The google-cloud-dialogflow-cx library was also updated to version 1.38.0.
  • Activity: The author, ffeldhaus, has indicated that tests and lint checks have passed.

The example is well-documented and includes instructions on setting up the environment and running the script. It also handles potential errors and provides informative logging.

I'll be checking for things like error handling, code style, and overall clarity of the example. I'll post my detailed review shortly.


And now, a little haiku to lighten the mood:

Code flows like a stream,
Dialog's voice, a clear response,
Smooth talk, all in sync.

@product-auto-label product-auto-label bot added the samples Issues that are directly related to samples. label Jan 14, 2025
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request introduces a continuous streaming example for Dialogflow CX, addressing issue #13041. The implementation is comprehensive and well-structured, demonstrating bidirectional streaming, audio input/output, and graceful exit handling. The code is generally clean and follows best practices. However, there are a few minor suggestions for improvement.

ffeldhaus and others added 3 commits January 14, 2025 12:29
Update example invocation

Co-authored-by: code-review-assist[bot] <182814678+code-review-assist[bot]@users.noreply.github.com>
Add validation of agent name

Co-authored-by: code-review-assist[bot] <182814678+code-review-assist[bot]@users.noreply.github.com>
@ffeldhaus ffeldhaus changed the title Example implementation of Streaming Detect Intent with continuous microphone input and audio output feat(dialogflow-cx): Example implementation of Streaming Detect Intent with continuous microphone input and audio output Jan 14, 2025
@glasnt
Copy link
Contributor

glasnt commented Jan 16, 2025

Internal b/389902797

@glasnt glasnt added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jan 16, 2025
Copy link

snippet-bot bot commented Jan 17, 2025

Here is the summary of changes.

You are about to add 1 region tag.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

Sorry, something went wrong.

Copy link
Contributor

@davidcavazos davidcavazos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR also needs tests. Since it's an "infinite" streaming CLI app, there's no need to test it end-to-end, but unit tests around the most important functions would be nice.

It's a little tricky since most functions are methods depending on the internal state of an object, so each test might need to create its own object.

I'm okay with more minimal testing around AudioIO since that's what streams from the microphone and can get tricky. However the Dialogflow code should be tested, it seems it could simply use a bytes generator with some fixed data, or maybe it could be read from a file?

@glasnt glasnt assigned davidcavazos and unassigned glasnt Jan 28, 2025
@ffeldhaus
Copy link
Author

This PR also needs tests. Since it's an "infinite" streaming CLI app, there's no need to test it end-to-end, but unit tests around the most important functions would be nice.

It's a little tricky since most functions are methods depending on the internal state of an object, so each test might need to create its own object.

I'm okay with more minimal testing around AudioIO since that's what streams from the microphone and can get tricky. However the Dialogflow code should be tested, it seems it could simply use a bytes generator with some fixed data, or maybe it could be read from a file?

I added a simple test similar to the existing test in streaming_detect_intent_partial_response_test.py reusing the some hello.wav resource file as input and the same existing Dialogflow Agent.

I reused and extended the MockPyAudio from transcribe_streaming_infinite_v2_test.py.

Please review again, all comments so far should be addressed.

@glasnt glasnt added kokoro:force-run Add this label to force Kokoro to re-run the tests. and removed do not merge Indicates a pull request not ready for merge, due to either quality or timing. labels Feb 16, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Feb 16, 2025
@glasnt glasnt added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Feb 25, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
samples Issues that are directly related to samples.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Streaming Detect Intent with continuous microphone input
4 participants