You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
Overview
v1.12 of the Speech SDK added support for a new KeywordRecognizer. This provides a way for applications to perform keyword spotting prior to authenticating with the speech service and will greatly improve cold start latencies--especially in real-world environments that involve token retrieval from an intermediate source.
The UWP Voice Assistant sample app, most specifically the DirectLineSpeechDialogBackend, should integrate this new KeywordRecognizer functionality to demonstrate its use in a easily reused way.
As a high-level summary of the work involved:
The backend initialization should no longer initialize a DialogServiceConnector immediately. Instead, it should create a KeywordRecognizer.
An audio turn start with confirmation required should plumb the input audio into the KeywordRecognizer (via the same sink currently used directly by the connector)
Confirmation timeouts should be tied to this new KeywordRecognizer rather than the connector
Upon confirmation (recognized event), the connector should be just-in-time initialized, an AudioDataStream should be retrieved from the KeywordRecognitionResult, and the stream data should be injected into the connector via a semi-persistent adapter object (AudioDataStream cannot currently work independently as a stream input source)
Everything else should generally then work the same way!
The text was updated successfully, but these errors were encountered:
This issue is for a: (mark with an
x
)Overview
v1.12 of the Speech SDK added support for a new
KeywordRecognizer
. This provides a way for applications to perform keyword spotting prior to authenticating with the speech service and will greatly improve cold start latencies--especially in real-world environments that involve token retrieval from an intermediate source.The UWP Voice Assistant sample app, most specifically the DirectLineSpeechDialogBackend, should integrate this new KeywordRecognizer functionality to demonstrate its use in a easily reused way.
As a high-level summary of the work involved:
AudioDataStream
should be retrieved from theKeywordRecognitionResult
, and the stream data should be injected into the connector via a semi-persistent adapter object (AudioDataStream
cannot currently work independently as a stream input source)The text was updated successfully, but these errors were encountered: