Async Functionality in Cordova #6

esgraham · 2020-01-14T16:23:55Z

Description

As an architect, I want to understand how Cordova implements Async functionality, so that I can determine if plugin should implement the Speech SDK async functions.

Acceptance Criteria

Find document that details the how Cordova works with async functionality so that we can review to make a decision.
Find document that details how the Cognitive Services Speech SDK implements async servcies so that we can review to make a decision on whether or not to use the functions in the plugin.
Update README.md with final decision of if and how to implement async functions from the Speech SDK

rozele · 2020-01-21T21:14:19Z

Android

Recognize from microphone

Recommendation: recognizeOnceAsync

Cordova plugins are invoked on the WebCore thread on Android, not the main UI thread, see the threading section of this Android overview.
The Java Speech SDK only has a recognizeOnceAsync method that returns a Future<T>.
Calling the get() method on the future will block the current thread.
We don't want to block the WebCore thread, so we should create an ExecutorService to get the results from the Future on a background thread. The ExecutorService used in the Cognitive Services samples is the cached thread pool (Executors.newCachedThreadPool()).

Stop recognizing from microphone

Recommendation: stopContinuousRecognitionAsync

Similar to the other methods, leverage the ExecutorService set up for speech recognition to "await" the get() call on the Future returned by stopContinuousRecognitionAsync.

Play text to speech audio

Recommendation: SpeakTextAsync and SpeakSsmlAsync

The Java Speech SDK has 8 methods for speaking text:
- Return before audio played vs. return after audio completed
- Async vs. sync
- Text vs. SSML
We should use the async variants that return after audio is completed and expose one for both text and SSML. We can revisit at a later time if we should expose the variant that returns before the audio is played.
We should leverage the same ExecutorService that we set up for speech recognition for resolving the Future.

Stop playing text to speech audio

Recommendation: AudioTrack

TODO: We need to investigate if calling cancel(true) on the Future<T> returned by the SpeakTextAsync method will stop playing any audio.
~~If it does not, we'll need to set up a custom playback mechanism similar to how it was done for iOS.~~
We need to set up audio playback to support cancellation on Android.

iOS

Recognize from microphone

Recommendation: recognizeOnce

Cordova plugins run on the main UI thread, so long running tasks should be invoked on a background thread (see this iOS overview).
Using either recognizeOnceAsync or recognizeOnce seems to block the calling thread until speech recognition is complete, so the simplest option is to just use recognizeOnce and run it on a background thread.

Stop recognizing from microphone

Recommendation: stopContinuousRecognition

stopContinuousRecognition will block the calling thread until complete. I believe this is desired behavior, so we may not need to switch this call onto a background thread, but it may lead to a poor UX if the stop / cancel operation takes too long.

Play text to speech audio

Recommendation: speakText and speakSsml

iOS does not have an async variant for these methods. Same recommendation as for Android.

Stop playing text to speech audio

Recommendation: AVAudioPlayer.stop

There currently is no cancellation option that can be invoked from the Speech SDK.
We have experimented with AVAudioPlayer, on which we can call the stop method to cancel any active audio.
It may be fine to do this from the main thread.

rozele · 2020-01-22T21:14:35Z

A minor modification to the above, I actually recommend we do not use the Speech SDK for text-to-speech playback, because it does not support cancellation. Using the Speech SDK to get the audio data to play on something like AVAudioPlayer (iOS) or AudioTrack (Android) is not as efficient as piping the bytes directly from a REST call to Cognitive Services.

I believe we can efficiently use the Speech SDK to stream audio, if we leverage the PushAudioOutputStream and direct the bytes written to the output stream directly to the AVAudioPlayer / AudioTrack.

SpeechConfig speechConfig = SpeechConfig.fromSubscription(...);
CustomPushAudioOutputStreamCallback callback = new CustomPushAudioOutputStreamCallback();
PushAudioOutputStream outputStream = PushAudioOutputStream.create(callback);
AudioConfig audioConfig = AudioConfig.fromStreamOutput(outputStream);
SpeechSynthesizer synthesizer = new SpeechSynthesizer(speechConfig, audioConfig);

esgraham added documentation Improvements or additions to documentation design User Case around a design discussion labels Jan 14, 2020

rozele self-assigned this Jan 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async Functionality in Cordova #6

Async Functionality in Cordova #6

esgraham commented Jan 14, 2020 •

edited

Loading

rozele commented Jan 21, 2020 •

edited

Loading

rozele commented Jan 22, 2020 •

edited

Loading

Async Functionality in Cordova #6

Async Functionality in Cordova #6

Comments

esgraham commented Jan 14, 2020 • edited Loading

Description

Acceptance Criteria

rozele commented Jan 21, 2020 • edited Loading

Android

Recognize from microphone

Stop recognizing from microphone

Play text to speech audio

Stop playing text to speech audio

iOS

Recognize from microphone

Stop recognizing from microphone

Play text to speech audio

Stop playing text to speech audio

rozele commented Jan 22, 2020 • edited Loading

esgraham commented Jan 14, 2020 •

edited

Loading

rozele commented Jan 21, 2020 •

edited

Loading

rozele commented Jan 22, 2020 •

edited

Loading