From 86aff278fe4ea8f2fb8fd999074f9e8c826a2f3b Mon Sep 17 00:00:00 2001 From: Jeff Tang Date: Mon, 14 Mar 2022 09:52:02 -0700 Subject: [PATCH] update of streaming ASR for PT 1.11 (#240) * initial commit * Revert "initial commit" This reverts commit 5a65775315ad581637837a408f80607fa5e3fed7. * main readme and helloworld/demo app readme updates * update of streaming ASR for PT 1.11 --- README.md | 4 ++++ StreamingASR/README.md | 16 ++++++++-------- StreamingASR/StreamingASR/app/build.gradle | 2 +- StreamingASR/save_model_for_mobile.py | 2 +- 4 files changed, 14 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index 09bdfd18..7c068a04 100644 --- a/README.md +++ b/README.md @@ -42,6 +42,10 @@ The [PyTorch demo app](https://github.com/pytorch/android-demo-app/tree/master/P [Speech Recognition](https://github.com/pytorch/android-demo-app/tree/master/SpeechRecognition) demonstrates how to convert Facebook AI's wav2vec 2.0, one of the leading models in speech recognition, to TorchScript and how to use the scripted model in an Android app to perform speech recognition. +### Streaming Speech recognition + +[Streaming Speech Recognition](https://github.com/pytorch/android-demo-app/tree/master/StreamingASR) demonstrates how to how to use a new torchaudio pipeline to perform streaming speech recognition, powered by Java Native Call to a C++ audio processing library for the mel spectrogram transform. + ### Video Classification [TorchVideo](https://github.com/pytorch/android-demo-app/tree/master/TorchVideo) demonstrates how to use a pre-trained video classification model, available at the newly released [PyTorchVideo](https://github.com/facebookresearch/pytorchvideo), on Android to see video classification results, updated per second while the video plays, on tested videos, videos from the Photos library, or even real-time videos. diff --git a/StreamingASR/README.md b/StreamingASR/README.md index ec2e9e7f..33ac5d39 100644 --- a/StreamingASR/README.md +++ b/StreamingASR/README.md @@ -6,9 +6,9 @@ In the Speech Recognition Android [demo app](https://github.com/pytorch/android- ## Prerequisites -* PyTorch 1.10.1 and torchaudio 0.10.1 or above (Optional) +* PyTorch 1.11 and torchaudio 0.11 or above (Optional) * Python 3.8 (Optional) -* Android Pytorch library org.pytorch:pytorch_android_lite:1.10.0 +* Android Pytorch library org.pytorch:pytorch_android_lite:1.11.0 * Android Studio 4.0.1 or later ## Quick Start @@ -22,7 +22,7 @@ git clone https://github.com/pytorch/android-demo-app cd android-demo-app/StreamingASR ``` -If you don't have PyTorch 1.10.1 and torchaudio 0.10.1 installed or want to have a quick try of the demo app, you can download the optimized scripted model file [streaming_asr.ptl](https://drive.google.com/file/d/1awT_1S6H5IXSOOqpFLmpeg0B-kQVWG2y/view?usp=sharing), then drag and drop it to the `StreamingASR/app/src/main/assets` folder inside `android-demo-app/StreamingASR`, and continue to Step 3. +If you don't have PyTorch 1.11 and torchaudio 0.11 installed or want to have a quick try of the demo app, you can download the optimized scripted model file [streaming_asr.ptl](https://drive.google.com/file/d/1awT_1S6H5IXSOOqpFLmpeg0B-kQVWG2y/view?usp=sharing), then drag and drop it to the `StreamingASR/app/src/main/assets` folder inside `android-demo-app/StreamingASR`, and continue to Step 3. Also you need to download [Eigen](https://eigen.tuxfamily.org/), a C++ template library for linear algebra, for Android NDK build required to run the app (see last section of this README for more info): ``` @@ -32,16 +32,16 @@ git clone https://github.com/jeffxtang/eigen ### 2. Test and Prepare the Model -To install PyTorch 1.10.1, torchaudio 0.10.1, and other required Python packages (numpy and pyaudio), do something like this: +To install PyTorch 1.11, torchaudio 0.11, and other required Python packages (numpy and pyaudio), do something like this: ``` -conda create -n pt1.10.1 python=3.8.5 -conda activate pt1.10.1 +conda create -n pt1.11 python=3.8.5 +conda activate pt1.11 pip install torch torchaudio numpy pyaudio ``` Now download the streaming ASR model file -[scripted_wrapper_tuple_no_transform.pt](https://drive.google.com/file/d/1_49DwHS_a3p3THGdHZj3TXmjNJj60AhP/view?usp=sharing) (the script used to create the model will be published soon) to the `android-demo-app/StreamingASR` directory. +[scripted_wrapper_tuple_no_transform.pt](https://drive.google.com/file/d/1_49DwHS_a3p3THGdHZj3TXmjNJj60AhP/view?usp=sharing) to the `android-demo-app/StreamingASR` directory. To test the model, run `python run_sasr.py`. After you see: ``` @@ -59,7 +59,7 @@ mv streaming_asr.ptl StreamingASR/app/src/main/assets ### 3. Build and run with Android Studio -Start Android Studio, open the project located in `android-demo-app/StreamingASR/StreamingASR`, build and run the app on an Android device. After the app runs, tap the Start button and start saying something. Some example recognition results are: +Start Android Studio, open the project located in `android-demo-app/StreamingASR/StreamingASR`, build and run the app on an Android device (not an emulator). After the app runs, tap the Start button and start saying something. Some example recognition results are: ![](screenshot1.png) ![](screenshot2.png) diff --git a/StreamingASR/StreamingASR/app/build.gradle b/StreamingASR/StreamingASR/app/build.gradle index 5318c3e7..288702a7 100644 --- a/StreamingASR/StreamingASR/app/build.gradle +++ b/StreamingASR/StreamingASR/app/build.gradle @@ -50,5 +50,5 @@ dependencies { androidTestImplementation 'androidx.test.ext:junit:1.1.3' androidTestImplementation 'androidx.test.espresso:espresso-core:3.4.0' - implementation 'org.pytorch:pytorch_android_lite:1.10.0' + implementation 'org.pytorch:pytorch_android_lite:1.11' } \ No newline at end of file diff --git a/StreamingASR/save_model_for_mobile.py b/StreamingASR/save_model_for_mobile.py index 7191fb76..f6412417 100644 --- a/StreamingASR/save_model_for_mobile.py +++ b/StreamingASR/save_model_for_mobile.py @@ -9,5 +9,5 @@ def get_demo_wrapper(): wrapper = get_demo_wrapper() scripted_model = torch.jit.script(wrapper) optimized_model = optimize_for_mobile(scripted_model) -optimized_model._save_for_lite_interpreter("sasr.ptl") +optimized_model._save_for_lite_interpreter("streaming_asr.ptl") print("Done _save_for_lite_interpreter")