Releases: k2-fsa/sherpa-onnx
Releases · k2-fsa/sherpa-onnx
Release v1.9.10
What's Changed
- Fix CI tests for Python and JNI. by @csukuangfj in #554
- Add a new Persian tts model by @csukuangfj in #555
- Add TTS demo for C# API by @csukuangfj in #557
Full Changelog: v1.9.9...v1.9.10
Release v1.9.9
What's Changed
- Fix kws ci by @pkufool in #540
- Fix cmake variables to point to the project root directory. by @csukuangfj in #545
- add blank_penalty for offline transducer by @chiiyeh in #542
- add hotwords docstring to offline_recognizer and online_recognizer by @chiiyeh in #546
- add blank_penalty for online transducer by @chiiyeh in #548
- Fixes issue #535 , fix hexa 1-char tokens in ASR output. by @vesis84 in #550
- Ensure input for speaker ID is a valid number. by @csukuangfj in #552
- Run TTS engine service without starting the app. by @csukuangfj in #553
New Contributors
Full Changelog: v1.9.8...v1.9.9
Release v1.9.8
What's Changed
- Add missing field for two-pass APK. by @csukuangfj in #511
- Fix Byte BPE string results for Python. by @csukuangfj in #512
- Fix #510 by @csukuangfj in #513
- Use high_freq -400 in computing fbank features. by @csukuangfj in #515
- Use NDK 22.1 for android build by @csukuangfj in #518
- Add runtime support for wespeaker models by @csukuangfj in #516
- Support exporting models to onnx from 3D-Speaker by @csukuangfj in #522
- Fix publishing nuget packages. by @csukuangfj in #525
- Add C++ runtime for models from 3d-speaker by @csukuangfj in #523
- Export speaker verification models from NeMo to ONNX by @csukuangfj in #526
- Add C++ runtime for speaker verification models from NeMo by @csukuangfj in #527
- Android TTS APKs for Persian by @csukuangfj in #529
- Fix setting speaker ID for Android TTS Engine. by @csukuangfj in #530
- Add a Persian and a Slovenian model from Piper for Android TTS. by @csukuangfj in #531
- Add Python API examples for speaker recognition with VAD and ASR. by @csukuangfj in #532
- Refactor the UI of Android TTS engine by @csukuangfj in #533
- decoder for open vocabulary keyword spotting by @pkufool in #505
- Change model url from modelscope to github by @pkufool in #538
- Add Android demo for speaker recognition by @csukuangfj in #536
Full Changelog: v1.9.7...v1.9.8
kws-models
Refactor the UI of Android TTS engine (#533)
Release v1.9.7
What's Changed
- Replace Android system TTS engine by @csukuangfj in #508
- Build text-to-speech engine APKs by @csukuangfj in #509
Full Changelog: v1.9.5...v1.9.7
Release v1.9.5
What's Changed
- Fix building wheels for Linux. by @csukuangfj in #484
- Fix CI by @csukuangfj in #485
- Print informative error messages for sherpa-onnx-alsa on errors. by @csukuangfj in #486
- Keep multiple threads from calling into espeak-ng at the same time by @csukuangfj in #489
- Fix whisper test script for the latest onnxruntime. by @csukuangfj in #494
- Release Python GIL in C++ class constructor by @csukuangfj in #493
- Support streaming zipformer CTC by @csukuangfj in #496
Full Changelog: v1.9.4...v1.9.5
Release v1.9.4
What's Changed
- Give an informative log for whisper on exceptions. by @csukuangfj in #473
- convert wespeaker models to sherpa-onnx by @csukuangfj in #475
- Fix releasing go packages by @csukuangfj in #476
- Support playing as it is generating for Android by @csukuangfj in #477
- Fix android tts audio buffer size and fix CI. by @csukuangfj in #478
- Add two GLaDOS TTS models by @csukuangfj in #481
- Play generated audio using alsa for TTS by @csukuangfj in #482
Full Changelog: v1.9.1...v1.9.4
Release v1.9.1
What's Changed
- Remove the 30-second constraint from whisper. by @csukuangfj in #471
- Support distil-small.en whisper by @csukuangfj in #472
Full Changelog: v1.9.0...v1.9.1
Speaker recognition models
This release contains speaker recognition models for sherpa-onnx.
Each model has its own license. Please see the corresponding repository for the specific license of a given model.
Release v1.9.0
What's Changed
- Build building for iOS by @csukuangfj in #430
- Judge before UseCachedDecoderOut by @HieDean in #431
- Build MFC examples for Windows x86 (Win32) by @csukuangfj in #434
- Replace Clone() with View() by @HieDean in #432
- Refactor CI scripts about building wheels by @csukuangfj in #436
- support nodejs by @csukuangfj in #438
- Add Swift API for TTS by @csukuangfj in #439
- Text-to-speech for iOS by @csukuangfj in #443
- Lock before push_back the deque for thread safety by @HieDean in #445
- Update to onnxruntime 1.16.3 by @csukuangfj in #446
- Fix reading tokens.txt on Windows by @csukuangfj in #448
- Fix nodejs on Windows by @csukuangfj in #450
- Release GIL to support multithreading in Python websocket servers. by @csukuangfj in #451
- Support piper-phonemize by @csukuangfj in #452
- Use piper-phonemize to convert text to token IDs by @csukuangfj in #453
- Fix CI by @csukuangfj in #456
- Play generated audio as it is generating. by @csukuangfj in #457
- Break text into sentences for tts. by @csukuangfj in #460
- Support playing generated audio as it is generating for MFC. by @csukuangfj in #462
- Fix building for .Net by @csukuangfj in #463
- Use espeak-ng for coqui-ai/TTS VITS English models. by @csukuangfj in #466
- Support Ukrainian VITS models from coqui-ai/TTS by @csukuangfj in #469
- Release v1.9.0 by @csukuangfj in #470
New Contributors
Full Changelog: v1.8.10...v1.9.0