Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASR Improve With XXLMConfig:muti-program-lanuage support ,usage in the doc or sample codes #1587

Open
fengzhi09 opened this issue Dec 3, 2024 · 0 comments

Comments

@fengzhi09
Copy link

fengzhi09 commented Dec 3, 2024

Background‌:

  • ASR (Automatic Speech Recognition) can convert audio to text, but it's not suitable for all scenarios.
    For example, in the text "D7851次列车制动," the letter 'D' should be pronounced the same as the last character '动 (dong, 4th tone).'
  • I can't simply increase the weight of the word 'D' when receiving audio input pronounced as 'dong (4th tone).'
  • I'm considering introducing a Language Model (LM) into the final step of ASR (text generation) to automatically correct the output text.

What I've found in this project that may help me‌:

My suggestions‌:

  1. Provide multi-programming-language support for XXLMConfig.
  2. Include usage instructions for XXLMConfig in the documentation or provide sample codes.
  3. Provide a simple, standard LM as a sample, similar to how VAD (Voice Activity Detection) was introduced into non-streaming ASR.
@fengzhi09 fengzhi09 changed the title ASR Improve With LM:provide LMConfig muti-program-lanuage support and usage in the doc or sample codes ASR Improve With XXLMConfig:muti-program-lanuage support ,usage in the doc or sample codes Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant