-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fast_beam_search_LG outputs empty for sentence with oov words #1151
Comments
If you use LG, it can only recognize words that are present in L, If it is not in L, then you will never find it in the output. |
@Slyne Can you first try decoding with |
Thank you, guys! Was wondering what's the rule of thumb to set the parameters? For example, |
I have a possibly related issue. I am trying long-form decoding on TED-Lium dataset (30s chunks). I tried decoding with
Notice that when decoding on the provided segments, using LG method gets better WER, but when decoding longer chunks, it gets much worse. I found that this increase in WER is mainly due to long deleted segments such as:
Such long deletions do not happen when decoding on the original segments. I have set Update: It seems the After correcting the parameters, the LG decoding WER (on 30s chunks) improved to 8%, but I still see a lot of long deleted segments such as the above. |
@pkufool Do you have any suggestions about this? |
@desh2608 I don't have ideas about this, could you share the model and bad cases to me, I can help to debug this issue. BTW, I do have some small fixes to fast_beam_search, see k2-fsa/k2#1237 , maybe you can try it. I am not sure whether this will help your problems. |
Thanks. Let me try it again with the updated k2. If the problem persists, I will share a minimum reproducible example. |
I also see some related PRs: k2-fsa/k2#1134 and k2-fsa/k2#1218. Should I pull those as well? |
no, just the latest one |
@pkufool Unfortunately, the problem still persists even after pulling your latest changes. In order to help you reproduce the issue, I have created a package containing the following:
Download link: Google Drive Since I am decoding with artificial 30s chunks, the "utterances" do not have corresponding reference texts. However, based on the start and end time of the utterances, the problematic utterance is from 446s to 484s in the recording, which roughly includes the following segments from the STM file:
I first thought the problem only happens with the last utterance in the batch for some reason, but other batches have utterances in the middle of the batch with long deleted segments. Please let me know if you need anything else which can be useful to debug the problem. |
I reverted the commit from k2-fsa/k2#1237 since it was causing errors during MWER training (see stack trace below).
|
OK, that change has not been fully tested yet. Thanks for your log. |
Looks like the deletion issue has been encountered before: #420 (comment) Update: I followed Dan's advice from the linked thread to increase the log likelihood beam, and it solved the deletion issue. |
Hi developers,
I'm not sure if this issue is from my side or not.
I use k2 + fast_beam_search and can get good result. However, after adding the LG.pt to the decoding graph, I can get results from most audio files from librispeech test_clean dataset but for six sentences, I get empty result so I checked the empty hypothesis sentences (6 sentences) transcript:
I found 'montfichet', 'fitzooth', 'ghisizzle', and 'timascheff's' are not in
words.txt
.I'm actually not sure if it's due to the OOV. Please help me figure out in what circumstances that it will output empty result?
ngram_lm_scale
has already been set as0.0
.The text was updated successfully, but these errors were encountered: