You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the scorer generator provided generate_scorer_package. I'm also using (e.g., SentencePiece) to build a unigram language model, where the decoder predicts the size of the language model. How can I adapt the scorer such that it supports sub-word units? Will scorer work if filling the alphabet file with the sub-word units? Or shall I rely on some tricks like encoding the unigram language model using an ASCII table and re-encoding the corpus and use the alphabet based on the previous encoding mapping? Thank you.
The text was updated successfully, but these errors were encountered:
I'm using the scorer generator provided
generate_scorer_package
. I'm also using (e.g., SentencePiece) to build a unigram language model, where the decoder predicts the size of the language model. How can I adapt the scorer such that it supports sub-word units? Will scorer work if filling the alphabet file with the sub-word units? Or shall I rely on some tricks like encoding the unigram language model using an ASCII table and re-encoding the corpus and use the alphabet based on the previous encoding mapping? Thank you.The text was updated successfully, but these errors were encountered: