Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repetition of words in the generation process #12

Open
nathias2004 opened this issue May 1, 2020 · 6 comments
Open

Repetition of words in the generation process #12

nathias2004 opened this issue May 1, 2020 · 6 comments

Comments

@nathias2004
Copy link

Hi...
I have trained and tested this model rigorously , using Quora dataset and my dataset. I see a lot of repetition of words in the same sentence while generating.
example: I want to order to order a phone , how to clear cookies caches and caches etc.,

Any help in this regard will be appreciated:)
Thanks

@TobiasLee
Copy link

It seems that the language model itself is not trained well.
Is the loss on the training set still decreasing?

@nathias2004
Copy link
Author

Yes it decreases

@TobiasLee
Copy link

well, i guess you can train the model longer and then check the results to see if it is getting better.

@FranxYao
Copy link
Owner

@TobiasLee Thanks for helping to answer!

Training it longer is indeed a quick answer. But the model may still suffer from repetition after proper convergence.

A quick solution would be rejection sampling: https://en.wikipedia.org/wiki/Rejection_sampling.

if: current token is the same as the previous step.
then: choose the next most-probable token.

In our case, rejection can be implemented by masking the probability of the previous token.
modify:

dec_index = tf.argmax(dec_dist, axis=1, output_type=tf.int32)

to something like (not strictly tensorflow code):
dec_dist[prev_token] = 0
current_token = argmax(dec_dist)
prev_token = current_token

hope this helps

@FranxYao
Copy link
Owner

For further discussion about architectures that prevent repetition, and its influence on sentence quality, see: https://www.aclweb.org/anthology/N18-1017/

@nathias2004
Copy link
Author

@FranxYao Thanks for the references, will look into those

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants