You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am doing text summarization for my thesis and I am not sure why this is happening, but apparently it has been an issue for 6 months. Is there a way to prevent this from happening?
Thank you.
The text was updated successfully, but these errors were encountered:
Hello @Kevin-Patyk
Do you need to preprocess text inputs before tokenizer???
I test by adding text input into tokenizer, and summary, but output not good, maybe need preprocess input text before summarization
tokenizer = AutoTokenizer.from_pretrained("google/bigbird-pegasus-large-arxiv")
model = BigBirdPegasusForConditionalGeneration.from_pretrained("google/bigbird-pegasus-large-arxiv", attention_type="original_full")
model = model.to(device)
inputs = tokenizer(text_ip, return_tensors='pt', truncation=True).to(device)
prediction = model.generate(**inputs) #max_length output is 256
prediction = tokenizer.batch_decode(prediction)
print(prediction)
Here is outputs:
['<s> the problem of machine learning is to find a way to learn from data.<n> this paper studies the problem of finding a way to learn a way to learn a way to learn a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn.<n> we study the problem of finding a way to learn a way to learn.<n> we study the problem of finding a way to']
Hello,
BigBird Pegaus, when creating summaries of text, is repeating the same sentence over and over. I have tried using text on the Hugging Face model hub and there is an issue posted on Stack Overflow (https://stackoverflow.com/questions/68911203/big-bird-pegasus-summarization-output-is-repeating-itself). Additionally, below are some images from the Hugging Face hub.
I am doing text summarization for my thesis and I am not sure why this is happening, but apparently it has been an issue for 6 months. Is there a way to prevent this from happening?
Thank you.
The text was updated successfully, but these errors were encountered: