how to use EncoderDecoderModel to do en-de translation? #8944

CharizardAcademy · 2020-12-06T10:58:57Z

I have trained a EncoderDecoderModel from huggging face to do english-German translation task. I tried to overfit a small dataset (100 parallel sentences), and use model.generate() then tokenizer.decode() to perform the translation. However, the output seems to be proper German sentences, but it is definitely not the correct translation.

Here are the code for building the model

encoder_config = BertConfig()
decoder_config = BertConfig()
config = EncoderDecoderConfig.from_encoder_decoder_configs(encoder_config, decoder_config)
model = EncoderDecoderModel(config=config)

Here are the code for testing the model

model.eval()
input_ids = torch.tensor(tokenizer.encode(input_text)).unsqueeze(0)
output_ids = model.generate(input_ids.to('cuda'), decoder_start_token_id=model.config.decoder.pad_token_id)
output_text = tokenizer.decode(output_ids[0])

Example input: "iron cement is a ready for use paste which is laid as a fillet by putty knife or finger in the mould edges ( corners ) of the steel ingot mould ."

Ground truth translation: "iron cement ist eine gebrauchs ##AT##-##AT## fertige Paste , die mit einem Spachtel oder den Fingern als Hohlkehle in die Formecken ( Winkel ) der Stahlguss -Kokille aufgetragen wird ."

What the model outputs after trained 100 epochs: "[S] wenn sie den unten stehenden link anklicken, sehen sie ein video uber die erstellung ansprechender illustrationen in quarkxpress" which is totally nonesense.

Where is the problem?

The text was updated successfully, but these errors were encountered:

LysandreJik · 2020-12-07T14:45:42Z

Hello, thanks for opening an issue! We try to keep the github issues for bugs/feature requests.
Could you ask your question on the forum instead?

Thanks!

cc @patrickvonplaten who might have an idea.

patrickvonplaten · 2020-12-07T15:50:53Z

This blog post should also help on how to fine-tune a warm-started Encoder-Decoder model: https://huggingface.co/blog/warm-starting-encoder-decoder . But as @LysandreJik said the forum is the better place to ask.

zmf0507 · 2020-12-17T07:46:54Z

@patrickvonplaten the blog post mentions about a notebook link for machine translation task but on clicking, it redirects to the blog only. I think there might be some mistake while adding the notebook link. Can you please share the translation task notebook on WMT dataset?

patrickvonplaten · 2020-12-17T12:12:01Z

Hey @zmf0507 - yeah I sadly haven't found the time yet to do this notebook

zmf0507 · 2020-12-25T19:50:15Z

@patrickvonplaten please let me know here when you make one. Despite being so popular, hugging-face doesn't provide any tutorial/notebook for machine translation. I think a lot of people might be looking for similar resources. Will help much. Thanks

patrickvonplaten · 2020-12-25T23:13:42Z

We have now one for mBart: https://colab.research.google.com/github/vasudevgupta7/huggingface-tutorials/blob/main/translation_training.ipynb -> will try to make one for Encoder Decoder as well when I find time :-)

zmf0507 · 2020-12-26T12:30:34Z

sure. thanks a lot :)

zmf0507 · 2021-02-14T19:20:49Z

@patrickvonplaten is there any encoder-decoder notebook made for translation task ? thanks

patrickvonplaten · 2021-02-15T07:09:18Z

I'm sadly not finding the time to do so at the moment :-/

I'll put this up as a "Good First Issue" now in case someone from the community finds time to make such a notebook.

A notebook for EncoderDecoderModel translation should look very similar to this notebook: https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Leveraging_Pre_trained_Checkpoints_for_Encoder_Decoder_Models.ipynb - one only has to change the summarization dataset with a translation dataset

zmf0507 · 2021-02-19T06:19:28Z

@patrickvonplaten thanks for the update.
Can you tell if there is any work on keyphrase generation /keywords generation (seq2seq task) using hugging-face ? I am looking for such tutorials and examples where I can try and play around keyphrase generation. This task is not mentioned on hugging-face notebooks page as well.
Please let me know

patrickvonplaten · 2021-03-11T14:33:23Z

My best advice would be to ask this question on the forum - I sadly don't know of any work related to this

parambharat · 2021-06-05T07:15:13Z

@patrickvonplaten : Here's my attempt that modifies the condensed version of BERT2BERT.ipynb to use the wmt dataset, BLEU4 score for the en-de translation task.

Nid989 · 2022-03-22T08:02:00Z

We have now one for mBart: https://colab.research.google.com/github/vasudevgupta7/huggingface-tutorials/blob/main/translation_training.ipynb -> will try to make one for Encoder-Decoder as well when I find time :-)

Inferring the model training details from BERT2BERT for CNN daily mail is not sufficient, we experimented with an MT model with the must-c data for en-fr , however the prediction were almost random and it was not able to understand the core meaning of its input sequence.

Nid989 · 2022-03-22T08:02:37Z

If anyone has a complete notebook based on the Encoder-Decoder model for MT, please share. Thank you.

xueqianyi · 2022-07-28T09:25:19Z

Has anyone performed the translation task correctly using bert2bert ? TAT

patrickvonplaten · 2022-08-23T18:11:41Z

@xueqianyi - maybe you have more luck on https://discuss.huggingface.co/ ?

ydshieh · 2022-08-23T18:16:51Z

Just an extra comment here: With bert2bert, it's not very helpful for MT, as BERT is only trained on English data.

desaibhargav · 2022-08-25T15:02:29Z

Hi there, I'm a Data Science grad student at Luddy. I was looking to contribute to open source in my free time and came across this issue. I did put a rough notebook together, linking it here @xueqianyi @CharizardAcademy. I would love to polish it to the standard upheld in the HF community if its indeed helpful.

Just some comments (I did NOT spend a lot of time on this, so your observations MIGHT differ):

The translation quality depends a lot on model capacity, though even using base BERT, the translations are fairly decent and definitely not gibberish. Tweaking the decoding parameters will help too.
I've trained only on 1M examples due to compute constraints, but I believe some multiples higher might work out better. I trained with 0.1M and 0.5M examples, I saw consistent improvements to the BLEU score on every increase.
Length of the tensors fed into the model (post-tokenization) have an impact on the translation quality too. Specifically max_length=64 and higher results in a lot of repetitions especially for short sentences because this particular dataset (1M subset) has most examples below 32 tokens (95%) (hence I recommend spending sometime tweaking the decoding parameters, no_repeat_ngram_size, max_length, length_penality etc in particular).
Also, the model seems to think President Obama and President Bush are the same person, EVERYTIME. xD

mahita2104 · 2023-10-21T17:35:39Z

I would like to work on this issue

patrickvonplaten added the Good First Issue label Feb 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to use EncoderDecoderModel to do en-de translation? #8944

how to use EncoderDecoderModel to do en-de translation? #8944

CharizardAcademy commented Dec 6, 2020

LysandreJik commented Dec 7, 2020

patrickvonplaten commented Dec 7, 2020

zmf0507 commented Dec 17, 2020

patrickvonplaten commented Dec 17, 2020

zmf0507 commented Dec 25, 2020

patrickvonplaten commented Dec 25, 2020

zmf0507 commented Dec 26, 2020

zmf0507 commented Feb 14, 2021

patrickvonplaten commented Feb 15, 2021

zmf0507 commented Feb 19, 2021

patrickvonplaten commented Mar 11, 2021

parambharat commented Jun 5, 2021

Nid989 commented Mar 22, 2022

Nid989 commented Mar 22, 2022

xueqianyi commented Jul 28, 2022

patrickvonplaten commented Aug 23, 2022

ydshieh commented Aug 23, 2022

desaibhargav commented Aug 25, 2022 •

edited

Loading

mahita2104 commented Oct 21, 2023

how to use EncoderDecoderModel to do en-de translation? #8944

how to use EncoderDecoderModel to do en-de translation? #8944

Comments

CharizardAcademy commented Dec 6, 2020

LysandreJik commented Dec 7, 2020

patrickvonplaten commented Dec 7, 2020

zmf0507 commented Dec 17, 2020

patrickvonplaten commented Dec 17, 2020

zmf0507 commented Dec 25, 2020

patrickvonplaten commented Dec 25, 2020

zmf0507 commented Dec 26, 2020

zmf0507 commented Feb 14, 2021

patrickvonplaten commented Feb 15, 2021

zmf0507 commented Feb 19, 2021

patrickvonplaten commented Mar 11, 2021

parambharat commented Jun 5, 2021

Nid989 commented Mar 22, 2022

Nid989 commented Mar 22, 2022

xueqianyi commented Jul 28, 2022

patrickvonplaten commented Aug 23, 2022

ydshieh commented Aug 23, 2022

desaibhargav commented Aug 25, 2022 • edited Loading

mahita2104 commented Oct 21, 2023

desaibhargav commented Aug 25, 2022 •

edited

Loading