Task547_spl_translation_entk_en and task560_alt_translation_en_entk #589

yeganehkordi · 2021-11-10T17:33:51Z

The instruction is incomplete, and it seems the task is to remove the space before punctuation marks. @PhaniRohithaKaza Can you explain the tokens?

PhaniRohithaKaza · 2021-11-10T21:54:29Z

@yeganehkordi The task is sentence translation and the task what usually does is convert given english sentence to it's tokens. But as tokens are also in english the input and output doesn't differ. Even I'm confused on this task. @swarooprm can you please look into it and help us.

yeganehkordi · 2021-11-10T22:56:11Z

Yeah, except for punctuations, they seem to be the same.

Palipoor · 2022-02-08T18:50:56Z

Yeah, "English tokens" is one of the translated versions in their data. I think we can drop these two tasks. @danyaljj
Also, some of the other tasks from this dataset have two "Domains". I can fix these in a PR.

danyaljj · 2022-02-09T01:56:11Z

Sounds good, thank you!

danyaljj · 2022-02-12T22:15:39Z

Moving some of the comments from #709 to here:

@yeganehkordi 's comment:

I'm in favor of not dropping en_entk and entk_en tasks. We already have simpler tasks than these tasks. I think we can change their definition and keep them.

@Palipoor 's comment:

I think those tasks being simple(just putting space before punctuation marks or removing the space before them) is one thing, but the other thing is that the models are probably going to process these at the token level. So it seems like both the input and the output are going to end up being encoded the same way, which makes the task pointless.

swarooprm · 2022-02-12T22:32:50Z

In my opinion, we could keep en_entk and entk_en tasks and mention in the definition that the task is to remove/add space before punctuation. This is a simple task, but not an invalid task.

Let's not worry about how models are going to process it. E.g. even if the encoder ignores the space, the decoder has to decode it back with space.

My opinion is not very strong and I am also fine if we decide to delete it. However, we should note that, creating a task requires significant effort, and deleting is easy. So, we should avoid deleting as much as possible and think about repairing.

yeganehkordi · 2022-02-13T08:06:56Z

I think in the worst-case scenario, we can shuffle the tokens in the input and change these tasks to "order generation" tasks.

yeganehkordi added the urgent label Nov 10, 2021

yeganehkordi changed the title ~~Task547_spl_translation_entk_en~~ Task547_spl_translation_entk_en and task560_alt_translation_en_entk Nov 10, 2021

Palipoor mentioned this issue Feb 9, 2022

Remove duplicate and incorrect domains for alt tasks and drop two tasks #709

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task547_spl_translation_entk_en and task560_alt_translation_en_entk #589

Task547_spl_translation_entk_en and task560_alt_translation_en_entk #589

yeganehkordi commented Nov 10, 2021

PhaniRohithaKaza commented Nov 10, 2021

yeganehkordi commented Nov 10, 2021

Palipoor commented Feb 8, 2022

danyaljj commented Feb 9, 2022

danyaljj commented Feb 12, 2022

swarooprm commented Feb 12, 2022

yeganehkordi commented Feb 13, 2022 •

edited

Loading

Task547_spl_translation_entk_en and task560_alt_translation_en_entk #589

Task547_spl_translation_entk_en and task560_alt_translation_en_entk #589

Comments

yeganehkordi commented Nov 10, 2021

PhaniRohithaKaza commented Nov 10, 2021

yeganehkordi commented Nov 10, 2021

Palipoor commented Feb 8, 2022

danyaljj commented Feb 9, 2022

danyaljj commented Feb 12, 2022

swarooprm commented Feb 12, 2022

yeganehkordi commented Feb 13, 2022 • edited Loading

yeganehkordi commented Feb 13, 2022 •

edited

Loading