Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wmt数据下载 #226

Open
0-KaiKai-0 opened this issue Nov 29, 2022 · 1 comment
Open

wmt数据下载 #226

0-KaiKai-0 opened this issue Nov 29, 2022 · 1 comment

Comments

@0-KaiKai-0
Copy link

请问论文Universal Conditional Masked Language Pre-training for Neural Machine Translation中所描述的数据集size是指什么,以及能否提供论文中所使用的数据下载源。
image

@jingmu123
Copy link
Contributor

jingmu123 commented Dec 1, 2022

您好,这个数据是从WMT官网下载并清洗之后的数据,size是指用于训练的数据规模,和mBART论文中一致;由于google硬盘空间有限,当前还无法提供所处理后的wmt数据,后边可能会选择在其他云盘上传,您也可以下载后按readme的说明进行处理,谢谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants