复现TinyBERT需要pre-train的wiki语料，另是否开源tinybert-cased模型 #237

hppy139 · 2023-05-11T09:16:43Z

你好，

论文中提到，For the general distillation, we set the maximum sequence length to 128 and use English Wikipedia (2,500M words) as the text corpus and perform the intermediate layer distillation for 3 epochs with the supervision from a pre-trained BERT BASE and keep other hyper-parameters the same as BERT pre-training (Devlin et al., 2019). 关于pre-train的语料，是否可以提供下载地址？

此外，在pre-train阶段，对于general_distill.py的配置参数--do_lower_case，是否可以不设置该参数。看到已开放模型的vocab.txt是小写字典，请问目前是否有已训好的、关注大小写的TinyBERT模型(即"tinybert-cased")？

谢谢~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

复现TinyBERT需要pre-train的wiki语料，另是否开源tinybert-cased模型 #237

复现TinyBERT需要pre-train的wiki语料，另是否开源tinybert-cased模型 #237

hppy139 commented May 11, 2023

复现TinyBERT需要pre-train的wiki语料，另是否开源tinybert-cased模型 #237

复现TinyBERT需要pre-train的wiki语料，另是否开源tinybert-cased模型 #237

Comments

hppy139 commented May 11, 2023