-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
embedding training config file #21
Comments
Hi, the basic difference of configurations are db paths. For embeddings, we use literature data rather than official data as the training data. Yes, please use the |
Ok, thanks for your reply. I have replace the config file in the terminal. And I have another question. In the evaluation stage, what is pretrained/Chinese-word-vector/embeddings refering to ? |
And I could not find chengyu_synonym_dict in train_embedding.py ... Sorry for bothering you, and waiting for your reply. |
Please refer to https://github.com/VisualJoyce/ChengyuBERT#learning-and-evaluating-chinese-idiom-embeddings This is a different paper focusing on embedding learning and evaluation. The data has been shared online. |
Thanks for your work!
I can not find this file train-embeddings-base-1gpu.json mentioned in ReadMe.md, but found bert-wwm-ext_literature file. Does the bert-wwm-ext_literature file replace the former file?
Thanks a lot!
The text was updated successfully, but these errors were encountered: