Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to fine tune the pre-trained GloVe vectors on a custom corpus #189

Open
smousav9 opened this issue Jun 19, 2021 · 4 comments
Open

How to fine tune the pre-trained GloVe vectors on a custom corpus #189

smousav9 opened this issue Jun 19, 2021 · 4 comments

Comments

@smousav9
Copy link

Hi,

Hope you are having a great time. I need to fine-tune the pre-trained GloVe vectors on a custom corpus and I was wondering how I can do it with the GloVe library. My understanding of fine-tuning is to initialize the value of word vectors (at the beginning of fine-tuning) to the values of the pre-trained word vectors.
There is a parameter in the "glove.c" named "load_init_param". If the value of this parameter is set to "1", then the code will look for a "-init-param-file" file to read the parameters from an input file. I tried to understand what should the format of the initialization file look like and whether initial word vectors are part of this initialization parameter or not, since C is not my programing language, I did not successfully understand all the details of it. I appreciate it if someone can help me initialize the word vectors with pre-trained word vectors to fine-tune the GloVe on my corpus?

Thanks
Maryam

@AngledLuffa
Copy link
Contributor

AngledLuffa commented Jun 20, 2021 via email

@smousav9
Copy link
Author

Thank you for the help

@smousav9
Copy link
Author

smousav9 commented Jun 21, 2021

@AngledLuffa
Thank you for the previous comment and clarification, however, I am still struggling to convert the initial txt file to a bin file? Is there any written c or python script to help me? The glove.c receives the initial-param-files in a bin file format. I am creating my initialization file in txt using python, I need to convert it to a bin file so that the program can read it. The question is how?

Also, there is a shuffling step before training. How does that affect the initialization step?

Regards,
Maryam

@smousav9 smousav9 reopened this Jun 21, 2021
@AngledLuffa
Copy link
Contributor

AngledLuffa commented Jul 1, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants