Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging cooccurrence files: processed 0 lines. Unable to open file overflow_1021.bin. #155

Open
summerZXH opened this issue Oct 21, 2019 · 2 comments

Comments

@summerZXH
Copy link

$ build/cooccur -memory 4.0 -vocab-file vocab.txt -verbose 2 -window-size 15 < ./train_data/data.txt > cooccurrence.bin COUNTING COOCCURRENCES window size: 15 context: symmetric max product: 13752509 overflow length: 38028356 Reading vocab from file "vocab.txt"...loaded 7459765 words. Building lookup table...table contains 215584655 elements. Processed 7713292825 tokens. Writing cooccurrences to disk............1141 files in total. Merging cooccurrence files: processed 0 lines.Unable to open file overflow_1021.bin.

when I train glove with big dataset (46G) , I met the problem, anyone knows why?

@ousou
Copy link
Contributor

ousou commented Jan 23, 2020

This seems to be the same error as mentioned in pull request #138 . The issue is probably that there are too many files open at the same time, and can be fixed by increasing the amount of allowed files, for instance: ulimit -n 2048. See the PR for more info.

@jairajrouth
Copy link

Hello Guys,
I am bit new in this GloVe topic. I am interested to see how this project works. I am using an online editor to run this project. After running the demo.sh script i can see the build folder is created and text8 and vocab.txt files are been created. I was next interested to see the cooccur tool. I ran below command and get this message, I was expecting a file with cooccurance of words from the vocab.txt which is passed. Am i doing anything wrong with the command here ? Please let me know.
All suggestions are welcome.

@ousou & @summerZXH As you guys were also working on it do you have any idea what i am doing wrong ?

Thanks and Regards.
Jai

./build/cooccur -vocab-file vocab.txt <------- Command
COUNTING COOCCURRENCES <------- Message
window size: 15
context: symmetric
max product: 10485784
overflow length: 28521267
Reading vocab from file "vocab.txt"...loaded 71290 words.
Building lookup table...table contains 75253375 elements.
Processing token: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants