-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge number of files created #6
Comments
Hmm yes it does create a bunch of files. There is a file for each move of every game of every model. The interest is that the model only has to parse the directory of a model once (which is usually pretty fast) and can then open the files only once for each batch in training. During training the samples are taken randomly from any move of any game. Random access can be pretty slow pretty fast for huge data. I could zip for past models as they are not used after some point (though Deepmind says they use the last 500k games which would correspond to 40M files in the current architecture). Do you have any other idea to make it better ? |
Could it be put in a sqlite database? |
It could. But for now I won't do it as I feel a filesystem is the best as it can be quite easily split across machines (I'm pondering trying to use AWS to reach the infamous 0.4s/move claimed by alphago zero.) |
It ran for a couple of days and found several new best models. However, it also creates numerous files (502,586 items, totalling 5.6 GB). The models directory is large and the games directory has most of the files. Perhaps zipping would be worthwhile. In any case, I'm happy to restart it again after you have had a chance to make more improvements. Thanks again for sharing.
The text was updated successfully, but these errors were encountered: