Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge number of files created #6

Open
brianprichardson opened this issue Nov 26, 2017 · 3 comments
Open

Huge number of files created #6

brianprichardson opened this issue Nov 26, 2017 · 3 comments

Comments

@brianprichardson
Copy link

It ran for a couple of days and found several new best models. However, it also creates numerous files (502,586 items, totalling 5.6 GB). The models directory is large and the games directory has most of the files. Perhaps zipping would be worthwhile. In any case, I'm happy to restart it again after you have had a chance to make more improvements. Thanks again for sharing.

@Narsil
Copy link
Owner

Narsil commented Nov 27, 2017

Hmm yes it does create a bunch of files. There is a file for each move of every game of every model.

The interest is that the model only has to parse the directory of a model once (which is usually pretty fast) and can then open the files only once for each batch in training. During training the samples are taken randomly from any move of any game. Random access can be pretty slow pretty fast for huge data.

I could zip for past models as they are not used after some point (though Deepmind says they use the last 500k games which would correspond to 40M files in the current architecture).
But it's not really my current focus as I feel there are still some optimizations that could be done.

Do you have any other idea to make it better ?

@tianshuo
Copy link

Could it be put in a sqlite database?

@Narsil
Copy link
Owner

Narsil commented Dec 27, 2017

It could. But for now I won't do it as I feel a filesystem is the best as it can be quite easily split across machines (I'm pondering trying to use AWS to reach the infamous 0.4s/move claimed by alphago zero.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants