Skip to content

Conversation

me-unsolicited
Copy link

Hello, please consider this pull request which I implemented based on this comment.

Mainly I added a GameHistoryDao class which creates "replay_buffer.db" with a simple key->value table and stores the games there like a dictionary. If this approach is good, then it can be optimized further by separating reanalysed_predicted_root_values, priorities, and game_priority into their own columns so it can avoid serializing/deserializing the full observation history at each update. However, I think that would take more invasive changes to the existing code.

@ahainaut
Copy link
Collaborator

Hi @me-unsolicited ,
Thank you for this new feature. After reviewing and testing the code, we found that it slows considerably the time of training. So we will have to wait to merge this PR until we find a way to speed the training keeping the replay buffer on disk.

@me-unsolicited
Copy link
Author

@ahainaut ,
Thanks for the feedback! I made some improvements and it runs much faster now.

Changes:

  1. Use SQL to efficiently sample from the prioritized replay buffer.
  2. Store priorities and predicted values in separate columns from the full object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants