-
-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add retrace #972
add retrace #972
Conversation
Codecov Report
@@ Coverage Diff @@
## main #972 +/- ##
==========================================
+ Coverage 0.02% 23.28% +23.25%
==========================================
Files 210 224 +14
Lines 7437 7718 +281
==========================================
+ Hits 2 1797 +1795
+ Misses 7435 5921 -1514
|
I'm closing this as I am not planning on finishing this soon, and the upcoming breaking changes mean one might as well start from scratch to implement this. |
This PR adds a function to compute the retrace bellman operator as described in this paper. It is described as an algorithm so I put it in RLZoo but it can be argued that this is more of a component used by RL algorithms . I'm thus open to move this to RLCore.
The tests are incomplete. For now I only test if the function returns the expected operators given toy networks. This needs to also be tested with actual networks for the API.
PR #966 must also be merged because this uses the new 'target' api.
PR Checklist