Roadmap

Rewrite core
Distributed RL
- Some References:
  - Ray Rllib: https://github.com/ray-project/ray/blob/master/doc/source/rllib.rst
    - papers: Ray RLlib: http://proceedings.mlr.press/v80/liang18b.html Ray: https://arxiv.org/abs/1712.05889
    - talks: Good Intro to distributed RL (first ~25 mins) then discussion of Ray RLlib (next ~15 mins, the rest of the talk is on imitation learning), by two of the RLlib authors: https://www.youtube.com/watch?v=Y6feXBY6_XQ&t=0s&list=PLkFD6_40KJIxJMR-j5A1mkxK26gh_qg37&index=6
    - blogs: https://rise.cs.berkeley.edu/blog/scaling-multi-agent-rl-with-rllib/
  - Ape-x: https://openreview.net/forum?id=H1Dy---0Z
    - uber single machine implementation: https://github.com/uber-research/ape-x
  - R2D2: Extension of Ape-x with recurrent networks: https://openreview.net/forum?id=r1lyTjAqYX
  - IMPALA: https://icml.cc/Conferences/2018/Schedule?showEvent=3093
  - Uber Faster Neuro-evolution: https://eng.uber.com/accelerated-neuroevolution/
  - DOPAMINE: A RESEARCH FRAMEWORK FOR DEEP REINFORCEMENT LEARNING. https://arxiv.org/pdf/1812.06110.pdf
    
    Conclusions. From the above taxonomy we conclude that there is a natural trade-off between simplicity and complexity in a deep RL research framework. The resulting choices empower some research objectives, possibly at the detriment of others.
  - facebookresearch/Horizon, paper Limited algorithms but well tested.
  - surreal, code, paper With Kubernetes integration.
HyperParameter Optimisation
Support for easily deploying experiments
More environments
- Bullet
- Box2D

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap

Clone this wiki locally