- Fixed Topology Neural Network Search.
- Q-Learning.
- Deep RL Policy Network.
- Deep Q-Learning (DQN).
- Deep Q-Learning (DQN) +[target network, reward clipping, frame skipping].
- Deep Deterministic Policy Gradient (DDPG).
- Deep Neuroevolution. [working, no results to show!]
-
Deep Q Learning with frame skipping(repeat same action for 3 frames), target network updated at (epsiode%2==0) & reward clipping(-1,1). landing at epsiode 720:
paremeters, for below: - refresh target net every 10 episodes. - skip 3 frames. - minibatch size 32. - at episode 460.
Link: https://youtu.be/fXbqDDaJDvg
--------- Welcoming the Era of Deep Neuroevolution: https://eng.uber.com/deep-neuroevolution/?lipi=urn%3Ali%3Apage%3Ad_flagship3_feed%3BzyxkMF5OTd%2BI48jAyJJ%2B2A%3D%3D
- MIT Deep-RL self-driving cars: https://selfdrivingcars.mit.edu
- Deep RL lecture by David Silver UCL: http://www0.cs.ucl.ac.uk/staff/d.silver/web/Resources_files/deep_rl.pdf
- DQN Nature paper, 'Human-level control through deep reinforcement learning' (2015)
- DDPG Paper, 'Continuous control with deep reinforcement learning' https://arxiv.org/abs/1509.02971
- Deep Neuroevolution @ UberAI Labs, https://arxiv.org/abs/1712.06567