torch_rl

Reinforcement learning algorithms for spiking networks and artificial neural networks.

Currently implemented

Deep deterministic policy gradients with hindsight experience replay
Stochastic policy gradient with hindsight experience replay
Biased hindsight policy gradient
Proximal Policy Optimization on GPU
Covariance Matrix Adaptation Evolutionary Strategy

In progress...

Distributed proximal policy optimization