Categorical DQN

Attempt at CNTK implementation of Categorical DQN from 'A distributional Perspective on Reinforcement Learning' found here.

Dependencies

To train a model for CartPole from OpenAI Gym, use:

python -m experiments.train_cartpole

To watch the trained model in action, use:

python -m experiments.watch_cartpole

Here are the results from a sample run:

Not currently planned. If you run it and get results, I'll be happy to include it.