Example Categorical DQN implementation with ReLAx

This repository contains an implementation of categorical deep q-network (Categotical DQN) with ReLAx.

Categotical DQN actor was trained on Breakout-v0 Atari Gym environment for 3m env-steps.

!Note: For demonstration purposes training was run only for 3m steps. In papers, DQN and its augmentations are trained for 200m steps, which may require several days of learning. That is why performance is lower than reported in papers.

The graph of average return vs environment step is shown below (logs done every 50k steps):

The distribution of estimated Q-values vs data Q-values is shown below:

Resulting Policy:

categorical_dqn_run.mp4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Example Categorical DQN implementation with ReLAx

Files

README.md

Latest commit

History

README.md

File metadata and controls

Example Categorical DQN implementation with ReLAx