Skip to content

Latest commit

 

History

History
54 lines (39 loc) · 5.02 KB

README.md

File metadata and controls

54 lines (39 loc) · 5.02 KB

A Repo for Classical to State-of-the-art Deep Q-Network Algorithms

DQN is an algorithm that addresses the question : How do we make reinforcement learning look more like Supervised learning ? Common problems that consisitently show up in value-based reinforcement learning are:

  1. Data is not independent and identically distributed (The IID Assumption )
  2. Non-stationarity of targets

It's obvious that we needed to make the neccessary tweaks in the algorithms to overcome these problems i.e to make the data look more IID and the targets fixed.

Solutions (which form the part of the DQN's we see today):

In order to make the target values more stationary we can have a separate network that we can fix for multiple steps and reserve it for calculating more stationary targets i.e making use of a Target network

Use "replay" of already seen experiences (Experience Replay) , often referred to as the replay buffer or a replay memory and holds experience samples for several steps, allowing the sampling of mini-batches from a broad-set of past experiences.

DQN with Replay memory

Papers associated with Novel-Algorithms

Playing Atari with Deep Reinforcement Learning [Paper]

Deep Reinforcement Learning with Double Q-learning [Paper]

Dueling Network Architectures for Deep Reinforcement Learning [Paper]

Prioritized Experience Replay [Paper]

Noisy Networks for Exploration [Paper]

A Distributional Perspective on Reinforcement Learning (Categorical DQN) [Paper]

Rainbow: Combining Improvements in Deep Reinforcement Learning [Paper]

Distributional Reinforcement Learning with Quantile Regression[Paper]

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [Paper]