reinforcement-learning

Some reinforcement learning algorithms implemented with Tensorflow 2

For practise I chose OpenAI gym Lunar Lander environment. It is representative and doesn't skip frames like some other envs.

Double DQN
Double Dueling DQN
Double Dueling DQN with priority based importance sampling
Double Dueling DQN with rank based importance sampling
Vanila Policy Gradient
Advantage Actor-Critic with MC update
Advantage Actor-Critic with online update, N-returns backup and entropy bonus
Advantage Actor-Critic with online update, N-returns backup implemented in replay buffer and entropy bonus
Proximal Policy Optimization(PPO) + Generalized Advantage Estimator(GAE)
Deep Deterministic Policy Gradient (DDPG)
Twin Delayed Deep Deterministic policy gradient (TD3)
Soft Actor-Critic (SAC)
Soft Actor-Critic with Value network (alpha term regularization taken from SAC v2)
Soft Actor-Critic with Emphasizing Recent Experience
APE-X DPG
- Orchestrator
- DPG Actor
- DPG Learner
APE-X with Soft Actor Critic
- Orchestrator
- SAC Actor
- SAC Learner
Curiosity based on Random Network Distillation ( with Soft Actor Critic)
Recurrent Experience Replay in Distributed Reinforcement Lerning (R2D2) with SAC.
- Orchestrator
- Agent
- Learner
- Agent buffer Responsible for collecting trajectories and is important part of whole algorithm.
Note 1: for this experiment the famous Lunar Lander environment was altered to produce 'stacked' states. This achived by adding liner interpolated states between 'state' and 'next_state'.

Note 2: Because original paper says nothing about behavior near trajoctory end, the simplest approach was taken - length of last trajectoy may vary, but has length of atleast 2 records.
Regularizing Action Policies for Smooth Control implementation based on Soft Actor-Critic
Active dendrites networks implementation arxiv paper
- Modified LunarLander environemnt LunarLander multitask. This implementation has two tasks: original landing task and new - lift off. Last one require lander to fly off from landing pad
- Active Dendrits and k-Winner-Takes-All layers
- The training script
Quantile Regression QR-DQN
Implicit Quantile Networks IQN
Distributional Soft Actor-Critic

Name		Name	Last commit message	Last commit date
Latest commit History 234 Commits
APEX		APEX
R2D2		R2D2
env		env
rl_utils		rl_utils
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
lunar_lander_ActiveDendritsNetwork.py		lunar_lander_ActiveDendritsNetwork.py
lunar_lander_ActorCritic.py		lunar_lander_ActorCritic.py
lunar_lander_DDPG.py		lunar_lander_DDPG.py
lunar_lander_DSAC.py		lunar_lander_DSAC.py
lunar_lander_IQN.py		lunar_lander_IQN.py
lunar_lander_PPO.py		lunar_lander_PPO.py
lunar_lander_PolicyGradient.py		lunar_lander_PolicyGradient.py
lunar_lander_QR_DQN.py		lunar_lander_QR_DQN.py
lunar_lander_RND_Curiosity.py		lunar_lander_RND_Curiosity.py
lunar_lander_SAC.py		lunar_lander_SAC.py
lunar_lander_SAC_CAPS.py		lunar_lander_SAC_CAPS.py
lunar_lander_SAC_ERE.py		lunar_lander_SAC_ERE.py
lunar_lander_SAC_R2D2.py		lunar_lander_SAC_R2D2.py
lunar_lander_SAC_value.py		lunar_lander_SAC_value.py
lunar_lander_TD3.py		lunar_lander_TD3.py
lunar_lander_a2c_tdn_buffer_with_entropy.py		lunar_lander_a2c_tdn_buffer_with_entropy.py
lunar_lander_a2c_tdn_entropy.py		lunar_lander_a2c_tdn_entropy.py
lunar_lander_ape-x-SAC.py		lunar_lander_ape-x-SAC.py
lunar_lander_ape-x.py		lunar_lander_ape-x.py
lunar_lander_doubleDQN.py		lunar_lander_doubleDQN.py
lunar_lander_double_dueling_DQN.py		lunar_lander_double_dueling_DQN.py
lunar_lander_double_dueling_DQN_IS.py		lunar_lander_double_dueling_DQN_IS.py
lunar_lander_double_dueling_DQN_IS_rank.py		lunar_lander_double_dueling_DQN_IS_rank.py
lunar_lander_keras.py		lunar_lander_keras.py
reinforcement-learning.pyproj		reinforcement-learning.pyproj
reinforcement-learning.sln		reinforcement-learning.sln
side_notes.txt		side_notes.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

reinforcement-learning

About

Releases

Packages

Languages

vformanyuk/reinforcement-learning

Folders and files

Latest commit

History

Repository files navigation

reinforcement-learning

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages