Deep Learning and Reinforcement Learning in Combinatorial Optimization

Regular updates on deep learning, reinforcement learning, and their applications to combinatorial optimization problems.
This repository provides insights into various reinforcement learning algorithms and their implementations, accompanied by clean and well-structured code.

Covered Topics

Reinforcement Learning Algorithms

Deep Q Network (DQN)
Implementation of the foundational DQN algorithm, which uses Q-learning with deep neural networks for decision making.
Double DQN (DDQN)
A more stable version of DQN that reduces overestimation bias in the Q-value.
Dueling DQN
An enhancement to DQN that separates value and advantage functions, improving the network's performance.
REINFORCE
A Monte Carlo policy gradient method for directly optimizing policy performance.
REINFORCE with Baseline
A variation of REINFORCE that reduces variance by subtracting a learned baseline value.
Actor-Critic
Combines the benefits of value-based and policy-based methods by learning both the policy and value functions.
Advantage Actor-Critic (A2C)
A synchronous version of Actor-Critic that uses the advantage function to optimize policy.
Proximal Policy Optimization (PPO)
A state-of-the-art policy gradient method that ensures stable learning by limiting policy updates.

Graph Neural Networks

Graph Convolutional Network (GCN) Reference: Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks[J]. arXiv preprint arXiv:1609.02907, 2016.

Reinforcement Learning for Combinatorial Optimization Problems

LSTM and Pointer Network, A2C, Greedy & Sampling for TSP
Reference: Bello, I., Pham, H., Le, Q. V., et al. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940, 2016.
Embedding and Pointer Network, REINFORCE with Rollout Baseline, Greedy & Sampling for VRP
Reference: Nazari M, Oroojlooy A, Snyder L, et al. Reinforcement learning for solving the vehicle routing problem[C]. Advances in Neural Information Processing Systems, 2018, 31.
Multi-Head Self-Attention, REINFORCE, Active Search for TSP
Reference: Bello, I., Pham, H., Le, Q. V., et al. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940, 2016. Active search starts from scratch using DRL to solve individual problems by continuously learning and adjusting the model during inference.
Multi-Head Self-Attention, REINFORCE, Greedy & Sampling for TSP
Reference: Kool, W., Van Hoof, H., Welling, M. Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475, 2018.
Multi-Head Self-Attention, REINFORCE with Rollout Baseline, Greedy & Sampling for TSP
Reference: Kool, W., Van Hoof, H., Welling, M. Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475, 2018. The rollout baseline is a more effective strategy compared to the moving average baseline.
Multi-Head Self-Attention, REINFORCE with Rollout Baseline, Greedy & Sampling for VRP
Reference: Kool, W., Van Hoof, H., Welling, M. Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475, 2018.

Stay tuned for more updates!

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
figs		figs
A2C.ipynb		A2C.ipynb
DDQN.ipynb		DDQN.ipynb
DQN.ipynb		DQN.ipynb
DRL4TSP (Attention, REINFORCE with Rollout Baseline).ipynb		DRL4TSP (Attention, REINFORCE with Rollout Baseline).ipynb
DRL4TSP (Attention, REINFORCE).ipynb		DRL4TSP (Attention, REINFORCE).ipynb
DRL4TSP (Attention, REINFORCE, active search).ipynb		DRL4TSP (Attention, REINFORCE, active search).ipynb
DRL4TSP (LSTM, A2C).ipynb		DRL4TSP (LSTM, A2C).ipynb
DRL4VRP (Attention, REINFORCE with Rollout Baseline).ipynb		DRL4VRP (Attention, REINFORCE with Rollout Baseline).ipynb
DRL4VRP (Embedding, REINFORCE with Rollout Baseline).ipynb		DRL4VRP (Embedding, REINFORCE with Rollout Baseline).ipynb
Dueling DQN.ipynb		Dueling DQN.ipynb
Graph Convolutional Network.ipynb		Graph Convolutional Network.ipynb
PPO.ipynb		PPO.ipynb
README.md		README.md
REINFORCE.ipynb		REINFORCE.ipynb
REINFORCE_with_baseline.ipynb		REINFORCE_with_baseline.ipynb
actor-critic.ipynb		actor-critic.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning and Reinforcement Learning in Combinatorial Optimization

Covered Topics

Reinforcement Learning Algorithms

Graph Neural Networks

Reinforcement Learning for Combinatorial Optimization Problems

About

Releases

Packages

Languages

Xavier-MaYiMing/Reinforcement-learning-and-combinatorial-optimzation

Folders and files

Latest commit

History

Repository files navigation

Deep Learning and Reinforcement Learning in Combinatorial Optimization

Covered Topics

Reinforcement Learning Algorithms

Graph Neural Networks

Reinforcement Learning for Combinatorial Optimization Problems

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages