A modular implementation of both the DQN and DDPG reinforcement learning algorithms, trained on the Trackmania (2020) videogame.
trackmania_dqn.mp4
The above video is after only a few hours of training on a simple map.
DQN has a discrete action space, and as such, uses keyboard input i.e. can go left, right or straight (no in-between).
DDPG on the other hand uses analog input, and can control how sharply it turns.
The algorithms are implemented modularly such that they can be used with any environment, including those from OpenAI Gymnasium. They were battle-tested on Cartpole, Pendulum, MountainCar (discrete + continuous) and Lunar Lander (discrete + continuous) before Trackmania.
DQN is implemented with ε-decay, Polyak updates, Prioritised Experience Replay and Double-DQN.