You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This project is implementation of multiple AI agents based on different Reinforcement Learning methods to OpenAI Gymnasium Lunar-Lander environment which is classic rocket landing trajectory optimization problem.
Experimental exploration of gradient-based methods in RL. Features a simple naive algorithm derivation, REINFORCE implementation, in the CartPole environment. Bridges supervised and reinforcement learning paradigms
This notebook trains an agent to navigate a maze and reach a desired destination. It uses the Gym-MiniGrid's fourRoom-v0 environment as the maze. The agent is trained by using reiforcement learning's vanilla policy gradient (REINFORCE) algorithm.