Setup for a project/competition amongst students to train a winning Reinforcement Learning agent for the classic game Bomberman.
First run:
conda create -n bomberman python=3.8
conda activate bomberman
conda install scipy numpy matplotlib
pip install scikit-learn pygame tqdm
Then follow this tutorial or just execute the following steps:
- Install tensorflow according to the documentation
- Install the CUDA Toolkit v11.2.1
- Install NVIDIA cuDNN v8.1.0 according to the documentation
Also, please create the following directories if you want to start the training process:
cd agent_code/big_bertha_expert
mkdir buffers
cd agent_code/big_bertha_v1
mkdir buffers
To receive the pickle files for the models uploaded (for pre-initialized replay buffers), please contact the owners of this repository as the files were too large to be uploaded.
Install following helper tools for development purposes:
pip install isort
pip install tensorboard
conda install pydot
conda install graphviz
Open a tensorboard after training with:
cd agent_code/big_bertha_v1
tensorboard --logdir tensorboard_logs
The training process of the model is based on Deep Q-learning from demonstrations. Therefore two agents exists:
- big_bertha_expert (similar to the rule based agent) -> Our network is pretrained with the filled experience replay buffer from the expert
- big_bertha_v1 (our DQN) -> Trained network from 1. is trained with normal Reinforcement Learning
Step 1. is part of the Imitation Learning and step 2. is standard Reinforcement Learning.