For this project,the Reacher environment is used.
In this environment, a double-jointed arm can move to target locations. A reward of +0.1 is provided for each step that the agent's hand is in the goal location. Thus, the goal of your agent is to maintain its position at the target location for as many time steps as possible.
The observation space consists of 33 variables corresponding to position, rotation, velocity, and angular velocities of the arm. Each action is a vector with four numbers, corresponding to torque applicable to two joints. Every entry in the action vector should be a number between -1 and 1.
For this project, I have used an environment that contains 20 identical agents, each with its own copy of the environment. This version is useful for algorithmsthat use multiple (non-interacting, parallel) copies of the same agent to distribute the task of gathering experience.
-
Install anaconda on Windows using windows installer at https://www.anaconda.com/products/individual#windows
-
Create (and activate) a new environment with Python 3.6.
- Windows:
conda create --name drl python=3.6 activate drl
-
Clone the repository, and navigate to the
setup/
folder. Then, install several dependencies.git clone https://github.com/KanikaGera/Navigation-Game-RL.git cd setup conda install pytorch=0.4.1 cuda90 -c pytorch pip install .
-
Create an IPython kernel for the
drl
environment.python -m ipykernel install --user --name drl --display-name "drl"
-
Before running code in a notebook, change the kernel to match the
drl
environment by using the drop-downKernel
menu.
-
Continous_Control.ipynb
consist of unityagent ml library to interact with unity environment and train the agent. -
model.py
consists of structure of RL model coded in pytorch. -
ddpg_agent.py
consist of DDPG Algorithm Implementation . -
saved/checkpoint_actor.pth
is saved trained model with weights for actor network. -
saved/checkpoint_critic.pth
is saved trained model with weights for critic network. -
saved/scores.list
is saved scores while training model.
-
Install Dependies by following commands in Getting Started
-
Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Twenty (20) Agents_
- Windows (32-bit): click here
- Windows (64-bit): click here
(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.
- Twenty (20) Agents_
-
Place the file in the GitHub repository, in the main folder, and unzip (or decompress) the file.
- Open Continous_Control.ipynb
- Run Jupyter Notebook
- Run the cells to train the model.
Deep Deterministic Policy Gradient Algorithm is used to train . Report is attached to main folder.