Skip to content

Latest commit

 

History

History
64 lines (44 loc) · 4.58 KB

File metadata and controls

64 lines (44 loc) · 4.58 KB

Yellow Banana Collector

Introduction

Trained Agent

In this task, the agent's goal is to collect as many yellow bananas as possible while avoiding blue bananas.
There are two variations to this task -- the Ray-Tracing Banana Collector, where the state from the environment is a small (37) dimensional vector of hand-crafted features of the agent's velocity, along with ray-based perception of objects around agent's forward direction. In contrast, the Pixel-Based Banana Collector receives a raw pixel (RGB image) as state from the environment. Below we go into more details on the task description:

The task is episodic, and in order to solve the environment, the agent must get an average score of +13 over 100 consecutive episodes.

Action Space

For both the Ray-based and Pixel-based tasks, the actions available to the agent are the same:

  • 0 - move forward.
  • 1 - move backward.
  • 2 - turn left.
  • 3 - turn right.

At each time step, the agent must provide an action to the environment.

Reward Structure

For both the Ray-based and Pixel-based tasks, the reward structure is the same as well. A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of the agent is to collect as many yellow bananas as possible while avoiding blue bananas.

State Space: Ray Tracing Banana Collector

The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:

State Space: Pixel-Based Banana Collector

The state space is a 3D tensor of size (84, 84, 3) representing an RGB image of width=height=84 pixels.

Preparing the Unity ML-Agent environments

If you're running on linux and wish to download both environments, you can run the script setup_linux.sh. If you're running on a different OS, please follow the instructions below for each environment.

Ray-Tracing Banana Collector
  1. Download the environment from one of the links below. You need only select the environment that matches your operating system:

    (For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.

    (For AWS) If you'd like to train the agent on AWS (and have not enabled a virtual screen), then please use this link to obtain the environment.

  2. Place the file in /tasks/banana_collector/environments/Banana_Linux, and unzip (or decompress) the file.

Pixel-based Banana Collector
  1. Download the environment from one of the links below. You need only select the environment that matches your operating system:

Then, place the file in the /tasks/banana_collector/environments/VisualBanana_Linux, and unzip (or decompress) the file.

(For AWS) If you'd like to train the agent on AWS, you must follow the instructions to set up X Server, and then download the environment for the Linux operating system above.