diff --git a/Project.toml b/Project.toml index 765ca8c..e30a002 100644 --- a/Project.toml +++ b/Project.toml @@ -1,7 +1,7 @@ name = "GridWorlds" uuid = "e15a9946-cd7f-4d03-83e2-6c30bacb0043" -authors = ["Sriram"] -version = "0.2.0" +authors = ["Siddharth Bhatia and contributors"] +version = "0.3.0" [deps] Crayons = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f" diff --git a/README.md b/README.md index 21e03c3..222ea95 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@ # GridWorlds -This package aims to provide grid world environments for reinforcement learning research in Julia. The focus of this package is on being **lightweight** and **efficient**. This package is inspired by [gym-minigrid](https://github.com/maximecb/gym-minigrid) +This package aims to provide grid world environments for reinforcement learning research in Julia. The focus of this package is on being **lightweight** and **efficient**. This package is inspired by [gym-minigrid](https://github.com/maximecb/gym-minigrid). -### Table of contents: +## Table of contents: * [Getting Started](#getting-started) * [Design](#design) @@ -22,7 +22,7 @@ This package aims to provide grid world environments for reinforcement learning 1. [DynamicObstacles](#dynamicobstacles) 1. [Sokoban](#sokoban) -### Getting Started +## Getting Started ```julia using GridWorlds @@ -51,17 +51,17 @@ play(env, file_name = "example.gif", frame_rate = 5) ## Design -#### Reinforcement Learing API for the environments +### Reinforcement Learing API for the environments This package uses the API provided in [`ReinforcementLearningBase.jl`](https://github.com/JuliaReinforcementLearning/ReinforcementLearningBase.jl) so that it can seamlessly work with the rest of the [JuliaReinforcementLearning](https://github.com/JuliaReinforcementLearning) ecosystem. -#### Representation of a grid-world +### Representation of a grid-world A grid-world environment instance (often named `env`) contains within it an instance of `GridWorldBase` (often named `world`), which represents the grid-world. A `world` contains a 3-D boolean array (`BitArray{3}`) (often named `grid`) of size `(num_objects, height, width)`. Each tile of the `grid` can have multiple objects in it, indicated by a multi-hot encoding along the first dimension of the `grid`. The objects in the `world` do not contain any fields. Any related information for such objects that is needed is cached separately as fields of `env`. -`env` contains fields called `world` and `agent` (along with some other fields). The point here is to note that an `agent` is stored separately as a field in `env` instead of an object contained in `world`. You __can__ create a custom field-less agent object and store it in the `world` if you want, but we usually store it as a field in the `env`, since an `agent` often has other information that need caching. +`env` contains fields called `world` and `agent` (along with some other fields). The point here is to note that an `agent` is stored separately as a field in `env` instead of an object contained in `world`. You *can* create a custom field-less agent object and store it in the `world` if you want, but we usually store it as a field in the `env`, since an `agent` often has other information that need caching. -#### Customizing an existing environment +### Customizing an existing environment The behaviour of environments is easily customizable. Here are some of the things that one may typically want to customize: @@ -69,11 +69,11 @@ The behaviour of environments is easily customizable. Here are some of the thing 1. You can set the navigation style trait (for environments where it makes sense) by `GridWorlds.get_navigation_style(::Type{<:SomeEnv}) = GridWorlds.DIRECTED_NAVIGATION` or `GridWorlds.get_navigation_style(::Type{<:SomeEnv}) = GridWorlds.UNDIRECTED_NAVIGATION`. 1. You can override specific `ReinforcementLearningBase` methods for customization. For example, the default implementation of the `ReinforcementLearingBase.reset!` method for an environment is appropriately randomized (like the goal position and agent start position in `EmptyRoom`). In case you need some custom behaviour, you can do so by simply overriding the `ReinforcementLearningBase.reset!` method, and reusing the rest of the behaviour (like what happens upon taking some action) as it is. You may also want to customize the `ReinforcementLearningBase.state` method to return the entire grid, or only the agent's view, or anything else you wish. See [RLBase API defaults](https://github.com/JuliaReinforcementLearning/GridWorlds.jl/blob/2e8975c85ce3534c2151121a0791be1ec53a8d31/src/abstract_grid_world.jl#L64) in `abstract_grid_world.jl` for examples. -#### Rendering +### Rendering `GridWorlds.jl` offers two modes of rendering: -1. ##### Terminal Rendering +1. #### Terminal Rendering While rendering a gridworld environment in the terminal, we display only one character per tile. If multiple objects are present in the same tile, we go by a priority implied by the order of the corresponding objects (lower the index, higher the priority) in the `objects` attribute (which is a tuple, and hence it is ordered) of the `GridWorldBase` instance. @@ -83,56 +83,56 @@ The behaviour of environments is easily customizable. Here are some of the thing -1. ##### Makie Rendering +1. #### Makie Rendering If available, one can optionally use [`Makie.jl`](https://github.com/JuliaPlots/Makie.jl) in order to render an environment, play with it interactively, and save animations. See the examples given below in List of Environments. ## List of Environments -1. #### EmptyRoom +1. ### EmptyRoom -DirectedNavigation | UndirectedNavigation ------------- | ------------- - | + DirectedNavigation | UndirectedNavigation + ------------ | ------------- + | -1. #### GridRooms +1. ### GridRooms -DirectedNavigation | UndirectedNavigation ------------- | ------------- - | + DirectedNavigation | UndirectedNavigation + ------------ | ------------- + | -1. #### SequentialRooms +1. ### SequentialRooms -DirectedNavigation | UndirectedNavigation ------------- | ------------- - | + DirectedNavigation | UndirectedNavigation + ------------ | ------------- + | -1. #### GoToDoor +1. ### GoToDoor -DirectedNavigation | UndirectedNavigation ------------- | ------------- - | + DirectedNavigation | UndirectedNavigation + ------------ | ------------- + | -1. #### DoorKey +1. ### DoorKey -DirectedNavigation | UndirectedNavigation ------------- | ------------- - | + DirectedNavigation | UndirectedNavigation + ------------ | ------------- + | -1. #### CollectGems +1. ### CollectGems -DirectedNavigation | UndirectedNavigation ------------- | ------------- - | + DirectedNavigation | UndirectedNavigation + ------------ | ------------- + | -1. #### DynamicObstacles +1. ### DynamicObstacles -DirectedNavigation | UndirectedNavigation ------------- | ------------- - | + DirectedNavigation | UndirectedNavigation + ------------ | ------------- + | -1. #### Sokoban +1. ### Sokoban -DirectedNavigation | UndirectedNavigation ------------- | ------------- - | + DirectedNavigation | UndirectedNavigation + ------------ | ------------- + |