Skip to content

Commit

Permalink
Merge pull request #9 from JuliaReinforcementLearning/refactor
Browse files Browse the repository at this point in the history
Refactor to allow containers to be directly used as spaces
  • Loading branch information
findmyway authored Aug 5, 2022
2 parents ddfd7ab + 9ee91b5 commit 00e365d
Show file tree
Hide file tree
Showing 9 changed files with 457 additions and 156 deletions.
115 changes: 91 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,70 +5,137 @@
[![Code Style: Blue](https://img.shields.io/badge/code%20style-blue-4495d1.svg)](https://github.com/invenia/BlueStyle)
[![PkgEval](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/C/CommonRLSpaces.svg)](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/report.html)

## Introduction

A space is simply a set of objects. In a reinforcement learning context, spaces define the sets of possible states, actions, and observations.

In Julia, spaces can be represented by a variety of objects. For instance, a small discrete action set might be represented with `["up", "left", "down", "right"]`, or an interval of real numbers might be represented with an object from the [`IntervalSets`](https://github.com/JuliaMath/IntervalSets.jl) package. In general, the space defined by any Julia object is the set of objects `x` for which `x in space` returns `true`.

In addition to establishing the definition above, this package provides three useful tools:

1. Traits to communicate about the properties of spaces, e.g. whether they are continuous or discrete, how many subspaces they have, and how to interact with them.
2. Functions such as `product` for constructing more complex spaces
3. Constructors to for spaces whose elements are arrays, such as `ArraySpace` and `Box`.

## Concepts and Interface

### Interface for all spaces

Since a space is simply a set of objects, a wide variety of common Julia types including `Vector`, `Set`, `Tuple`, and `Dict`<sup>1</sup>can represent a space.
Because of this inclusive definition, there is a very minimal interface that all spaces are expected to implement. Specifically, it consists of
- `in(x, space)`, which tests whether `x` is a member of the set `space` (this can also be called with the `x in space` syntax).
- `rand(space)`, which returns a valid member of the set<sup>2</sup>.
- `eltype(space)`, which returns the type of the elements in the space.

In addition, the `SpaceStyle` trait is always defined. Calling `SpaceStyle(space)` will return either a `FiniteSpaceStyle`, `ContinuousSpaceStyle`, `HybridSpaceStyle`, or an `UnknownSpaceStyle` object.

### Finite discrete spaces

Spaces with a finite number of elements have `FiniteSpaceStyle`. These spaces are guaranteed to be iterable, implementing Julia's [iteration interface](https://docs.julialang.org/en/v1/manual/interfaces/). In particular `collect(space)` will return all elements in an array.

### Continuous spaces

Continuous spaces represent sets that have an uncountable number of elements they have a `SpaceStyle` of type `ContinuousSpaceStyle`. CommonRLSpaces does not adopt a rigorous mathematical definition of a continuous set, but, roughly, elements in the interior of a continuous space have other elements very close to them.

Continuous spaces have some additional interface functions:

- `bounds(space)` returns upper and lower bounds in a tuple. For example, if `space` is a unit circle, `bounds(space)` will return `([-1.0, -1.0], [1.0, 1.0])`. This allows agents to choose policies that appropriately cover the space e.g. a normal distribution with a mean of `mean(bounds(space))` and a standard deviation of half the distance between the bounds.
- `clamp(x, space)` returns an element of `space` that is near `x`. i.e. if `space` is a unit circle, `clamp([2.0, 0.0], space)` might return `[1.0, 0.0]`. This allows for a convenient way for an agent to find a valid action if they sample actions from a distribution that doesn't match the space exactly (e.g. a normal distribution).
- `clamp!(x, space)`, similar to `clamp`, but clamps `x` in place.

### Hybrid spaces

The interface for hybrid continuous-discrete spaces is currently planned, but not yet defined. If the space style is not `FiniteSpaceStyle` or `ContinuousSpaceStyle`, it is `UnknownSpaceStyle`.

### Spaces of arrays

[need to figure this out, but I think `elsize(space)` should return the size of the arrays in the space]

### Cartesian products of spaces

The Cartesian product of two spaces `a` and `b` can be constructed with `c = product(a, b)`.

The exact form of the resulting space is unspecified and should be considered an implementation detail. The only guarantees are (1) that there will be one unique element of `c` for every combination of one object from `a` and one object from `b` and (2) that the resulting space conforms to the interface above.

The `TupleSpaceProduct` constructor provides a specialized Cartesian product where each element is a tuple, i.e. `TupleSpaceProduct(a, b)` has elements of type `Tuple{eltype(a), eltype(b)}`.

---

<sup>1</sup>Note: the elements of a space represented by a `Dict` are key-value `Pair`s.
<sup>2</sup>[TODO: should we make any guarantees about whether `rand(space)` is drawn from a uniform distribution?]

## Usage

### Construction

|Category|Style|Example|
|:---|:----|:-----|
|Enumerable discrete space| `DiscreteSpaceStyle{()}()` | `Space((:cat, :dog))`, `Space(0:1)`, `Space(1:2)`, `Space(Bool)`|
|Multi-dimensional discrete space| `DiscreteSpaceStyle{(3,4)}()` | `Space((:cat, :dog), 3, 4)`, `Space(0:1, 3, 4)`, `Space(1:2, 3, 4)`, `Space(Bool, 3, 4)`|
|Multi-dimensional variable discrete space| `DiscreteSpaceStyle{(2,)}()` | `Space(SVector((:cat, :dog), (:litchi, :longan, :mango))`, `Space([-1:1, (false, true)])`|
|Continuous space| `ContinuousSpaceStyle{()}()` | `Space(-1.2..3.3)`, `Space(Float32)`|
|Multi-dimensional continuous space| `ContinuousSpaceStyle{(3,4)}()` | `Space(-1.2..3.3, 3, 4)`, `Space(Float32, 3, 4)`|
|Enumerable discrete space| `FiniteSpaceStyle{()}()` | `(:cat, :dog)`, `0:1`, `["a","b","c"]` |
|One dimensional continuous space| `ContinuousSpaceStyle{()}()` | `-1.2..3.3`, `Interval(1.0, 2.0)` |
|Multi-dimensional discrete space| `FiniteSpaceStyle{(3,4)}()` | `ArraySpace((:cat, :dog), 3, 4)`, `ArraySpace(0:1, 3, 4)`, `ArraySpace(1:2, 3, 4)`, `ArraySpace(Bool, 3, 4)`|
|Multi-dimensional variable discrete space| `FiniteSpaceStyle{(2,)}()` | `product((:cat, :dog), (:litchi, :longan, :mango))`, `product(-1:1, (false, true))`|
|Multi-dimensional continuous space| `ContinuousSpaceStyle{(2,)}()` or `ContinuousSpaceStyle{(3,4)}()` | `Box([-1.0, -2.0], [2.0, 4.0])`, `product(-1.2..3.3, -4.6..5.0)`, `ArraySpace(-1.2..3.3, 3, 4)`, `ArraySpace(Float32, 3, 4)` |
|Multi-dimensional hybrid space| `HybridSpaceStyle{(2,),()}()` | `product(-1.2..3.3, -4.6..5.0, [:cat, :dog])`, `product(Box([-1.0, -2.0], [2.0, 4.0]), [1,2,3])`|

### API

```julia
julia> using CommonRLSpaces

julia> s = Space((:litchi, :longan, :mango))
Space{Tuple{Symbol, Symbol, Symbol}}((:litchi, :longan, :mango))
julia> s = (:litchi, :longan, :mango)

julia> rand(s)
:litchi

julia> rand(s) in s
true

julia> size(s)
()
julia> length(s)
3
```

```julia
julia> s = Space(UInt8, 2,3)
Space{Matrix{UnitRange{UInt8}}}(UnitRange{UInt8}[0x00:0xff 0x00:0xff 0x00:0xff; 0x00:0xff 0x00:0xff 0x00:0xff])
julia> s = ArraySpace(1:5, 2,3)
CommonRLSpaces.RepeatedSpace{UnitRange{Int64}, Tuple{Int64, Int64}}(1:5, (2, 3))

julia> rand(s)
2×3 Matrix{UInt8}:
0x7b 0x38 0xf3
0x6a 0xe1 0x28
2×3 Matrix{Int64}:
4 1 1
3 2 2

julia> rand(s) in s
true

julia> SpaceStyle(s)
DiscreteSpaceStyle{(2, 3)}()
FiniteSpaceStyle()

julia> size(s)
julia> elsize(s)
(2, 3)
```

```julia
julia> s = Space(SVector(-1..1, 0..1))
Space{SVector{2, ClosedInterval{Int64}}}(ClosedInterval{Int64}[-1..1, 0..1])
julia> s = product(-1..1, 0..1)
Box{StaticArraysCore.SVector{2, Float64}}([-1.0, 0.0], [1.0, 1.0])

julia> rand(s)
2-element SVector{2, Float64} with indices SOneTo(2):
0.5563101538643473
0.9227368869418011
2-element StaticArraysCore.SVector{2, Float64} with indices SOneTo(2):
0.03049072910834738
0.6295234114874269

julia> rand(s) in s
true

julia> SpaceStyle(s)
ContinuousSpaceStyle{(2,)}()
ContinuousSpaceStyle()

julia> size(s)
julia> elsize(s)
(2,)
```

julia> bounds(s)
([-1.0, 0.0], [1.0, 1.0])

julia> clamp([5, 5], s)
2-element StaticArraysCore.SizedVector{2, Float64, Vector{Float64}} with indices SOneTo(2):
1.0
1.0
```
29 changes: 26 additions & 3 deletions src/CommonRLSpaces.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,34 @@ module CommonRLSpaces

using Reexport

@reexport using FillArrays
@reexport using IntervalSets
@reexport using StaticArrays
@reexport import Base: OneTo

using StaticArrays
using FillArrays
using Random
import Base: clamp

export
SpaceStyle,
AbstractSpaceStyle,
FiniteSpaceStyle,
ContinuousSpaceStyle,
UnknownSpaceStyle,
bounds,
elsize

include("basic.jl")

export
Box,
ArraySpace

include("array.jl")

export
product,
TupleProduct

include("product.jl")

end
81 changes: 81 additions & 0 deletions src/array.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
abstract type AbstractArraySpace end
# Maybe AbstractArraySpace should have an eltype parameter so that you could call convert(AbstractArraySpace{Float32}, space)

"""
Box(lower, upper)
A Box represents a space of real-valued arrays bounded element-wise above by `upper` and below by `lower`, e.g. `Box([-1, -2], [3, 4]` represents the two-dimensional vector space that is the Cartesian product of the two closed sets: ``[-1, 3] \\times [-2, 4]``.
The elements of a Box are always `AbstractArray`s with `AbstractFloat` elements. `Box`es always have `ContinuousSpaceStyle`, and products of `Box`es with `Box`es or `ClosedInterval`s are `Box`es when the dimensions are compatible.
"""
struct Box{A<:AbstractArray{<:AbstractFloat}} <: AbstractArraySpace
lower::A
upper::A

Box{A}(lower, upper) where {A<:AbstractArray} = new(lower, upper)
end

function Box(lower, upper; convert_to_static::Bool=false)
@assert size(lower) == size(upper)
sz = size(lower)
continuous_lower = convert(AbstractArray{float(eltype(lower))}, lower)
continuous_upper = convert(AbstractArray{float(eltype(upper))}, upper)
if convert_to_static
final_lower = SArray{Tuple{sz...}}(continuous_lower)
final_upper = SArray{Tuple{sz...}}(continuous_upper)
else
final_lower, final_upper = promote(continuous_lower, continuous_upper)
end
return Box{typeof(final_lower)}(final_lower, final_upper)
end

# By default, convert builtin arrays to static
Box(lower::Array, upper::Array) = Box(lower, upper; convert_to_static=true)

SpaceStyle(::Box) = ContinuousSpaceStyle()

function Base.rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:Box})
box = sp[]
return box.lower + rand_similar(rng, box.lower) .* (box.upper-box.lower)
end

rand_similar(rng::AbstractRNG, a::StaticArray) = rand(rng, typeof(a))
rand_similar(rng::AbstractRNG, a::AbstractArray) = rand(rng, eltype(a), size(a)...)

Base.in(x::AbstractArray, b::Box) = all(b.lower .<= x .<= b.upper)

Base.eltype(::Box{A}) where A = A
elsize(b::Box) = size(b.lower)

bounds(b::Box) = (b.lower, b.upper)
Base.clamp(x::AbstractArray, b::Box) = clamp.(x, b.lower, b.upper)

Base.convert(t::Type{<:Box}, i::ClosedInterval) = t(SA[minimum(i)], SA[maximum(i)])

struct RepeatedSpace{B, S<:Tuple} <: AbstractArraySpace
base_space::B
elsize::S
end

"""
ArraySpace(base_space, size...)
Create a space of Arrays with shape `size`, where each element of the array is drawn from `base_space`.
"""
ArraySpace(base_space, size...) = RepeatedSpace(base_space, size)

SpaceStyle(s::RepeatedSpace) = SpaceStyle(s.base_space)

Base.rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:RepeatedSpace}) = rand(rng, sp[].base_space, sp[].elsize...)

Base.in(x::AbstractArray, s::RepeatedSpace) = all(entry in s.base_space for entry in x)
Base.eltype(s::RepeatedSpace) = AbstractArray{eltype(s.base_space), length(s.elsize)}
Base.eltype(s::RepeatedSpace{<:AbstractInterval}) = AbstractArray{Random.gentype(s.base_space), length(s.elsize)}
elsize(s::RepeatedSpace) = s.elsize

function bounds(s::RepeatedSpace)
bs = bounds(s.base_space)
return (Fill(first(bs), s.elsize...), Fill(last(bs), s.elsize...))
end

Base.clamp(x::AbstractArray, s::RepeatedSpace) = map(entry -> clamp(entry, s.base_space), x)
Loading

0 comments on commit 00e365d

Please sign in to comment.