Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to allow containers to be directly used as spaces #9

Merged
merged 14 commits into from
Aug 5, 2022
79 changes: 62 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,39 +5,85 @@
[![Code Style: Blue](https://img.shields.io/badge/code%20style-blue-4495d1.svg)](https://github.com/invenia/BlueStyle)
[![PkgEval](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/C/CommonRLSpaces.svg)](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/report.html)

## Introduction

A space is simply a set of objects. In a reinforcement learning context, spaces define the sets of possible states, actions, and observations.

In Julia, spaces can be represented by a variety of objects. For instance, a small discrete action set might be represented with `["up", "left", "down", "right"]`, or an interval of real numbers might be represented with an object from the `IntervalSets` package. In general, the space defined by any Julia object is the set of objects `x` for which `x in space` returns `true`.

In addition to establishing the definition above, this package provides three useful tools:
1. Traits to communicate about the properties of spaces, e.g. whether they are continuous or discrete, how many dimensions they have, and how to interact with them.
2. Functions such as `product` for constructing more complex spaces
3. Constructors to for spaces whose elements are arrays, such as `ArraySpace` and `Box`.

## Concepts and Interface

### Interface for all spaces

Since a space is simply a set of objects, a wide variety of common Julia types including `Vector`, `Set`, `Tuple`, and `Dict`<sup>1</sup>can represent a space.
Because of this inclusive definition, there is a very minimal interface that all spaces are expected to implement. Specifically, it consists of
- `in(x, space)`, which tests whether `x` is a member of the set `space` (this can also be called with the `x in space` syntax).
- `rand(space)`, which returns a valid member of the set<sup>2</sup>.
- `eltype(space)`, which returns the type of the elements in the space.

In addition, the `SpaceStyle` trait is always defined. Calling `SpaceStyle(space)` will return either a `FiniteSpaceStyle`, `ContinuousSpaceStyle`, `HybridSpaceStyle`, or an `UnknownSpaceStyle` object.

### Finite discrete spaces

Spaces with a finite number of elements have `FiniteSpaceStyle`. These spaces are guaranteed to be iterable, implementing Julia's [iteration interface](https://docs.julialang.org/en/v1/manual/interfaces/). In particular `collect(space)` will return all elements in an array.

### Continuous spaces

Continuous spaces represent sets that have an uncountable number of elements they have a `SpaceStyle` of type `ContinuousSpaceStyle`. CommonRLSpaces does not adopt a rigorous mathematical definition of a continuous set, but, roughly, elements in the interior of a continuous space have other elements very close to them.
Continuous spaces have two additional interface functions:
- `bounds(space)` returns upper and lower bounds in a tuple. For example, if `space` is a unit circle, `bounds(space)` will return `([-1.0, -1.0], [1.0, 1.0])`. This allows agents to choose policies that appropriately cover the space e.g. a normal distribution with a mean of `mean(bounds(space))` and a standard deviation of half the distance between the bounds.
- `clamp(x, space)` returns an element of `space` that is near `x`. i.e. if `space` is a unit circle, `clamp([2.0, 0.0], space)` might return `[1.0, 0.0]`. This allows for a convenient way for an agent to find a valid action if they sample actions from a distribution that doesn't match the space exactly (e.g. a normal distribution).

### Hybrid spaces

[need to figure this out]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Hybrid Spaces, I think we can still use product(s1, s2) to create a hybrid space.

To help access each sub space easily, s1 and s2 may be named spaces and the hybrid space need to support getindex.


### Spaces of arrays

[need to figure this out, but I think `elsize(space)` should return the size of the arrays in the space]

---

<sup>1</sup>Note: the elements of a space represented by a `Dict` are key-value `Pair`s.
<sup>2</sup>[TODO: should we make any guarantees about whether `rand(space)` is drawn from a uniform distribution?]

## Usage

### Construction

|Category|Style|Example|
|:---|:----|:-----|
|Enumerable discrete space| `DiscreteSpaceStyle{()}()` | `Space((:cat, :dog))`, `Space(0:1)`, `Space(1:2)`, `Space(Bool)`|
|Multi-dimensional discrete space| `DiscreteSpaceStyle{(3,4)}()` | `Space((:cat, :dog), 3, 4)`, `Space(0:1, 3, 4)`, `Space(1:2, 3, 4)`, `Space(Bool, 3, 4)`|
|Multi-dimensional variable discrete space| `DiscreteSpaceStyle{(2,)}()` | `Space(SVector((:cat, :dog), (:litchi, :longan, :mango))`, `Space([-1:1, (false, true)])`|
|Continuous space| `ContinuousSpaceStyle{()}()` | `Space(-1.2..3.3)`, `Space(Float32)`|
|Multi-dimensional continuous space| `ContinuousSpaceStyle{(3,4)}()` | `Space(-1.2..3.3, 3, 4)`, `Space(Float32, 3, 4)`|
|Enumerable discrete space| `FiniteSpaceStyle{()}()` | `(:cat, :dog)`, `0:1`, `["a","b","c"]` |
|One dimensional continuous space| `ContinuousSpaceStyle{()}()` | `-1.2..3.3`, `Interval(1.0, 2.0)` |
|Multi-dimensional discrete space| `FiniteSpaceStyle{(3,4)}()` | `ArraySpace((:cat, :dog), 3, 4)`, `ArraySpace(0:1, 3, 4)`, `ArraySpace(1:2, 3, 4)`, `ArraySpace(Bool, 3, 4)`|
|Multi-dimensional variable discrete space| `FiniteSpaceStyle{(2,)}()` | `product((:cat, :dog), (:litchi, :longan, :mango))`, `product(-1:1, (false, true))`|
|Multi-dimensional continuous space| `ContinuousSpaceStyle{(2,)}()` or `ContinuousSpaceStyle{(3,4)}()` | `Box([-1.0, -2.0], [2.0, 4.0])`, `product(-1.2..3.3, -4.6..5.0)`, `ArraySpace(-1.2..3.3, 3, 4)`, `ArraySpace(Float32, 3, 4)` |
|Multi-dimensional hybrid space| `HybridSpaceStyle{(2,),()}()` | `product(-1.2..3.3, -4.6..5.0, [:cat, :dog])`, `product(Box([-1.0, -2.0], [2.0, 4.0]), [1,2,3])`|

### API

```julia
julia> using CommonRLSpaces

julia> s = Space((:litchi, :longan, :mango))
Space{Tuple{Symbol, Symbol, Symbol}}((:litchi, :longan, :mango))
julia> s = (:litchi, :longan, :mango)

julia> rand(s)
:litchi

julia> rand(s) in s
true

julia> size(s)
()
julia> length(s)
3
```

```julia
julia> s = Space(UInt8, 2,3)
Space{Matrix{UnitRange{UInt8}}}(UnitRange{UInt8}[0x00:0xff 0x00:0xff 0x00:0xff; 0x00:0xff 0x00:0xff 0x00:0xff])
julia> s = ArraySpace(UInt8, 2,3)

julia> rand(s)
2×3 Matrix{UInt8}:
Expand All @@ -48,15 +94,14 @@ julia> rand(s) in s
true

julia> SpaceStyle(s)
DiscreteSpaceStyle{(2, 3)}()
FiniteSpaceStyle{(2, 3)}()

julia> size(s)
julia> elsize(s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may also have eltype implemented

(2, 3)
```

```julia
julia> s = Space(SVector(-1..1, 0..1))
Space{SVector{2, ClosedInterval{Int64}}}(ClosedInterval{Int64}[-1..1, 0..1])
julia> s = product(-1..1, 0..1)

julia> rand(s)
2-element SVector{2, Float64} with indices SOneTo(2):
Expand All @@ -69,6 +114,6 @@ true
julia> SpaceStyle(s)
ContinuousSpaceStyle{(2,)}()

julia> size(s)
julia> elsize(s)
(2,)
```
```
26 changes: 23 additions & 3 deletions src/CommonRLSpaces.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,31 @@ module CommonRLSpaces

using Reexport

@reexport using FillArrays
@reexport using IntervalSets
@reexport using StaticArrays
@reexport import Base: OneTo

using StaticArrays
using FillArrays

export
SpaceStyle,
AbstractSpaceStyle,
FiniteSpaceStyle,
ContinuousSpaceStyle,
UnknownSpaceStyle,
bounds,
elsize

include("basic.jl")

export
Box,
ArraySpace

include("array.jl")

export
product

include("product.jl")

end
69 changes: 69 additions & 0 deletions src/array.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
abstract type AbstractArraySpace end
# Maybe AbstractArraySpace should have an eltype parameter so that you could call convert(AbstractArraySpace{Float32}, space)

struct Box{A<:AbstractArray} <: AbstractArraySpace
lower::A
upper::A

Box{A}(lower, upper) where {A<:AbstractArray} = new(lower, upper)
end

function Box(lower, upper; convert_to_static::Bool=false)
@assert size(lower) == size(upper)
sz = size(lower)
continuous_lower = convert(AbstractArray{float(eltype(lower))}, lower)
continuous_upper = convert(AbstractArray{float(eltype(upper))}, upper)
if convert_to_static
final_lower = SArray{Tuple{sz...}}(continuous_lower)
final_upper = SArray{Tuple{sz...}}(continuous_upper)
else
final_lower, final_upper = promote(continuous_lower, continuous_upper)
end
return Box{typeof(final_lower)}(final_lower, final_upper)
end

# By default, convert builtin arrays to static
Box(lower::Array, upper::Array) = Box(lower, upper; convert_to_static=true)

SpaceStyle(::Box) = ContinuousSpaceStyle()

function Base.rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:Box})
box = sp[]
return box.lower + rand_similar(rng, box.lower) .* (box.upper-box.lower)
end

rand_similar(rng::AbstractRNG, a::StaticArray) = rand(rng, typeof(a))
rand_similar(rng::AbstractRNG, a::AbstractArray) = rand(rng, eltype(a), size(a)...)

Base.in(x::AbstractArray, b::Box) = all(b.lower .<= x .<= b.upper)

Base.eltype(::Box{A}) where A = A
elsize(b::Box) = size(b.lower)

bounds(b::Box) = (b.lower, b.upper)
Base.clamp(x::AbstractArray, b::Box) = clamp.(x, b.lower, b.upper)

Base.convert(t::Type{<:Box}, i::ClosedInterval) = t(SA[minimum(i)], SA[maximum(i)])

struct RepeatedSpace{B, S<:Tuple} <: AbstractArraySpace
base_space::B
elsize::S
end

ArraySpace(base_space, size...) = RepeatedSpace(base_space, size)

SpaceStyle(s::RepeatedSpace) = SpaceStyle(s.base_space)

Base.rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:RepeatedSpace}) = rand(rng, sp[].base_space, sp[].elsize...)

Base.in(x::AbstractArray, s::RepeatedSpace) = all(entry in s.base_space for entry in x)
Base.eltype(s::RepeatedSpace) = AbstractArray{eltype(s.base_space), length(s.elsize)}
Base.eltype(s::RepeatedSpace{<:AbstractInterval}) = AbstractArray{Random.gentype(s.base_space), length(s.elsize)}
elsize(s::RepeatedSpace) = s.elsize

function bounds(s::RepeatedSpace)
bs = bounds(s.base_space)
return (Fill(first(bs), s.elsize...), Fill(last(bs), s.elsize...))
end

Base.clamp(x::AbstractArray, s::RepeatedSpace) = map(entry -> clamp(entry, s.base_space), x)
98 changes: 16 additions & 82 deletions src/basic.jl
Original file line number Diff line number Diff line change
@@ -1,97 +1,31 @@
export Space, SpaceStyle, DiscreteSpaceStyle, ContinuousSpaceStyle, TupleSpace, NamedSpace, DictSpace

using Random

struct Space{T}
s::T
end

Space(s::Type{T}) where {T<:Integer} = Space(typemin(T):typemax(T))
Space(s::Type{T}) where {T<:AbstractFloat} = Space(typemin(T) .. typemax(T))

Space(x, dims::Int...) = Space(Fill(x, dims))
Space(x::Type{T}, dim::Int, extra_dims::Int...) where {T<:Integer} = Space(Fill(typemin(x):typemax(T), dim, extra_dims...))
Space(x::Type{T}, dim::Int, extra_dims::Int...) where {T<:AbstractFloat} = Space(Fill(typemin(x) .. typemax(T), dim, extra_dims...))
Space(x::Type{T}, dim::Int, extra_dims::Int...) where {T} = Space(Fill(T, dim, extra_dims...))

Base.size(s::Space) = size(SpaceStyle(s))
Base.length(s::Space) = length(SpaceStyle(s), s)
Base.getindex(s::Space, i...) = getindex(SpaceStyle(s), s, i...)
Base.:(==)(s1::Space, s2::Space) = s1.s == s2.s

#####

abstract type AbstractSpaceStyle{S} end

struct DiscreteSpaceStyle{S} <: AbstractSpaceStyle{S} end
struct ContinuousSpaceStyle{S} <: AbstractSpaceStyle{S} end
abstract type AbstractSpaceStyle end

SpaceStyle(::Space{<:Tuple}) = DiscreteSpaceStyle{()}()
SpaceStyle(::Space{<:AbstractVector{<:Number}}) = DiscreteSpaceStyle{()}()
SpaceStyle(::Space{<:AbstractInterval}) = ContinuousSpaceStyle{()}()
struct FiniteSpaceStyle <: AbstractSpaceStyle end
struct ContinuousSpaceStyle <: AbstractSpaceStyle end
struct UnknownSpaceStyle <: AbstractSpaceStyle end

SpaceStyle(s::Space{<:AbstractArray{<:Tuple}}) = DiscreteSpaceStyle{size(s.s)}()
SpaceStyle(s::Space{<:AbstractArray{<:AbstractRange}}) = DiscreteSpaceStyle{size(s.s)}()
SpaceStyle(s::Space{<:AbstractArray{<:AbstractInterval}}) = ContinuousSpaceStyle{size(s.s)}()
SpaceStyle(space::Any) = UnknownSpaceStyle()

Base.size(::AbstractSpaceStyle{S}) where {S} = S
Base.length(::DiscreteSpaceStyle{()}, s) = length(s.s)
Base.getindex(::DiscreteSpaceStyle{()}, s, i...) = getindex(s.s, i...)
Base.length(::DiscreteSpaceStyle, s) = mapreduce(length, *, s.s)

#####
SpaceStyle(::Tuple) = FiniteSpaceStyle()
SpaceStyle(::NamedTuple) = FiniteSpaceStyle()

Random.rand(rng::Random.AbstractRNG, s::Space) = rand(rng, s.s)

Random.rand(
rng::Random.AbstractRNG,
s::Union{
<:Space{<:AbstractArray{<:Tuple}},
<:Space{<:AbstractArray{<:AbstractRange}},
<:Space{<:AbstractArray{<:AbstractInterval}}
}
) = map(x -> rand(rng, x), s.s)

Base.in(x, s::Space) = x in s.s
Base.in(x, s::Space{<:Type}) = x isa s.s

Base.in(
x,
s::Union{
<:Space{<:AbstractArray{<:Tuple}},
<:Space{<:AbstractArray{<:AbstractRange}},
<:Space{<:AbstractArray{<:AbstractInterval}}
}
) = size(x) == size(s) && all(x -> x[1] in x[2], zip(x, s.s))

function Random.rand(rng::AbstractRNG, s::Interval{:closed,:closed,T}) where {T}
if s == typemin(T) .. typemax(T)
rand(T)
function SpaceStyle(x::Union{AbstractArray,AbstractDict,AbstractSet,AbstractRange})
if Base.IteratorSize(x) isa Union{Base.HasLength, Base.HasShape} && length(x) < Inf
return FiniteSpaceStyle()
else
r = rand(rng)

if r == 0.0
r = rand(Bool)
end

r * (s.right - s.left) + s.left
return UnknownSpaceStyle()
end
end

Base.iterate(s::Space, args...) = iterate(SpaceStyle(s), s, args...)
Base.iterate(::DiscreteSpaceStyle{()}, s::Space, args...) = iterate(s.s, args...)

#####
SpaceStyle(::AbstractInterval) = ContinuousSpaceStyle()

const TupleSpace = Tuple{Vararg{Space}}
const NamedSpace = NamedTuple{<:Any,<:TupleSpace}
const VectorSpace = Vector{<:Space}
const DictSpace = Dict{<:Any,<:Space}
function elsize end # note: different than Base.elsize

Random.rand(rng::AbstractRNG, s::Union{TupleSpace,NamedSpace,VectorSpace}) = map(x -> rand(rng, x), s)
Random.rand(rng::AbstractRNG, s::DictSpace) = Dict(k => rand(rng, s[k]) for k in keys(s))
function bounds end

Base.in(xs::Tuple, ts::TupleSpace) = length(xs) == length(ts) && all(((x, s),) -> x in s, zip(xs, ts))
Base.in(xs::AbstractVector, ts::VectorSpace) = length(xs) == length(ts) && all(((x, s),) -> x in s, zip(xs, ts))
Base.in(xs::NamedTuple{names}, ns::NamedTuple{names,<:TupleSpace}) where {names} = all(((x, s),) -> x in s, zip(xs, ns))
Base.in(xs::Dict, ds::DictSpace) = length(xs) == length(ds) && all(k -> haskey(ds, k) && xs[k] in ds[k], keys(xs))
bounds(i::AbstractInterval) = (infimum(i), supremum(i))
Base.clamp(x, i::AbstractInterval) = IntevalSets.clamp(x, i)
23 changes: 23 additions & 0 deletions src/product.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
product(i1::ClosedInterval, i2::ClosedInterval) = Box(SA[minimum(i1), minimum(i2)], SA[maximum(i1), maximum(i2)])

product(b::Box, i::ClosedInterval) = product(b, convert(Box, i))
product(i::ClosedInterval, b::Box) = product(convert(Box, i), b)
product(b1::Box{<:AbstractVector}, b2::Box{<:AbstractVector}) = Box(vcat(b1.lower, b2.lower), vcat(b1.upper, b2.upper))
function product(b1::Box, b2::Box)
if size(b1.lower, 2) == size(b2.lower, 2) # same number of columns
return Box(vcat(b1.lower, b2.lower), vcat(b1.upper, b2.upper))
else
return GenericrSpaceProduct((b1, b2))
end
end

# handle case of 3 or more
product(s1, s2, s3, args...) = product(product(s1, s2), s3, args...)

struct GenericrSpaceProduct{T<:Tuple}
zsunberg marked this conversation as resolved.
Show resolved Hide resolved
members::T
end

# handle any case not covered above
product(s1, s2) = GenericrSpaceProduct((s1, s2))
product(s1::GenericrSpaceProduct, s2) = GenericrSpaceProduct((s1.members..., s2))
Loading