Display - fix #14 (#22)

* added DisplaySimulator code * documentation, etc for display * added a few comments * more permissive NamedTupleTools compat * added display.md to git
JuliaPOMDP · Nov 19, 2019 · 27c515f · 27c515f · zsunberg · Nov 19, 2019
1 parent a464bab
commit 27c515f
Show file tree

Hide file tree

Showing 11 changed files with 226 additions and 14 deletions.
diff --git a/Project.toml b/Project.toml
@@ -1,6 +1,6 @@
 name = "POMDPSimulators"
 uuid = "e0d0a172-29c6-5d4e-96d0-f262df5d01fd"
-version = "0.3.1"
+version = "0.3.2"
 
 [deps]
 BeliefUpdaters = "8bb6e9a1-7d73-552c-a44a-e5dc5634aac4"
@@ -14,7 +14,13 @@ ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
 Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
 
 [compat]
+BeliefUpdaters = "0.1"
+DataFrames = "0.19"
+NamedTupleTools = "0"
+POMDPModelTools = "0.1, 0.2"
+POMDPPolicies = "0.2"
 POMDPs = "0.7.3, 0.8"
+ProgressMeter = "1"
 julia = "1.0"
 
 [extras]

diff --git a/docs/make.jl b/docs/make.jl
@@ -4,10 +4,18 @@ using POMDPSimulators
 makedocs(
     modules = [POMDPSimulators],
     format = Documenter.HTML(),
-    sitename = "POMDPSimulators.jl"
+    sitename = "POMDPSimulators.jl",
+    pages = ["index.md",
+             "which.md",
+             "rollout.md",
+             "parallel.md",
+             "history_recorder.md",
+             "histories.md",
+             "stepthrough.md",
+             "display.md",
+             "sim.md"]
 )
 
 deploydocs(
     repo = "github.com/JuliaPOMDP/POMDPSimulators.jl.git",
 )
-
diff --git a/docs/src/display.md b/docs/src/display.md
@@ -0,0 +1,42 @@
+# Display
+
+## `DisplaySimulator`
+
+The `DisplaySimulator` displays each step of a simulation in real time through a multimedia display such as a Jupyter notebook or [ElectronDisplay](https://github.com/queryverse/ElectronDisplay.jl).
+Specifically it uses `POMDPModelTools.render` and the built-in Julia [`display` function](https://docs.julialang.org/en/v1/base/io-network/#Base.Multimedia.display) to visualize each step.
+
+Example:
+```julia
+using POMDPs
+using POMDPModels
+using POMDPPolicies
+using POMDPSimulators
+using ElectronDisplay
+ElectronDisplay.CONFIG.single_window = true
+
+ds = DisplaySimulator()
+m = SimpleGridWorld()
+simulate(ds, m, RandomPolicy(m))
+```
+
+```@docs
+DisplaySimulator
+```
+
+## Display-specific tips
+
+The following tips may be helpful when using particular displays.
+
+### Jupyter notebooks
+
+By default, in a Jupyter notebook, the visualizations of all steps are displayed in the output box one after another. To make the output animated instead, where the image is overwritten at each step, one may use
+```julia
+DisplaySimulator(predisplay=(d)->IJulia.clear_output(true))
+```
+
+### ElectronDisplay
+
+By default, ElectronDisplay will open a new window for each new step. To prevent this, use
+```julia
+ElectronDisplay.CONFIG.single_window = true
+```
diff --git a/docs/src/histories.md b/docs/src/histories.md
@@ -59,4 +59,4 @@ will produce a vector of the distances traveled on each step (assuming the state
 
 `state_hist(h)`, `action_hist(h)`, `observation_hist(h)` `belief_hist(h)`, and `reward_hist(h)` will return vectors of the states, actions, and rewards, and `undiscounted_reward(h)` and `discounted_reward(h)` will return the total rewards collected over the trajectory. `n_steps(h)` returns the number of steps in the history. `exception(h)` and `backtrace(h)` can be used to hold an exception if the simulation failed to finish.
 
-`view(h, range)` (e.g. `view(h, 1:n_steps(h)-4)`) can be used to create a view of the history object `h` that only contains a certain range of steps. The object returned by `view` is a `SimHistory` that can be iterated through and manipulated just like a complete `SimHistory`.
+`view(h, range)` (e.g. `view(h, 1:n_steps(h)-4)`) can be used to create a view of the history object `h` that only contains a certain range of steps. The object returned by `view` is an `AbstractSimHistory` that can be iterated through and manipulated just like a complete `SimHistory`.
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -7,4 +7,13 @@ Examples can be found in the [simulation tutorial in the POMDPExamples package](
 If you are just getting started, probably the easiest way to begin is the [`stepthrough` function](@ref Stepping-through). Otherwise, consult the [Which Simulator Should I Use?](@ref) page.
 
 ```@contents
+Pages = ["index.md",
+         "which.md",
+         "rollout.md",
+         "parallel.md",
+         "history_recorder.md",
+         "histories.md",
+         "stepthrough.md",
+         "display.md",
+         "sim.md"]
 ```
diff --git a/docs/src/which.md b/docs/src/which.md
@@ -18,12 +18,12 @@ Use the [History Recorder](@ref History-Recorder).
 
 Use the [`stepthrough` function](@ref Stepping-through).
 
-## I want to interact with a MDP or POMDP environment from the policy's perspective
+## I want to visualize a simulation.
 
-Use the [`sim` function](@ref sim-function).
+Use the [`DisplaySimulator`](@ref Display).
 
-## I want to visualize a simulation.
+Also see the [POMDPGifs package](https://github.com/JuliaPOMDP/POMDPGifs.jl) for creating gif animations.
 
-Visualization is not implemented directly in this package. However, the [Blink POMDP Simulator package](https://github.com/JuliaPOMDP/BlinkPOMDPSimulator.jl) contains a simulator for real-time visualization, and the [POMDPGifs package](https://github.com/JuliaPOMDP/POMDPGifs.jl) includes tools for creating gif animations. Additionally, histories produced by a [`HistoryRecorder`](@ref) or [`sim`](@ref) are can be visualized using the [`render`](https://juliapomdp.github.io/POMDPModelTools.jl/latest/visualization.html#POMDPModelTools.render) function from [POMDPModelTools](https://github.com/JuliaPOMDP/POMDPModelTools.jl).
+## I want to interact with a MDP or POMDP environment from the policy's perspective
 
-See the [Visualization Tutorial in POMDPExamples](https://github.com/JuliaPOMDP/POMDPExamples.jl) for more info.
+Use the [`sim` function](@ref sim-function).
diff --git a/src/POMDPSimulators.jl b/src/POMDPSimulators.jl
@@ -54,4 +54,8 @@ export
     problem
 include("parallel.jl")
 
+export
+    DisplaySimulator
+include("display.jl")
+
 end # module
diff --git a/src/display.jl b/src/display.jl
@@ -0,0 +1,115 @@
+struct DisplaySimulator
+    display::Union{AbstractDisplay, Nothing}
+    render_kwargs
+    max_fps::Float64
+    predisplay::Function
+    extra_initial::Bool
+    extra_final::Bool
+    stepsim::StepSimulator
+end
+
+"""
+    DisplaySimulator(;kwargs...)
+
+Create a simulator that displays each step of a simulation.
+
+Given a POMDP or MDP model `m`, this simulator roughly works like
+
+    for step in stepthrough(m, ...)
+        display(render(m, step))
+    end
+
+# Keyword Arguments
+- `display::AbstractDisplay`: the display to use for the first argument to the `display` function. If this is `nothing`, `display(...)` will be called without an `AbstractDisplay` argument.
+- `render_kwargs::NamedTuple`: keyword arguments for `POMDPModelTools.render(...)`
+- `max_fps::Number=10`: maximum number of frames to be displayed per second - `sleep` will be used to skip extra time, so this is not designed for high precision
+- `predisplay::Function`: function to call before every call to `display(...)`. The only argument to this function will be the display (if it is specified) or `nothing`
+- `extra_initial::Bool=false`: if `true`, display an extra step at the beginning with only elements `t`, `sp`, and `bp` for POMDPs (this can be useful to see the initial state if `render` displays only `sp` and not `s`).
+- `extra_final`::Bool=false`: if `true`, display an extra step at the end with only elements `t`, `done`, `s`, and `b` for POMDPs (this can be useful to see the final state if `render` displays only `s` and not `sp`).
+- `max_steps::Integer`: maximum number of steps to run for
+- `spec::NTuple{Symbol}`: specification of what step elements to display (see `eachstep`)
+- `rng::AbstractRNG`: random number generator
+
+See the POMDPSimulators documentation for more tips about using specific displays.
+"""
+function DisplaySimulator(;display=nothing,
+                           render_kwargs=NamedTuple(),
+                           max_fps=10,
+                           predisplay=(d)->nothing,
+                           extra_initial=false,
+                           extra_final=true,
+                           max_steps=nothing,
+                           spec=CompleteSpec(),
+                           rng=Random.GLOBAL_RNG
+                         )
+    stepsim = StepSimulator(rng, max_steps, spec)
+    return DisplaySimulator(display,
+                            render_kwargs,
+                            max_fps,
+                            predisplay,
+                            extra_initial,
+                            extra_final,
+                            stepsim)
+end
+
+function simulate(sim::DisplaySimulator, m, args...)
+    rsum = 0.0
+    disc = 1.0
+    dt = 1/sim.max_fps
+    tm = time()
+    isinitial = true
+    last = NamedTuple() # for extra_final
+
+    for step in simulate(sim.stepsim, m, args...)
+        if isinitial && sim.extra_initial
+            isinitial = false
+            istep = initialstep(m, step)
+            vis = render(m, istep; sim.render_kwargs...)
+            perform_display(sim, vis)
+            sleep_until(tm += dt)
+        end
+
+        vis = render(m, step; sim.render_kwargs...)
+        perform_display(sim, vis)
+        rsum += disc*get(step, :r, missing)
+        disc *= discount(m)
+        sleep_until(tm += dt)
+
+        last = step # save for extra final
+    end
+
+    if sim.extra_final
+        fstep = finalstep(m, last)
+        vis = render(m, fstep; sim.render_kwargs...)
+        perform_display(sim, vis)
+    end
+
+    if ismissing(rsum)
+        return nothing
+    else
+        return rsum
+    end
+end
+
+sleep_until(t) = sleep(max(t-time(), 0.0))
+
+initialstep(m::MDP, step) = (t=0, sp=get(step, :s, missing))
+initialstep(m::POMDP, step) = (t=0,
+                               sp=get(step, :s, missing),
+                               bp=get(step, :b, missing))
+finalstep(m::MDP, last) = (done=true,
+                           t=get(last, :t, missing) + 1,
+                           s=get(last, :sp, missing))
+finalstep(m::POMDP, last) = (done=true,
+                             t=get(last, :t, missing) + 1,
+                             s=get(last, :sp, missing),
+                             b=get(last, :bp, missing))
+
+function perform_display(sim::DisplaySimulator, vis)
+    sim.predisplay(sim.display)
+    if sim.display===nothing
+        display(vis)
+    else
+        display(sim.display, vis)
+    end
+end
diff --git a/src/stepthrough.jl b/src/stepthrough.jl
@@ -1,7 +1,7 @@
 # StepSimulator
 # maintained by @zsunberg
 
-mutable struct StepSimulator <: Simulator
+struct StepSimulator <: Simulator
     rng::AbstractRNG
     max_steps::Union{Nothing,Any}
     spec
@@ -11,7 +11,7 @@ function StepSimulator(spec; rng=Random.GLOBAL_RNG, max_steps=nothing)
 end
 
 function simulate(sim::StepSimulator, mdp::MDP{S}, policy::Policy, init_state::S=initialstate(mdp, sim.rng)) where {S}
-    symtuple = convert_spec(sim.spec, MDP)
+    symtuple = convert_spec(sim.spec, typeof(mdp))
     max_steps = something(sim.max_steps, typemax(Int64))
     return MDPSimIterator(symtuple, mdp, policy, sim.rng, init_state, max_steps)
 end
@@ -23,7 +23,7 @@ end
 
 function simulate(sim::StepSimulator, pomdp::POMDP, policy::Policy, bu::Updater, dist::Any, is=initialstate(pomdp, sim.rng))
     initial_belief = initialize_belief(bu, dist)
-    symtuple = convert_spec(sim.spec, POMDP)
+    symtuple = convert_spec(sim.spec, typeof(pomdp))
     max_steps = something(sim.max_steps, typemax(Int64))
     return POMDPSimIterator(symtuple, pomdp, policy, bu, sim.rng, initial_belief, is, max_steps)
 end
@@ -166,8 +166,20 @@ end
 
 convert_spec(spec::Symbol) = spec
 
-default_spec(m::MDP) = tuple(nodenames(DDNStructure(m))..., :t, :action_info)
-default_spec(m::POMDP) = tuple(nodenames(DDNStructure(m))..., :t, :action_info, :b, :bp, :update_info)
+"""
+    CompleteSpec()
+
+Default placeholder for a complete step output specification. Will include all DDNNodes, plus all known possible outputs in each step.
+"""
+struct CompleteSpec end
+
+convert_spec(::CompleteSpec, T::Type{M}) where M <: MDP = default_spec(T)
+convert_spec(::CompleteSpec, T::Type{M}) where M <: POMDP = default_spec(T)
+
+default_spec(m::Union{MDP,POMDP}) = default_spec(typeof(m))
+default_spec(T::Type{M}) where M <: MDP = tuple(nodenames(DDNStructure(T))..., :t, :action_info)
+default_spec(T::Type{M}) where M <: POMDP = tuple(nodenames(DDNStructure(T))..., :t, :action_info, :b, :bp, :update_info)
+
 
 """
     stepthrough(problem, policy, [spec])

diff --git a/test/runtests.jl b/test/runtests.jl
@@ -21,3 +21,6 @@ end
 @testset "parallel" begin
     include("test_parallel.jl")
 end
+@testset "display" begin
+    include("test_display.jl")
+end
diff --git a/test/test_display.jl b/test/test_display.jl
@@ -0,0 +1,13 @@
+ds = DisplaySimulator(max_steps=10,
+                      extra_initial=true,
+                      extra_final=true,
+                      rng=MersenneTwister(4))
+m = BabyPOMDP()
+@test simulate(ds, m, Starve()) ≈ 0.0
+
+ds = DisplaySimulator(max_steps=1,
+                      extra_initial=true,
+                      extra_final=true,
+                      rng=MersenneTwister(4))
+m = SimpleGridWorld()
+@test simulate(ds, m, FunctionPolicy(s->first(actions(m)))) ≈ 0.0
Original file line number	Diff line number	Diff line change
Expand Up		@@ -59,4 +59,4 @@ will produce a vector of the distances traveled on each step (assuming the state

		`state_hist(h)`, `action_hist(h)`, `observation_hist(h)` `belief_hist(h)`, and `reward_hist(h)` will return vectors of the states, actions, and rewards, and `undiscounted_reward(h)` and `discounted_reward(h)` will return the total rewards collected over the trajectory. `n_steps(h)` returns the number of steps in the history. `exception(h)` and `backtrace(h)` can be used to hold an exception if the simulation failed to finish.

		`view(h, range)` (e.g. `view(h, 1:n_steps(h)-4)`) can be used to create a view of the history object `h` that only contains a certain range of steps. The object returned by `view` is a `SimHistory` that can be iterated through and manipulated just like a complete `SimHistory`.
		`view(h, range)` (e.g. `view(h, 1:n_steps(h)-4)`) can be used to create a view of the history object `h` that only contains a certain range of steps. The object returned by `view` is an `AbstractSimHistory` that can be iterated through and manipulated just like a complete `SimHistory`.