Replies: 1 comment 4 replies
-
I think the idea of a closer connection between The algorithms for RL are very different than those from Diffeq, so not sure how easy it would be to connect them. However there are other nice things already created in Diffeq, such as the setup of types for CUDA and parallelization, etc. So it is something to consider. |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
According to this blog, the simulation flow of ReinforcemetLearning.jl is very flexible by calling hooks at any stages as follows.
To me, it looks like callbacks in DifferentialEquations.jl.
How about making this package based on DifferentialEquations.jl?
Although it may cause high dependencies, it would provide extensive functionality and extensibility for continuous time simulation (including ODE, SDE, RODE, and so on). Also, DifferentialEquations.jl can handle discrete time simulations.
In addition, DifferentialEquations.jl is a widely used package for scientific machine learning. For example, it is also compatible with other ML packages, for example, see DiffEqFlux.jl for neural ODE.
Note that this proposal might be biased for continuous-time simulation of dynamical systems (as I'm interested in them).
(Note)
I'm not sure what an appropriate form of
Base.run
would be for continuous-time simulation cuz there might be no difference betweenpolicy(PRE_ACT_STAGE, ...)
,env(action)
, andpolicy(POST_ACT_STAGE, ...)
(they would be integrated into one callback). Probably theaction
is injected into the simulator as parameters, e.g., parameterised simulation within one time step.(Note2)
Of course, there is an alternative way; custom usage of
DifferentialEquations.jl
only forenv(action)
(that is, system propagation).Beta Was this translation helpful? Give feedback.
All reactions