Infinite horizon time LQR problem solved with Deep Learning and Dynamic Programming

See lqr_infinite_horizon.ipnyb

TODO: add entropy in the value function to add exploration. (We need to know the whole distribution!)

Finite Horizon time Stochastic Control solved using Deep Learning

See lqr.py.

We solve the control problem, by minimising J where g is convex. The policy alpha is parametrised with a neural network, and we use Method of successive approximations on Pontryagin Maximum principle. Algorithm:

Start with initial policy
Solve BSDE using Deep Learning for processes (Y_t, Z_t).
Update policy by maximising Hamiltonian (analog to Q-learning on model-free RL)
Go back to 2.

Example

Drunk agents trying to reach the origin (aka LQR: dX_t = a_t dt + dW_t, with running cost f(x,a) = a^2, and final cost g(x) = x^2)

TODO

Code is loopy. The bsde solver and the Hamiltonian should be vectorized across time.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
images_readme		images_readme
lib		lib
numerical_results		numerical_results
LICENSE		LICENSE
README.md		README.md
lqr.py		lqr.py
lqr_infinite_horizon.ipynb		lqr_infinite_horizon.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Infinite horizon time LQR problem solved with Deep Learning and Dynamic Programming

Finite Horizon time Stochastic Control solved using Deep Learning

Example

TODO

About

Releases

Packages

Languages

License

msabvid/DeepLearning-StochasticControl

Folders and files

Latest commit

History

Repository files navigation

Infinite horizon time LQR problem solved with Deep Learning and Dynamic Programming

Finite Horizon time Stochastic Control solved using Deep Learning

Example

TODO

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages