paper_summaries/Model_Based_Reinforcement_Learning at master · quanvuong/paper_summaries · GitHub

Name		Name	Last commit message	Last commit date
parent directory ..
Exploring_Model_based_Planning_with_Policy_Networks.pdf		Exploring_Model_based_Planning_with_Policy_Networks.pdf
Object_centric_Forward_Modeling_for_Model_Predictive_Control.pdf		Object_centric_Forward_Modeling_for_Model_Predictive_Control.pdf
README.md		README.md
Using_Inaccurate_Models_in_Reinforcement_Learning.pdf		Using_Inaccurate_Models_in_Reinforcement_Learning.pdf
When_to_Trust_Your_Model_Model_based_policy_optimization.pdf		When_to_Trust_Your_Model_Model_based_policy_optimization.pdf

README.md

When_to_Trust_Your_Model_Model_based_policy_optimization

Use short rollout from the model to train the policy
Argue that the short rollout allows us to take more policy gradient step per environment step compared to model-free RL

Exploring_Model_based_Planning_with_Policy_Networks

Argues that performing CEM in the parameter space of a policy network is easier than in the action space.
Visualizes the reward function surface and uses this visualization to justify their claims.
The visualization techniques look interesting and can be re-used in other studies.

Using_Inaccurate_Models_in_Reinforcement_Learning

Differentiate through the learnt model to obtain the policy gradient.
Evaluate this policy gradient at a true trajectory to remove one source of gradient approximation error.