This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
Replies: 1 comment
-
@mxnet-label-bot add [Question, Example] |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am referring to the gluon example: actor critic.
According to the code in
actor_critic.py
, the true returns of each states is calculated as:,which is an Monte Carlo method without bootstrapping.
So I think the name should be
REINOFRCE with Baseline
but notActor Critic
. As stated in Section 13.5 of book Reinforcement Learning: An Introduction:And I also found
Pytorch
has the same issue with their example. But anyway, it is just a naming problem. If almost people think this should be also treated asActor Critic
. Then never mind~Beta Was this translation helpful? Give feedback.
All reactions