Replies: 2 comments
-
I'm not sure if I understand your question correctly. But the idea behind is that, after each |
Beta Was this translation helpful? Give feedback.
0 replies
-
I'm afraid there's a missunderstanding here. The 1:N loop is to collect the performance of each independent rollout. So I don't think anything should be passed to the next iteration.
…------------------ Original ------------------
From: GrutmanE ***@***.***>
Date: Thu,May 20,2021 6:02 AM
To: JuliaReinforcementLearning/ReinforcementLearning.jl ***@***.***>
Cc: Jun Tian ***@***.***>, Comment ***@***.***>
Subject: Re: [JuliaReinforcementLearning/ReinforcementLearning.jl] Agent training (#299)
Thanks for your reply. Let me add clarify some more.
I do not see how the learner gets updated. How does the information from the run number m get passed to the run number m+1? To rephrase my question, some mutable variable in the scope of the 1:N loop in repeated_run function must be passed to the learner to make it different from the previous iteration?
In case of tabular Q learning there must be a Q matrix with dimensions size(environment), size(action space) that needs to be maintained and updated.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
One
of the Plato notebooks in the Zoo (Chapter06_Cliff_Walking.jl) has a following function:It seams repeated_runs uses create_agent (via calling a constructor) to create an identical agent in 1:N, but clearly this cannot be so. What is the trick please? In other words, how does the information is passed along?
P.S.
For completeness adding create_agent function
Beta Was this translation helpful? Give feedback.
All reactions