cart_pole #7

wsredniawa · 2020-03-21T17:44:47Z

No description provided.

ziemowit-s · 2020-03-24T10:50:53Z

cart_pole_run.py

+        # print(observation)
+        obs,reward,done,info=env.step(action)
+        # print(obs)
+        obs = np.array([1,1,1,1])*100


obs from env is updated by a constant array, why?

ziemowit-s · 2020-03-24T10:52:01Z

cart_pole_run.py

+    for t in range(100):
+        env_vis.append(env.render(mode='rgb_array'))
+        # print(observation)
+        obs,reward,done,info=env.step(action)


You update your own reward by env reward in each step, and only env reward goes to the agent.step() function

ziemowit-s · 2020-03-24T10:53:56Z

cart_pole_run.py

+        obs,reward,done,info=env.step(action)
+        # print(obs)
+        obs = np.array([1,1,1,1])*100
+        output_spikes_ms=agent.step(observation=abs(obs.reshape((2,2))), reward=reward)


with the new version of Agent - I think we should use:

agent.reward_step() for reward only

agent.step() for step only

cart_pole

1a495cd

ziemowit-s force-pushed the master branch from 53ecac7 to ef41799 Compare March 24, 2020 09:48

ziemowit-s requested changes Mar 24, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cart_pole #7

cart_pole #7

wsredniawa commented Mar 21, 2020

ziemowit-s Mar 24, 2020

ziemowit-s Mar 24, 2020

ziemowit-s Mar 24, 2020

cart_pole #7

Are you sure you want to change the base?

cart_pole #7

Conversation

wsredniawa commented Mar 21, 2020

ziemowit-s Mar 24, 2020

Choose a reason for hiding this comment

ziemowit-s Mar 24, 2020

Choose a reason for hiding this comment

ziemowit-s Mar 24, 2020

Choose a reason for hiding this comment