You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I have just forked this repo, and I am trying to run, MPE simple spread (3 landmark, 2 agent), and test for convergence. I have only changed "n_rollout_threads" and "n_eval_rollout_threads" to 1. The other parameters did't change. I have found two issues:
The reward for MPE simple spreads has extend to from -1300 to -800.
After 2e6 steps, the reward stays around -800.
I have tried to visualize the evaluation result using render(), but the agent didn't go for landmark at all, seems just move around one small place.
Please any one can help on this matters, really appreaciated.
Thank you
The text was updated successfully, but these errors were encountered:
Hi, n_rollout_threads needs to be 128 or at least over 50.
Thank you for your reply. My PC's is unble to handle 128 parallel environments. Currently the maximum I can set is 80. I am still running on simple-spread (3 landmark 3 agents). I have set the "episode_length" to 25 (was accidentally set to 200, which leads to -800~1200 average episode rewards), and "num_env_steps" is set to 10e6. However my average episode rewards is slightly over -150 (average episode reward graph is attached)
. I seems unable to achieve -120 which is current benchmark.
Is it because my agent don't have enough experiences, as I only have 80 parallel environments(n_rollout_threads).
Or is there other critical training parameters that I need to tune.
Thank you
Hi,
I have just forked this repo, and I am trying to run, MPE simple spread (3 landmark, 2 agent), and test for convergence. I have only changed "n_rollout_threads" and "n_eval_rollout_threads" to 1. The other parameters did't change. I have found two issues:
I have tried to visualize the evaluation result using render(), but the agent didn't go for landmark at all, seems just move around one small place.
Please any one can help on this matters, really appreaciated.
Thank you
The text was updated successfully, but these errors were encountered: