Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAPPO simple spread not converging #112

Open
zhangmingcheng28 opened this issue Jul 11, 2024 · 2 comments
Open

MAPPO simple spread not converging #112

zhangmingcheng28 opened this issue Jul 11, 2024 · 2 comments

Comments

@zhangmingcheng28
Copy link

Hi,
I have just forked this repo, and I am trying to run, MPE simple spread (3 landmark, 2 agent), and test for convergence. I have only changed "n_rollout_threads" and "n_eval_rollout_threads" to 1. The other parameters did't change. I have found two issues:

  1. The reward for MPE simple spreads has extend to from -1300 to -800.
  2. After 2e6 steps, the reward stays around -800.
    I have tried to visualize the evaluation result using render(), but the agent didn't go for landmark at all, seems just move around one small place.
    Please any one can help on this matters, really appreaciated.
    Thank you
@zoeyuchao
Copy link
Member

Hi, n_rollout_threads needs to be 128 or at least over 50.

@zhangmingcheng28
Copy link
Author

zhangmingcheng28 commented Jul 19, 2024

Hi, n_rollout_threads needs to be 128 or at least over 50.

Thank you for your reply. My PC's is unble to handle 128 parallel environments. Currently the maximum I can set is 80. I am still running on simple-spread (3 landmark 3 agents). I have set the "episode_length" to 25 (was accidentally set to 200, which leads to -800~1200 average episode rewards), and "num_env_steps" is set to 10e6. However my average episode rewards is slightly over -150 (average episode reward graph is attached)
W B Chart 7_19_2024, 3_08_35 PM
. I seems unable to achieve -120 which is current benchmark.
Is it because my agent don't have enough experiences, as I only have 80 parallel environments(n_rollout_threads).
Or is there other critical training parameters that I need to tune.
Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants