Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you share training log of DMC envs? #45

Open
return-sleep opened this issue Nov 24, 2023 · 1 comment
Open

Can you share training log of DMC envs? #45

return-sleep opened this issue Nov 24, 2023 · 1 comment

Comments

@return-sleep
Copy link

Thank you for your work. Would you be willing to share the training logs such as how the reward loss as well as the image loss changes as the number of training steps increases and to what extent it eventually converges. When I used the algorithm on my own dataset, I noticed that the world model's reward loss looks larger, and I'm not sure if that makes sense. It looks like the reward loss converges around 0.5, but the range of the reward is -1 to 1. I think this is a relatively large prediction error.

@NM512
Copy link
Owner

NM512 commented Jan 8, 2024

Hello,

Thank you for your interest in this work and for sharing your observations regarding the reward loss in your experiments.

I understand your concern about the world model's reward loss appearing larger than expected, especially considering the reward range of -1 to 1. A reward loss converging around 0.5 can indeed be indicative of a significant prediction error.

To address this, I'd like to inform you that I have recently made an update to the network weight initialization, aligning it same with the original repository. This adjustment could potentially influence the training dynamics and loss values you're observing. The details of this update can be found in the recent commit: Network Weight Initialization Adjustment.

I recommend rerunning your experiments with this latest update. It's possible that this adjustment may lead to improvements in the reward loss metrics for your dataset.

If you continue to observe unusual reward loss values or if you're unable to identify the causes, I would be more than willing to assist further. You can share your logs with me for a closer examination. Alternatively, if you're interested in comparing your results with mine, I can provide the training logs post the modifications mentioned above. In those cases, please contact me via e-mail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants