-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic programming #12
Comments
So how can we implement iterations for policy and value statements. |
@lamare3423 What do you mean by saying "add dynamic bellman equation for reward function? Do you want to customize the reward function? |
@Souphis i want to understand something. If we imply dynamic reward function, our reward function can be more success. Is it true . for example how can we customize our reward function for your work. Should we write the code to be prepared to create the reward function with dynamic programming in our main function or code it into the agent we will use? Have you got any examples ? For example How to convert the reward function into a dynamic reward function using the ddpg algorithm and does it help? |
@lamare3423 Oh, okay, so you want to change the reward function during learning? There are two solutions:
|
@Souphis First of all thank u for all information. I'm dealing with a mobile robot that can avoid obstacle and go to the target. I'm encountering and trying to solve the situations I mentioned above with what I've been able to do so far. I've used Pyrep and I'm working with the ddpg agent. I don't know how to make the changes according to the situations you suggest. What should I change in the agent itself and its network updates? For example, I created a function called build critic train method in my ddpg agent code, do I need to make changes related to the reward function in this part? |
hi sir, i prepare a study (master degree) ((deep reinforcement learning approach based on dynamic path planning for mobile robot) |
How can we add dynamic bellman equation for reward function? It will be more sensitive rewards for us.
thank u
The text was updated successfully, but these errors were encountered: