-
Notifications
You must be signed in to change notification settings - Fork 0
Description
I am very confused with many things in the code and how they relate to reproducing the results in the paper.
-
One question is why we are using the reach-avoid value function based on the HJ "https://github.com/HJReachability/safety_rl/" instead of the traditional HJ function "https://ieeexplore.ieee.org/document/8794107". As far as I understood, we are not using a reach-while-avoid (RWA) policy as the controller but using the vanilla HJ value function as a safety filter on top of the dreamer controller, so it confused me why the RWA controller is being used in the code. I expected the HJ to be learned using the paper referenced in "https://ieeexplore.ieee.org/document/8794107". I might be confused/misunderstood something, so please let me know!
-
Also, will the code be updated at any time for evaluation to reproduce any results in the paper? A large body of the code is missing to do that