-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hyperparameter Optimization Module #164
base: main
Are you sure you want to change the base?
Conversation
…ve unnecessary config in cartpole_stab.yaml 2. add hpo module in test_build.py
…ckpoint in ppo.py. 4. Boolean var in ppo_sampler.
…dd an example of hpo for gpmpc.
… avg return in base_experiment.py. 3. use BaseExperiment class in hpo example 3. add hp study bash script and jupyter notebook for gpmpc.
… config. 3. add done_on_max_steps in base_experiment.py. 4. remove _run() and use BaseExperiment in hpo.
…SQLite. 4. Unittest for HPO for iLQR, PPO, and GPMPC on cartpole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good! I have a couple thoughts though:
-
It seems like there is a lot of controller-specific code. For example, there is a separate sampler for each controller, and in the HPO code, there are if statements depending on the algorithm being optimized. I'm wondering if there is a way to make this more generic by making an HPO yaml more complex and then having the underlying code make classes from the arguments in these yamls? For example, I think that hpo_sampler.py could almost entirely be defined in a yaml and then having a generic class for sampler that parses the yaml appropriately? It feels like there is a lot of repeated code that could be simplified and the addition of future algos simpler?
-
the HPO class is defined in both hpo_optuna.py and hpo_vizier.py which seem to share a lot of code. I think there should really be a parent HPO class, and then child sublcasses for the different use cases?
-
I'm not totally sure why the files are being removed from the examples. Is it because you are replacing them with better hyper parameters?
|
||
class HPO(object): | ||
|
||
def __init__(self, algo, task, load_study, output_dir, task_config, hpo_config, algo_config, safety_filter=None, sf_config=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put each argument on a separate line and provide types and defaults for all if possible.
Gs = np.inf | ||
for i in range(self.hpo_config.repetitions): | ||
# np.random.seed() | ||
seed = np.random.randint(0, 10000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be careful about this for reproducibility. In the envs, we usually create a random seeding object that is used for all randomness. Can we incorporate this somehow to ensure everything is exactly reproducible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see hpo_experiment.py
.
seed = np.random.randint(0, 10000) | ||
# update the agent config with sample candidate hyperparameters | ||
# new agent with the new hps | ||
for hp in sampled_hyperparams: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this more generic, is there a way to make the keys part of the sampled_hyperparams dictionary? Could hp object also include the name like 'q_mpc', 'r_mpc'? Otherwise, this logic has to be updated whenever a new controller is added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
from safe_control_gym.utils.registration import make | ||
from safe_control_gym.utils.utils import mkdirs | ||
|
||
class HPO(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is defined in both hpo_optuna and hpo_vizier, naming should be more precise instead of having the same class defined in two places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Adam's comments and believe they should be addressed before the next round of reviews. The changes are too broad to do only one review. In general I think there is too much code, both repeated and non-repeated. I think the gym needs to generally simple and concise for people (other than us) to use it. There is also a lot of commented out code which I don't think should be there in completed code. Also, two stylistic comments: the precommit hooks need to be run to format everything consistently, and the docstrings and comments should follow our standard style, as Adam pointed out.
device: cpu | ||
restore: null | ||
device: cuda | ||
restore: null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there should be a newline at the end of the file. I think the pre-commit hooks handle that. If not, you can configure vs code to always add one when a file is saved, which is what I do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the pre-commit hook should fix this, which means you haven't yet run the hook. This should be done ASAP as it may change a lot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the hooks have still not been run, since there are double quotes here and there.
Looks much cleaner! Nice! I think the samplers could still be made a little more flexible? Let me know if you think this is feasible. |
General comments: to get HPO module fully tested, I am waiting for another PR (quadrotor interface) to be approved. After that I will run unit-test for HPO on new env interface and also run pre-commits hook. |
This PR is created for two primary purposes: