Skip to content
This repository was archived by the owner on May 9, 2025. It is now read-only.
This repository was archived by the owner on May 9, 2025. It is now read-only.

[question] Tuning for GAIL and custom envs with time bottlenecks #92

@prabhasak

Description

@prabhasak

Hello. I use SB and zoo actively for GAIL. My CustomEnv built using AirSim trains (almost) in real-time, due to which I have spent months trying to find the right set of hyperparameters (HPs) for GAIL to imitate expert trajectories (generated from an optimal TRPO policy). I had some specific questions regarding TRPO and GAIL

  1. Since GAIL uses TRPO, I made a copy of the zoo TRPO HPs and called it GAIL. Can I do better? I have had luck imitating simple Gym envs with GAIL, but have had a hard time imitating MuJoCo envs
  2. CustomEnv training for 1e6 timesteps takes ~1.5 days, so I've been avoiding tuning. Would you recommend tuning for GAIL? Do I just copy the trpo sampler for gail? Is there anything else I can do to speed-up tuning?
  3. With both a lack of tuned HPs and real-time training, is there any other avenue I can try my hands on to get GAIL to work on my CustomEnv?

Any help is greatly appreciated. Thank you for these awesome repos!

CustomEnv info:
obs: 6dim, cts
action: 3dim, cts
rewards: dense, large reward at goal

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions