-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Vectorized custom environments that output (num_envs, obs_size) without stacking #2066
Comments
this is not a hack and this is the way to go for vectorized env. We have several working examples (already in our doc for most):
|
Thanks for the directions. Much appreciated. Can I clarify this paragraph in the docs:
Does it mean that we need to have within env.step() the instructions to reset env[i] and return obs[i] as the observation of a new episode when done[i] is True? |
maybe the easiest to answer your question is to have a look at: stable-baselines3/stable_baselines3/common/vec_env/dummy_vec_env.py Lines 68 to 73 in b7c64a1
|
I think my customised environment does not facilitate the creations of self.envs as a list of individual envs. In any case, I found a workaround by using the VecEnvWrapper mentioned in the docs. Since for my custom env, all episodes will reset at the same time due to a fix episode length so I added the reset() into the step() function of the VecEnvWrapper as shown below. Without this (i.e. if you remove the if done.all() condition), the envs are not resetting automatically.
|
❓ Question
I have a question about vectorized custom environments where the step() and reset() function already have outputs of shape (n_envs, obs_size) or (n_envs) for states, rewards, dones, etc.
Reading the documentation, all the helper function seem to be built for stacking up running multiple envs independently. I have tried a hack where I inherit the from VecEnv in the sample code below. Although it runs, for some reason, reset() is never called despite the dones being true. Running the code, you will see that reset() which has a print('reset') is only called once despite episodes ending.
My question is
Checklist
The text was updated successfully, but these errors were encountered: