Skip to content

RL-Zoo3 v1.8.0 : New Documentation, OpenRL Benchmark, Multi-Env HerReplayBuffer

Compare
Choose a tag to compare
@araffin araffin released this 08 Apr 16:09
· 34 commits to master since this release
483319b

Release 1.8.0 (2023-04-07)

We have run a massive and open source benchmark of all algorithms on all environments from the RL Zoo: Open RL Benchmark

New documentation: https://rl-baselines3-zoo.readthedocs.io/en/master/

Warning
Stable-Baselines3 (SB3) v1.8.0 will be the last one to use Gym as a backend.
Starting with v2.0.0, Gymnasium will be the default backend (though SB3 will have compatibility layers for Gym envs).
You can find a migration guide here.
If you want to try the SB3 v2.0 alpha version, you can take a look at PR #1327.

Breaking Changes

  • Upgraded to SB3 >= 1.8.0
  • Upgraded to new HerReplayBuffer implementation that supports multiple envs
  • Removed TimeFeatureWrapper for Panda and Fetch envs, as the new replay buffer should handle timeout.

New Features

  • Tuned hyperparameters for RecurrentPPO on Swimmer
  • Documentation is now built using Sphinx and hosted on read the doc
  • Open RL Benchmark

Bug fixes

  • Set highway-env version to 1.5 and setuptools to v65.5 for the CI
  • Removed use_auth_token for push to hub util
  • Reverted from v3 to v2 for HumanoidStandup, Reacher, InvertedPendulum and InvertedDoublePendulum since they were not part of the mujoco refactoring (see openai/gym#1304)
  • Fixed gym-minigrid policy (from MlpPolicy to MultiInputPolicy)

Documentation

Other

  • Added support for ruff (fast alternative to flake8) in the Makefile
  • Removed Gitlab CI file
  • Replaced deprecated optuna.suggest_loguniform(...) by optuna.suggest_float(..., log=True)
  • Switched to ruff and pyproject.toml
  • Removed online_sampling and max_episode_length argument when using HerReplayBuffer