Skip to content

rlberry-v0.3.0

Compare
Choose a tag to compare
@TimotheeMathieu TimotheeMathieu released this 03 Jun 11:26
· 260 commits to main since this release

Release of version 0.3.0 of rlberry.

New in 0.3.0

PR #206

  • Creation of a Deep RL tutorial, in the user guide.

PR #132

  • New tracker class rlberry.agents.bandit.tools.BanditTracker to track statistics to be used in Bandit algorithms.

PR #191

  • Possibility to generate a profile with rlberry.agents.manager.AgentManager.

PR #148, #161, #180

  • Misc improvements on A2C.
  • New StableBaselines3 wrapper rlberry.agents.stable_baselines.StableBaselinesAgent to import StableBaselines3 Agents.

PR #119

  • Improving documentation for agents.torch.utils
  • New replay buffer rlberry.agents.utils.replay.ReplayBuffer, aiming to replace code in utils/memories.py
  • New DQN implementation, aiming to fix reproducibility and compatibility issues.
  • Implements Q(lambda) in DQN Agent.

Feb 22, 2022 (PR #126)

  • Setup rlberry.__version__ (currently 0.3.0dev0)
  • Record rlberry version in a AgentManager attribute equality of AgentManagers
  • Override __eq__ method of the AgentManager class.

Feb 14-15, 2022 (PR #97, #118)

  • (feat) Add Bandits basic environments and agents. See ~rlberry.agents.bandits.IndexAgent and ~rlberry.envs.bandits.Bandit.
  • Thompson Sampling bandit algorithm with gaussian or beta prior.
  • Base class for bandits algorithms with custom save & load functions (called ~rlberry.agents.bandits.BanditWithSimplePolicy)

Feb 11, 2022 (#83, #95)

  • (fix) Fixed bug in FiniteMDP.sample(): terminal state was being checked with self.state instead of given state
  • (feat) Option to use 'fork' or 'spawn' in ~rlberry.manager.AgentManager
  • (feat) AgentManager output_dir now has a timestamp and a short ID by default.
  • (feat) Gridworld can be constructed from string layout
  • (feat) max_workers argument for ~rlberry.manager.AgentManager to control the maximum number of processes/threads created by the fit method.

Feb 04, 2022

  • Add ~rlberry.manager.read_writer_data to load agent's writer data from pickle files and make it simpler to customize in ~rlberry.manager.plot_writer_data
  • Fix bug, dqn should take a tuple as environment
  • Add a quickstart tutorial in the docs quick_start
  • Add the RLSVI algorithm (tabular) ~rlberry.agents.RLSVIAgent
  • Add the Posterior Sampling for Reinforcement Learning PSRL agent for tabular MDP ~rlberry.agents.PSRLAgent
  • Add a page to help contributors in the doc contributing