Skip to content

rlberry-v0.2.1

Compare
Choose a tag to compare
@omardrwch omardrwch released this 19 Nov 11:42
· 478 commits to main since this release
1701a48

New in v0.2

Improving interface and tools for parallel execution (#50)

  • AgentStats renamed to AgentManager.
  • AgentManager can handle agents that cannot be pickled.
  • Agent interface requires eval() method instead of policy() to handle more general agents (e.g. reward-free, POMDPs etc).
  • Multi-processing and multi-threading are now done with ProcessPoolExecutor and ThreadPoolExecutor (allowing nested processes for example). Processes are created with spawn (jax does not work with fork, see #51).

New experimental features (see #51, #62)

  • JAX implementation of DQN and replay buffer using reverb.
  • rlberry.network: server and client interfaces to exchange messages via sockets.
  • RemoteAgentManager to train agents in a remote server and gather the results locally (using rlberry.network).

Logging and rendering:

  • Data logging with a new DefaultWriter and improved evaluation and plot methods in rlberry.manager.evaluation.
  • Fix rendering bug with OpenGL (bf606b4).

Bug fixes.

New in v0.2.1 (#65)

Features:

  • Agent and AgentManager both have a unique_id attribute (useful for creating unique output files/directories).
  • DefaultWriter is now initialized in base class Agent and (optionally) wraps a tensorboard SummaryWriter.
  • AgentManager has an option enable_tensorboard that activates tensorboard logging in each of its Agents (with their writer attribute). The log_dirs of tensorboard are automatically assigned by AgentManager.
  • RemoteAgentManager receives tensorboard data created in the server, when the method get_writer_data() is called. This is done by a zip file transfer with rlberry.network.
  • BaseWrapper and gym_make now have an option wrap_spaces. If set to True, this option converts gym.spaces to rlberry.spaces, which provides classes with better seeding (using numpy's default_rng instead of RandomState)
  • AgentManager: new method get_agent_instances() that returns trained instances
  • plot_writer_data: possibility to set xtag (tag used for x-axis)

Bug fixes:

  • Fixed agent initialization bug in AgentHandler (eval_env missing in kwargs for agent_class).