Skip to content

Commit cb05540

Browse files
authored
[RLlib] Cleanup examples folder #1. (ray-project#44067)
1 parent 1dbcacb commit cb05540

File tree

104 files changed

+3973
-6516
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

104 files changed

+3973
-6516
lines changed

.buildkite/rllib.rayci.yml

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -106,17 +106,6 @@ steps:
106106
--test-env=RLLIB_NUM_GPUS=1
107107
depends_on: rllibgpubuild
108108

109-
- label: ":brain: rllib: rlmodule tests"
110-
tags: rllib_directly
111-
instance_type: large
112-
commands:
113-
- bazel run //ci/ray_ci:test_in_docker -- //rllib/... rllib
114-
--parallelism-per-worker 3
115-
--only-tags rlm
116-
--test-env RLLIB_ENABLE_RL_MODULE=1
117-
--test-env RAY_USE_MULTIPROCESSING_CPU_COUNT=1
118-
depends_on: rllibbuild
119-
120109
- label: ":brain: rllib: data tests"
121110
if: build.branch != "master"
122111
tags: data

doc/source/ray-overview/getting-started.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -303,7 +303,7 @@ pip install -U "ray[rllib]" tensorflow # or torch
303303
```
304304
````
305305
306-
```{literalinclude} ../../../rllib/examples/documentation/rllib_on_ray_readme.py
306+
```{literalinclude} ../rllib/doc_code/rllib_on_ray_readme.py
307307
:end-before: __quick_start_end__
308308
:language: python
309309
:start-after: __quick_start_begin__

doc/source/rllib/package_ref/env.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,11 @@ For example, if you provide a custom `gym.Env <https://github.com/openai/gym>`_
2929

3030
Here is a simple example:
3131

32-
.. literalinclude:: ../../../../rllib/examples/documentation/custom_gym_env.py
32+
.. literalinclude:: ../doc_code/custom_gym_env.py
3333
:language: python
3434

35-
.. start-after: __sphinx_doc_model_construct_1_begin__
36-
.. end-before: __sphinx_doc_model_construct_1_end__
35+
.. start-after: __rllib-custom-gym-env-begin__
36+
.. end-before: __rllib-custom-gym-env-end__
3737
3838
However, you may also conveniently sub-class any of the other supported RLlib-specific
3939
environment types. The automated paths from those env types (or callables returning instances of those types) to

doc/source/rllib/rllib-connector.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ With connectors essentially checkpointing all the transformations used during tr
236236
policies can be easily restored without the original algorithm for local inference,
237237
as demonstrated by the following Cartpole example:
238238

239-
.. literalinclude:: ../../../rllib/examples/connectors/v1/run_connector_policy.py
239+
.. literalinclude:: ../../../rllib/examples/_old_api_stack/connectors/run_connector_policy.py
240240
:language: python
241241
:start-after: __sphinx_doc_begin__
242242
:end-before: __sphinx_doc_end__
@@ -255,7 +255,7 @@ different environments to work together at the same time.
255255
Here is an example demonstrating adaptation of a policy trained for the standard Cartpole environment
256256
for a new mock Cartpole environment that returns additional features and requires extra action inputs.
257257

258-
.. literalinclude:: ../../../rllib/examples/connectors/v1/adapt_connector_policy.py
258+
.. literalinclude:: ../../../rllib/examples/_old_api_stack/connectors/adapt_connector_policy.py
259259
:language: python
260260
:start-after: __sphinx_doc_begin__
261261
:end-before: __sphinx_doc_end__

doc/source/rllib/rllib-examples.rst

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -14,23 +14,8 @@ Tuned Examples
1414
--------------
1515

1616
- `Tuned examples <https://github.com/ray-project/ray/blob/master/rllib/tuned_examples>`__:
17-
Collection of tuned hyperparameters by algorithm.
18-
- `MuJoCo and Atari benchmarks <https://github.com/ray-project/rl-experiments>`__:
19-
Collection of reasonably optimized Atari and MuJoCo results.
17+
Collection of tuned hyperparameters sorted by algorithm.
2018

21-
Blog Posts
22-
----------
23-
24-
- `Attention Nets and More with RLlib’s Trajectory View API <https://medium.com/distributed-computing-with-ray/attention-nets-and-more-with-rllibs-trajectory-view-api-d326339a6e65>`__:
25-
This blog describes RLlib's new "trajectory view API" and how it enables implementations of GTrXL (attention net) architectures.
26-
- `Reinforcement Learning with RLlib in the Unity Game Engine <https://medium.com/distributed-computing-with-ray/reinforcement-learning-with-rllib-in-the-unity-game-engine-1a98080a7c0d>`__:
27-
A how-to on connecting RLlib with the Unity3D game engine for running visual- and physics-based RL experiments.
28-
- `Lessons from Implementing 12 Deep RL Algorithms in TF and PyTorch <https://medium.com/distributed-computing-with-ray/lessons-from-implementing-12-deep-rl-algorithms-in-tf-and-pytorch-1b412009297d>`__:
29-
Discussion on how we ported 12 of RLlib's algorithms from TensorFlow to PyTorch and what we learnt on the way.
30-
- `Scaling Multi-Agent Reinforcement Learning <http://bair.berkeley.edu/blog/2018/12/12/rllib>`__:
31-
This blog post is a brief tutorial on multi-agent RL and its design in RLlib.
32-
- `Functional RL with Keras and TensorFlow Eager <https://medium.com/riselab/functional-rl-with-keras-and-tensorflow-eager-7973f81d6345>`__:
33-
Exploration of a functional paradigm for implementing reinforcement learning (RL) algorithms.
3419

3520
Environments and Adapters
3621
-------------------------
@@ -47,7 +32,7 @@ Environments and Adapters
4732
Custom- and Complex Models
4833
--------------------------
4934

50-
- `Custom Keras model <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_keras_model.py>`__:
35+
- `Custom Keras model <https://github.com/ray-project/ray/blob/master/rllib/examples/_old_api_stack/custom_keras_model.py>`__:
5136
Example of using a custom Keras model.
5237
- `Registering a custom model with supervised loss <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_model_loss_and_metrics.py>`__:
5338
Example of defining and registering a custom model with a supervised loss.
@@ -83,9 +68,9 @@ Training Workflows
8368

8469
Evaluation:
8570
-----------
86-
- `Custom evaluation function <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_eval.py>`__:
71+
- `Custom evaluation function <https://github.com/ray-project/ray/blob/master/rllib/examples/evaluation/custom_evaluation.py>`__:
8772
Example of how to write a custom evaluation function that is called instead of the default behavior, which is running with the evaluation worker set through n episodes.
88-
- `Parallel evaluation and training <https://github.com/ray-project/ray/blob/master/rllib/examples/parallel_evaluation_and_training.py>`__:
73+
- `Parallel evaluation and training <https://github.com/ray-project/ray/blob/master/rllib/examples/evaluation/evaluation_parallel_to_training.py>`__:
8974
Example showing how the evaluation workers and the "normal" rollout workers can run (to some extend) in parallel to speed up training.
9075

9176

@@ -113,23 +98,23 @@ Serving and Offline
11398
Multi-Agent and Hierarchical
11499
----------------------------
115100

116-
- `Simple independent multi-agent setup vs a PettingZoo env <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_independent_learning.py>`__:
101+
- `Simple independent multi-agent setup vs a PettingZoo env <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_and_self_play/independent_learning.py>`__:
117102
Setup RLlib to run any algorithm in (independent) multi-agent mode against a multi-agent environment.
118-
- `More complex (shared-parameter) multi-agent setup vs a PettingZoo env <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_parameter_sharing.py>`__:
103+
- `More complex (shared-parameter) multi-agent setup vs a PettingZoo env <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_and_self_play/parameter_sharing.py>`__:
119104
Setup RLlib to run any algorithm in (shared-parameter) multi-agent mode against a multi-agent environment.
120-
- `Rock-paper-scissors <https://github.com/ray-project/ray/blob/master/rllib/examples/rock_paper_scissors_multiagent.py>`__:
105+
- `Rock-paper-scissors <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_and_self_play/rock_paper_scissors.py>`__:
121106
Example of different heuristic and learned policies competing against each other in rock-paper-scissors.
122-
- `Two-step game <https://github.com/ray-project/ray/blob/master/rllib/examples/two_step_game.py>`__:
107+
- `Two-step game <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_and_self_play/two_step_game.py>`__:
123108
Example of the two-step game from the `QMIX paper <https://arxiv.org/pdf/1803.11485.pdf>`__.
124109
- `PettingZoo multi-agent example <https://github.com/Farama-Foundation/PettingZoo/blob/master/tutorials/Ray/rllib_pistonball.py>`__:
125110
Example on how to use RLlib to learn in `PettingZoo <https://www.pettingzoo.ml>`__ multi-agent environments.
126111
- `PPO with centralized critic on two-step game <https://github.com/ray-project/ray/blob/master/rllib/examples/centralized_critic.py>`__:
127112
Example of customizing PPO to leverage a centralized value function.
128113
- `Centralized critic in the env <https://github.com/ray-project/ray/blob/master/rllib/examples/centralized_critic_2.py>`__:
129114
A simpler method of implementing a centralized critic by augmentating agent observations with global information.
130-
- `Hand-coded policy <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_custom_policy.py>`__:
115+
- `Hand-coded policy <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_and_self_play/custom_heuristic_rl_module.py>`__:
131116
Example of running a custom hand-coded policy alongside trainable policies.
132-
- `Weight sharing between policies <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_cartpole.py>`__:
117+
- `Weight sharing between policies <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_and_self_play/multi_agent_cartpole.py>`__:
133118
Example of how to define weight-sharing layers between two different policies.
134119
- `Multiple algorithms <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_two_trainers.py>`__:
135120
Example of alternating training between DQN and PPO.
@@ -140,11 +125,11 @@ Multi-Agent and Hierarchical
140125
Special Action- and Observation Spaces
141126
--------------------------------------
142127

143-
- `Nested action spaces <https://github.com/ray-project/ray/blob/master/rllib/examples/nested_action_spaces.py>`__:
128+
- `Nested action spaces <https://github.com/ray-project/ray/blob/master/rllib/examples/connectors/connector_v2_nested_action_spaces.py>`__:
144129
Learning in arbitrarily nested action spaces.
145-
- `Parametric actions <https://github.com/ray-project/ray/blob/master/rllib/examples/parametric_actions_cartpole.py>`__:
130+
- `Parametric actions <https://github.com/ray-project/ray/blob/master/rllib/examples/_old_api_stack/parametric_actions_cartpole.py>`__:
146131
Example of how to handle variable-length or parametric action spaces.
147-
- `Using the "Repeated" space of RLlib for variable lengths observations <https://github.com/ray-project/ray/blob/master/rllib/examples/complex_struct_space.py>`__:
132+
- `Using the "Repeated" space of RLlib for variable lengths observations <https://github.com/ray-project/ray/blob/master/rllib/examples/_old_api_stack/complex_struct_space.py>`__:
148133
How to use RLlib's `Repeated` space to handle variable length observations.
149134
- `Autoregressive action distribution example <https://github.com/ray-project/ray/blob/master/rllib/examples/autoregressive_action_dist.py>`__:
150135
Learning with auto-regressive action dependencies (e.g. 2 action components; distribution for 2nd component depends on the 1st component's actually sampled value).
@@ -185,3 +170,18 @@ Community Examples
185170
Example of training in StarCraft2 maps with RLlib / multi-agent.
186171
- `Traffic Flow <https://berkeleyflow.readthedocs.io/en/latest/flow_setup.html>`__:
187172
Example of optimizing mixed-autonomy traffic simulations with RLlib / multi-agent.
173+
174+
175+
Blog Posts
176+
----------
177+
178+
- `Attention Nets and More with RLlib’s Trajectory View API <https://medium.com/distributed-computing-with-ray/attention-nets-and-more-with-rllibs-trajectory-view-api-d326339a6e65>`__:
179+
Blog describing RLlib's new "trajectory view API" and how it enables implementations of GTrXL (attention net) architectures.
180+
- `Reinforcement Learning with RLlib in the Unity Game Engine <https://medium.com/distributed-computing-with-ray/reinforcement-learning-with-rllib-in-the-unity-game-engine-1a98080a7c0d>`__:
181+
How-To guide about connecting RLlib with the Unity3D game engine for running visual- and physics-based RL experiments.
182+
- `Lessons from Implementing 12 Deep RL Algorithms in TF and PyTorch <https://medium.com/distributed-computing-with-ray/lessons-from-implementing-12-deep-rl-algorithms-in-tf-and-pytorch-1b412009297d>`__:
183+
Discussion on how the Ray Team ported 12 of RLlib's algorithms from TensorFlow to PyTorch and the lessons learned.
184+
- `Scaling Multi-Agent Reinforcement Learning <http://bair.berkeley.edu/blog/2018/12/12/rllib>`__:
185+
Blog post of a brief tutorial on multi-agent RL and its design in RLlib.
186+
- `Functional RL with Keras and TensorFlow Eager <https://medium.com/riselab/functional-rl-with-keras-and-tensorflow-eager-7973f81d6345>`__:
187+
Exploration of a functional paradigm for implementing reinforcement learning (RL) algorithms.

doc/source/rllib/rllib-replay-buffers.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ Here are three ways of specifying a type:
7171
.. dropdown:: **Changing a replay buffer configuration**
7272
:animate: fade-in-slide-down
7373

74-
.. literalinclude:: ../../../rllib/examples/documentation/replay_buffer_demo.py
74+
.. literalinclude:: doc_code/replay_buffer_demo.py
7575
:language: python
7676
:start-after: __sphinx_doc_replay_buffer_type_specification__begin__
7777
:end-before: __sphinx_doc_replay_buffer_type_specification__end__
@@ -102,7 +102,7 @@ Advanced buffer types add functionality while trying to retain compatibility thr
102102
The following is an example of the most basic scheme of interaction with a :py:class:`~ray.rllib.utils.replay_buffers.replay_buffer.ReplayBuffer`.
103103

104104

105-
.. literalinclude:: ../../../rllib/examples/documentation/replay_buffer_demo.py
105+
.. literalinclude:: doc_code/replay_buffer_demo.py
106106
:language: python
107107
:start-after: __sphinx_doc_replay_buffer_basic_interaction__begin__
108108
:end-before: __sphinx_doc_replay_buffer_basic_interaction__end__
@@ -113,7 +113,7 @@ Building your own ReplayBuffer
113113

114114
Here is an example of how to implement your own toy example of a ReplayBuffer class and make SimpleQ use it:
115115

116-
.. literalinclude:: ../../../rllib/examples/documentation/replay_buffer_demo.py
116+
.. literalinclude:: doc_code/replay_buffer_demo.py
117117
:language: python
118118
:start-after: __sphinx_doc_replay_buffer_own_buffer__begin__
119119
:end-before: __sphinx_doc_replay_buffer_own_buffer__end__
@@ -132,7 +132,7 @@ When later calling the ``sample()`` method, num_items will relate to said storag
132132

133133
Here is a full example of how to modify the storage_unit and interact with a custom buffer:
134134

135-
.. literalinclude:: ../../../rllib/examples/documentation/replay_buffer_demo.py
135+
.. literalinclude:: doc_code/replay_buffer_demo.py
136136
:language: python
137137
:start-after: __sphinx_doc_replay_buffer_advanced_usage_storage_unit__begin__
138138
:end-before: __sphinx_doc_replay_buffer_advanced_usage_storage_unit__end__
@@ -145,7 +145,7 @@ the same way as the parent's config.
145145
Here is an example of how to create an :py:class:`~ray.rllib.utils.replay_buffers.multi_agent_replay_buffer.MultiAgentReplayBuffer` with an alternative underlying :py:class:`~ray.rllib.utils.replay_buffers.replay_buffer.ReplayBuffer`.
146146
The :py:class:`~ray.rllib.utils.replay_buffers.multi_agent_replay_buffer.MultiAgentReplayBuffer` can stay the same. We only need to specify our own buffer along with a default call argument:
147147

148-
.. literalinclude:: ../../../rllib/examples/documentation/replay_buffer_demo.py
148+
.. literalinclude:: doc_code/replay_buffer_demo.py
149149
:language: python
150150
:start-after: __sphinx_doc_replay_buffer_advanced_usage_underlying_buffers__begin__
151151
:end-before: __sphinx_doc_replay_buffer_advanced_usage_underlying_buffers__end__

0 commit comments

Comments
 (0)