You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@
25
25
`tmrl` is a python library designed to facilitate the implementation of deep RL applications in real-time settings such as robots and video games. Full tutorial [here](readme/tuto_library.md) and documentation [here](https://tmrl.readthedocs.io/en/latest/).
26
26
27
27
-:ok_hand:**ML developers who are TM enthusiasts with no interest in learning this huge thing:**\
28
-
`tmrl` provides a Gym environment for TrackMania that is easy to use. Fast-track for you guys [here](#trackmania-gym-environment).
28
+
`tmrl` provides a Gymnasium environment for TrackMania that is easy to use. Fast-track for you guys [here](#trackmania-gymnasium-environment).
29
29
30
30
-:earth_americas:**Everyone:**\
31
31
`tmrl` hosts the [TrackMania Roborace League](readme/competition.md), a vision-based AI competition where participants design real-time self-racing AIs in the TrackMania video game.
-[TrackMania training details](#trackmania-training-details)
@@ -93,8 +93,8 @@ These models learn the physics from histories or observations equally spaced in
93
93
`tmrl` is a complete framework designed to help you successfully implement deep RL in your [real-time applications](#real-time-gym-framework) (e.g., robots...).
94
94
A complete tutorial toward doing this is provided [here](readme/tuto_library.md).
95
95
96
-
***TrackMania Gym environment:**
97
-
`tmrl` comes with a real-time Gym environment for the TrackMania2020 video game, based on [rtgym](https://pypi.org/project/rtgym/). Once `tmrl` is installed, it is easy to use this environment in your own training framework. More information [here](#trackmania-gym-environment).
96
+
***TrackMania Gymnasium environment:**
97
+
`tmrl` comes with a real-time Gymnasium environment for the TrackMania2020 video game, based on [rtgym](https://pypi.org/project/rtgym/). Once `tmrl` is installed, it is easy to use this environment in your own training framework. More information [here](#trackmania-gymnasium-environment).
98
98
99
99
***Distributed training:**
100
100
`tmrl` is based on a single-server / multiple-clients architecture.
@@ -157,7 +157,7 @@ Follow the link for information about the competition, including the current lea
157
157
158
158
Regardless of whether they want to compete or not, ML developers will find the [competition tutorial script](https://github.com/trackmania-rl/tmrl/blob/master/tmrl/tuto/competition/custom_actor_module.py) useful for creating advanced training pipelines in TrackMania.
159
159
160
-
## TrackMania Gym environment
160
+
## TrackMania Gymnasium environment
161
161
In case you only wish to use the `tmrl` Real-Time Gym environment for TrackMania in your own training framework, this is made possible by the `get_environment()` method:
162
162
163
163
_(NB: the game needs to be set up as described in the [getting started](readme/get_started.md) instructions)_
Copy file name to clipboardExpand all lines: readme/Install.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ _(Note for ML developers: in case you are not interested in using support for Tr
17
17
18
18
The following instructions are for installing `tmrl` with support for the TrackMania 2020 video game.
19
19
20
-
You will first need to install [TrackMania 2020](https://www.trackmania.com/) (obviously), and also a small community-supported utility called [Openplanet for TrackMania](https://openplanet.nl/) (the Gym environment needs this utility to compute the reward).
20
+
You will first need to install [TrackMania 2020](https://www.trackmania.com/) (obviously), and also a small community-supported utility called [Openplanet for TrackMania](https://openplanet.nl/) (the Gymnasium environment needs this utility to compute the reward).
Copy file name to clipboardExpand all lines: readme/competition.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -69,7 +69,7 @@ We choose whether to accept your entry based on reproducibility and novelty.
69
69
### Current iteration (Beta)
70
70
The `tmrl` competition is an open research initiative, currently in its first iteration :hatching_chick:
71
71
72
-
In this iteration, competitors race on the `tmrl-test` track (plain road) by solving the `Full` version of the [TrackMania 2020 Gym environment](https://github.com/trackmania-rl/tmrl#gym-environment) (the `LIDAR` version is also accepted).
72
+
In this iteration, competitors race on the `tmrl-test` track (plain road) by solving the `Full` version of the [TrackMania 2020 Gym environment](https://github.com/trackmania-rl/tmrl#trackmania-gymnasium-environment) (the `LIDAR` version is also accepted).
73
73
74
74
- The `action space` is the default TrackMania 2020 continuous action space (3 floats between -1.0 and 1.0).
75
75
- The `observation space` is a history of 4 raw snapshots along with the speed, gear, rpm and 2 previous actions. The choice of camera is up to you as long as you use one of the default. You are allowed to use colors if you wish (set the `"IMG_GRAYSCALE"` entry to `false` in `config.json`). You may also customize the actual image dimensions (`"IMG_WIDTH"` and `"IMG_HEIGHT"`), and the game window dimensions (`"WINDOW_WIDTH"` and `"WINDOW_HEIGHT"`) if you need to. However, the window dimensions must remain between `(256, 128)` and `(958, 488)` (dimensions greater than `(958, 488)` are **not** allowed).
Copy file name to clipboardExpand all lines: readme/tuto_library.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -162,7 +162,7 @@ As soon as the server is instantiated, it listens for incoming connections from
162
162
In RL, a task is often called an "environment".
163
163
`tmrl` is meant for asynchronous remote training of real-time applications such as robots.
164
164
Thus, we use [Real-Time Gym](https://github.com/yannbouteiller/rtgym) (`rtgym`) to wrap our robots and video games into a Gym environment.
165
-
You can also probably use other environments as long as they are registered as Gym environments and have a relevant substitute for the `default_action` attribute.
165
+
You can also probably use other environments as long as they are registered as Gymnasium environments and have a relevant substitute for the `default_action` attribute.
166
166
167
167
To build your own environment (e.g., an environment for your own robot or video game), follow the [rtgym tutorial](https://github.com/yannbouteiller/rtgym#tutorial).
168
168
If you need inspiration, you can find our `rtgym` interfaces for TrackMania in [custom_gym_interfaces.py](https://github.com/trackmania-rl/tmrl/blob/master/tmrl/custom/custom_gym_interfaces.py).
@@ -173,7 +173,7 @@ _(NB: you need `opencv-python` installed)_
173
173
174
174
```python
175
175
from rtgym import RealTimeGymInterface, DEFAULT_CONFIG_DICT, DummyRCDrone
Now that we have our robot encapsulated in a Gym environment, we will create an RL actor.
279
+
Now that we have our robot encapsulated in a Gymnasium environment, we will create an RL actor.
280
280
In `tmrl`, this is done within a `RolloutWorker` object.
281
281
282
282
One to several `RolloutWorkers` can coexist in `tmrl`, each one typically encapsulating a robot, or, in the case of a video game, an instance of the game
@@ -290,7 +290,7 @@ import tmrl.config.config_constants as cfg # constants from the config.json fil
290
290
classRolloutWorker:
291
291
def__init__(
292
292
self,
293
-
env_cls=None, # class of the Gym environment
293
+
env_cls=None, # class of the Gymnasium environment
294
294
actor_module_cls=None, # class of a module containing the policy
295
295
sample_compressor: callable=None, # compressor for sending samples over the Internet
296
296
server_ip=None, # ip of the central server
@@ -315,16 +315,16 @@ In this tutorial, we will implement a similar `RolloutWorker` for our dummy dron
315
315
316
316
The first argument of our `RolloutWorker` is `env_cls`.
317
317
318
-
This expects a Gym environment class, which can be partially instantiated with `partial()`.
319
-
Furthermore, this Gym environment needs to be wrapped in the `GenericGymEnv` wrapper (which by default just changes float64 to float32 in observations).
318
+
This expects a Gymnasium environment class, which can be partially instantiated with `partial()`.
319
+
Furthermore, this Gymnasium environment needs to be wrapped in the `GenericGymEnv` wrapper (which by default just changes float64 to float32 in observations).
320
320
321
321
With our dummy drone environment, this translates to:
We can create a dummy environment to retrieve the action and observation spaces:
@@ -505,7 +505,7 @@ This is done by setting the `Server` IP as the localhost IP, i.e., `"127.0.0.1"`
505
505
_(NB: We have set the values for `server_ip` and `server_port` earlier in this tutorial.)_
506
506
507
507
In the current iteration of `tmrl`, samples are gathered locally in a buffer by the `RolloutWorker` and are sent to the `Server` only at the end of an episode.
508
-
In case your Gym environment is never `terminated` (or only after too long), `tmrl` enables forcing reset after a time-steps threshold.
508
+
In case your Gymnasium environment is never `terminated` (or only after too long), `tmrl` enables forcing reset after a time-steps threshold.
509
509
For instance, let us say we don't want an episode to last more than 1000 time-steps:
510
510
511
511
_(Note 1: This is for the sake of illustration, in fact, this cannot happen in our RC drone environment)_
@@ -694,13 +694,13 @@ class TorchTrainingOffline:
694
694
`TorchTrainingOffline` requires other (possibly partially instantiated) classes as arguments: a dummy environment, a `TorchMemory`, and a `TrainingAgent`
695
695
696
696
#### Dummy environment:
697
-
`env_cls`: Most of the time, the dummy environment class that you need to pass here is the same class as for the `RolloutWorker`Gym environment:
697
+
`env_cls`: Most of the time, the dummy environment class that you need to pass here is the same class as for the `RolloutWorker`Gymnasium environment:
This dummy environment will only be used by the `Trainer` to retrieve the observation and action spaces (`reset()` will not be called).
706
706
Alternatively, you can pass this information as a Tuple:
@@ -750,7 +750,7 @@ class TorchMemory(ABC):
750
750
"""
751
751
Outputs a decompressed RL transition.
752
752
753
-
This transition is the same as the output by the Gym environment (after observation preprocessing).
753
+
This transition is the same as the output by the Gymnasium environment (after observation preprocessing).
754
754
755
755
Args:
756
756
item: int: indices of the transition that the Trainer wants to sample
@@ -826,7 +826,7 @@ In this tutorial, we will privilege memory usage and thus we will implement our
826
826
The `append_buffer()` method will simply store the compressed sample components in `self.data`.
827
827
828
828
`append_buffer()` is passed a [buffer](https://github.com/trackmania-rl/tmrl/blob/c1f740740a7d57382a451607fdc66d92ba62ea0c/tmrl/networking.py#L198) object that contains a list of compressed `(act, new_obs, rew, terminated, truncated, info)` samples in its `memory` attribute.
829
-
`act` is the action that was sent to the `step()` method of the Gym environment to yield `new_obs`, `rew`, `terminated`, `truncated`, and `info`.
829
+
`act` is the action that was sent to the `step()` method of the Gymnasium environment to yield `new_obs`, `rew`, `terminated`, `truncated`, and `info`.
830
830
Here, we decompose our samples in their relevant components, append these components to the `self.data` list, and clip `self.data` when `self.memory_size` is exceeded:
831
831
832
832
```python
@@ -904,7 +904,7 @@ Finally, if we have enough samples, we need to remove the length of the action b
904
904
Furthermore, the `get_transition()` method outputs a full RL transition, which includes the previous observation. Thus, we must subtract 1 to get the number of full transitions that we can actually output.
905
905
906
906
Alright, let us finally implement `get_transition()`, where we have chosen sample decompression would happen.
907
-
This method outputs full transitions as if they were output by the Gym environment
907
+
This method outputs full transitions as if they were output by the Gymnasium environment
0 commit comments