Skip to content

Commit 43620a4

Browse files
Upgraded to rtgym 0.8 (gymnasium)
1 parent 05ec924 commit 43620a4

21 files changed

+63
-67
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
`tmrl` is a python library designed to facilitate the implementation of deep RL applications in real-time settings such as robots and video games. Full tutorial [here](readme/tuto_library.md) and documentation [here](https://tmrl.readthedocs.io/en/latest/).
2626

2727
- :ok_hand: **ML developers who are TM enthusiasts with no interest in learning this huge thing:**\
28-
`tmrl` provides a Gym environment for TrackMania that is easy to use. Fast-track for you guys [here](#trackmania-gym-environment).
28+
`tmrl` provides a Gymnasium environment for TrackMania that is easy to use. Fast-track for you guys [here](#trackmania-gymnasium-environment).
2929

3030
- :earth_americas: **Everyone:**\
3131
`tmrl` hosts the [TrackMania Roborace League](readme/competition.md), a vision-based AI competition where participants design real-time self-racing AIs in the TrackMania video game.
@@ -44,7 +44,7 @@
4444
- [Security (important)](#security)
4545
- [TrackMania applications](#autonomous-driving-in-trackmania)
4646
- [TrackMania Roborace League](readme/competition.md)
47-
- [TrackMania Gym environment](#trackmania-gym-environment)
47+
- [TrackMania Gymnasium environment](#trackmania-gymnasium-environment)
4848
- [LIDAR environment](#lidar-environment)
4949
- [Full environment](#full-environment)
5050
- [TrackMania training details](#trackmania-training-details)
@@ -93,8 +93,8 @@ These models learn the physics from histories or observations equally spaced in
9393
`tmrl` is a complete framework designed to help you successfully implement deep RL in your [real-time applications](#real-time-gym-framework) (e.g., robots...).
9494
A complete tutorial toward doing this is provided [here](readme/tuto_library.md).
9595

96-
* **TrackMania Gym environment:**
97-
`tmrl` comes with a real-time Gym environment for the TrackMania2020 video game, based on [rtgym](https://pypi.org/project/rtgym/). Once `tmrl` is installed, it is easy to use this environment in your own training framework. More information [here](#trackmania-gym-environment).
96+
* **TrackMania Gymnasium environment:**
97+
`tmrl` comes with a real-time Gymnasium environment for the TrackMania2020 video game, based on [rtgym](https://pypi.org/project/rtgym/). Once `tmrl` is installed, it is easy to use this environment in your own training framework. More information [here](#trackmania-gymnasium-environment).
9898

9999
* **Distributed training:**
100100
`tmrl` is based on a single-server / multiple-clients architecture.
@@ -157,7 +157,7 @@ Follow the link for information about the competition, including the current lea
157157

158158
Regardless of whether they want to compete or not, ML developers will find the [competition tutorial script](https://github.com/trackmania-rl/tmrl/blob/master/tmrl/tuto/competition/custom_actor_module.py) useful for creating advanced training pipelines in TrackMania.
159159

160-
## TrackMania Gym environment
160+
## TrackMania Gymnasium environment
161161
In case you only wish to use the `tmrl` Real-Time Gym environment for TrackMania in your own training framework, this is made possible by the `get_environment()` method:
162162

163163
_(NB: the game needs to be set up as described in the [getting started](readme/get_started.md) instructions)_

readme/Install.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ _(Note for ML developers: in case you are not interested in using support for Tr
1717

1818
The following instructions are for installing `tmrl` with support for the TrackMania 2020 video game.
1919

20-
You will first need to install [TrackMania 2020](https://www.trackmania.com/) (obviously), and also a small community-supported utility called [Openplanet for TrackMania](https://openplanet.nl/) (the Gym environment needs this utility to compute the reward).
20+
You will first need to install [TrackMania 2020](https://www.trackmania.com/) (obviously), and also a small community-supported utility called [Openplanet for TrackMania](https://openplanet.nl/) (the Gymnasium environment needs this utility to compute the reward).
2121

2222

2323
### Install TrackMania 2020:

readme/competition.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ We choose whether to accept your entry based on reproducibility and novelty.
6969
### Current iteration (Beta)
7070
The `tmrl` competition is an open research initiative, currently in its first iteration :hatching_chick:
7171

72-
In this iteration, competitors race on the `tmrl-test` track (plain road) by solving the `Full` version of the [TrackMania 2020 Gym environment](https://github.com/trackmania-rl/tmrl#gym-environment) (the `LIDAR` version is also accepted).
72+
In this iteration, competitors race on the `tmrl-test` track (plain road) by solving the `Full` version of the [TrackMania 2020 Gym environment](https://github.com/trackmania-rl/tmrl#trackmania-gymnasium-environment) (the `LIDAR` version is also accepted).
7373

7474
- The `action space` is the default TrackMania 2020 continuous action space (3 floats between -1.0 and 1.0).
7575
- The `observation space` is a history of 4 raw snapshots along with the speed, gear, rpm and 2 previous actions. The choice of camera is up to you as long as you use one of the default. You are allowed to use colors if you wish (set the `"IMG_GRAYSCALE"` entry to `false` in `config.json`). You may also customize the actual image dimensions (`"IMG_WIDTH"` and `"IMG_HEIGHT"`), and the game window dimensions (`"WINDOW_WIDTH"` and `"WINDOW_HEIGHT"`) if you need to. However, the window dimensions must remain between `(256, 128)` and `(958, 488)` (dimensions greater than `(958, 488)` are **not** allowed).

readme/tuto_library.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,7 @@ As soon as the server is instantiated, it listens for incoming connections from
162162
In RL, a task is often called an "environment".
163163
`tmrl` is meant for asynchronous remote training of real-time applications such as robots.
164164
Thus, we use [Real-Time Gym](https://github.com/yannbouteiller/rtgym) (`rtgym`) to wrap our robots and video games into a Gym environment.
165-
You can also probably use other environments as long as they are registered as Gym environments and have a relevant substitute for the `default_action` attribute.
165+
You can also probably use other environments as long as they are registered as Gymnasium environments and have a relevant substitute for the `default_action` attribute.
166166

167167
To build your own environment (e.g., an environment for your own robot or video game), follow the [rtgym tutorial](https://github.com/yannbouteiller/rtgym#tutorial).
168168
If you need inspiration, you can find our `rtgym` interfaces for TrackMania in [custom_gym_interfaces.py](https://github.com/trackmania-rl/tmrl/blob/master/tmrl/custom/custom_gym_interfaces.py).
@@ -173,7 +173,7 @@ _(NB: you need `opencv-python` installed)_
173173

174174
```python
175175
from rtgym import RealTimeGymInterface, DEFAULT_CONFIG_DICT, DummyRCDrone
176-
import gym.spaces as spaces
176+
import gymnasium.spaces as spaces
177177
import numpy as np
178178
import cv2
179179
from threading import Thread
@@ -276,7 +276,7 @@ my_config["benchmark_polyak"] = 0.2
276276

277277
## Rollout workers
278278

279-
Now that we have our robot encapsulated in a Gym environment, we will create an RL actor.
279+
Now that we have our robot encapsulated in a Gymnasium environment, we will create an RL actor.
280280
In `tmrl`, this is done within a `RolloutWorker` object.
281281

282282
One to several `RolloutWorkers` can coexist in `tmrl`, each one typically encapsulating a robot, or, in the case of a video game, an instance of the game
@@ -290,7 +290,7 @@ import tmrl.config.config_constants as cfg # constants from the config.json fil
290290
class RolloutWorker:
291291
def __init__(
292292
self,
293-
env_cls=None, # class of the Gym environment
293+
env_cls=None, # class of the Gymnasium environment
294294
actor_module_cls=None, # class of a module containing the policy
295295
sample_compressor: callable = None, # compressor for sending samples over the Internet
296296
server_ip=None, # ip of the central server
@@ -315,16 +315,16 @@ In this tutorial, we will implement a similar `RolloutWorker` for our dummy dron
315315

316316
The first argument of our `RolloutWorker` is `env_cls`.
317317

318-
This expects a Gym environment class, which can be partially instantiated with `partial()`.
319-
Furthermore, this Gym environment needs to be wrapped in the `GenericGymEnv` wrapper (which by default just changes float64 to float32 in observations).
318+
This expects a Gymnasium environment class, which can be partially instantiated with `partial()`.
319+
Furthermore, this Gymnasium environment needs to be wrapped in the `GenericGymEnv` wrapper (which by default just changes float64 to float32 in observations).
320320

321321
With our dummy drone environment, this translates to:
322322

323323
```python
324324
from tmrl.util import partial
325325
from tmrl.envs import GenericGymEnv
326326

327-
env_cls=partial(GenericGymEnv, id="real-time-gym-v0", gym_kwargs={"config": my_config})
327+
env_cls=partial(GenericGymEnv, id="real-time-gym-v1", gym_kwargs={"config": my_config})
328328
```
329329

330330
We can create a dummy environment to retrieve the action and observation spaces:
@@ -505,7 +505,7 @@ This is done by setting the `Server` IP as the localhost IP, i.e., `"127.0.0.1"`
505505
_(NB: We have set the values for `server_ip` and `server_port` earlier in this tutorial.)_
506506

507507
In the current iteration of `tmrl`, samples are gathered locally in a buffer by the `RolloutWorker` and are sent to the `Server` only at the end of an episode.
508-
In case your Gym environment is never `terminated` (or only after too long), `tmrl` enables forcing reset after a time-steps threshold.
508+
In case your Gymnasium environment is never `terminated` (or only after too long), `tmrl` enables forcing reset after a time-steps threshold.
509509
For instance, let us say we don't want an episode to last more than 1000 time-steps:
510510

511511
_(Note 1: This is for the sake of illustration, in fact, this cannot happen in our RC drone environment)_
@@ -694,13 +694,13 @@ class TorchTrainingOffline:
694694
`TorchTrainingOffline` requires other (possibly partially instantiated) classes as arguments: a dummy environment, a `TorchMemory`, and a `TrainingAgent`
695695

696696
#### Dummy environment:
697-
`env_cls`: Most of the time, the dummy environment class that you need to pass here is the same class as for the `RolloutWorker` Gym environment:
697+
`env_cls`: Most of the time, the dummy environment class that you need to pass here is the same class as for the `RolloutWorker` Gymnasium environment:
698698

699699
```python
700700
from tmrl.util import partial
701701
from tmrl.envs import GenericGymEnv
702702

703-
env_cls = partial(GenericGymEnv, id="real-time-gym-v0", gym_kwargs={"config": my_config})
703+
env_cls = partial(GenericGymEnv, id="real-time-gym-v1", gym_kwargs={"config": my_config})
704704
```
705705
This dummy environment will only be used by the `Trainer` to retrieve the observation and action spaces (`reset()` will not be called).
706706
Alternatively, you can pass this information as a Tuple:
@@ -750,7 +750,7 @@ class TorchMemory(ABC):
750750
"""
751751
Outputs a decompressed RL transition.
752752
753-
This transition is the same as the output by the Gym environment (after observation preprocessing).
753+
This transition is the same as the output by the Gymnasium environment (after observation preprocessing).
754754
755755
Args:
756756
item: int: indices of the transition that the Trainer wants to sample
@@ -826,7 +826,7 @@ In this tutorial, we will privilege memory usage and thus we will implement our
826826
The `append_buffer()` method will simply store the compressed sample components in `self.data`.
827827

828828
`append_buffer()` is passed a [buffer](https://github.com/trackmania-rl/tmrl/blob/c1f740740a7d57382a451607fdc66d92ba62ea0c/tmrl/networking.py#L198) object that contains a list of compressed `(act, new_obs, rew, terminated, truncated, info)` samples in its `memory` attribute.
829-
`act` is the action that was sent to the `step()` method of the Gym environment to yield `new_obs`, `rew`, `terminated`, `truncated`, and `info`.
829+
`act` is the action that was sent to the `step()` method of the Gymnasium environment to yield `new_obs`, `rew`, `terminated`, `truncated`, and `info`.
830830
Here, we decompose our samples in their relevant components, append these components to the `self.data` list, and clip `self.data` when `self.memory_size` is exceeded:
831831

832832
```python
@@ -904,7 +904,7 @@ Finally, if we have enough samples, we need to remove the length of the action b
904904
Furthermore, the `get_transition()` method outputs a full RL transition, which includes the previous observation. Thus, we must subtract 1 to get the number of full transitions that we can actually output.
905905

906906
Alright, let us finally implement `get_transition()`, where we have chosen sample decompression would happen.
907-
This method outputs full transitions as if they were output by the Gym environment
907+
This method outputs full transitions as if they were output by the Gymnasium environment
908908
(after observation preprocessing if used):
909909

910910
```python

setup.py

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -101,21 +101,17 @@ def url_retrieve(url: str, outfile: Path, overwrite: bool = False):
101101
install_req = [
102102
'numpy',
103103
'torch',
104-
'imageio',
105-
'imageio-ffmpeg',
106104
'pandas',
107-
'gym>=0.26.0',
108-
'rtgym>=0.7',
105+
'gymnasium',
106+
'rtgym>=0.8',
109107
'pyyaml',
110108
'wandb',
111109
'requests',
112110
'opencv-python',
113-
'scikit-image',
114111
'keyboard',
115112
'pyautogui',
116113
'pyinstrument',
117-
'tlspyo>=0.2.5',
118-
'matplotlib'
114+
'tlspyo>=0.2.5'
119115
]
120116

121117
if platform.system() == "Windows":
@@ -131,7 +127,7 @@ def url_retrieve(url: str, outfile: Path, overwrite: bool = False):
131127

132128
setup(
133129
name='tmrl',
134-
version='0.4.2',
130+
version='0.5.0',
135131
description='Network-based framework for real-time robot learning',
136132
long_description=README,
137133
long_description_content_type='text/markdown',

tmrl/__init__.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,9 @@
3333

3434
def get_environment():
3535
"""
36-
Default TMRL Gym environment for TrackMania 2020.
36+
Default TMRL Gymnasium environment for TrackMania 2020.
3737
3838
Returns:
39-
Gym.Env: An instance of the default TMRL Gym environment
39+
gymnasium.Env: An instance of the default TMRL Gym environment
4040
"""
41-
return GenericGymEnv(id="real-time-gym-v0", gym_kwargs={"config": CONFIG_DICT})
41+
return GenericGymEnv(id="real-time-gym-v1", gym_kwargs={"config": CONFIG_DICT})

tmrl/__main__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ def main(args):
2323
config_modifiers = args.config
2424
for k, v in config_modifiers.items():
2525
config[k] = v
26-
rw = RolloutWorker(env_cls=partial(GenericGymEnv, id="real-time-gym-v0", gym_kwargs={"config": config}),
26+
rw = RolloutWorker(env_cls=partial(GenericGymEnv, id="real-time-gym-v1", gym_kwargs={"config": config}),
2727
actor_module_cls=cfg_obj.POLICY,
2828
sample_compressor=cfg_obj.SAMPLE_COMPRESSOR,
2929
device='cuda' if cfg.CUDA_INFERENCE else 'cpu',

tmrl/actor.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,8 @@ class ActorModule(ABC):
2020
def __init__(self, observation_space, action_space):
2121
"""
2222
Args:
23-
observation_space (Gym.spaces.Space): observation space (here for your convenience)
24-
action_space (Gym.spaces.Space): action space (here for your convenience)
23+
observation_space (gymnasium.spaces.Space): observation space (here for your convenience)
24+
action_space (gymnasium.spaces.Space): action space (here for your convenience)
2525
"""
2626
self.observation_space = observation_space
2727
self.action_space = action_space
@@ -121,8 +121,8 @@ class TorchActorModule(ActorModule, torch.nn.Module, ABC):
121121
def __init__(self, observation_space, action_space, device="cpu"):
122122
"""
123123
Args:
124-
observation_space (Gym.spaces.Space): observation space (here for your convenience)
125-
action_space (Gym.spaces.Space): action space (here for your convenience)
124+
observation_space (gymnasium.spaces.Space): observation space (here for your convenience)
125+
action_space (gymnasium.spaces.Space): action space (here for your convenience)
126126
device: device where your model should live and where observations for `act` will be collated
127127
"""
128128
super().__init__(observation_space, action_space) # ActorModule

tmrl/config/config_objects.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@
6666
else:
6767
SAMPLE_COMPRESSOR = get_local_buffer_sample_tm20_imgs
6868

69-
# to preprocess observations that come out of the gym environment:
69+
# to preprocess observations that come out of the gymnasium environment:
7070
if cfg.PRAGMA_LIDAR:
7171
if cfg.PRAGMA_PROGRESS:
7272
OBS_PREPROCESSOR = obs_preprocessor_tm_lidar_progress_act_in_obs
@@ -144,7 +144,7 @@ def sac_v2_entropy_scheduler(agent, epoch):
144144
agent.entopy_target = start_ent + (end_ent - start_ent) * epoch / end_epoch
145145

146146

147-
ENV_CLS = partial(GenericGymEnv, id="real-time-gym-v0", gym_kwargs={"config": CONFIG_DICT})
147+
ENV_CLS = partial(GenericGymEnv, id="real-time-gym-v1", gym_kwargs={"config": CONFIG_DICT})
148148

149149
if cfg.PRAGMA_LIDAR: # lidar
150150
TRAINER = partial(

tmrl/custom/custom_gym_interfaces.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
# third-party imports
1010
import cv2
11-
import gym.spaces as spaces
11+
import gymnasium.spaces as spaces
1212
import numpy as np
1313

1414

0 commit comments

Comments
 (0)