Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update mazev5 branch #229

Merged
merged 24 commits into from
Oct 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
54043fc
Fix reproducibility of FetchPickAndPlace-v2
amacati Feb 10, 2024
e89b53e
Add tests for same environment rollout determinism
amacati Feb 10, 2024
5e14ea1
Fix reproducibility for all robotics environments
amacati Feb 11, 2024
02ffe14
Revert changes to mujoco-py. Simplify _reset_sim. Remove mujoco-py en…
amacati Feb 12, 2024
11db01a
Remove unnecessary forward call
amacati Feb 15, 2024
007afa8
Merge branch 'Farama-Foundation:main' into main
amacati Feb 26, 2024
9dd88f7
Bump versions of affected environments. Change documentation to refle…
amacati Feb 29, 2024
c6b54d6
Fix typo, clarify version change with issue
amacati Mar 1, 2024
b451ba7
update to gymnasium==1.0a2
Kallinteris-Andreas May 24, 2024
e0f3fb8
Add asserts for mj_id2name calls in mujoco_utils
DavidPL1 May 29, 2024
dace06b
update changelog
Kallinteris-Andreas May 29, 2024
f0c1a32
update changelog
Kallinteris-Andreas May 29, 2024
5f6c5e2
Update reach.py
Kallinteris-Andreas May 29, 2024
7ab8f63
Update slide.py
Kallinteris-Andreas May 29, 2024
5da9ab6
Update reach.py
Kallinteris-Andreas May 29, 2024
e525caa
Merge pull request #208 from amacati/main
Kallinteris-Andreas May 29, 2024
0a213bb
Merge pull request #218 from DavidPL1/add-asserts
Kallinteris-Andreas Jun 1, 2024
58175c3
Fixed typo
timofriedl Jul 12, 2024
6fc76a6
Merge pull request #221 from timofriedl/patch-1
Kallinteris-Andreas Jul 12, 2024
bfa3cd4
Update README.md
Kallinteris-Andreas Jul 25, 2024
629c589
Update pyproject.toml
Kallinteris-Andreas Jul 29, 2024
ff32e96
Merge pull request #225 from Farama-Foundation/Kallinteris-Andreas-pa…
Kallinteris-Andreas Aug 29, 2024
ecf2a01
Merge branch 'main' into Kallinteris-Andreas-patch-3
Kallinteris-Andreas Aug 30, 2024
b1acee9
Merge pull request #223 from Farama-Foundation/Kallinteris-Andreas-pa…
Kallinteris-Andreas Aug 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ We support and test for Python 3.8, 3.9, 3.10 and 3.11 on Linux and macOS. We wi

* [Fetch](https://robotics.farama.org/envs/fetch/) - A collection of environments with a 7-DoF robot arm that has to perform manipulation tasks such as Reach, Push, Slide or Pick and Place.
* [Shadow Dexterous Hand](https://robotics.farama.org/envs/shadow_dexterous_hand/) - A collection of environments with a 24-DoF anthropomorphic robotic hand that has to perform object manipulation tasks with a cube, egg-object, or pen. There are variations of these environments that also include data from 92 touch sensors in the observation space.
* [MaMuJoCo](https://robotics.farama.org/envs/MaMuJoCo/) - A collection of multi agent factorizations of the [Gymnasium/MuJoCo](https://gymnasium.farama.org/environments/mujoco/) environments and a framework for factorizing robotic environments, uses the [pettingzoo.ParallelEnv](https://pettingzoo.farama.org/api/parallel/) API.

The [D4RL](https://github.com/Farama-Foundation/D4RL) environments are now available. These environments have been refactored and may not have the same action/observation spaces as the original, please read their documentation:

Expand All @@ -32,8 +33,6 @@ The [D4RL](https://github.com/Farama-Foundation/D4RL) environments are now avail
The different tasks involve hammering a nail, opening a door, twirling a pen, or picking up and moving a ball.
* [Franka Kitchen](https://robotics.farama.org/envs/franka_kitchen/) - Multitask environment in which a 9-DoF Franka robot is placed in a kitchen containing several common household items. The goal of each task is to interact with the items in order to reach a desired goal configuration.

* [MaMuJoCo](https://robotics.farama.org/envs/MaMuJoCo/) - A collection of multi agent factorizations of the [Gymnasium/MuJoCo](https://gymnasium.farama.org/environments/mujoco/) environments and a framework for factorizing robotic environments, uses the [pettingzoo.ParallelEnv](https://pettingzoo.farama.org/api/parallel/) API.

**WIP**: generate new `D4RL` environment datasets with [Minari](https://github.com/Farama-Foundation/Minari).

## Multi-goal API
Expand All @@ -54,7 +53,7 @@ goal, e.g. state derived from the simulation.
```python
import gymnasium as gym

env = gym.make("FetchReach-v2")
env = gym.make("FetchReach-v3")
env.reset()
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())

Expand Down
2 changes: 1 addition & 1 deletion docs/content/multi-goal_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ import gymnasium_robotics

gym.register_envs(gymnasium_robotics)

env = gym.make("FetchReach-v2")
env = gym.make("FetchReach-v3")
env.reset()
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())

Expand Down
8 changes: 4 additions & 4 deletions docs/envs/fetch/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ lastpage:

The Fetch environments are based on the 7-DoF [Fetch Mobile Manipulator](https://fetchrobotics.com/) arm, with a two-fingered parallel gripper attached to it. The main environment tasks are the following:

* `FetchReach-v2`: Fetch has to move its end-effector to the desired goal position.
* `FetchPush-v2`: Fetch has to move a box by pushing it until it reaches a desired goal position.
* `FetchSlide-v2`: Fetch has to hit a puck across a long table such that it slides and comes to rest on the desired goal.
* `FetchPickAndPlace-v2`: Fetch has to pick up a box from a table using its gripper and move it to a desired goal above the table.
* `FetchReach-v3`: Fetch has to move its end-effector to the desired goal position.
* `FetchPush-v3`: Fetch has to move a box by pushing it until it reaches a desired goal position.
* `FetchSlide-v3`: Fetch has to hit a puck across a long table such that it slides and comes to rest on the desired goal.
* `FetchPickAndPlace-v3`: Fetch has to pick up a box from a table using its gripper and move it to a desired goal above the table.

```{raw} html
:file: list.html
Expand Down
2 changes: 1 addition & 1 deletion docs/envs/shadow_dexterous_hand/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ lastpage:

These environments are based on the [Shadow Dexterous Hand](https://www.shadowrobot.com/), 5 which is an anthropomorphic robotic hand with 24 degrees of freedom. Of those 24 joints, 20 can be controlled independently whereas the remaining ones are coupled joints.

* `HandReach-v1`: ShadowHand has to reach with its thumb and a selected finger until they meet at a desired goal position above the palm.
* `HandReach-v2`: ShadowHand has to reach with its thumb and a selected finger until they meet at a desired goal position above the palm.
* `HandManipulateBlock-v1`: ShadowHand has to manipulate a block until it achieves a desired goal position and rotation.
* `HandManipulateEgg-v1`: ShadowHand has to manipulate an egg until it achieves a desired goal position and rotation.
* `HandManipulatePen-v1`: ShadowHand has to manipulate a pen until it achieves a desired goal position and rotation.
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ import gymnasium_robotics

gym.register_envs(gymnasium_robotics)

env = gym.make("FetchPickAndPlace-v2", render_mode="human")
env = gym.make("FetchPickAndPlace-v3", render_mode="human")
observation, info = env.reset(seed=42)
for _ in range(1000):
action = policy(observation) # User-defined policy function
Expand Down
10 changes: 5 additions & 5 deletions gymnasium_robotics/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def _merge(a, b):
)

register(
id=f"FetchSlide{suffix}-v2",
id=f"FetchSlide{suffix}-v3",
entry_point="gymnasium_robotics.envs.fetch.slide:MujocoFetchSlideEnv",
kwargs=kwargs,
max_episode_steps=50,
Expand All @@ -44,7 +44,7 @@ def _merge(a, b):
)

register(
id=f"FetchPickAndPlace{suffix}-v2",
id=f"FetchPickAndPlace{suffix}-v3",
entry_point="gymnasium_robotics.envs.fetch.pick_and_place:MujocoFetchPickAndPlaceEnv",
kwargs=kwargs,
max_episode_steps=50,
Expand All @@ -58,7 +58,7 @@ def _merge(a, b):
)

register(
id=f"FetchReach{suffix}-v2",
id=f"FetchReach{suffix}-v3",
entry_point="gymnasium_robotics.envs.fetch.reach:MujocoFetchReachEnv",
kwargs=kwargs,
max_episode_steps=50,
Expand All @@ -72,7 +72,7 @@ def _merge(a, b):
)

register(
id=f"FetchPush{suffix}-v2",
id=f"FetchPush{suffix}-v3",
entry_point="gymnasium_robotics.envs.fetch.push:MujocoFetchPushEnv",
kwargs=kwargs,
max_episode_steps=50,
Expand All @@ -87,7 +87,7 @@ def _merge(a, b):
)

register(
id=f"HandReach{suffix}-v1",
id=f"HandReach{suffix}-v2",
entry_point="gymnasium_robotics.envs.shadow_dexterous_hand.reach:MujocoHandReachEnv",
kwargs=kwargs,
max_episode_steps=50,
Expand Down
7 changes: 2 additions & 5 deletions gymnasium_robotics/envs/fetch/fetch_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -373,11 +373,8 @@ def _render_callback(self):
self._mujoco.mj_forward(self.model, self.data)

def _reset_sim(self):
self.data.time = self.initial_time
self.data.qpos[:] = np.copy(self.initial_qpos)
self.data.qvel[:] = np.copy(self.initial_qvel)
if self.model.na != 0:
self.data.act[:] = None
# Reset buffers for joint states, actuators, warm-start, control buffers etc.
self._mujoco.mj_resetData(self.model, self.data)

# Randomize start position of object.
if self.has_object:
Expand Down
7 changes: 4 additions & 3 deletions gymnasium_robotics/envs/fetch/pick_and_place.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,15 +88,15 @@ class MujocoFetchPickAndPlaceEnv(MujocoFetchEnv, EzPickle):
- *sparse*: the returned reward can have two values: `-1` if the block hasn't reached its final target position, and `0` if the block is in the final target position (the block is considered to have reached the goal if the Euclidean distance between both is lower than 0.05 m).
- *dense*: the returned reward is the negative Euclidean distance between the achieved goal position and the desired goal.

To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `FetchPickAndPlace-v2`. However, for `dense` reward the id must be modified to `FetchPickAndPlaceDense-v2` and initialized as follows:
To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `FetchPickAndPlace-v3`. However, for `dense` reward the id must be modified to `FetchPickAndPlaceDense-v3` and initialized as follows:

```python
import gymnasium as gym
import gymnasium_robotics

gym.register_envs(gymnasium_robotics)

env = gym.make('FetchPickAndPlaceDense-v2')
env = gym.make('FetchPickAndPlaceDense-v3')
```

## Starting State
Expand Down Expand Up @@ -125,11 +125,12 @@ class MujocoFetchPickAndPlaceEnv(MujocoFetchEnv, EzPickle):

gym.register_envs(gymnasium_robotics)

env = gym.make('FetchPickAndPlace-v2', max_episode_steps=100)
env = gym.make('FetchPickAndPlace-v3', max_episode_steps=100)
```

## Version History

* v3: Fixed bug: `env.reset()` not properly resetting the internal state. Fetch environments now properly reset their state (related [GitHub issue](https://github.com/Farama-Foundation/Gymnasium-Robotics/issues/207)).
* v2: the environment depends on the newest [mujoco python bindings](https://mujoco.readthedocs.io/en/latest/python.html) maintained by the MuJoCo team in Deepmind.
* v1: the environment depends on `mujoco_py` which is no longer maintained.
"""
Expand Down
8 changes: 4 additions & 4 deletions gymnasium_robotics/envs/fetch/push.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,15 +116,15 @@ class MujocoFetchPushEnv(MujocoFetchEnv, EzPickle):
- *sparse*: the returned reward can have two values: `-1` if the block hasn't reached its final target position, and `0` if the block is in the final target position (the block is considered to have reached the goal if the Euclidean distance between both is lower than 0.05 m).
- *dense*: the returned reward is the negative Euclidean distance between the achieved goal position and the desired goal.

To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `FetchPush-v2`. However, for `dense` reward the id must be modified to `FetchPush-v2` and initialized as follows:
To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `FetchPush-v3`. However, for `dense` reward the id must be modified to `FetchPushDense-v3` and initialized as follows:

```python
import gymnasium as gym
import gymnasium_robotics

gym.register_envs(gymnasium_robotics)

env = gym.make('FetchPushDense-v2')
env = gym.make('FetchPushDense-v3')
```

## Starting State
Expand Down Expand Up @@ -153,11 +153,11 @@ class MujocoFetchPushEnv(MujocoFetchEnv, EzPickle):

gym.register_envs(gymnasium_robotics)

env = gym.make('FetchPush-v2', max_episode_steps=100)
env = gym.make('FetchPush-v3', max_episode_steps=100)
```

## Version History

* v3: Fixed bug: `env.reset()` not properly resetting the internal state. Fetch environments now properly reset their state (related [GitHub issue](https://github.com/Farama-Foundation/Gymnasium-Robotics/issues/207)).
* v2: the environment depends on the newest [mujoco python bindings](https://mujoco.readthedocs.io/en/latest/python.html) maintained by the MuJoCo team in Deepmind.
* v1: the environment depends on `mujoco_py` which is no longer maintained.
"""
Expand Down
10 changes: 5 additions & 5 deletions gymnasium_robotics/envs/fetch/reach.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,16 +77,16 @@ class MujocoFetchReachEnv(MujocoFetchEnv, EzPickle):
the end effector and the goal is lower than 0.05 m).
- *dense*: the returned reward is the negative Euclidean distance between the achieved goal position and the desired goal.

To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `FetchReach-v2`. However, for `dense`
reward the id must be modified to `FetchReachDense-v2` and initialized as follows:
To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `FetchReach-v3`. However, for `dense`
reward the id must be modified to `FetchReachDense-v3` and initialized as follows:

```python
import gymnasium as gym
import gymnasium_robotics

gym.register_envs(gymnasium_robotics)

env = gym.make('FetchReachDense-v2')
env = gym.make('FetchReachDense-v3')
```

## Starting State
Expand All @@ -111,11 +111,11 @@ class MujocoFetchReachEnv(MujocoFetchEnv, EzPickle):

gym.register_envs(gymnasium_robotics)

env = gym.make('FetchReach-v2', max_episode_steps=100)
env = gym.make('FetchReach-v3', max_episode_steps=100)
```

## Version History

* v3: Fixed bug: `env.reset()` not properly resetting the internal state. Fetch environments now properly reset their state (related [GitHub issue](https://github.com/Farama-Foundation/Gymnasium-Robotics/issues/207)).
* v2: the environment depends on the newest [mujoco python bindings](https://mujoco.readthedocs.io/en/latest/python.html) maintained by the MuJoCo team in Deepmind.
* v1: the environment depends on `mujoco_py` which is no longer maintained.
"""
Expand Down
8 changes: 4 additions & 4 deletions gymnasium_robotics/envs/fetch/slide.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,15 +116,15 @@ class MujocoFetchSlideEnv(MujocoFetchEnv, EzPickle):
- *sparse*: the returned reward can have two values: `-1` if the puck hasn't reached its final target position, and `0` if the puck is in the final target position (the puck is considered to have reached the goal if the Euclidean distance between both is lower than 0.05 m).
- *dense*: the returned reward is the negative Euclidean distance between the achieved goal position and the desired goal.

To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `FetchSlide-v2`. However, for `dense` reward the id must be modified to `FetchSlideDense-v2` and initialized as follows:
To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `FetchSlide-v3`. However, for `dense` reward the id must be modified to `FetchSlideDense-v3` and initialized as follows:

```python
import gymnasium as gym
import gymnasium_robotics

gym.register_envs(gymnasium_robotics)

env = gym.make('FetchSlideDense-v2')
env = gym.make('FetchSlideDense-v3')
```

## Starting State
Expand Down Expand Up @@ -152,11 +152,11 @@ class MujocoFetchSlideEnv(MujocoFetchEnv, EzPickle):

gym.register_envs(gymnasium_robotics)

env = gym.make('FetchSlide-v2', max_episode_steps=100)
env = gym.make('FetchSlide-v3', max_episode_steps=100)
```

## Version History

* v3: Fixed bug: `env.reset()` not properly resetting the internal state. Fetch environments now properly reset their state (related [GitHub issue](https://github.com/Farama-Foundation/Gymnasium-Robotics/issues/207)).
* v2: the environment depends on the newest [mujoco python bindings](https://mujoco.readthedocs.io/en/latest/python.html) maintained by the MuJoCo team in Deepmind.
* v1: the environment depends on `mujoco_py` which is no longer maintained.
"""
Expand Down
11 changes: 3 additions & 8 deletions gymnasium_robotics/envs/robot_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ def reset(
def _mujoco_step(self, action):
"""Advance the mujoco simulation.

Override depending on the python binginds, either mujoco or mujoco_py
Override depending on the python bindings, either mujoco or mujoco_py
"""
raise NotImplementedError

Expand Down Expand Up @@ -299,13 +299,8 @@ def _initialize_simulation(self):
self.initial_qvel = np.copy(self.data.qvel)

def _reset_sim(self):
self.data.time = self.initial_time
self.data.qpos[:] = np.copy(self.initial_qpos)
self.data.qvel[:] = np.copy(self.initial_qvel)
if self.model.na != 0:
self.data.act[:] = None

mujoco.mj_forward(self.model, self.data)
# Reset buffers for joint states, warm-start, control buffers etc.
mujoco.mj_resetData(self.model, self.data)
return super()._reset_sim()

def render(self):
Expand Down
10 changes: 5 additions & 5 deletions gymnasium_robotics/envs/shadow_dexterous_hand/reach.py
Original file line number Diff line number Diff line change
Expand Up @@ -306,13 +306,13 @@ class MujocoHandReachEnv(get_base_hand_reanch_env(MujocoHandEnv)):
the achieved goal vector and the desired goal vector is lower than 0.01).
- *dense*: the returned reward is the negative 2-norm distance between the achieved goal vector and the desired goal vector.

To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `HandReach-v1`.
However, for `dense` reward the id must be modified to `HandReachDense-v1` and initialized as follows:
To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `HandReach-v2`.
However, for `dense` reward the id must be modified to `HandReachDense-v2` and initialized as follows:

```
import gymnasium as gym

env = gym.make('HandReachDense-v1')
env = gym.make('HandReachDense-v2')
```

## Starting State
Expand Down Expand Up @@ -383,11 +383,11 @@ class MujocoHandReachEnv(get_base_hand_reanch_env(MujocoHandEnv)):
```
import gymnasium as gym

env = gym.make('HandReach-v1', max_episode_steps=100)
env = gym.make('HandReach-v2', max_episode_steps=100)
```

## Version History

* v2: Fixed bug: `env.reset()` not properly resetting the internal state. Fetch environments now properly reset their state (related [GitHub issue](https://github.com/Farama-Foundation/Gymnasium-Robotics/issues/207)).
* v1: the environment depends on the newest [mujoco python bindings](https://mujoco.readthedocs.io/en/latest/python.html) maintained by the MuJoCo team in Deepmind.
* v0: the environment depends on `mujoco_py` which is no longer maintained.

Expand Down
Loading
Loading