Implement Predator-Prey Flock Environment #259

zombie-einstein · 2024-11-04T10:41:24Z

Add a predator-prey flock environment where two sets of agents attempt to catch/evade each other.

Changes

Adds the Esquilax library as a dependency
Adds the swarm environment group/type (was not sure the new environment fit into an existing group, but happy to move if you think it would better fit somewhere else)
Implement some common swarm/flock functionality (can be used if more environments of this type are added)
Implement the predator-prey environment and docs

Todo

Need to add images and animations, was waiting to finalise code before adding.

Questions

I only forwarded the Environment import to jumanji.environments do types also need forwarding somewhere?
I didn't add an animate method to the environment, but saw that some other do? Easy enough to add.
Do you want defaults for all the environment parameters? Not sure there are really "natural" choices, but could add sensible defaults to avoid some typing.
Are the API docs auto-generated somehow, or do I need to add a link manually?

* Initial prototype * feat: Add environment tests * fix: Update esquilax version to fix type issues * docs: Add docstrings * docs: Add docstrings * test: Test multiple reward types * test: Add smoke tests and add max-steps check * feat: Implement pred-prey environment viewer * refactor: Pull out common viewer functionality * test: Add reward and view tests * test: Add rendering tests and add test docstrings * docs: Add predator-prey environment documentation page * docs: Cleanup docstrings * docs: Cleanup docstrings

CLAassistant · 2024-11-04T10:41:31Z

All committers have signed the CLA.

zombie-einstein · 2024-11-04T10:43:04Z

Here you go @sash-a this is correct now. Will grab a look at the contributor license and Ci failure now.

zombie-einstein · 2024-11-04T10:50:40Z

I think CI issue is I've Esquilax set to Python >=3.10, seems you've a PR open to upgrade Python version, is it worth holding on for that?

sash-a · 2024-11-04T14:37:16Z

Python version PR is merged now so hopefully it will pass 😄

Should have time during the week to review this, really appreciate the contribution!

sash-a

An initial review with some high level comments about jumanji conventions. Will go through it more in depth once these are addressed. In general it's looking really nice and well documented!

Not quite sure on the new swarms package, but also not sure where else we would put it. Not sure on it especially if we only have 1 env and no news ones planned.

One thing I don't quite understand is the benefit of amap over vmap specifically in the case of this env?

Please @ me when it's ready for another review or if you have any questions.

sash-a · 2024-11-05T06:01:24Z

jumanji/environments/swarms/common/types.py

+import chex
+
+
+@dataclass


I assume these are fixed attributes for each agent? If so can we be explicit that it is frozen

Suggested change

@dataclass

@dataclass(frozen=True)

Correct yeah these remain fixed after creation, will add.

sash-a · 2024-11-05T06:15:39Z

jumanji/environments/swarms/common/updates.py

+    return new_heading, new_speeds
+
+
+@esquilax.transforms.amap


For this function why not just vmap? Since you don't use the params or key

Yes in this case this is overkill, I'll use vmap inside the update function.

sash-a · 2024-11-05T06:20:15Z

jumanji/environments/swarms/common/updates.py

+from . import types
+from .types import AgentParams


By convention we don't do relative imports in jumanji, also I see you're using types.AgentParams and also just AgentParams I think I prefer just using agent params, so:

Suggested change

from . import types

from .types import AgentParams

from jumanji.environments.swarms.common.types import AgentParams

I'll switch out relative imports (there's a few around). In this case also have types.Agentstate so will import the types module.

sash-a · 2024-11-05T06:22:28Z

jumanji/environments/swarms/common/updates.py

+def init_state(
+    n: int, params: types.AgentParams, key: chex.PRNGKey
+) -> types.AgentState:
+    """
+    Randomly initialise state of a group of agents
+
+    Args:
+        n: Number of agents to initialise.
+        params: Agent parameters.
+        key: JAX random key.
+
+    Returns:
+        AgentState: Random agent states (i.e. position, headings, and speeds)
+    """
+    k1, k2, k3 = jax.random.split(key, 3)
+
+    positions = jax.random.uniform(k1, (n, 2))
+    speeds = jax.random.uniform(
+        k2, (n,), minval=params.min_speed, maxval=params.max_speed
+    )
+    headings = jax.random.uniform(k3, (n,), minval=0.0, maxval=2.0 * jax.numpy.pi)
+
+    return types.AgentState(
+        pos=positions,
+        speed=speeds,
+        heading=headings,
+    )


I think it would be nice to turn this into a generator, it's a convention for making it easy to switch the initial state distribution. See cleaner for a good example of how we do generators

sash-a · 2024-11-05T06:23:17Z

jumanji/environments/swarms/common/updates.py

+        AgentState: Updated state of the agents after applying steering
+            actions and updating positions.
+    """
+    actions = jax.numpy.clip(actions, min=-1.0, max=1.0)


Convention in jumanji is to import jax.numpy as jnp and then use jnp everywhere instead of jax.numpy

sash-a · 2024-11-05T06:29:54Z

jumanji/environments/swarms/predator_prey/updates.py

Can we rename this to reward.py to keep with jumanji convention and can you follow this convention for how we write our reward functions 🙏

Oh nice yeah, I meant to ask about generic reward functions, reward tuning can be a large part of these multi-agent environments

sash-a · 2024-11-05T06:32:20Z

jumanji/environments/swarms/predator_prey/types.py

+@dataclass
+class Observation:
+    """
+    predators: Local view of predator agents.
+    prey: Local view of prey agents.
+    """
+
+    predators: chex.Array
+    prey: chex.Array


By convention our observations are NamedTuples and also need to be very well documented, see here

What's the reason for this (for my own knowledge)?

sash-a · 2024-11-05T06:33:08Z

jumanji/environments/swarms/predator_prey/types.py

+@dataclass
+class Actions:
+    """
+    predators: Array of actions for predator agents.
+    prey: Array of actions for prey agents.
+    """
+
+    predators: chex.Array
+    prey: chex.Array
+
+
+@dataclass
+class Rewards:
+    """
+    predators: Array of individual rewards for predator agents.
+    prey: Array of individual rewards for prey agents.
+    """
+
+    predators: chex.Array
+    prey: chex.Array


I see this is repeated multipe times, maybe a PredatorPrey type would be best? Although not sure about this

Yeah I did this in the prototype to indicate something that just had the two fields. I guess you could say for readability and in the strict typing sense these should be different things (was my thinking here)? But also appreciate the repetition is a bit ugly.

sash-a · 2024-11-05T06:33:58Z

jumanji/environments/swarms/common/types.py

+    pos: chex.Array
+    heading: chex.Array
+    speed: chex.Array


For all types we add shape comments so it's easy to understand what we're expecting when debugging. e.g here

Oh yes I'll add in

sash-a · 2024-11-05T06:36:32Z

jumanji/environments/swarms/predator_prey/env.py

+        if self.sparse_rewards:
+            rewards = self._state_to_sparse_rewards(state)
+        else:
+            rewards = self._state_to_distance_rewards(state)


Can you change this to how we set up different reward functions in jumanji, see here

sash-a · 2024-11-05T06:52:27Z

As for your questions in the description:

I only forwarded the Environment import to jumanji.environments do types also need forwarding somewhere?

Nope just the environment is fine

I didn't add an animate method to the environment, but saw that some other do? Easy enough to add.

Please do add animation it's a great help.

Do you want defaults for all the environment parameters? Not sure there are really "natural" choices, but could add sensible defaults to avoid some typing.

We do want defaults, I think we can discuss what makes sense.

Are the API docs auto-generated somehow, or do I need to add a link manually?

It's generated with mkdocs, we need an entry in docs/api/environments and mkdocs.yml, see this recently closed PR for an example of which files we change

One big thing I've realized that this is missing after my review is training code. We like to validate that the env works. I'm not 100% sure if this is possible because the env has two teams, so which reward do you optimize, maybe training with simple heuristic, eg you are the predator and the prey moves randomly? For examples see the training folder, you should only need to create a network. An example of this should also be in the above PR.

zombie-einstein added 2 commits November 4, 2024 10:16

Merge branch 'instadeepai:main' into main

c955320

Merge branch 'main' into main

6b34657

sash-a requested changes Nov 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Predator-Prey Flock Environment #259

Implement Predator-Prey Flock Environment #259

zombie-einstein commented Nov 4, 2024

CLAassistant commented Nov 4, 2024 •

edited

Loading

zombie-einstein commented Nov 4, 2024

zombie-einstein commented Nov 4, 2024

sash-a commented Nov 4, 2024

sash-a left a comment

sash-a Nov 5, 2024

zombie-einstein Nov 5, 2024

sash-a Nov 5, 2024

zombie-einstein Nov 5, 2024

sash-a Nov 5, 2024

zombie-einstein Nov 5, 2024

sash-a Nov 5, 2024

sash-a Nov 5, 2024

sash-a Nov 5, 2024

zombie-einstein Nov 5, 2024

sash-a Nov 5, 2024

zombie-einstein Nov 5, 2024

sash-a Nov 5, 2024

zombie-einstein Nov 5, 2024

sash-a Nov 5, 2024

zombie-einstein Nov 5, 2024

sash-a Nov 5, 2024

sash-a commented Nov 5, 2024

	from . import types
	from .types import AgentParams
	from jumanji.environments.swarms.common.types import AgentParams

Implement Predator-Prey Flock Environment #259

Are you sure you want to change the base?

Implement Predator-Prey Flock Environment #259

Conversation

zombie-einstein commented Nov 4, 2024

Changes

Todo

Questions

CLAassistant commented Nov 4, 2024 • edited Loading

zombie-einstein commented Nov 4, 2024

zombie-einstein commented Nov 4, 2024

sash-a commented Nov 4, 2024

sash-a left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sash-a commented Nov 5, 2024

CLAassistant commented Nov 4, 2024 •

edited

Loading