Refactor Conditional GFlowNets #431

hyeok9855 · 2025-11-04T01:15:44Z

I've read the .github/CONTRIBUTING.md file
My code follows the typing guidelines
I've added appropriate tests
I've run pre-commit hooks locally

Description

Major refactorings for conditional GFlowNets.

Add ConditionalEnv as a new abstract class for an environment with a conditional reward
Now, the Trajectories.conditions have a shape of (n_trajectories, condition_vector_dim), simplifying many shape-related logics.
Fix the train_conditional.py example (before, true_dist for the validation was wrong.)

TODO (maybe in another PR?)

Let ConditionalEnv support conditional transitions

younik

Just a few comments; good to go for me, but I would wait for @josephdviviano as he understands this code better

younik · 2025-11-24T09:57:13Z

src/gfn/containers/trajectories.py

+        # Concatenate conditions of the trajectories.
+        if self.conditions is not None and other.conditions is not None:
+            self.conditions = torch.cat((self.conditions, other.conditions), dim=0)
+        else:
+            self.conditions = None


can we maybe add a test for extending with conditions, and then try common ops like get_item to check the output is as expected?

can we maybe add a test for extending with conditions

I will add one.

and then try common ops like get_item to check the output is as expected?

I have no idea what this means. Could you elaborate more?

I mean in the test, after calling extend, check if the extend operation gave the expected result.
Like here:

torchgfn/testing/test_states.py

Lines 432 to 454 in c3f3096

pre_extend_shape = state2.batch_shape

state1.extend(state2)

assert state2.batch_shape == pre_extend_shape

# Check final shape should be (max_len=3, B=4)

assert state1.batch_shape == (3, 4)

# The actual count might be higher due to padding with sink states

assert state1.tensor.x.size(0) == expected_nodes

assert state1.tensor.num_edges == expected_edges

# Check if states are extended as expected

assert (state1[0, 0].tensor.x == datas[0].x).all()

assert (state1[0, 1].tensor.x == datas[1].x).all()

assert (state1[0, 2].tensor.x == datas[4].x).all()

assert (state1[0, 3].tensor.x == datas[5].x).all()

assert (state1[1, 0].tensor.x == datas[2].x).all()

assert (state1[1, 1].tensor.x == datas[3].x).all()

assert (state1[1, 2].tensor.x == datas[6].x).all()

assert (state1[1, 3].tensor.x == datas[7].x).all()

assert (state1[2, 0].tensor.x == MyGraphStates.sf.x).all()

assert (state1[2, 1].tensor.x == MyGraphStates.sf.x).all()

assert (state1[2, 2].tensor.x == datas[8].x).all()

assert (state1[2, 3].tensor.x == datas[9].x).all()

I see. I will add a test soon!

src/gfn/containers/trajectories.py

src/gfn/gflownet/sub_trajectory_balance.py

younik · 2025-11-24T10:15:38Z

src/gfn/env.py

+    def reward(self, states: States, conditions: torch.Tensor) -> torch.Tensor:
+        """Compute rewards for the conditional environment.
+
+        Args:
+            states: The states to compute rewards for.
+                states.tensor.shape should be (batch_size, *state_shape)
+            conditions: The conditions to compute rewards for.
+                conditions.shape should be (batch_size, condition_vector_dim)
+
+        Returns:
+            A tensor of shape (batch_size,) containing the rewards.
+        """
+        raise NotImplementedError


aha, this is not a real subclass of Env, as conditions are mandatory (i.e. if you can't call this function pretending it is an env obj while it is ConditionEnv).

Would it make sense to have a default condition?
If not, this shouldn't inehrit from Env probably.

Would it make sense to have a default condition?

How could having a default condition solve the problem?

If not, this shouldn't inherit from Env probably.

Maybe, but still we need a parent class that defines the default methods for Envs, like reward, step, etc...

How could having a default condition solve the problem?

If we have a function like this:

def get_reward(env: Env, states: States) -> torch.Tensor: return env.reward(states)

This should work with any Env object, given the interface of Env.

However, currently, if I pass a ConditionEnv (which is an Env), this will fail as you need to specify the conditioning. If you have a default value for conditioning, now the get_reward function will work properly (indeed, with default, the reward function interface of ConditionEnv becomes a subtype of the one of Env)

An alternative approach would be to have the conditions live inside the states themselves (states could have a conditioning field that is None unless conditioning is required, and then anything that accepts States follows a different path when conditioning is present).

The env itself would only be conditional or not depending on the logic the user defines in the reward and step functions. No actual ConditionalEnv class would be required.

The estimators would also optionally use the conditioning information, if it's present, just like how it's done currently.

Now I'm seeing something I never noticed before - this class makes the hot path for calculating from conditions tensor-based, which may or may not be more torch.compile friendly than using conditions in the states class.

The use of a ConditionalEnv is growing on me. I don't mind the changing API, but I would prefer if this logic was somehow all in the Env directly somehow. I keep changing my mind on the best design. I suppose it depends on whether we think putting the conditions in States is ultimately a good design.

josephdviviano

Overall a really nice PR, but I have a few questions about changes that seem unrelated to the goal (in particular I think we remove a few checks that might have side effects not captured in our test suites) and I wonder if it would be cleaner for the conditioning to live directly within the States class which would help avoid a lot of added complexity. We can discuss in the standup. Great work!

josephdviviano · 2025-11-21T06:51:48Z

src/gfn/containers/states_container.py

        self.conditions = conditions
        assert self.conditions is None or (
-            self.conditions.shape[: len(batch_shape)] == batch_shape
+            len(self.conditions.shape) == 2


right, because we assume the conditioning would not change through the trajectory?

josephdviviano · 2025-11-21T06:52:46Z

src/gfn/containers/states_container.py

-            self._log_rewards[self.is_terminating] = self.env.log_reward(
+            if isinstance(self.env, ConditionalEnv):
+                assert self.conditions is not None
+                log_reward_fn = partial(


josephdviviano · 2025-11-25T03:00:33Z

src/gfn/containers/trajectories.py

-            # Assign rewards to valid terminating states.
-            terminating_mask = is_terminating & (
-                valid_batch_indices == (self.terminating_idx[valid_traj_indices] - 1)
+            log_rewards[self.terminating_idx - 1, torch.arange(len(self))] = (


really nice cleanup here!

josephdviviano · 2025-11-25T03:13:54Z

src/gfn/env.py

+    def reward(self, states: States, conditions: torch.Tensor) -> torch.Tensor:
+        """Compute rewards for the conditional environment.
+
+        Args:
+            states: The states to compute rewards for.
+                states.tensor.shape should be (batch_size, *state_shape)
+            conditions: The conditions to compute rewards for.
+                conditions.shape should be (batch_size, condition_vector_dim)
+
+        Returns:
+            A tensor of shape (batch_size,) containing the rewards.
+        """
+        raise NotImplementedError


An alternative approach would be to have the conditions live inside the states themselves (states could have a conditioning field that is None unless conditioning is required, and then anything that accepts States follows a different path when conditioning is present).

The env itself would only be conditional or not depending on the logic the user defines in the reward and step functions. No actual ConditionalEnv class would be required.

The estimators would also optionally use the conditioning information, if it's present, just like how it's done currently.

josephdviviano · 2025-11-25T03:15:13Z

src/gfn/gflownet/flow_matching.py


 from gfn.containers import StatesContainer, Trajectories
-from gfn.env import DiscreteEnv
+from gfn.env import Env


This is technically wrong because FlowMatching won't work for continuous environments.

src/gfn/gflownet/sub_trajectory_balance.py

josephdviviano · 2025-11-25T03:23:47Z

src/gfn/gym/hypergrid.py

            )

-            self._all_states_tensor = all_states_tensor
+            if self.store_all_states:


Nice, thanks for this addition :)

josephdviviano · 2025-11-25T03:24:58Z

src/gfn/utils/prob_calculations.py

-    valid_states = trajectories.states[state_mask]
-    valid_actions = trajectories.actions[action_mask]
-
-    if valid_states.batch_shape != valid_actions.batch_shape:


Why are you removing this stuff? I thought this was a useful check.

I disagree with removing this assert

josephdviviano · 2025-11-25T03:28:43Z

src/gfn/utils/prob_calculations.py

                # Build distribution for active rows and compute step log-probs
+                # TODO: masking ctx with step_mask outside of compute_dist and log_probs,
+                # i.e., implement __getitem__ for ctx. (maybe we should contain only the
+                # tensors, and not additional metadata like the batch size, device, etc.)


Masking of ctx should already be handled. Or are you suggesting it should be handled in this logic here (i.e., generic)?

josephdviviano · 2025-11-25T03:29:07Z

src/gfn/utils/prob_calculations.py

+                    valid_step_actions.tensor, dist, ctx, step_mask, vectorized=False
                )

-                # Pad back to full batch size.


Why did you remove this? It's important.

josephdviviano

For now, I'll leave comments - we can decide what to do with the other PR before deciding what to do for this one.

But I must say there's a lot of good work here. Thank you, I'm sure much of this will be a good improvement to the library!

josephdviviano · 2025-12-12T04:51:58Z

src/gfn/containers/trajectories.py

        new_max_length = terminating_idx.max().item() if len(terminating_idx) > 0 else 0
        states = self.states[:, index]
-        conditions = self.conditions[:, index] if self.conditions is not None else None
+        conditions = self.conditions[index] if self.conditions is not None else None


so this is indexing the batch dimension? since the condition is static for the whole trajectory?

josephdviviano · 2025-12-12T04:54:09Z

src/gfn/containers/trajectories.py

-            # We need to index the conditions tensor to match the actions
-            # The actions exclude the last step, so we need to exclude the last step from conditions
-            conditions = self.conditions[:-1][~self.actions.is_dummy]
+            # The conditions tensor has shape (n_trajectories, condition_vector_dim)


is n_trajectories, batch_dim? That naming is a bit confusing because there's also trajectory_length.

josephdviviano · 2025-12-12T04:55:43Z

src/gfn/containers/trajectories.py

+            # The conditions tensor has shape (n_trajectories, condition_vector_dim)
+            # The actions have batch shape (max_length, n_trajectories)
+            # We need to repeat the condition vector tensor to match the actions
+            conditions = self.conditions.repeat(self.actions.batch_shape[0], 1, 1)


can you add inline batch dim notation here e.g., # (T, B, C) for trajectrory_length, batch_dim, conditioning_dim.

josephdviviano · 2025-12-12T04:56:19Z

src/gfn/containers/trajectories.py

+            # The conditions tensor has shape (n_trajectories, condition_vector_dim)
+            # The states have batch shape (max_length, n_trajectories)
+            # We need to repeat the conditions to match the batch shape of the states.
+            conditions = self.conditions.repeat(self.states.batch_shape[0], 1, 1)


ditto as above

josephdviviano · 2025-12-12T05:13:15Z

src/gfn/env.py

+    def reward(self, states: States, conditions: torch.Tensor) -> torch.Tensor:
+        """Compute rewards for the conditional environment.
+
+        Args:
+            states: The states to compute rewards for.
+                states.tensor.shape should be (batch_size, *state_shape)
+            conditions: The conditions to compute rewards for.
+                conditions.shape should be (batch_size, condition_vector_dim)
+
+        Returns:
+            A tensor of shape (batch_size,) containing the rewards.
+        """
+        raise NotImplementedError


Now I'm seeing something I never noticed before - this class makes the hot path for calculating from conditions tensor-based, which may or may not be more torch.compile friendly than using conditions in the states class.

The use of a ConditionalEnv is growing on me. I don't mind the changing API, but I would prefer if this logic was somehow all in the Env directly somehow. I keep changing my mind on the best design. I suppose it depends on whether we think putting the conditions in States is ultimately a good design.

josephdviviano · 2025-12-12T05:14:23Z

src/gfn/gflownet/flow_matching.py

+        if not env.is_discrete:
+            raise NotImplementedError(
+                "Flow Matching GFlowNet only supports discrete environments for now."
+            )


I see, it's handled here.

josephdviviano · 2025-12-12T05:16:52Z

src/gfn/utils/prob_calculations.py

-    valid_states = trajectories.states[state_mask]
-    valid_actions = trajectories.actions[action_mask]
-
-    if valid_states.batch_shape != valid_actions.batch_shape:


I disagree with removing this assert

Place conditions within States

hyeok9855 added 2 commits November 3, 2025 23:45

refactor conditioning (wip)

ea5e1f5

minor fix in DB and FM

1c25f04

hyeok9855 marked this pull request as draft November 4, 2025 01:15

hyeok9855 self-assigned this Nov 4, 2025

hyeok9855 changed the title ~~Refactor Conditional GFlowNets~~ [WIP] Refactor Conditional GFlowNets Nov 4, 2025

hyeok9855 added 6 commits November 4, 2025 15:20

Merge branch 'master' into refactor-conditions

97ac4a8

minor fix

915f171

minor refactorings

74c85f6

fix github action

b3496cd

fix smoke tests

9839e10

fix subtb

0d577a9

hyeok9855 mentioned this pull request Nov 19, 2025

Refactor Detailed Balance #432

Merged

4 tasks

hyeok9855 added 3 commits November 19, 2025 17:19

Merge branch 'master' into refactor-conditions

580d00a

Merge branch 'master' into refactor-conditions

5187a8a

fix to_states_container

e299081

hyeok9855 marked this pull request as ready for review November 20, 2025 18:44

hyeok9855 requested review from josephdviviano and younik November 20, 2025 18:44

hyeok9855 changed the title ~~[WIP] Refactor Conditional GFlowNets~~ Refactor Conditional GFlowNets Nov 20, 2025

younik reviewed Nov 24, 2025

View reviewed changes

hyeok9855 added 2 commits November 24, 2025 14:22

Merge branch 'master' into refactor-conditions

8ef7951

fix comments

7eb8255

josephdviviano reviewed Nov 25, 2025

View reviewed changes

hyeok9855 added 2 commits December 1, 2025 15:20

Merge branch 'master' into refactor-conditions

0cae53e

put conditions inside states

ce5ec3c

josephdviviano reviewed Dec 12, 2025

View reviewed changes

hyeok9855 and others added 2 commits December 12, 2025 11:42

remove redundancy

beda65a

Merge pull request #445 from GFNOrg/condition-in-states

d000cbd

Place conditions within States

	pre_extend_shape = state2.batch_shape
	state1.extend(state2)
	assert state2.batch_shape == pre_extend_shape
	# Check final shape should be (max_len=3, B=4)
	assert state1.batch_shape == (3, 4)

	# The actual count might be higher due to padding with sink states
	assert state1.tensor.x.size(0) == expected_nodes
	assert state1.tensor.num_edges == expected_edges

	# Check if states are extended as expected
	assert (state1[0, 0].tensor.x == datas[0].x).all()
	assert (state1[0, 1].tensor.x == datas[1].x).all()
	assert (state1[0, 2].tensor.x == datas[4].x).all()
	assert (state1[0, 3].tensor.x == datas[5].x).all()
	assert (state1[1, 0].tensor.x == datas[2].x).all()
	assert (state1[1, 1].tensor.x == datas[3].x).all()
	assert (state1[1, 2].tensor.x == datas[6].x).all()
	assert (state1[1, 3].tensor.x == datas[7].x).all()
	assert (state1[2, 0].tensor.x == MyGraphStates.sf.x).all()
	assert (state1[2, 1].tensor.x == MyGraphStates.sf.x).all()
	assert (state1[2, 2].tensor.x == datas[8].x).all()
	assert (state1[2, 3].tensor.x == datas[9].x).all()

Refactor Conditional GFlowNets #431

Are you sure you want to change the base?

Refactor Conditional GFlowNets #431

Uh oh!

Conversation

hyeok9855 commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

TODO (maybe in another PR?)

Uh oh!

younik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josephdviviano left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josephdviviano left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

hyeok9855 commented Nov 4, 2025 •

edited

Loading