Skip to content

Conversation

@naoyam
Copy link
Collaborator

@naoyam naoyam commented Dec 19, 2025

This PR introduces the combine operation as discussed in the RaggedIterDomain design doc.

One design decision that I changed from the original design doc is about detecting and validating component iter domains. Previously, I was thinking about using the exact graph to find the corresponding component iter domain for a given ragged iter domain (e.g., #5550 (comment)). However, it won't work, for example, when a fusion is segmented and a segment does not have the corresponding Partition expr for a RaggedIterDomain. For example, when a tensor is used as an input for asNested, followed by some other operations, if the fusion is segmented after some operations, the latter segment won't be able to see the asNested and the Partition operations as they don't exist in the segment. This could be alleviated by providing an exact graph for the whole complete fusion, but more fundamentally, if a fusion has a nested tensor as an input, there doesn't seem to be any reasonable way to attach a Partition expr.

See doc/dev/ragged_iter_domain_combine_design_doc.md‎ for detailed discussions. At this moment, I decided to not worry too much about the validation and assume the correctness is guaranteed by the user.

Note that partitioning is still limited to 1D extents. Multi-dim offsets will be the next step of this series of RPs.

  • Update *
    Tracking which iter domains correspond to which extent iter domains seems to be actually necessary for supporting combine with ragged iter domains produced by multi-dim extent tensors. I'll revisit this as part of multi-dim combine work, but my current plan is to take Option 4 as described in the design doc.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@naoyam
Copy link
Collaborator Author

naoyam commented Jan 13, 2026

!test

@wujingyue
Copy link
Collaborator

for example, when a fusion is segmented and a segment does not have the corresponding Partition expr for a RaggedIterDomain

We should still be able to validate using the exact graph on the complete fusion. Correct?

Copy link
Collaborator

@wujingyue wujingyue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise


// The combined extent is the sum of all extents in the ragged dimension
// For a 1D extents tensor [e0, e1, ..., en-1], the total is sum(extents)
TensorView* extents_tv = ragged->extents();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TensorView* extents_tv = ragged->extents();
TensorView* extents = ragged->extents();

The type already says it. Also, in the context of RaggedIterDomain, extents has to be a TensorView.

}

std::string Combine::toInlineString(int indent_size) const {
NVF_CHECK(false, "Combine can not be printed inline");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? toString seems to be one line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually am not quite sure why, but our convention is that inline printing seems to be only for scalar values. For example, Split::toInlineString isn't supported either. It isn't just whether it can be printed in a single line. It's more like if it can be recursively called.

Base automatically changed from raggediterdomain_clone to main January 16, 2026 05:49
@naoyam
Copy link
Collaborator Author

naoyam commented Jan 16, 2026

for example, when a fusion is segmented and a segment does not have the corresponding Partition expr for a RaggedIterDomain

We should still be able to validate using the exact graph on the complete fusion. Correct?

Yes.

Actually, I'm considering changing this design for supporting multi-dim combine. I realized we would indeed need to know which iter domains correspond to which extent iter domains for partial combine like the shuffle pattern in expert parallelism.

Please consider the validation part is a TODO task. I'll be likely to address that in a later PR.

@naoyam
Copy link
Collaborator Author

naoyam commented Jan 16, 2026

!test

@naoyam naoyam added the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 16, 2026
@wujingyue
Copy link
Collaborator

realized we would indeed need to know which iter domains correspond to which extent iter domains

👍

@naoyam naoyam merged commit 352dcbf into main Jan 16, 2026
64 of 66 checks passed
@naoyam naoyam deleted the ragged_combine branch January 16, 2026 17:20
@github-actions github-actions bot removed the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants