Combine for RaggedIterDomain #5716

naoyam · 2025-12-19T03:07:28Z

This PR introduces the combine operation as discussed in the RaggedIterDomain design doc.

One design decision that I changed from the original design doc is about detecting and validating component iter domains. Previously, I was thinking about using the exact graph to find the corresponding component iter domain for a given ragged iter domain (e.g., #5550 (comment)). However, it won't work, for example, when a fusion is segmented and a segment does not have the corresponding Partition expr for a RaggedIterDomain. For example, when a tensor is used as an input for asNested, followed by some other operations, if the fusion is segmented after some operations, the latter segment won't be able to see the asNested and the Partition operations as they don't exist in the segment. This could be alleviated by providing an exact graph for the whole complete fusion, but more fundamentally, if a fusion has a nested tensor as an input, there doesn't seem to be any reasonable way to attach a Partition expr.

See doc/dev/ragged_iter_domain_combine_design_doc.md‎ for detailed discussions. At this moment, I decided to not worry too much about the validation and assume the correctness is guaranteed by the user.

Note that partitioning is still limited to 1D extents. Multi-dim offsets will be the next step of this series of RPs.

Update *
Tracking which iter domains correspond to which extent iter domains seems to be actually necessary for supporting combine with ragged iter domains produced by multi-dim extent tensors. I'll revisit this as part of multi-dim combine work, but my current plan is to take Option 4 as described in the design doc.

…ial_type_def

…ition

…sted

greptile-apps

_{No files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

naoyam · 2026-01-13T18:39:23Z

!test

wujingyue · 2026-01-16T02:02:25Z

for example, when a fusion is segmented and a segment does not have the corresponding Partition expr for a RaggedIterDomain

We should still be able to validate using the exact graph on the complete fusion. Correct?

wujingyue

LGTM otherwise

wujingyue · 2026-01-16T01:57:04Z

csrc/ir/internal_base_nodes.cpp

+
+  // The combined extent is the sum of all extents in the ragged dimension
+  // For a 1D extents tensor [e0, e1, ..., en-1], the total is sum(extents)
+  TensorView* extents_tv = ragged->extents();


Suggested change

TensorView* extents_tv = ragged->extents();

TensorView* extents = ragged->extents();

The type already says it. Also, in the context of RaggedIterDomain, extents has to be a TensorView.

csrc/ir/internal_nodes.cpp

wujingyue · 2026-01-16T02:04:25Z

csrc/ir/internal_nodes.cpp

+}
+
+std::string Combine::toInlineString(int indent_size) const {
+  NVF_CHECK(false, "Combine can not be printed inline");


Why not? toString seems to be one line.

I actually am not quite sure why, but our convention is that inline printing seems to be only for scalar values. For example, Split::toInlineString isn't supported either. It isn't just whether it can be printed in a single line. It's more like if it can be recursively called.

csrc/ir/internal_nodes.h

naoyam · 2026-01-16T05:55:54Z

for example, when a fusion is segmented and a segment does not have the corresponding Partition expr for a RaggedIterDomain

We should still be able to validate using the exact graph on the complete fusion. Correct?

Yes.

Actually, I'm considering changing this design for supporting multi-dim combine. I realized we would indeed need to know which iter domains correspond to which extent iter domains for partial combine like the shuffle pattern in expert parallelism.

Please consider the validation part is a TODO task. I'll be likely to address that in a later PR.

naoyam · 2026-01-16T06:15:26Z

!test

wujingyue · 2026-01-16T06:31:30Z

realized we would indeed need to know which iter domains correspond to which extent iter domains

👍

csrc/ir/internal_nodes.cpp

naoyam added 30 commits December 12, 2025 10:30

Initial introduction of RaggedIterDomain

d87e6d7

Merge remote-tracking branch 'origin/main' into raggediterdomain_init…

77c6a07

…ial_type_def

cleanup

f16fc4d

fix

23d55f1

fix

8392332

unit test

787dfec

cleanup

a0b40a3

Fix IterVisitor

dbdd917

cleanup

cdbd81e

WIP: partition

d4c8d7f

Partition expr

9575a13

TensorView::partition

a054ae0

cleanup

69dbe0f

Merge remote-tracking branch 'origin/main' into raggediterdomain_part…

db3b359

…ition

cleanup

2348dde

WIP: asNested

7090b9c

cleanup

b07e285

asNested

a2c504b

warpdim

b1d8cf4

Make sure RaggedIterDomain is propagated to output tensors

201c148

Extend ops to be aware with RaggediterDomain

9e0b161

RaggedIterDomain and reduction

60a2dd5

WIP

566d63d

WIP

144b206

cleanup

e2efe75

cleanup

0b68d6b

cleanup

8a73bb2

Merge branch 'raggediterdomain_partition' into raggediterdomain-asnested

550e0c5

Merge remote-tracking branch 'origin/main' into raggediterdomain-asne…

82bd85e

…sted

Use extents as a parameter

f215f07

naoyam and others added 10 commits January 7, 2026 08:59

Merge branch 'main' into raggediterdomain-asnested

f75ecb6

feedback

8aa854e

fix

72ae14f

Merge branch 'raggediterdomain-asnested' into raggediterdomain_clone

85d48df

Merge branch 'main' into raggediterdomain_clone

5f86d9c

Merge remote-tracking branch 'origin/main' into raggediterdomain_clone

bf5b627

Merge remote-tracking branch 'origin/main' into raggediterdomain_clone

bec4c09

cleanup

4d8acab

cleanup

3b082ba

Merge branch 'raggediterdomain_clone' into ragged_combine

72dbc41

greptile-apps bot reviewed Jan 13, 2026

View reviewed changes

naoyam added 2 commits January 13, 2026 10:14

expand doc

5002407

cleanup

be0e2ea

naoyam requested a review from wujingyue January 13, 2026 18:39

naoyam mentioned this pull request Jan 15, 2026

RaggedIterDomain partitioning with multi-dim extents #5823

Open

wujingyue reviewed Jan 16, 2026

View reviewed changes

Base automatically changed from raggediterdomain_clone to main January 16, 2026 05:49

naoyam added 2 commits January 15, 2026 22:06

Merge remote-tracking branch 'origin/main' into ragged_combine

05a6201

format

d2b5384

naoyam force-pushed the ragged_combine branch from 7f34288 to d2b5384 Compare January 16, 2026 06:07

naoyam added the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 16, 2026

wujingyue approved these changes Jan 16, 2026

View reviewed changes

csrc/ir/internal_nodes.cpp Show resolved Hide resolved

naoyam merged commit 352dcbf into main Jan 16, 2026
64 of 66 checks passed

naoyam deleted the ragged_combine branch January 16, 2026 17:20

github-actions bot removed the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 16, 2026

	TensorView* extents_tv = ragged->extents();
	TensorView* extents = ragged->extents();

Combine for RaggedIterDomain #5716

Combine for RaggedIterDomain #5716

Uh oh!

Conversation

naoyam commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

naoyam commented Jan 13, 2026

Uh oh!

wujingyue commented Jan 16, 2026

Uh oh!

wujingyue left a comment

Choose a reason for hiding this comment

Uh oh!

wujingyue Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wujingyue Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

naoyam Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

naoyam commented Jan 16, 2026

Uh oh!

naoyam commented Jan 16, 2026

Uh oh!

wujingyue commented Jan 16, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

naoyam commented Dec 19, 2025 •

edited

Loading