Fix allocation logic: non-divisible split #5186

jjsjann123 · 2025-09-18T21:14:02Z

Stacked PRs

Breaking original PR #5170 into three:
#5186 Fix allocation logic: non-divisible split <- this one
#5185 Fix allocation logic: unconnected alloc/logical
#5184 Allow non-device split on allocation domain

This PR

Fixing project from allocation to logical for non-divisible split

This PR changes the projection to ensure extent for non-divisible split on backward projection maintains the correct extent, instead of including padded sizes from the two split IDs.

e.g. with ID i0 (extent 5), if we do split by 4, we'll get two new ID

    i0 (extent 5)
   /  \
 i0/4  4

Because i0/4 is a ceilDiv, it will ended up having extent 2.
When we project the extent from i0/4 and 4 back, the combined extent will be 8, instead of 5.

When we allocate buffer, we want the buffer to use allocation sizes (buffer size should be 8). But the tensor we produce should have size that's faithful to its logical domain sizes (tensor size should be 5). Otherwise, later consumer of the tensor would assert on mismatch tensor sizes.

TODO

Naoya requested more unit tests in comment

github-actions · 2025-09-18T21:15:30Z

Review updated until commit 89b2f1c

Description

Fix extent projection for non-divisible splits
Ensure logical tensor sizes remain correct
Add test validation for logical sizes
Fix typo in test case name

Changes walkthrough 📝

Relevant files

Bug fix

allocations.cpp `Fix tensor shape and stride computation` csrc/runtime/allocations.cpp Moved `prod *= tensor_new_shape[i]` after stride assignment to compute correct cumulative product Added comment explaining ceilDiv padding in split operations Sliced tensor to input extent when extent mismatch occurs during backward projection Updated shape in `as_strided` call to use correct input extent for non-divisible splits	+12/-1

Tests

test_layout_op.cpp `Improve logical size correctness test` tests/cpp/test_layout_op.cpp Reenabled test asserting output tensor logical size Added equality check between input and output tensors Removed outdated comment about slicing workaround Verified strides reflect padding via outer dimension	+4/-7

Miscellaneous

test_low_precision_recipe.cpp `Fix test name typo` tests/cpp/test_low_precision_recipe.cpp Fixed typo in test name: 'Ouput' → 'Output'	+1/-1

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
⚡ Recommended focus areas for review Possible Issue The order of operations in the stride computation loop was changed, which may affect the correctness of the computed strides. The original code updated the product after setting the stride, while the new code updates it before. This could lead to incorrect stride values. for (int i = static_cast<int>(tensor_new_shape.size()) - 1; i >= 0; --i) { tensor_new_strides[i] = prod; prod = tensor_new_shape[i]; Logic Gap* The slicing logic for non-divisible splits is applied in both the contiguous and non-contiguous cases, but the extent used for slicing in the non-contiguous case comes from the evaluated input extent. It should be verified that this value is correctly propagated and used in all scenarios, especially when the tensor has complex layout transformations. int64_t in_extent = ee_.evaluate(in->extent()).as<int64_t>(); if (areDimsToBeMergedContiguous(tensor_, new_shape)) { tensor_ = tensor_.view(new_shape); // slice for non-divisible split if (in_extent != tensor_.size(left)) { tensor_ = tensor_.slice(left, 0, in_extent); } } else { auto [tensor_new_shape, tensor_new_strides] = getShapeAndStrideAfterDimMerged(tensor_, new_shape); // slice for non-divisible split tensor_new_shape[left] = in_extent;

clangformat

jjsjann123 · 2025-09-18T21:41:31Z

I'm keeping this as draft for now until I add more tests per @naoyam 's request.

Co-authored-by: Priya Mishra <[email protected]>

1. refactor buffer allocation buffer to use allocation domain, intead of logical domain. 2. fixing projection from allocation to logical special path when projection is not possible: We now compute correct extent instead of returning the allocation buffer as-is, this allows that layout op to return a tensor with the correct logical size, while still allocating a large enough buffer to accommodate the padding requirement.

This PR changes the projection to ensure extent for non-divisible split on backward projection maintains the correct extent, instead of including padded sizes from the two split IDs. e.g. with ID i0 (extent 5), if we do split by 4, we'll get two new ID i0 (extent 5) / \ i0/4 4 Because i0/4 is a ceilDiv, it will ended up having extent 2. When we project the extent from i0/4 and 4 back, the combined extent will be 8, instead of 5. When we allocate buffer, we want the buffer to use codomain sizes (buffer size should be 5). But the tensor we produce should have size that's faithful to its logical domain sizes. Otherwise, later consumer of the tensor would assert on mismatch tensor sizes.

jjsjann123 added 2 commits September 18, 2025 11:18

PR0: Relax assert on non-device split on allocation domain

fe90fb5

relaxing the check

a0df5e9

jjsjann123 changed the title ~~Fixing non-divisible split handling in allocation logic~~ Fix non-divisible split handling in allocation logic Sep 18, 2025

jjsjann123 changed the title ~~Fix non-divisible split handling in allocation logic~~ Fix allocation logic: non-divisible split Sep 18, 2025

This was referenced Sep 18, 2025

Support Split between logical domain to allocation domain to represent padding #5184

Draft

Fix allocation logic: unconnected alloc/logical #5185

Merged

Adding test validating vectorization

5097533

clangformat

jjsjann123 force-pushed the jj/allocation_for_layout_op_PR_1 branch from c64d299 to 33d0ce3 Compare September 18, 2025 21:40

jjsjann123 force-pushed the jj/allocation_non_divisible_split_PR_2 branch from 8fe76bd to 82fdd9b Compare September 18, 2025 21:40

jjsjann123 requested a review from protonu September 18, 2025 21:41

jjsjann123 mentioned this pull request Sep 18, 2025

Patch allocation logic to produce outputs with correct logical size #5170

Closed

jjsjann123 and others added 4 commits September 19, 2025 00:37

renaming

d4b7c8b

clangformat

4b07e79

I think it's working now!

051fc9e

clangformat

bf85c0b

jjsjann123 force-pushed the jj/allocation_non_divisible_split_PR_2 branch from 82fdd9b to beab9a3 Compare September 19, 2025 22:54

jjsjann123 force-pushed the jj/allocation_for_layout_op_PR_1 branch from 33d0ce3 to 17df15a Compare September 19, 2025 22:55

jjsjann123 added 2 commits September 22, 2025 14:37

quick patch

6ff1050

Merge remote-tracking branch 'origin/main' into HEAD

f2f43be

jjsjann123 force-pushed the jj/allocation_for_layout_op_PR_1 branch from 17df15a to f9acfc3 Compare September 22, 2025 21:50

jjsjann123 force-pushed the jj/allocation_non_divisible_split_PR_2 branch from beab9a3 to 30c1640 Compare September 22, 2025 21:52

jjsjann123 and others added 7 commits September 22, 2025 15:11

fix clearing allocation domain on cache for cacheBefore

a32f54b

revert changes

cf6e609

updating tests

b303923

i was dumb as always

17dbf23

why is it so hard for me

c2a3aeb

Apply suggestions from code review

1a156be

Co-authored-by: Priya Mishra <[email protected]>

clangformat

2081d0c

jjsjann123 added 4 commits September 22, 2025 21:46

Merge branch 'main' into jj/allocation_PR_0

6f674aa

missed one line

89b2f1c

jjsjann123 force-pushed the jj/allocation_for_layout_op_PR_1 branch from f9acfc3 to 87afb60 Compare September 23, 2025 17:39

jjsjann123 force-pushed the jj/allocation_non_divisible_split_PR_2 branch from 30c1640 to 89b2f1c Compare September 23, 2025 17:40

jjsjann123 force-pushed the jj/allocation_for_layout_op_PR_1 branch 2 times, most recently from c64d299 to ea3cd68 Compare October 1, 2025 20:17

Base automatically changed from jj/allocation_for_layout_op_PR_1 to main October 7, 2025 23:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix allocation logic: non-divisible split #5186

Fix allocation logic: non-divisible split #5186

jjsjann123 commented Sep 18, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 18, 2025 •

edited

Loading

Uh oh!

jjsjann123 commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix allocation logic: non-divisible split #5186

Are you sure you want to change the base?

Fix allocation logic: non-divisible split #5186

Conversation

jjsjann123 commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Stacked PRs

This PR

TODO

Uh oh!

github-actions bot commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes walkthrough 📝

PR Reviewer Guide 🔍

Uh oh!

jjsjann123 commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jjsjann123 commented Sep 18, 2025 •

edited

Loading

github-actions bot commented Sep 18, 2025 •

edited

Loading