Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Generic][AIE2] Combiner for shufflevectors that use build vector #129

Open
wants to merge 13 commits into
base: vvandebe.vshuffle.impl
Choose a base branch
from

Conversation

ValentijnvdBeek
Copy link
Collaborator

Transforms a shufflevector that uses a build vector or undefined into just a build vector. This can be done is because a shuffle vector lowering is an unmerge and then merge. Since build is a merge, the merge and unmerge cancel each other out and we can just merge the vector directly.

Example:

    %1:_(s32) = COPY $r0
    %3:_(<8 x s32>) = G_IMPLICIT_DEF
    %5:_(s32) = G_IMPLICIT_DEF
    %2:_(<8 x s32>) = G_BUILD_VECTOR %1(s32), %5(s32), %5(s32), %5(s32), %5(s32), %5(s32), %5(s32), %5(s32)
    %0:_(<8 x s32>) = G_SHUFFLE_VECTOR %2(<8 x s32>), %3, shufflemask(0, 0, 0, 0, 0, 0, 0, 0)
    ===>
    %2:_(<8 x s32>) = G_BUILD_VECTOR %1(s32), %1(s32), %1(s32), %1(s32), %1(s32), %1(s32), %1(s32), %1(s32)

@ValentijnvdBeek ValentijnvdBeek added llvm:globalisel Code that modifies the Global Intruction Selection backend:aie Code that modifies AIE code vectorization Support for vector instructions llvm:instcombine Code that modifies the combiner llvm:core Modifies non-AIE specific code backend:aie2 labels Jul 19, 2024
@ValentijnvdBeek ValentijnvdBeek self-assigned this Jul 19, 2024
Copy link
Collaborator

@konstantinschwarz konstantinschwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits, combiner looks good to me

return true;
default:
return false;
}
}
};

/// Represents a G_SHUFFLE_VECTOR
class GShuffleVector : public GMergeLikeInstr {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

G_SHUFFLE_VECTOR takes the shufflemask as the last operand, which shouldn't be part of the getNumSources I think?
To me that's what makes G_SHUFFLE_VECTOR different compared to the other "merge like" instructions.
I'd rather not make it part of that group, as this can have surprising effects to other existing combines

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that argument. I was looking at it from the perspective of what the result of the function would be. I'll separate them, thanks.

// Our inputs need to be either be build vectors or undefined, register inputs
// break this optimization. You could maybe do something clever were you
// concatenate vectors to save half a build vector.
if ((SrcInstr1 == 0 && IsUndef1 == 0) || (SrcInstr2 == 0 && IsUndef2 == 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are pointers, just use (!SrcInstr1 && !IsUndef1) || ...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do. I personally prefer explicit NULL pointer checking for clarity (which, to be fair, this isn't yet), but doing it like that is fine with me.

; CHECK: [[DEF:%[0-9]+]]:_(<16 x s32>) = G_IMPLICIT_DEF
; CHECK-NEXT: $x0 = COPY [[DEF]](<16 x s32>)
; CHECK-NEXT: PseudoRET implicit $lr, implicit $x0
%0:_(s32) = G_CONSTANT i32 28
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: unused G_CONSTANTs here and in the tests below

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, thanks Konstantin.

%5:_(s32) = G_IMPLICIT_DEF
%2:_(<8 x s32>) = G_BUILD_VECTOR %1(s32), %5(s32), %5(s32), %5(s32), %5(s32), %5(s32), %5(s32), %5(s32)
%0:_(<8 x s32>) = G_SHUFFLE_VECTOR %2(<8 x s32>), %3, shufflemask(0, 0, 0, 0, 0, 0, 0, 0)
PseudoRET implicit $lr, implicit %0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is missing the closing ... at the end, and then I think this would fail, because this is further combined into G_AIE_BROADCAST?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shufflevector branch has become really divergent in the past month or so, so that hasn't landed yet. Let's try to get all of that merged as quickly as we can, I'll manually merge them to make sure that spectests are updated before enter the main branch.

Missing the closing ... is deliberate, trailing ellipses have caused errors with the unit tests scripts for me before.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran into it again just now. If you have a trailing ellipsis, you need a trailing newline, or otherwise the update mir script will error out. If you skip it, then it doesn't matter whether you have a newline or not. That is why I tend to commit mine without the trailing ellipsis.

Transforms a shufflevector that uses a build vector or undefined
into just a build vector. This can be done is because a shuffle
vector lowering is an unmerge and then merge. Since build is a
merge, the merge and unmerge cancel each other out and we can just
merge the vector directly.

Example:
```
    %1:_(s32) = COPY $r0
    %3:_(<8 x s32>) = G_IMPLICIT_DEF
    %5:_(s32) = G_IMPLICIT_DEF
    %2:_(<8 x s32>) = G_BUILD_VECTOR %1(s32), %5(s32), %5(s32), %5(s32), %5(s32), %5(s32), %5(s32), %5(s32)
    %0:_(<8 x s32>) = G_SHUFFLE_VECTOR %2(<8 x s32>), %3, shufflemask(0, 0, 0, 0, 0, 0, 0, 0)
    ===>
    %2:_(<8 x s32>) = G_BUILD_VECTOR %1(s32), %1(s32), %1(s32), %1(s32), %1(s32), %1(s32), %1(s32), %1(s32)
```
Comment on lines 297 to 299
select_to_minmax, or_to_bsp, combine_concat_vector,
commute_constant_to_rhs]> {
commute_constant_to_rhs, shufflevector_merge]> {
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed since some ARM64 tests relies on the legalizer changing the inputs of the shufflevector and if you don't run them afterwards you get worse code.

; CHECK-NEXT: mov.h v0[1], v1[0]
; CHECK-NEXT: mov w8, #0 ; =0x0
; CHECK-NEXT: fmov s0, w8
; CHECK-NEXT: mov.16b v1, v0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add copyright header!

@ValentijnvdBeek ValentijnvdBeek force-pushed the vvandebe.vshuffle.impl branch 3 times, most recently from aec1600 to d1d0a3a Compare August 15, 2024 13:35
def shufflevector_merge_matchinfo : GIDefMatchData<"SmallVector<Register, 8>">;
def shufflevector_merge : GICombineRule<
(defs root:$d, shufflevector_merge_matchinfo:$info),
(match (wip_match_opcode G_SHUFFLE_VECTOR): $d,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A recent move in LLVM is that you shouldn't use wip_match_opcode anymore since it slows down compilation.

https://llvm.org/docs/GlobalISel/MIRPatterns.html#gallery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:aie Code that modifies AIE code backend:aie2 llvm:core Modifies non-AIE specific code llvm:globalisel Code that modifies the Global Intruction Selection llvm:instcombine Code that modifies the combiner vectorization Support for vector instructions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants