Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][ABI][AArch64][Lowering] Fix calls for struct types > 128 bits #1335

Merged
merged 1 commit into from
Feb 12, 2025

Conversation

bruteforceboy
Copy link
Contributor

@bruteforceboy bruteforceboy commented Feb 11, 2025

In PR#1074 we introduced calls for struct types > 128 bits, but there's is an issue here.

This is meant to be a memcpy of the alloca instead of directly passing the alloca, just like in the OG. The PR was meant to use a memcpy and later handle cases where we don't need the memcpy.

For example, running the following code snippet tmp.c using bin/clang tmp.c -o tmp -Xclang -fclangir -Xclang -fclangir-call-conv-lowering --target=aarch64-none-linux-gnu:

#include <stdio.h>

typedef struct {
  int a, b, c, d, e;
} S;

void change(S s) { s.a = 10; }

void foo(void) {
  S s;
  s.a = 9;
  change(s);
  printf("%d\n", s.a);
}

int main(void) {
  foo();
  return 0;
}

gives 10 instead of 9, because we pass the pointer instead of a copy.

Relevant part of the OG LLVM output:

@foo()
  %s = alloca %struct.S, align 4
  %byval-temp = alloca %struct.S, align 4
  %a = getelementptr inbounds nuw %struct.S, ptr %s, i32 0, i32 0
  store i32 9, ptr %a, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %byval-temp, ptr align 4 %s, i64 20, i1 false)
  call void @change(ptr noundef %byval-temp)

Current LLVM output through CIR:

@foo()
  %1 = alloca %struct.S, i64 1, align 4
  %2 = getelementptr %struct.S, ptr %1, i32 0, i32 0
  store i32 9, ptr %2, align 4
  %3 = load %struct.S, ptr %1, align 4
  call void @change(ptr %1)

So, there should be a memcpy.

This PR fixes this, and adds a comment/note for the future cases where we need to check if the copy is not needed. I have also updated the old test with structs having size > 128.

@bruteforceboy bruteforceboy marked this pull request as ready for review February 11, 2025 08:57
@@ -1166,7 +1166,13 @@ mlir::Value LowerFunction::rewriteCallOp(const LowerFunctionInfo &CallInfo,
if (::cir::MissingFeatures::undef())
cir_cconv_unreachable("NYI");

IRCallArgs[FirstIRArg] = alloca;
// TODO(cir): add check for cases where we don't need the memcpy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit afraid on how we're gonna remember to tackle this later, any major issue that prevents it to be treated right away?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, there isn't enough information in CIR currently to determine when we don't need the copy. I plan to add these incrementally. Also, if it makes it any better, the cases where we don't need the copy and much more rarer compared to the cases where we do -:)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental is fine, I'm mostly curious about the C source that leads to the case where we don't need a copy (I'm assuming that if you made that comment you are coming from somewhere?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, the reason for the comment is here

@bcardosolopes bcardosolopes merged commit 03e6535 into llvm:main Feb 12, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants