-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CIR][CIRGen][TBAA] Add support for struct types #1338
base: main
Are you sure you want to change the base?
Conversation
lowerMod->getContext().getCodeGenOpts(), | ||
lowerMod->getContext().getLangOpts()); | ||
auto baseType = | ||
structLower.lowerStructType(ast.getRawDecl()->getTypeForDecl()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually don't look at the raw decl, either (a) provide the AST interface methods to solve the queries you need or (b) lower that information in CIRGen and attach more data to the TBAA attribute so you don't need to do the work here. Before you make any changes I'd like to hear why you decided not to do (b) (given perhaps you didn't know about (a))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(b) lower that information in CIRGen and attach more data to the TBAA attribute so you don't need to do the work here.
For (b), my first implementation involves encoding enough information into the TBAA attribute within CIRGen. You can find the initial implementation here: #1076. After the code review, several key suggestions emerged.
The overall idea is to infer as much as possible during lowering, while only encoding hard-to-determine information in CIRGen.
A general question for this PR: why do we need to re-encode information that CIR already possesses? The goal is to lower to identical LLVM TBAA information without replicating it all in CIR. How can we better reuse existing data?
Refer to the original comments here:
- [CIR][CIRGen][TBAA] Add support for TBAA #1076 (comment)
- [CIR][CIRGen][TBAA] Add support for TBAA #1076 (comment)
In this patch, I lower TBAAStructTypeAttr
based on the Clang type in Lowering.
(a) provide the AST interface methods to solve the queries you need
Regarding (a), how do we get the type name of each field in a struct from CIR_StructType
? To keep the final TBAA info consistent with the original, we need the Clang type name. If using raw declarations isn't possible, we should at least encode the type name.
I have some thoughts to share; please feel free to express your insights.
-
What do we need in lowering
cir::TBAAStructTypeAttr
?
We need the type name, size, and offset of each field in the struct. This information helps us constructllvm::TBAATypeDescriptorAttr
withllvm::TBAAMemberAttr
. Thellvm::TBAAMemberAttr
carries field details, including type name, size, and offset. -
How do we obtain type size and field offset?
Sample code for obtaining this information:// cir::LowerModule *lowerMod for (int i = 0; i < structType.getNumElements(); i++) { auto fieldType = structType.getMembers()[i]; auto size = lowerMod->getDataLayout().layout.getTypeSize(fieldType); auto offset = lowerMod->getDataLayout() .getStructLayout(structType) ->getMemberOffsets()[i]; llvm::errs() << "idx: " << i << ", size: " << size << ", offset: " << offset << "\n"; }
-
How do we obtain the type name?
It seems there's no direct way to get the type name of echo field fromCIR_StructType
. (the same issue when we handle scalar type). Perhaps we should encode the type name for each field in the struct. Following this idea, we can constructcir::TBAAMemberAttr
with only an ID. The size and offset can be inferred fromCIR_StructType
.
A potential risk is that the IDs generated by CIRGen do not match the size and offset inferred from CIR_StructType.
For example, we have a struct A
struct A {
int a;
long b;
long long c;
}
and the CIRTBAAStructAttr is #tbaa_struct{#tbaa_member{id = int}, #tbaa_member{id = long}, #tbaa_member{id = long long}}
. when lowering, we get size and offset use the method in How do we obtain type size and field offset? . I am not sure if the IDs and size/offset here can match precisely.
This patch introduces support for TBAA with struct types. The converted
CIR_StructType
is stored inTBAAStructAttr
, which is then lowered usingCIRToLLVMTBAAStructAttrLowering
.