Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CIRGen][TBAA] Add support for struct types #1338

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

PikachuHyA
Copy link
Collaborator

This patch introduces support for TBAA with struct types. The converted CIR_StructType is stored in TBAAStructAttr, which is then lowered using CIRToLLVMTBAAStructAttrLowering.

lowerMod->getContext().getCodeGenOpts(),
lowerMod->getContext().getLangOpts());
auto baseType =
structLower.lowerStructType(ast.getRawDecl()->getTypeForDecl());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually don't look at the raw decl, either (a) provide the AST interface methods to solve the queries you need or (b) lower that information in CIRGen and attach more data to the TBAA attribute so you don't need to do the work here. Before you make any changes I'd like to hear why you decided not to do (b) (given perhaps you didn't know about (a))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(b) lower that information in CIRGen and attach more data to the TBAA attribute so you don't need to do the work here.

For (b), my first implementation involves encoding enough information into the TBAA attribute within CIRGen. You can find the initial implementation here: #1076. After the code review, several key suggestions emerged.

The overall idea is to infer as much as possible during lowering, while only encoding hard-to-determine information in CIRGen.

A general question for this PR: why do we need to re-encode information that CIR already possesses? The goal is to lower to identical LLVM TBAA information without replicating it all in CIR. How can we better reuse existing data?

Refer to the original comments here:

In this patch, I lower TBAAStructTypeAttr based on the Clang type in Lowering.

(a) provide the AST interface methods to solve the queries you need

Regarding (a), how do we get the type name of each field in a struct from CIR_StructType? To keep the final TBAA info consistent with the original, we need the Clang type name. If using raw declarations isn't possible, we should at least encode the type name.


I have some thoughts to share; please feel free to express your insights.

  1. What do we need in lowering cir::TBAAStructTypeAttr?
    We need the type name, size, and offset of each field in the struct. This information helps us construct llvm::TBAATypeDescriptorAttr with llvm::TBAAMemberAttr. The llvm::TBAAMemberAttr carries field details, including type name, size, and offset.

  2. How do we obtain type size and field offset?
    Sample code for obtaining this information:

    // cir::LowerModule *lowerMod
    for (int i = 0; i < structType.getNumElements(); i++) {
      auto fieldType = structType.getMembers()[i];
      auto size = lowerMod->getDataLayout().layout.getTypeSize(fieldType);
      auto offset = lowerMod->getDataLayout()
                        .getStructLayout(structType)
                        ->getMemberOffsets()[i];
      llvm::errs() << "idx: " << i << ", size: " << size
                   << ", offset: " << offset << "\n";
    }
  3. How do we obtain the type name?
    It seems there's no direct way to get the type name of echo field from CIR_StructType. (the same issue when we handle scalar type). Perhaps we should encode the type name for each field in the struct. Following this idea, we can construct cir::TBAAMemberAttr with only an ID. The size and offset can be inferred from CIR_StructType.
    A potential risk is that the IDs generated by CIRGen do not match the size and offset inferred from CIR_StructType.
    For example, we have a struct A

struct A {
int a;
long b;
long long c;
}

and the CIRTBAAStructAttr is #tbaa_struct{#tbaa_member{id = int}, #tbaa_member{id = long}, #tbaa_member{id = long long}}. when lowering, we get size and offset use the method in How do we obtain type size and field offset? . I am not sure if the IDs and size/offset here can match precisely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants