forked from checkedc/checkedc-clang
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-decl overhaul (including inline struct fixes) #657
Merged
Merged
Changes from 50 commits
Commits
Show all changes
66 commits
Select commit
Hold shift + click to select a range
eeab8d1
Remove `static` when separating an inline struct definition.
mattmccutchen-cci b7a119a
Use rewriteSourceRange instead of R.RemoveText for its extra
mattmccutchen-cci dcf9da0
Assign record names and use them in extractBaseType.
mattmccutchen-cci 4bf6ff9
Insert assigned names into unnamed struct definitions. (Draft)
mattmccutchen-cci 8184556
Remove all the old workarounds for unnamed structs.
mattmccutchen-cci c2ea286
XFAIL inline_anon_structs.c until I get around to updating it.
mattmccutchen-cci 17de215
Get the correct form of the base type for multi-decl vars with unchanged
mattmccutchen-cci 9dc4968
Stop mkString from producing the extra space in `int * x[10]`.
mattmccutchen-cci fb819ef
Minimal fix to detect inline structs nested in other structs so
mattmccutchen-cci 20b23b8
Use the `EndLoc` variable like I meant to.
mattmccutchen-cci 9af2fe9
When a struct is split from a multi-decl, de-nest it too.
mattmccutchen-cci 500d870
Multi-decl rewriting should not try to split a struct forward
mattmccutchen-cci 52db941
New multi-decl detector + typedef multi-decl support.
mattmccutchen-cci a4c4ad1
Update _existing_ tests in inline_anon_structs.c.
mattmccutchen-cci 843cad0
Fix de-nesting of unnamed TagDecls and improve some comments.
mattmccutchen-cci 755f651
Add some tests of the new features.
mattmccutchen-cci faf5006
Actually lift the constraint of typedef multi-decls to wild: oops.
mattmccutchen-cci d3a48d6
Fix the check for a typedef that exactly matches the inline struct type.
mattmccutchen-cci 8839ad1
Add tests of typedef multi-decl rewriting without inline structs.
mattmccutchen-cci 6bbf9ca
Remove remaining DeclStmt code (WIP)
mattmccutchen-cci 51e777a
Exclude ParmVarDecl from multi-decl detection (needs more thought)
mattmccutchen-cci 5f69ae8
More WIP on removing obsolete code
mattmccutchen-cci c91ba7b
Comment about K&R parameters
mattmccutchen-cci ff1ae35
Comment about outside references to unnamed TagDecls.
mattmccutchen-cci ccdc611
Make basic_checks.c pass for now.
mattmccutchen-cci 4de64f7
Attempt to fix StructInit condition. It still isn't working in some
mattmccutchen-cci 6032aaf
Scribble
mattmccutchen-cci 6fb03df
My change has (correctly) removed initializers from static structs in
mattmccutchen-cci d926bd6
macro_end_of_decl.c: Change the expected output for g1 back to match the
mattmccutchen-cci 56d7303
Straightening out multi-decl source range issues (snapshot)
mattmccutchen-cci 3666d95
doDeclRewrite tweak preserves a macro for an equals sign.
mattmccutchen-cci cd6cccb
Fix null dereference in buildItypeDecl
mattmccutchen-cci e8ad64c
Work around compiler bug affecting partial_checked.c test.
mattmccutchen-cci 8d1735a
Comment updates.
mattmccutchen-cci ae9e314
Don't generate a multi-decl of all decls with invalid source locations.
mattmccutchen-cci 740390e
Comment on the two changes I decided not to make:
mattmccutchen-cci e6a0d4d
Document BaseTypeRenamed.
mattmccutchen-cci e89ece7
More documentation improvements.
mattmccutchen-cci d75748f
Remove the mkString fixes from this PR.
mattmccutchen-cci 77145eb
Comment about -alltypes affecting _Ptr in inline_anon_structs.c test.
mattmccutchen-cci d3220b2
Add some tests to itypes_for_extern.c.
mattmccutchen-cci 86b1acc
More tests.
mattmccutchen-cci 74685c6
Even more tests.
mattmccutchen-cci 2073f33
Merge branch 'main' of github.com:correctcomputation/checkedc-clang i…
mattmccutchen-cci 13c0d77
Fix #include sorting in MultiDecls.h per clang-tidy.
mattmccutchen-cci 0d578bc
Merge branch 'main' of github.com:correctcomputation/checkedc-clang i…
mattmccutchen-cci e594faa
Address some review comments from Kyle.
mattmccutchen-cci c9888f2
Address review comments from John on 2021-09-29.
mattmccutchen-cci 220234c
Fix bugs with automatic naming of structs in shared header files.
mattmccutchen-cci 0c2b38c
Fix TagDecl de-nesting when the outer TagDecl is preceded by qualifiers.
mattmccutchen-cci d3de950
Use PrintingPolicy to exclude any initializer when printing a VarDecl.
mattmccutchen-cci f7ae158
Minor refactor.
mattmccutchen-cci f18cc23
Merge branch 'main' of github.com:correctcomputation/checkedc-clang i…
mattmccutchen-cci d437bad
Update regression tests whose output was improved by the merge of #714.
mattmccutchen-cci 4c85580
Cite newly filed issue #739 for Rewriter::InsertText* wrappers.
mattmccutchen-cci 5ed5ad3
Add regression test for TagDecl de-nesting with outer qualifier.
mattmccutchen-cci eea9b94
Add test of automatic struct naming with multiple translation units.
mattmccutchen-cci 36e262a
Merge branch 'main' of github.com:correctcomputation/checkedc-clang i…
mattmccutchen-cci ab2d1a5
Some cleanup of TODO and REVIEW comments.
mattmccutchen-cci 0bbfe7f
Avoid duplicate code in multivardecls.c and itypes_for_extern.c.
mattmccutchen-cci a83daa6
Cite new issue #741 re unnecessary initializers for global variables.
mattmccutchen-cci bc9f3bc
Cite PR #723 that currently tracks the _Ptr source location bug.
mattmccutchen-cci 2a58731
Convert some TODOs and assertions to placeholder non-fatal assertions.
mattmccutchen-cci 31b82a6
Update some more comments to cite issues.
mattmccutchen-cci aff9f80
clang-format code changed in the PR.
mattmccutchen-cci f7ff59f
More comment fixes.
mattmccutchen-cci File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,218 @@ | ||
//=--MultiDecls.h-------------------------------------------------*- C++-*-===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
// Code to deal with "multi-decls": constructs in which one or more identifiers | ||
// are declared in a comma-separated list based on a single type "on the left". | ||
// A simple example: | ||
// | ||
// struct my_struct x, *p; | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef LLVM_CLANG_3C_MULTIDECLS_H | ||
#define LLVM_CLANG_3C_MULTIDECLS_H | ||
|
||
#include "clang/3C/PersistentSourceLoc.h" | ||
#include "clang/AST/Decl.h" | ||
#include "clang/AST/RecursiveASTVisitor.h" | ||
#include "clang/AST/Type.h" | ||
#include "llvm/ADT/Optional.h" | ||
|
||
using namespace clang; | ||
|
||
// Some more information about multi-decls in the context of 3C: | ||
// | ||
// The "members" of a given multi-decl may be ordinary variables (VarDecls), | ||
// struct/union fields (FieldDecls), or typedefs (TypedefDecls), but all members | ||
// of a given multi-decl are of the same kind. | ||
// | ||
// If the "left type" of a multi-decl is a TagDecl, it may have an inline | ||
// definition; if it does, then the TagDecl may be unnamed. Examples: | ||
// | ||
// struct my_struct { int *y; } x, *p; | ||
// struct { int *y; } x, *p; | ||
// | ||
// Multi-decls (especially those with inline TagDecls) have historically been | ||
// tricky for 3C to rewrite. If the type of one member becomes a _Ptr (or | ||
// similar), then the left type of the members is no longer the same, so the | ||
// multi-decl must be broken up, for example: | ||
// | ||
// struct my_struct x; | ||
// _Ptr<struct my_struct> p; | ||
// | ||
// To keep the logic simpler, if 3C needs to change the type of at least one | ||
// member of a multi-decl, it breaks up all members of the multi-decl into | ||
// separate declarations. To preserve SourceLocations as much as possible and | ||
// avoid interfering with rewrites to any other constructs in the multi-decl | ||
// (e.g., within existing initializer expressions), this breakup is performed by | ||
// replacing the commas with semicolons in place and inserting additional | ||
// occurrences of the left type and any common qualifiers as needed. | ||
// | ||
// If there is an inline TagDecl, it is separated too and moved out of any | ||
// containing RecordDecl to avoid a compiler warning, and if the TagDecl is | ||
// unnamed, it is given an automatically generated name so that it can be | ||
// referenced by the new, separate declarations of the multi-decl members. | ||
// Example: | ||
// | ||
// static struct { int *y; } x, *p: | ||
// | ||
// -> | ||
// | ||
// struct x_struct_1 { _Ptr<int> y; }; | ||
// static struct x_struct_1 x; | ||
// static _Ptr<struct x_struct_1> p; | ||
// | ||
// Exception: In a typedef multi-decl, if the _first_ member refers to the | ||
// TagDecl itself (not a pointer to it, etc.), then 3C uses that name for the | ||
// TagDecl rather than generating a new one. This produces nicer output for the | ||
// idiom: | ||
// | ||
// typedef struct { int *y; } FOO, *PFOO; | ||
// | ||
// -> | ||
// | ||
// typedef struct { _Ptr<int> y; } FOO; | ||
// typedef _Ptr<FOO> PFOO; | ||
// | ||
// The multi-decl code is used even for "multi-decls" of VarDecls, FieldDecls, | ||
// or TypedefDecls that have only a single member to avoid having to maintain a | ||
// separate code path for them. But a multi-decl always has at least one member; | ||
// a pure TagDecl such as `struct my_struct { int *y; };` is _not_ considered a | ||
// multi-decl. ParmVarDecls are handled differently. In fact, ParmVarDecls with | ||
// inline TagDecls are known to be handled poorly, but that's a rare and poor | ||
// practice and it's not easy to handle them better. | ||
|
||
// Currently, we automatically generate a name for every unnamed TagDecl defined | ||
// in a multi-decl and use the name in ConstraintVariables, but we only insert | ||
// the name into the definition if the multi-decl gets rewritten for some other | ||
// reason. This solves the common case of allowing the types of all the | ||
// multi-decl members to refer to the TagDecl, but it doesn't address cases in | ||
// which 3C might need to insert a reference to the unnamed TagDecl elsewhere | ||
// even if the multi-decl isn't being rewritten. In these cases, 3C typically | ||
// uses the generated name even though it is not defined, causing a compile | ||
// error that the user has to correct manually. The problematic cases include: | ||
// | ||
// - Type argument insertion. TypeVariableEntry has a check for | ||
// `isTypeAnonymous`, but it has at least one bug (it misses double pointers). | ||
// | ||
// - Cast insertion, potentially. I was unable to find an example, but that | ||
// doesn't mean it will never happen, especially with future changes to the | ||
// code. | ||
// | ||
// - Typedef itype insertion. | ||
// | ||
// One approach to try to rule out all of these bugs at once would be to | ||
// preemptively rewrite all multi-decls containing unnamed TagDecls, but those | ||
// changes might be undesirable or could even cause errors in the presence of | ||
// macros, etc. Or we could try to add the necessary code so that insertion of a | ||
// reference to an unnamed TagDecl would trigger insertion of the name into the | ||
// definition. For now, we don't deal with the problem. | ||
|
||
// Implementation note: The Clang AST does not represent multi-decls explicitly | ||
// (except in functions, where they are represented by DeclStmts). In other | ||
// contexts, we detect them based on the property that the beginning | ||
// SourceLocation of all the members is the same. And as long as we are making | ||
// this assumption, we use it in functions too rather than having a separate | ||
// code path that looks for DeclStmts. | ||
|
||
// NamedDecl is the nearest common superclass of all Decl subtypes that can be | ||
// multi-decl members. There is no enforcement that a MultiDeclMemberDecl is | ||
// actually one of the allowed subtypes, so use of the MultiDeclMemberDecl | ||
// typedef serves as documentation only. (If we wanted to enforce it, we'd need | ||
// a wrapper object of some kind, which currently seems to be more trouble than | ||
// it's worth.) | ||
typedef NamedDecl MultiDeclMemberDecl; | ||
|
||
// Returns D if it can be a multi-decl member, otherwise null. | ||
MultiDeclMemberDecl *getAsMultiDeclMember(Decl *D); | ||
|
||
// Helpers to cope with the different APIs to do corresponding things with a | ||
// TypedefDecl or DeclaratorDecl. | ||
QualType getTypeOfMultiDeclMember(MultiDeclMemberDecl *MMD); | ||
TypeSourceInfo *getTypeSourceInfoOfMultiDeclMember(MultiDeclMemberDecl *MMD); | ||
|
||
struct MultiDeclInfo { | ||
// The TagDecl that is defined inline in the multi-decl and needs to be split | ||
// from it during rewriting, if any, otherwise null. In a case like | ||
// `typedef struct { ... } T`, there is an inline tag definition but we don't | ||
// need to split it out, so this will be null. | ||
TagDecl *TagDefToSplit = nullptr; | ||
|
||
// True if the base type was an unnamed TagDecl defined inline for which we | ||
// are using a new name. Note that TagDefToSplit can be nonnull and | ||
// BaseTypeRenamed can be false if the inline TagDecl was named, and the | ||
// reverse can occur in the `typedef struct { ... } T` case. | ||
bool BaseTypeRenamed = false; | ||
|
||
// The members of the multi-decl in their original order. | ||
std::vector<MultiDeclMemberDecl *> Members; | ||
|
||
// Set by DeclRewriter::rewriteMultiDecl after it rewrites the entire | ||
// multi-decl to ensure that it doesn't try to do so more than once if | ||
// multiple members needed changes. | ||
// REVIEW: A design argument could be made that this flag doesn't belong here | ||
// and the rewriter should instead keep a visited set or something like that. | ||
bool AlreadyRewritten = false; | ||
}; | ||
|
||
struct TUMultiDeclsInfo { | ||
// All multi-decls, keyed by the common beginning source location of their | ||
// members. Note that the beginning source location of TagDefToSplit may be | ||
// later if there is a keyword such as `static` or `typedef` in between. | ||
std::map<SourceLocation, MultiDeclInfo> MultiDeclsByBeginLoc; | ||
|
||
// Map from a tag definition to its containing multi-decl (if it is part of | ||
// one). Note that the TagDefToSplit of the MultiDeclInfo is not guaranteed to | ||
// equal the TagDecl: it may be null in the `typedef struct { ... } T` case. | ||
// | ||
// Note that the MultiDeclInfo pointers remain valid for as long as the | ||
// MultiDeclInfo objects remain in MultiDeclsByBeginLoc: see | ||
// https://en.cppreference.com/w/cpp/container#Iterator_invalidation. | ||
std::map<TagDecl *, MultiDeclInfo *> ContainingMultiDeclOfTagDecl; | ||
}; | ||
|
||
class ProgramMultiDeclsInfo { | ||
private: | ||
// Set of TagDecl names already used at least once in the program, so we can | ||
// avoid colliding with them. | ||
std::set<std::string> UsedTagNames; | ||
|
||
// Information about an originally unnamed tag definition in a multi-decl for | ||
// which we're using a new name. | ||
struct RenamedTagDefInfo { | ||
// The new string that should be used to refer to the type of the TagDecl. | ||
// Unlike UsedTagNames, this includes the tag kind keyword (such as | ||
// `struct`), except when we use an existing typedef (which doesn't require | ||
// a tag keyword). | ||
std::string AssignedTypeStr; | ||
// Whether the TagDecl should be split from the multi-decl. True except when | ||
// we use an existing typedef. | ||
bool ShouldSplit; | ||
}; | ||
|
||
// Map from PSL of a TagDecl to its RenamedTagDefInfo, to ensure that we | ||
// handle the TagDecl consistently when 3C naively rewrites the same header | ||
// file multiple times as part of different translation units (see | ||
// https://github.com/correctcomputation/checkedc-clang/issues/374#issuecomment-804283984). | ||
std::map<PersistentSourceLoc, RenamedTagDefInfo> RenamedTagDefs; | ||
|
||
std::map<ASTContext *, TUMultiDeclsInfo> TUInfos; | ||
|
||
// Recursive helpers. | ||
void findUsedTagNames(DeclContext *DC); | ||
void findMultiDecls(DeclContext *DC, ASTContext &Context); | ||
|
||
public: | ||
void findUsedTagNames(ASTContext &Context); | ||
void findMultiDecls(ASTContext &Context); | ||
llvm::Optional<std::string> getTypeStrOverride(const Type *Ty, | ||
const ASTContext &C); | ||
MultiDeclInfo *findContainingMultiDecl(MultiDeclMemberDecl *MMD); | ||
MultiDeclInfo *findContainingMultiDecl(TagDecl *TD); | ||
bool wasBaseTypeRenamed(Decl *D); | ||
}; | ||
|
||
#endif // LLVM_CLANG_3C_MULTIDECLS_H |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC
VisitedMultiDeclMembers
in theDeclRewriter
class does approximately this. I don't have a strong opinion on if the information should be tracked there or here, but do you have a reason for adding this field whenVisitedMultiDeclMembers
already existed?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VisitedMultiDeclMembers
is removed in this PR. So shall I leaveAlreadyRewritten
here or remove it in favor of aVisitedMultiDecls
analogous to the previousVisitedMultiDeclMembers
?