-
Notifications
You must be signed in to change notification settings - Fork 15.6k
Reapply "[C++20][Modules] Implement P1857R3 Modules Dependency Discovery" #173130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ery" This reverts commit 2b8b305.
…hit the end of TokenLexer Signed-off-by: Wang, Yihan <[email protected]>
Signed-off-by: Wang, Yihan <[email protected]>
|
@llvm/pr-subscribers-clang-modules @llvm/pr-subscribers-clang Author: None (yronglin) ChangesThis PR reapply #107168. Patch is 141.89 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/173130.diff 44 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 994ac444d4aa1..abc6dab2f0614 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -207,6 +207,7 @@ C++20 Feature Support
- Clang now normalizes constraints before checking whether they are satisfied, as mandated by the standard.
As a result, Clang no longer incorrectly diagnoses substitution failures in template arguments only
used in concept-ids, and produces better diagnostics for satisfaction failure. (#GH61811) (#GH135190)
+- Clang now supports `P1857R3 <https://wg21.link/p1857r3>`_ Modules Dependency Discovery. (#GH54047)
C++17 Feature Support
^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/docs/StandardCPlusPlusModules.rst b/clang/docs/StandardCPlusPlusModules.rst
index 71988d0fced98..f6ab17ede46fa 100644
--- a/clang/docs/StandardCPlusPlusModules.rst
+++ b/clang/docs/StandardCPlusPlusModules.rst
@@ -1384,33 +1384,6 @@ declarations which use it. Thus, the preferred name will not be displayed in
the debugger as expected. This is tracked by
`#56490 <https://github.com/llvm/llvm-project/issues/56490>`_.
-Don't emit macros about module declaration
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-This is covered by `P1857R3 <https://wg21.link/P1857R3>`_. It is mentioned here
-because we want users to be aware that we don't yet implement it.
-
-A direct approach to write code that can be compiled by both modules and
-non-module builds may look like:
-
-.. code-block:: c++
-
- MODULE
- IMPORT header_name
- EXPORT_MODULE MODULE_NAME;
- IMPORT header_name
- EXPORT ...
-
-The intent of this is that this file can be compiled like a module unit or a
-non-module unit depending on the definition of some macros. However, this usage
-is forbidden by P1857R3 which is not yet implemented in Clang. This means that
-is possible to write invalid modules which will no longer be accepted once
-P1857R3 is implemented. This is tracked by
-`#54047 <https://github.com/llvm/llvm-project/issues/54047>`_.
-
-Until then, it is recommended not to mix macros with module declarations.
-
-
Inconsistent filename suffix requirement for importable module units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/clang/include/clang/Basic/DiagnosticLexKinds.td b/clang/include/clang/Basic/DiagnosticLexKinds.td
index a72d3f37b1b72..77feea9f869e9 100644
--- a/clang/include/clang/Basic/DiagnosticLexKinds.td
+++ b/clang/include/clang/Basic/DiagnosticLexKinds.td
@@ -503,8 +503,8 @@ def warn_cxx98_compat_variadic_macro : Warning<
InGroup<CXX98CompatPedantic>, DefaultIgnore;
def ext_named_variadic_macro : Extension<
"named variadic macros are a GNU extension">, InGroup<VariadicMacros>;
-def err_embedded_directive : Error<
- "embedding a #%0 directive within macro arguments is not supported">;
+def err_embedded_directive : Error<"embedding a %select{#|C++ }0%1 directive "
+ "within macro arguments is not supported">;
def ext_embedded_directive : Extension<
"embedding a directive within macro arguments has undefined behavior">,
InGroup<DiagGroup<"embedded-directive">>;
@@ -998,6 +998,21 @@ def warn_module_conflict : Warning<
InGroup<ModuleConflict>;
// C++20 modules
+def err_pp_module_name_is_macro : Error<
+ "%select{module|partition}0 name component %1 cannot be a object-like macro">;
+def err_pp_module_expected_ident : Error<
+ "expected %select{identifier after '.' in |}0module name">;
+def err_pp_unexpected_tok_after_module_name : Error<
+ "unexpected preprocessing token '%0' after module name, "
+ "only ';' and '[' (start of attribute specifier sequence) are allowed">;
+def warn_pp_extra_tokens_at_module_directive_eol
+ : Warning<"extra tokens after semicolon in '%0' directive">,
+ InGroup<ExtraTokens>;
+def err_pp_module_decl_in_header
+ : Error<"module declaration must not come from an #include directive">;
+def err_pp_cond_span_module_decl
+ : Error<"module directive lines are not allowed on lines controlled "
+ "by preprocessor conditionals">;
def err_header_import_semi_in_macro : Error<
"semicolon terminating header import declaration cannot be produced "
"by a macro">;
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 662fe16d965b6..83d4ce3ca278c 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1792,10 +1792,8 @@ def ext_bit_int : Extension<
} // end of Parse Issue category.
let CategoryName = "Modules Issue" in {
-def err_unexpected_module_decl : Error<
- "module declaration can only appear at the top level">;
-def err_module_expected_ident : Error<
- "expected a module name after '%select{module|import}0'">;
+def err_unexpected_module_or_import_decl : Error<
+ "%select{module|import}0 declaration can only appear at the top level">;
def err_attribute_not_module_attr : Error<
"%0 attribute cannot be applied to a module">;
def err_keyword_not_module_attr : Error<
@@ -1806,6 +1804,10 @@ def err_keyword_not_import_attr : Error<
"%0 cannot be applied to a module import">;
def err_module_expected_semi : Error<
"expected ';' after module name">;
+def err_expected_semi_after_module_or_import
+ : Error<"%0 directive must end with a ';'">;
+def note_module_declared_here : Note<
+ "%select{module|import}0 directive defined here">;
def err_global_module_introducer_not_at_start : Error<
"'module;' introducing a global module fragment can appear only "
"at the start of the translation unit">;
diff --git a/clang/include/clang/Basic/IdentifierTable.h b/clang/include/clang/Basic/IdentifierTable.h
index 043c184323876..1131727ed23ee 100644
--- a/clang/include/clang/Basic/IdentifierTable.h
+++ b/clang/include/clang/Basic/IdentifierTable.h
@@ -231,6 +231,10 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
LLVM_PREFERRED_TYPE(bool)
unsigned IsModulesImport : 1;
+ // True if this is the 'module' contextual keyword.
+ LLVM_PREFERRED_TYPE(bool)
+ unsigned IsModulesDecl : 1;
+
// True if this is a mangled OpenMP variant name.
LLVM_PREFERRED_TYPE(bool)
unsigned IsMangledOpenMPVariantName : 1;
@@ -267,8 +271,9 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
IsCPPOperatorKeyword(false), NeedsHandleIdentifier(false),
IsFromAST(false), ChangedAfterLoad(false), FEChangedAfterLoad(false),
RevertedTokenID(false), OutOfDate(false), IsModulesImport(false),
- IsMangledOpenMPVariantName(false), IsDeprecatedMacro(false),
- IsRestrictExpansion(false), IsFinal(false), IsKeywordInCpp(false) {}
+ IsModulesDecl(false), IsMangledOpenMPVariantName(false),
+ IsDeprecatedMacro(false), IsRestrictExpansion(false), IsFinal(false),
+ IsKeywordInCpp(false) {}
public:
IdentifierInfo(const IdentifierInfo &) = delete;
@@ -569,12 +574,24 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
}
/// Determine whether this is the contextual keyword \c import.
- bool isModulesImport() const { return IsModulesImport; }
+ bool isImportKeyword() const { return IsModulesImport; }
/// Set whether this identifier is the contextual keyword \c import.
- void setModulesImport(bool I) {
- IsModulesImport = I;
- if (I)
+ void setKeywordImport(bool Val) {
+ IsModulesImport = Val;
+ if (Val)
+ NeedsHandleIdentifier = true;
+ else
+ RecomputeNeedsHandleIdentifier();
+ }
+
+ /// Determine whether this is the contextual keyword \c module.
+ bool isModuleKeyword() const { return IsModulesDecl; }
+
+ /// Set whether this identifier is the contextual keyword \c module.
+ void setModuleKeyword(bool Val) {
+ IsModulesDecl = Val;
+ if (Val)
NeedsHandleIdentifier = true;
else
RecomputeNeedsHandleIdentifier();
@@ -629,7 +646,7 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
void RecomputeNeedsHandleIdentifier() {
NeedsHandleIdentifier = isPoisoned() || hasMacroDefinition() ||
isExtensionToken() || isFutureCompatKeyword() ||
- isOutOfDate() || isModulesImport();
+ isOutOfDate() || isImportKeyword();
}
};
@@ -797,10 +814,11 @@ class IdentifierTable {
// contents.
II->Entry = &Entry;
- // If this is the 'import' contextual keyword, mark it as such.
+ // If this is the 'import' or 'module' contextual keyword, mark it as such.
if (Name == "import")
- II->setModulesImport(true);
-
+ II->setKeywordImport(true);
+ else if (Name == "module")
+ II->setModuleKeyword(true);
return *II;
}
diff --git a/clang/include/clang/Basic/TokenKinds.def b/clang/include/clang/Basic/TokenKinds.def
index 3d955095b07a8..a3d286fdb81a7 100644
--- a/clang/include/clang/Basic/TokenKinds.def
+++ b/clang/include/clang/Basic/TokenKinds.def
@@ -133,6 +133,11 @@ PPKEYWORD(pragma)
// C23 & C++26 #embed
PPKEYWORD(embed)
+// C++20 Module Directive
+PPKEYWORD(module)
+PPKEYWORD(__preprocessed_module)
+PPKEYWORD(__preprocessed_import)
+
// GNU Extensions.
PPKEYWORD(import)
PPKEYWORD(include_next)
@@ -1030,6 +1035,9 @@ ANNOTATION(module_include)
ANNOTATION(module_begin)
ANNOTATION(module_end)
+// Annotations for C++, Clang and Objective-C named modules.
+ANNOTATION(module_name)
+
// Annotation for a header_name token that has been looked up and transformed
// into the name of a header unit.
ANNOTATION(header_unit)
diff --git a/clang/include/clang/Basic/TokenKinds.h b/clang/include/clang/Basic/TokenKinds.h
index a801113c57715..c0316257d9d97 100644
--- a/clang/include/clang/Basic/TokenKinds.h
+++ b/clang/include/clang/Basic/TokenKinds.h
@@ -76,6 +76,10 @@ const char *getPunctuatorSpelling(TokenKind Kind) LLVM_READNONE;
/// tokens like 'int' and 'dynamic_cast'. Returns NULL for other token kinds.
const char *getKeywordSpelling(TokenKind Kind) LLVM_READNONE;
+/// Determines the spelling of simple Objective-C keyword tokens like '@import'.
+/// Returns NULL for other token kinds.
+const char *getObjCKeywordSpelling(ObjCKeywordKind Kind) LLVM_READNONE;
+
/// Returns the spelling of preprocessor keywords, such as "else".
const char *getPPKeywordSpelling(PPKeywordKind Kind) LLVM_READNONE;
diff --git a/clang/include/clang/Frontend/CompilerInstance.h b/clang/include/clang/Frontend/CompilerInstance.h
index a8e8461b9b5a9..42ef3ea7b355a 100644
--- a/clang/include/clang/Frontend/CompilerInstance.h
+++ b/clang/include/clang/Frontend/CompilerInstance.h
@@ -893,7 +893,7 @@ class CompilerInstance : public ModuleLoader {
/// load it.
ModuleLoadResult findOrCompileModuleAndReadAST(StringRef ModuleName,
SourceLocation ImportLoc,
- SourceLocation ModuleNameLoc,
+ SourceRange ModuleNameRange,
bool IsInclusionDirective);
/// Creates a \c CompilerInstance for compiling a module.
diff --git a/clang/include/clang/Lex/CodeCompletionHandler.h b/clang/include/clang/Lex/CodeCompletionHandler.h
index bd3e05a36bb33..2ef29743415ae 100644
--- a/clang/include/clang/Lex/CodeCompletionHandler.h
+++ b/clang/include/clang/Lex/CodeCompletionHandler.h
@@ -13,12 +13,15 @@
#ifndef LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
#define LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
+#include "clang/Basic/IdentifierTable.h"
+#include "clang/Basic/SourceLocation.h"
#include "llvm/ADT/StringRef.h"
namespace clang {
class IdentifierInfo;
class MacroInfo;
+using ModuleIdPath = ArrayRef<IdentifierLoc>;
/// Callback handler that receives notifications when performing code
/// completion within the preprocessor.
@@ -70,6 +73,11 @@ class CodeCompletionHandler {
/// file where we expect natural language, e.g., a comment, string, or
/// \#error directive.
virtual void CodeCompleteNaturalLanguage() { }
+
+ /// Callback invoked when performing code completion inside the module name
+ /// part of an import directive.
+ virtual void CodeCompleteModuleImport(SourceLocation ImportLoc,
+ ModuleIdPath Path) {}
};
}
diff --git a/clang/include/clang/Lex/DependencyDirectivesScanner.h b/clang/include/clang/Lex/DependencyDirectivesScanner.h
index f9fec3998ca53..b21da166a96e5 100644
--- a/clang/include/clang/Lex/DependencyDirectivesScanner.h
+++ b/clang/include/clang/Lex/DependencyDirectivesScanner.h
@@ -135,6 +135,22 @@ void printDependencyDirectivesAsSource(
ArrayRef<dependency_directives_scan::Directive> Directives,
llvm::raw_ostream &OS);
+/// Scan an input source buffer for C++20 named module usage.
+///
+/// \param Source The input source buffer.
+///
+/// \returns true if any C++20 named modules related directive was found.
+bool scanInputForCXX20ModulesUsage(StringRef Source);
+
+/// Scan an input source buffer, and check whether the input source is a
+/// preprocessed output.
+///
+/// \param Source The input source buffer.
+///
+/// \returns true if any '__preprocessed_module' or '__preprocessed_import'
+/// directive was found.
+bool isPreprocessedModuleFile(StringRef Source);
+
/// Functor that returns the dependency directives for a given file.
class DependencyDirectivesGetter {
public:
diff --git a/clang/include/clang/Lex/ModuleLoader.h b/clang/include/clang/Lex/ModuleLoader.h
index a58407200c41c..042a5ab1f4a57 100644
--- a/clang/include/clang/Lex/ModuleLoader.h
+++ b/clang/include/clang/Lex/ModuleLoader.h
@@ -159,6 +159,7 @@ class ModuleLoader {
/// \returns Returns true if any modules with that symbol found.
virtual bool lookupMissingImports(StringRef Name,
SourceLocation TriggerLoc) = 0;
+ static std::string getFlatNameFromPath(ModuleIdPath Path);
bool HadFatalFailure = false;
};
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index b1c648e647f41..c8356b1dd45e4 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -48,6 +48,7 @@
#include "llvm/Support/Allocator.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/Registry.h"
+#include "llvm/Support/TrailingObjects.h"
#include <cassert>
#include <cstddef>
#include <cstdint>
@@ -136,6 +137,64 @@ struct CXXStandardLibraryVersionInfo {
std::uint64_t Version;
};
+/// Record the previous 'export' keyword info.
+///
+/// Since P1857R3, the standard introduced several rules to determine whether
+/// the 'module', 'export module', 'import', 'export import' is a valid
+/// directive introducer. This class is used to record the previous 'export'
+/// keyword token, and then handle 'export module' and 'export import'.
+class ExportContextualKeywordInfo {
+ Token ExportTok;
+ bool AtPhysicalStartOfLine = false;
+
+public:
+ ExportContextualKeywordInfo() = default;
+ ExportContextualKeywordInfo(const Token &Tok, bool AtPhysicalStartOfLine)
+ : ExportTok(Tok), AtPhysicalStartOfLine(AtPhysicalStartOfLine) {}
+
+ bool isValid() const { return ExportTok.is(tok::kw_export); }
+ bool isAtPhysicalStartOfLine() const { return AtPhysicalStartOfLine; }
+ Token getExportTok() const { return ExportTok; }
+ void reset() {
+ ExportTok.startToken();
+ AtPhysicalStartOfLine = false;
+ }
+};
+
+class ModuleNameLoc final
+ : llvm::TrailingObjects<ModuleNameLoc, IdentifierLoc> {
+ friend TrailingObjects;
+ unsigned NumIdentifierLocs;
+ unsigned numTrailingObjects(OverloadToken<IdentifierLoc>) const {
+ return getNumIdentifierLocs();
+ }
+
+ ModuleNameLoc(ModuleIdPath Path) : NumIdentifierLocs(Path.size()) {
+ (void)llvm::copy(Path, getTrailingObjectsNonStrict<IdentifierLoc>());
+ }
+
+public:
+ static ModuleNameLoc *Create(Preprocessor &PP, ModuleIdPath Path);
+ unsigned getNumIdentifierLocs() const { return NumIdentifierLocs; }
+ ModuleIdPath getModuleIdPath() const {
+ return {getTrailingObjectsNonStrict<IdentifierLoc>(),
+ getNumIdentifierLocs()};
+ }
+
+ SourceLocation getBeginLoc() const {
+ return getModuleIdPath().front().getLoc();
+ }
+ SourceLocation getEndLoc() const {
+ auto &Last = getModuleIdPath().back();
+ return Last.getLoc().getLocWithOffset(
+ Last.getIdentifierInfo()->getLength());
+ }
+ SourceRange getRange() const { return {getBeginLoc(), getEndLoc()}; }
+ std::string str() const {
+ return ModuleLoader::getFlatNameFromPath(getModuleIdPath());
+ }
+};
+
/// Engages in a tight little dance with the lexer to efficiently
/// preprocess tokens.
///
@@ -339,8 +398,9 @@ class Preprocessor {
/// lexed, if any.
SourceLocation ModuleImportLoc;
- /// The import path for named module that we're currently processing.
- SmallVector<IdentifierLoc, 2> NamedModuleImportPath;
+ /// The source location of the \c module contextual keyword we just
+ /// lexed, if any.
+ SourceLocation ModuleDeclLoc;
llvm::DenseMap<FileID, SmallVector<const char *>> CheckPoints;
unsigned CheckPointCounter = 0;
@@ -351,6 +411,12 @@ class Preprocessor {
/// Whether the last token we lexed was an '@'.
bool LastTokenWasAt = false;
+ /// Whether we're importing a standard C++20 named Modules.
+ bool ImportingCXXNamedModules = false;
+
+ /// Whether the last token we lexed was an 'export' keyword.
+ ExportContextualKeywordInfo LastTokenWasExportKeyword;
+
/// First pp-token source location in current translation unit.
SourceLocation FirstPPTokenLoc;
@@ -562,9 +628,9 @@ class Preprocessor {
reset();
}
- void handleIdentifier(IdentifierInfo *Identifier) {
- if (isModuleCandidate() && Identifier)
- Name += Identifier->getName().str();
+ void handleModuleName(ModuleNameLoc *NameLoc) {
+ if (isModuleCandidate() && NameLoc)
+ Name += NameLoc->str();
else if (!isNamedModule())
reset();
}
@@ -576,13 +642,6 @@ class Preprocessor {
reset();
}
- void handlePeriod() {
- if (isModuleCandidate())
- Name += ".";
- else if (!isNamedModule())
- reset();
- }
-
void handleSemi() {
if (!Name.empty() && isModuleCandidate()) {
if (State == InterfaceCandidate)
@@ -639,10 +698,6 @@ class Preprocessor {
ModuleDeclSeq ModuleDeclState;
- /// Whether the module import expects an identifier next. Otherwise,
- /// it expects a '.' or ';'.
- bool ModuleImportExpectsIdentifier = false;
-
/// The identifier and source location of the currently-active
/// \#pragma clang arc_cf_code_audited begin.
IdentifierLoc PragmaARCCFCodeAuditedInfo;
@@ -1125,6 +1180,9 @@ class Preprocessor {
/// Whether tokens are being skipped until the through header is seen.
bool SkippingUntilPCHThroughHeader = false;
+ /// Whether the main file is preprocessed module file.
+ bool MainFileIsPreprocessedModuleFile = false;
+
/// \{
/// Cache of macro expanders to reduce malloc traffic.
enum { TokenLexerCacheSize = 8 };
@@ -1778,6 +1836,36 @@ class Preprocessor {
std::optional<LexEmbedParametersResult> LexEmbedParameters(Token &Current,
bool ForHasEmbed);
+ /// Whether the main file is preprocessed module file.
+ bool isPreprocessedModuleFile() const {
+ return MainFileIsPreprocessedModuleFile;
+ }
+
+ /// Mark the main file as a preprocessed module file, then the 'module' and
+ /// 'import' directive recognition will be suppressed. Only
+ /// '__preprocessed_moduke' and '__preprocessed_import' are allowed.
+ void markMainFileAsPreprocessedModuleFile() {
+ MainFileIsPreprocessedModuleFile = true;
+ }
+
+ bool LexModuleNameContinue(Token &Tok, SourceLocation UseLoc,
+ SmallVectorImpl<Token> &Suffix,
+ SmallVectorImpl<IdentifierLoc> &Path,
+ bool AllowMacroExpansion = true,
+ bool IsPartition = false);
+ void EnterModuleSuffixTokenStream(ArrayRef<Token> Toks);
+ void HandleCXXImportDirective(Token Import);
+ void HandleCXXModuleDirective(Token Module);
+
+ /// Callback invoked when the lexer sees one of export, import or module token
+ /// at the start of a line.
+ ///
+ /// This consumes the...
[truncated]
|
|
@ilovepi Could you help verify whether this branch can resolve this crash issue? Many thanks! |
|
Please see my (request, not a demand) to hold off on this until NY: #107168 (comment) |
|
@aemerson are you find landing this if @yronglin watches over the bots? @yronglin Did you run msan stage 2 builds locally? |
This PR reapply #107168.