-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct 2 x grammar rules for compilation unit name in #line #1120
base: draft-v8
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to expand the changes here, or spin of a new issue/PR to fix more.
@@ -1488,12 +1488,12 @@ fragment PP_Line_Indicator | |||
; | |||
|
|||
fragment PP_Compilation_Unit_Name | |||
: '"' PP_Compilation_Unit_Name_Character+ '"' | |||
: '"' PP_Compilation_Unit_Name_Character* '"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check this is by design before making this change. Regardless of the answer there is probably some work to do:
- §14.2 Compilation units defines a compilation unit, the definition does not include them having names… or being files… However the example (non-normative) uses two files
A.cs
andB.cs
and refers to them as two compilation units. - §22.5.6.3 The CallerFilePath attribute which provides the file path (which is implementation-dependent) states “The file path may be affected by
#line
directives ([§6.5.8]”. - Here in §6.5.8 the
#line
allows the setting of the “compilation unit name” - So:
- §6.5.8 states compilation units have names, which is omitted in §14.2; and
- §22.5.6.3 tells us that the name is the file path, but leaves what that is implementation-dependent
What already exists isn’t overly clear and this change seeks to allow the compilation unit name to be the empty string, which is probably not a valid implementation-dependent path on any implementation… So if this
observed compiler behaviour is by design then it surely needs to have a defined meaning in the Standard.
If this change is to be made, and even if not, this all needs to tided up – either in this PR or spin it all off into a new one.
I might ask what the intended use of an empty file path/compilation unit name is but I might know – it was requested by the NSA so that the file names in NSA distributed software is not leaked and so endanger National Security! I’m only partially joking here, but that’s a story for another time ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BillWagner told me that allowing an empty string was indeed a conscious decision, so I propose keeping that edit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
§6.1 Programs states:
A C# program consists of one or more source files, known formally as compilation units (§14.2). Although a compilation unit might have a one-to-one correspondence with a file in a file system, such correspondence is not required.
I propose appending to this, the following:
… As such, the accepted spelling of a compilation unit name, and its mapping, if any, to a filename is outside the scope of this specification.
I'm deliberately avoiding using any of the following terms:
- behavior, implementation-defined – unspecified behavior where each implementation documents how the choice is made
- behavior, undefined – behavior, upon use of a non-portable or erroneous construct or of erroneous data, for which this specification imposes no requirements
- behavior, unspecified – behavior where this specification provides two or more possibilities and imposes no further requirements on which is chosen in any instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be fine with that extra text - Rex, do you think it's worth adding that to this PR, so we can merge it all in one go?
; | ||
|
||
fragment PP_Compilation_Unit_Name_Character | ||
// Any Input_Character except " | ||
: ~('\u000D' | '\u000A' | '\u0085' | '\u2028' | '\u2029' | '#') | ||
: ~('\u000D' | '\u000A' | '\u0085' | '\u2028' | '\u2029' | '"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The #
was a typo so should be fixed…
However §22.5.6.3 defines the format of the compilation unit name/file path as “implementation-dependent”. So this section might need a semantic rule saying this arbitrary string must conform to the same implementation-dependent rules §22.5.6.3, or that is does not need to (i.e. not be valid as a file path).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... I'm not sure. It feels like it would be okay to allow values which aren't valid filenames, when specifying the compilation unit name directly in code, even if the CallerFilePathAttribute could never automatically generate such a name. (Indeed, I can see some cases where that would even be useful!)
I think this is speaking in favor of having a semantic rule saying "that it does not need to".
Anecdotally, I observe that Roslyn is okay with this:
#line 100 ":invalid:"
and even:
#line 100 ".."
(Interestingly, for the latter, it reports any subsequent error as belonging to the parent directory of the directory containing the file - it doesn't report it as ".." verbatim...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only just tried code with the former (":invalid:"
) with code containing errors after it: Roslyn crashes:
error MSB6006: "csc.dll" exited with code 1.
(But if there are no warnings/errors that need reporting, it's fine...)
closing and reopening to rerun the CI builds. |
6.5.8 Line directives, states
Note the presence of "reported": there is no requirement that such a reported name map to anything in a file system, so why not allow an empty string name? |
From the 2024-09-04 TG2 call: There appear to be 2 issues:
After a short discussion, Rex agreed to take another look. |
Having looked at this again, I don't have more to add, and I've re-added the "Meeting discuss" label. |
Good catch, @logeshkumars0604!
This PR addresses Issue #1118.
The " vs. # error was introduced in V6 when we converted to the ANTLR grammar notation.
The ability to have an empty filename was never tested, but as you point out, it does work, so I have made the name contents zero-or-more characters, instead of one-or-more. I tested this using
#line 300 ""
along withCallerFilePathAttribute
.