Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show String outputs illegal escape sequence \& in string literals #258

Open
stengerh opened this issue Feb 5, 2021 · 7 comments
Open

Show String outputs illegal escape sequence \& in string literals #258

stengerh opened this issue Feb 5, 2021 · 7 comments
Labels
purs-0.15 A reminder to address this issue or merge this PR before we release PureScript v0.15.0 type: breaking change A change that requires a major version bump. type: bug Something that should function correctly isn't.

Comments

@stengerh
Copy link

stengerh commented Feb 5, 2021

Description

The REPL outputs string literals containing the escape sequence \& but this escape sequence is not accepted as input.

To Reproduce

In the REPL:

> "\x0000001"
"\0\&1"

> "\0\&1"
Illegal character escape code at line 1, column 3

Expected behavior

The REPL outputs valid string literals.

Additional context

Haskell supports \& to delimit a numeric escape sequence from the following characters, see The zero-width escape sequence. This is exactly the way in which the PureScript REPL uses \& to produce canonicalized string literals. However the compiler does not actually accept this escape sequence. If it did this would provide a solution to the problem in purescript/purescript#3750.

PureScript version

0.13.8

@stengerh
Copy link
Author

stengerh commented Feb 5, 2021

I discovered this while experimenting with the PureScript plugin for IntelliJ and comparing its syntax highlighting with the behavior of the PureScript compiler/REPL. Unfortunately I could not find any documentation on the escape sequences which PureScript supports. At least not in the sections on Syntax and Differences from Haskell. The only source of truth here is the source code of the PureScript compiler itself.

This lack of documentation seems to have caused some confusion in PureScript itself and even more so in the IntelliJ plugin. I can prepare a pull request for the documentation, so other can benefit from what I learned.

@MonoidMusician
Copy link

I can confirm this still affects 0.14. Interestingly, it doesn't affect type-level symbols (at least in 0.14):

> "\x0000001"
"\0\&1"

> "\x000001"
"\1"

> data SProxy (s :: Symbol) = SProxy
> :t SProxy :: SProxy "\x0000001"
SProxy "\x0000001"

> :t SProxy :: SProxy "\x000001"
SProxy "\x000001"

> :t SProxy :: SProxy "\x00001"
SProxy "\x000001"

I'm not familiar enough with the pretty printer to diagnose/fix this.

@MonoidMusician
Copy link

Wait. I just realized that this is an issue with the Show String instance in the prelude, not the compiler. See

exports.showStringImpl = function (s) {
var l = s.length;
return "\"" + s.replace(
/[\0-\x1F\x7F"\\]/g, // eslint-disable-line no-control-regex
function (c, i) {
switch (c) {
case "\"":
case "\\":
return "\\" + c;
case "\x07": return "\\a";
case "\b": return "\\b";
case "\f": return "\\f";
case "\n": return "\\n";
case "\r": return "\\r";
case "\t": return "\\t";
case "\v": return "\\v";
}
var k = i + 1;
var empty = k < l && s[k] >= "0" && s[k] <= "9" ? "\\&" : "";
return "\\" + c.charCodeAt(0).toString(10) + empty;
}
) + "\"";
};

We should probably update it to match the compiler's output though though.

@MonoidMusician MonoidMusician transferred this issue from purescript/purescript Feb 5, 2021
@MonoidMusician MonoidMusician reopened this Feb 5, 2021
@MonoidMusician MonoidMusician changed the title REPL outputs illegal escape sequence \& in string literals Show String outputs illegal escape sequence \& in string literals Feb 5, 2021
@MonoidMusician
Copy link

Okay, I transferred it to the prelude from the compiler repo.

@stengerh
Copy link
Author

stengerh commented Feb 5, 2021

Thanks for the quick response!

I looked into this more closely in the meantime and realized the scope of the bug report was probably too narrow. It was not just the \& but also the decimal escape sequence \1. Char literals are also affected.

I cannot tell which other code would be affected by this change to the prelude. I would however prefer that the REPL pretty-printed string and char literals using PureScript escape sequences.

@JordanMartinez JordanMartinez added type: breaking change A change that requires a major version bump. type: bug Something that should function correctly isn't. purs-0.15 A reminder to address this issue or merge this PR before we release PureScript v0.15.0 labels Dec 1, 2021
@JordanMartinez
Copy link
Contributor

Correct me if I'm wrong but is prettyPrintStringJS where the compiler prints a string?

@JordanMartinez
Copy link
Contributor

Here's another place where a breaking change can be done now. How do we fix this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
purs-0.15 A reminder to address this issue or merge this PR before we release PureScript v0.15.0 type: breaking change A change that requires a major version bump. type: bug Something that should function correctly isn't.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants