Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode ^ in pathname #607

Open
TimothyGu opened this issue May 21, 2021 · 10 comments · May be fixed by #846
Open

Encode ^ in pathname #607

TimothyGu opened this issue May 21, 2021 · 10 comments · May be fixed by #846
Labels
addition/proposal New features or enhancements needs tests Moving the issue forward requires someone to write tests topic: parser

Comments

@TimothyGu
Copy link
Member

u = new URL('http://abc.com/a^b');
console.log(u.pathname);

This gives "a%5Eb" in Chrome and Firefox, in addition to Go and Node.js. Ruby's URI fails to parse the URL with ^, but is fine with %5E. However, the spec and Safari don't escape ^ at the moment.

Shall we escape ^ in paths? This will cause U+005E (^) to be moved from the userinfo set to path set.

@annevk
Copy link
Member

annevk commented May 21, 2021

It seems to depend on "is special" in Chrome and Firefox, which isn't ideal.

@TimothyGu
Copy link
Member Author

I mean, Chrome and Firefox don't even escape spaces (or anything not in the C0 controls set) in non-special paths…

@mnot
Copy link
Member

mnot commented May 22, 2021

3986 defines path segments to contain these characters:

pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
              / "*" / "+" / "," / ";" / "="

That doesn't include ^, so it needs to be percent-encoded.

@TimothyGu
Copy link
Member Author

@achristensen07 Are you okay with aligning Safari on this?

@achristensen07
Copy link
Collaborator

I think so. This is a case where Chrome and Firefox have the same behavior, so aligning with them would likely increase compatibility. It looks like the most compatible solution depends on "is special"
Like I said in issue 608, we really need complete tests with each ASCII code point in each part of a URL with and without a special scheme.

@alwinb
Copy link
Contributor

alwinb commented May 23, 2021

FYI. The discussion around issue #379 has a good overview of the percent encode sets.

@karwa
Copy link
Contributor

karwa commented Nov 12, 2021

FWIW, I've been looking at interoperability between this standard and the URL type in Apple's Foundation framework (which I assume would also be of interest to WebKit). It is documented as conforming to RFC-1738.

The biggest difficulty in getting Foundation to parse the serialised output of this standard is the difference in percent-encode sets. This makes it harder for applications to transition to a web-compatible URL model, as converting to a Foundation URL means adding percent-encoding, so the serialised URL string changes. Anything which minimises that would be appreciated, and if it is actually a better description of how browsers behave, it seems like a no-brainer.

That said, if Safari currently does not encode it, and Chrome/Firefox conditionalise it, and neither of them "broke the web", it seems reasonable to conclude that few if any sites actually care whether it is encoded or not. In that case, the better choice IMO would be to unconditionally encode it and align with RFC-3986 as a bonus. Conditional percent-encode sets are awful.

@annevk annevk added topic: parser needs tests Moving the issue forward requires someone to write tests addition/proposal New features or enhancements labels Mar 6, 2023
@annevk
Copy link
Member

annevk commented Dec 2, 2024

test://test/^: https://jsdom.github.io/whatwg-url/#url=dGVzdDovL3Rlc3QvXg==&base=YWJvdXQ6Ymxhbms=
test:^: https://jsdom.github.io/whatwg-url/#url=dGVzdDpe&base=YWJvdXQ6Ymxhbms=
https://test/^: https://jsdom.github.io/whatwg-url/#url=aHR0cHM6Ly90ZXN0L14=&base=YWJvdXQ6Ymxhbms=

It seems Gecko branches on "is special", but Chromium doesn't. Chromium branches on "opaque path", which seems more reasonable?

I would be willing to align WebKit and the standard with Chromium here (assuming that @achristensen07 is still on board). @hayatoito @valenting what do you think?

@valenting
Copy link
Collaborator

I am strongly in favor of doing this. Aligning the standard with Chromium seems like a good path forward.

@hayatoito
Copy link
Member

Sounds good to me.

I checked the Chromium's implementation. As you noticed, the current implementation doesn't branch whether "is special" or "is not special" on percent encoding of path. I don't have any intention here; I didn't know this is not Standard compliant until now.

annevk added a commit that referenced this issue Dec 3, 2024
@annevk annevk linked a pull request Dec 3, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements needs tests Moving the issue forward requires someone to write tests topic: parser
Development

Successfully merging a pull request may close this issue.

8 participants