Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A '@' character in the host part of file URLs #805

Open
hayatoito opened this issue Dec 5, 2023 · 5 comments
Open

A '@' character in the host part of file URLs #805

hayatoito opened this issue Dec 5, 2023 · 5 comments
Labels
topic: file Aren't file: URLs the best? topic: parser

Comments

@hayatoito
Copy link
Member

hayatoito commented Dec 5, 2023

(Reported in https://crbug.com/1502849)

It appears that Windows uses file URLs with '@' (U+0040) characters in their host parts, such as file://webdavserver.net@ssl/a.pdf.

However, according to my understanding, file://webdavserver.net@ssl/a.pdf is an invalid URL in the URL Standard because '@' is considered a forbidden host code point.

To ensure compatibility with Windows file URLs, should we consider allowing the '@' character in the host part of file URLs?

I'd appreciate hearing opinions of the URL Standard folks on this matter.

@annevk annevk added topic: parser topic: file Aren't file: URLs the best? labels Dec 5, 2023
@annevk
Copy link
Member

annevk commented Dec 5, 2023

It seems reasonable to allow, but I wonder if it would be possible for Chromium to determine the complete set of changes needed for it to not have platform-divergent behavior. At least I suspect that making them all at once would allow for an easier rollout.

@karwa
Copy link
Contributor

karwa commented Dec 5, 2023

A single @ in the authority section (username, password, hostname, port) generally delimits the credentials from the hostname.

Let's take any other URL scheme, e.g. HTTP: http://webdavserver.net@ssl/

  • "webdavserver.net" would be the username
  • "ssl" would be the hostname

Which is clearly not what the reporter wants to happen.

What's more, this has been the accepted interpretation for at least the last 30 years (going back to RFC-1738). I doubt many URL parsers are going to interpret file://webdavserver.net@ssl/ as having a hostname containing an @ sign, so the output of the URL parser must keep the @ escaped in order to properly encode its understanding of the URL components. file://webdavserver.net%40ssl/ is semantically correct.

I think the actual problem is that hostnames in file URLs are not able to contain percent-encoding. I looked in to this in depth a while back, and found that:

  • UNC server names can contain spaces, which obviously must be escaped. Chromium even has test cases for this (see linked issue below), so you'll probably hit this sooner or later.
  • Windows allows pre-canonicalised paths, which are expressed using UNC syntax with the hostname "?" (e.g. \\?\C\SomePath). These cannot be expressed using file URLs because the hostname would have to be %3F.

See #599

@whatwg whatwg deleted a comment from Ur100 Feb 5, 2024
@catmanjan
Copy link

catmanjan commented Sep 30, 2024

Please remove @ from the forbidden host code point list!

Given rfc1738 it was probably a mistake that it was there in the first place.

Just on this:

I doubt many URL parsers are going to interpret file://webdavserver.net@ssl/ as having a hostname containing an @ sign

Firefox, Safari, windows explorer, linux terminals all handle the URL fine, in fact its only chromium based browsers that have the issue because they want to use the URL standard as their only authority, rather than use multiple path standards...

@valenting
Copy link
Collaborator

I doubt many URL parsers are going to interpret file://webdavserver.net@ssl/ as having a hostname containing an @ sign

Firefox, Safari, windows explorer, linux terminals all handle the URL fine

Safari also rejects this URL, and the only reason it works in Firefox is that we currently ignore everything in the hostname part of a file URL (tracked in 1507354 - URL parser discards host for file URLs

Allowing @ only in the authority section of file URLs seems like a weird exception to make.
I'm in favor of keeping hostname parsing as close to the HTTP url parser as possible - and here the @ sign should probably be percent encoded.

@catmanjan
Copy link

@valenting yes I think the problem is calling them file URLs, they are URL like but ultimately the OP (file://webdavserver.net@ssl/a.pdf) is a UNC file path, and currently its just a coincidence that Chromium works for most of them...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: file Aren't file: URLs the best? topic: parser
Development

No branches or pull requests

5 participants