-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot capture default port in a named group #234
Comments
The spec itself seems to be correct, at least the port part is passed without any default port handling in the match algorithm https://urlpattern.spec.whatwg.org/#url-pattern-match. I'd say it's more like an implementation problem. Filed a bug for chromoium crbug.com/363027641. I confirmed this happens on chromium and also the polyfill library. The default port is removed when processing |
Sorry, I found a problematic part in the spec. In the step 12.2.2.1 of match, |
Removing default ports from URLs is something that the URL API always does: new URL('http://example.com:80').port
// '' |
So I think this is just a consequence of segment wildcards not matching empty strings. On the other hand, this pattern does work and captures an empty string for the port: new URLPattern('http://example.com::port(.*)') these are also options: new URLPattern('http://example.com::port?')
new URLPattern('http://example.com::port*') though if you truly want everything, you can just use |
Yes, there are workarounds. It’s probably better to not canonicalize away the default port because a “null port” in the URL spec does not semantically mean “no port” for special schemes. |
another option would be to forbid name tokens in the protocol and port. This would make parsing considerably simpler without losing much in the way of expressiveness. |
Which special schemes, specifically? For all but " new URL('http://example.com:80/').port
// ''
new URL('http://example.com:80/').toString()
// 'http://example.com/'
What do we gain by doing so (assuming, for the sake of argument, that it would be web-compatible to)? e.g., this seems potentially useful and consistent with how other components work: new URLPattern({protocol: 'web\\+:webprotocol'}).exec('web+foo://foo').protocol.groups
// {webprotocol: 'foo'} |
By a "null port" I do mean an unspecified port. as per the spec:
Removing the port makes sense when you're creating a canonicalizing a concrete URL and you prefer brevity. It does not make sense for matching. I'd expect the meaning of
I originally thought you example would not work since The main advantage is that you can infer that the initial colon is the protocol delimiter. e.g. The reason I don't think it makes sense to put capture groups in protocol or port is because these have atomic meaning. That is the scheme is |
Re. port canonicalization: there are fundamentally a few options here, right:
On the web I would expect this distinction to come up less, since I suspect almost all URLs to be http/https/ws/wss URLs anyway, especially the case of Re. segment identifier in port and protocol, I agree that it would be unusual to do so, but there is some value in as much consistency between different components as possible, if only for comprehension. I'm a little bit more amenable to the idea of making it a difference only in the shortest syntax for expressing it in the string shorthand (e.g., maybe you have to write
I'm not sure I'll have the time soon to make such an exploration, so in practice someone else would have to dig into it (I may be able to help gathering field data from Chrome if we get something that seems likely to be web-compatible and want to confirm) if you want to make a change in that direction. |
The more I look at this, the more I think that the issue is that we can't base any logic on That is, the step compute protocol matches a special scheme flag is conceptually impossible because that depends on the input to
This would mean that
This causes the behavior to change based on whether the pattern has a special protocol.
I think I'm fully behind this one. It seems like it has the fewest corner cases.
This seems impossibly strange. For instance, Part of this issue is also, as mentioned, that "just a consequence of segment wildcards not matching empty strings". Was this intentional? If so, why? |
What is the issue with the URL Pattern Standard?
Using a
:<name>
token to capture the port of a URL fails if the URL is using the default port for that protocol (either explicitly or implicitly). Tested in Chrome.The text was updated successfully, but these errors were encountered: