Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match pattern fragment clarification #36069

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

rebloor
Copy link
Contributor

@rebloor rebloor commented Sep 27, 2024

Description

Clarifies how fragments are handled in response to #14018

Related issues and pull requests

Fixes #14018

@rebloor rebloor added the Content:WebExt WebExtensions docs label Sep 27, 2024
@rebloor rebloor requested a review from Rob--W September 27, 2024 04:25
@rebloor rebloor self-assigned this Sep 27, 2024
@rebloor rebloor requested a review from a team as a code owner September 27, 2024 04:25
@github-actions github-actions bot added the size/s [PR only] 6-50 LoC changed label Sep 27, 2024
Copy link
Contributor

github-actions bot commented Sep 27, 2024

Preview URLs

External URLs (1)

URL: /en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns
Title: Match patterns

(comment last updated: 2024-11-22 02:47:25)

@rebloor rebloor requested a review from dotproto October 10, 2024 22:41
@rebloor
Copy link
Contributor Author

rebloor commented Nov 15, 2024

@Rob--W @dotproto reminder that I'm looking for a review on this one

@@ -342,6 +342,10 @@ The special value `<all_urls>` matches all URLs under any of the supported schem
<td><code>https://mozilla.org</code></td>
<td>No path.</td>
</tr>
<tr>
<td><code>https://mozilla.org/#section1</code></td>
<td>Contains an unsupported match character.</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a syntactically invalid match pattern, just one that never matches.

The other examples here are examples of match patterns that are rejected as a value for "match pattern" types, but this one is accepted.

On the other hand, the example below with a host name is also syntactically valid but listed as invalid. In Chrome it is valid but in Firefox it is ignored.

@@ -342,6 +342,10 @@ The special value `<all_urls>` matches all URLs under any of the supported schem
<td><code>https://mozilla.org</code></td>
<td>No path.</td>
</tr>
<tr>
<td><code>https://mozilla.org/#section1</code></td>
<td>Contains an unsupported match character.</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<td>Contains an unsupported match character.</td>
<td>Contains a reference fragment, which are never included in the URLs to match against.</td>

Can probably be phrased better than this. It is not that the character is unsupported technically, but that the URL that the pattern is matched against never includes the reference fragment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rob--W made some changes to address, let me know what you think

@rebloor rebloor requested a review from Rob--W November 17, 2024 22:51
<tr>
<td><code>https://mozilla.*.org/</code></td>
<td>"*" in host must be at the start.</td>
<td>Unmatched</td>
<td>Is syntactically valid and matches in Chrome but is ignored in Firefox.</td>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern is not valid in Chrome or Firefox. I just double checked this in a test extension and Chrome logged the following warning at instillation time:

Permission 'https://mozilla.*.org/' is unknown or URL pattern is malformed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @Rob--W does this mean I misunderstood your comment? ("On the other hand, the example below with a host name is also syntactically valid but listed as invalid. In Chrome it is valid but in Firefox it is ignored.")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment at #36069 (comment) was poorly phrased. I was referring to the one with the port:

https://mozilla.org:80/
Host must not include a port number.

This is syntactically valid, but the port is ignored in Firefox due to https://bugzilla.mozilla.org/show_bug.cgi?id=1362809 and https://bugzilla.mozilla.org/show_bug.cgi?id=1468162

I'll look into fixing that... This is a long standing issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dotproto also commented on the port number example here, pointing out that there was a mismatch between the protocol and port number. So, to clarify, the statement "host must not include a port number." is incorrect? Therefore, we should:

  • add a port to the details of how the host can be specified with a note about the issues in Firefox
  • add an example to the examples with a note that the port number must match the protocol
  • remove the port example from the invalid list

Comment on lines 349 to 351
<td><code>https://mozilla.org/#section1</code></td>
<td>Unmatched</td>
<td>Contains a reference fragment: the URL that the pattern is matched against has any reference fragment removed before matching.</td>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The origin of this URL makes a bit hard understand how browsers handle this case. If your extension uses this host permission, it won't appear to work because navigating to that URL will trigger a 301 redirect to https://www.mozilla.org/.

If you replace https://mozilla.org/#section1 with https://www.mozilla.org/#section1, the behavior becomes more clear. Both Chrome and Firefox will auto-correct the requested host to https://www.mozilla.org/* and neither logs an error/warning about the fragment. The same behavior occurs observed you omit the fragment in the host permission declaration (e.g. https://www.mozilla.org/).

<td>Empty path: this should be "<code>*://*/*</code>".</td>
</tr>
<tr>
<td><code>file://*</code></td>
<td>Invalid</td>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file://* is invalid when declared in host_permissions in Firefox but works in Chrome. Chrome auto-corrects it to file:///*.

file:///* works in both Firefox and Chrome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added note re-Chrome behavior

<td>"*" in scheme must be the only character.</td>
</tr>
<tr>
<td><code>https://mozilla.org:80/</code></td>
<td>Invalid</td>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, we're in weird territory now. This pattern (https://mozilla.org:80/) doesn't work for two reasons: the aforementioned origin redirect and a mismatch between the declared port and port required by the scheme. HTTPS runs on port 443, HTTP runs on port 80.

Nether Chrome or Firefox will log an install time error/warning if an extension requests https://mozilla.org:80/. This permission is "valid" (but meaningless). You can see this in action if you use await chrome.permissions.getAll() to check the extension's current permission; the response object will look something like this: { permissions: [], origins: ["https://mozilla.org:80/*"]}.

If you modify the host permission to request a port that matches the scheme (e.g. https://www.mozilla.org:443/ or http://example.com:80/), both Chrome and Firefox will grant the host permission and the extension will be able to access data for that host.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dotproto so should we use another port, such as 8080. I assume from your comment that this was just a poorly chosen example given that 80 has a binding, or is.the statement "host must not include a port number." Incorrect

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Content:WebExt WebExtensions docs size/s [PR only] 6-50 LoC changed
Projects
None yet
3 participants