-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Script Enforcement Mechanism to use flags #533
Conversation
spec/index.bs
Outdated
|
||
Modify the [=The text insertion mode=] algorithm as follows: | ||
1. If <var ignore=''>parserChange</var> is false, set [=this=]'s [=HTMLScriptElement/is trusted=] to false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parserChange
is a placeholder for what we end up speccing in whatwg/dom#1288
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which issues, besides the one mentioned in #533 (comment), is this PR intended to fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When "parserChange" is false
and changed by trusted sink
is true
, couldn't still malicious code have been injected? E.g. if a trusted sink called only someScript.innerText = someScript.innerText
that'd make the untrusted code trusted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g. if a trusted sink called only
someScript.innerText = someScript.innerText
that'd make the untrusted code trusted.
That would only work if a default policy had sanctioned that value. Else the assignment would fail before the "changed by trusted sink" Boolean is set
0c7d33e
to
f22111f
Compare
<dt id="scriptEndTag">An end tag whose tag name is "script"</dt> | ||
<dd> | ||
<p>...</p> | ||
1. If [=this=]'s [=HTMLScriptElement/changed by trusted sink=] is true, set [=this=]'s [=HTMLScriptElement/is trusted=] to true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flag is used to say hey this is an API change but it's a trusted one. We unset the flag once used.
element</span> <var>script</var>. This might cause some script to execute, which might cause | ||
<span data-x="dom-document-write">new characters to be inserted into the tokenizer</span>, and | ||
might cause the tokenizer to output more tokens, resulting in a [=reentrant invocation of the parser=].</p> | ||
Issue: Need to double check how [part of script element's spec](https://html.spec.whatwg.org/#prepare-the-script-element:~:text=When%20a%20script%20element%20el%20that%20is%20not%20parser%2Dinserted%20experiences) fits into this. These steps need to happen before prepare the script is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to change the html spec when upstreaming to run the prepare the script (under relevant conditions) at the end of the children changed steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly it would rely on having a bit more information in the children changed steps algorithm if we want to inline it. Because it needs to know what type of change it is (insertion specifically in this case).
I suspect this is why some Chrome and WebKit's childrenChanged functions include more than the dom spec's algorithm. (And is why Firefox implements it in a way that also gives them this more granular informaion).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@domfarolino should probably look at this.
Also would that create issues with re-entrant invocations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would seem that whatwg/html#10188 already changes that part of the HTML spec to be defined in terms of the children changed steps so I think we'd just need to put our new steps first and then run the post-insertion steps and it'll fix the concerns I had here.
Also add SVGScriptElement to spec
f22111f
to
ac68dd7
Compare
element</span> <var>script</var>. This might cause some script to execute, which might cause | ||
<span data-x="dom-document-write">new characters to be inserted into the tokenizer</span>, and | ||
might cause the tokenizer to output more tokens, resulting in a [=reentrant invocation of the parser=].</p> | ||
Issue: Need to double check how [part of script element's spec](https://html.spec.whatwg.org/#prepare-the-script-element:~:text=When%20a%20script%20element%20el%20that%20is%20not%20parser%2Dinserted%20experiences) fits into this. These steps need to happen before prepare the script is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@domfarolino should probably look at this.
Also would that create issues with re-entrant invocations?
If children changed steps are never called by the parser then I think I can just remove the new boolean and this PR is basically ready to go. I was under the impression they would be called by the parser but I don't think they are. @domfarolino do you know off the top of your head if that's true? I know you've used them for the post connection steps work youve been doing |
https://bugs.webkit.org/show_bug.cgi?id=276426 Reviewed by NOBODY (OOPS!). Scripts now have two boolean flags rather than storing a duplicate of their contents. Spec PR: w3c/trusted-types#533 * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-innerText-expected.txt: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-innerText.html: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-midparse-appendChild-expected.txt: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-midparse-appendChild.html: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-midparse-innerHTML-expected.txt: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-midparse-innerHTML.html: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-midparse-nodeValue-expected.txt: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-midparse-nodeValue.html: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-src-expected.txt: Renamed from LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-internal-slot-expected.txt. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-src.html: Renamed from LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-internal-slot.html. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-text-expected.txt: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-text.html: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-textContent-expected.txt: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/HTMLScriptElement-textContent.html: Added. * LayoutTests/imported/w3c/web-platform-tests/trusted-types/block-Node-multiple-arguments-expected.txt: * LayoutTests/imported/w3c/web-platform-tests/trusted-types/block-text-node-insertion-into-script-element-expected.txt: * Source/WebCore/dom/CharacterData.cpp: (WebCore::canUseSetDataOptimization): * Source/WebCore/dom/ScriptElement.cpp: (WebCore::ScriptElement::childrenChanged): (WebCore::ScriptElement::prepareScript): (WebCore::ScriptElement::finishParsingChildren): Deleted. (WebCore::ScriptElement::setTrustedScriptText): Deleted. * Source/WebCore/dom/ScriptElement.h: (WebCore::ScriptElement::setChangedByTrustedSink): * Source/WebCore/html/HTMLScriptElement.cpp: (WebCore::HTMLScriptElement::setTextContent): (WebCore::HTMLScriptElement::setInnerText): (WebCore::HTMLScriptElement::finishParsingChildren): Deleted. * Source/WebCore/html/HTMLScriptElement.h: * Source/WebCore/svg/SVGScriptElement.cpp: (WebCore::SVGScriptElement::finishParsingChildren): Deleted. * Source/WebCore/svg/SVGScriptElement.h:
I've updated this PR to remove the parserChange boolean because it appears it's not needed. Spec wise the children changed steps seem to never actually get called by the parser. So parserChange is always false (aka it's always an API). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukewarlow I guess you are still waiting for a review? I checked the PR and it looks good to me, but I really think someone more familiar with the spec/code should review.
SHA: a71f29c Reason: push, by koto Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this has already landed but I took a very belated look and have at least one question
element</span> <var>script</var>. This might cause some script to execute, which might cause | ||
<span data-x="dom-document-write">new characters to be inserted into the tokenizer</span>, and | ||
might cause the tokenizer to output more tokens, resulting in a [=reentrant invocation of the parser=].</p> | ||
Note: This relies on the children changed steps never being called by the parser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we assume this? I think these steps are called indirectly by the parser, in the usual insertion path, no? And it seems that browsers expect this too: https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/html/html_script_element.cc;l=82;drc=d81683e439e0b80cac98e017d94449d59f48fa0d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the slow response I've been OoO I didn't neccesarily expect this PR to be merged until someone with more knowledge of the status quo had reviewed, apologies for not making that clearer.
So from all I could work out, the specification children changed steps are only called from APIs and NOT the parser.
If that's not true then this will need reverting.
The ChildrenChanged method implemented in Chromium and WebKit are called by the parser but that's because they aren't a 100% match for the spec.
Fwiw I did a quick prototype in Chromium and it seemed to work as expected (it was a few months back at this point so can't fully remember). I've also got a draft PR to update WebKit to use this new model too WebKit/WebKit#30649.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the slow response I've been OoO
Hah, not a problem, my delay since Anne pinged be about this was far worse.
I didn't neccesarily expect this PR to be merged until someone with more knowledge of the status quo had reviewed, apologies for not making that clearer.
Do you think we should revert it?
So from all I could work out, the specification children changed steps are only called from APIs and NOT the parser.
If that's not true then this will need reverting.
Hmm, spec-wise, I think the children changed steps are indeed called from the parser. At the very least, the HTML's adoption agency algorithm creates an element and then immediately appends it to the DOM, which goes through the https://dom.spec.whatwg.org/#concept-node-insert path, which itself calls the children changed steps.
I am really not familiar with the HTML parser, so I can't find any other paths from parser -> DOM insertion hook (which calls children changed steps), but I'm pretty sure they exist. For example, when an iframe initializes its document, it does so only as a result of its HTML insertion steps, and the only thing that calls these steps is the DOM insertion algorithm—the same thing that calls the children changed steps. So the DOM insertion algorithm must be invoked by the parser, or else I think parser-inserted iframes would never get initialized.
Based on the above @koto could we revert this? Cc @fred-wang |
If we do revert this then we still have spec work required to actually finish this which I personally don't have capacity to continue but I'm happy to help others who wish to do that. |
Yeah, we can revert this. |
Could we do to the spec, the same thing that you did with Blink and WebKit to make it work with an implementation that calls the children changed steps from the parser? I guess it seems like the work required here might actually be easier than what was originally proposed, if you already got implementations (that match the spec) working properly? |
So yeah this comes back to us needing new data passed down the children changed steps to say whether it's a parser or API change. Idk how much work that is to do, and I don't know if it's compatible with a Mozilla implementation (as WebKit and chrome are quite similar in this respect) |
Also add SVGScriptElement to spec
Fixes #483, #517
Preview | Diff