Defer non-critical JS for slightly faster page load under CPU+Network simultaneous bottleneck #4798

evnchn · 2025-05-26T21:01:01Z

Motivation

Still in the mood of speed improvement.

So I was reading https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/script again...

Basically, loading behaviour wise, there's 3 valid options:

Normal script
Defer script
Async script

(Don't think about defer async. It has been mentioned that if async is set, "If the attribute is specified with the defer attribute, the element will act as if only the async attribute is specified", meaning async + defer => async).

I then took a look at the script situation in NiceGUI:

No script in head
Body's empty, no elements
Main script is a <script type="module">, which is "defer by default"

Then I thought: Can I have the rest of the NiceGUI's scripts be defer as well, since they need to run before the main script only, and not any earlier. We don't need to block up the parser while those scripts load?

And yes, indeed, there is performance uplift.

Implementation

Change the scripts from normal scripts to defer scripts.
- Left the ES module shim in-tact since that probably has to load ASAP.
- Left Socket.IO in-tact because we'd have to move it anyways if we adapt Move Socket.IO initialization all the way to head + Pre-Mount Message Queueing #4756
- Both instances, maybe defer would make it even faster, or it would break things. I'll see when I have time.

Progress

I chose a meaningful title that completes the sentence: "If applied, this PR will..."
The implementation is complete.
Pytests have been added (or are not necessary).
Documentation has been added (or is not necessary).

-> No need pytest and documentation. As long as existing tests don't break, Id be happy

Results showcase

Using Fast 4G profile, 4x CPU slowdown.

Before:

After:

Finish: 6.47s -> 5.99s
DOMContentLoaded: 4.69s -> 4.38s
Load: 5.69s -> 5.18s

About save 5s.

Reference: 4G network latency 165ms. I am not saying that we are saving 3x the network latency, but it just puts things into perspective.

Worthy of note:

No performance uplift if network is not slowed down and only CPU slowdown, since the network request would complete in an instant, and we didn't really block the parser for much.
No performance uplift if CPU is not slowed down and only network slowdown, since the parsing would complete in an instant, and we can't block the parser if completes in a blink of the eye.
Only notice performance uplift if you apply BOTH network slowdown and CPU slowdown.

May want to test with more slowdown and Slow 4G later. 3G's too much.

evnchn · 2025-05-27T02:34:15Z

Found where the speed improvement was, after staring at the waterfall.

falkoschindler · 2025-05-27T05:37:08Z

Interesting! I mean, under good conditions you won't notice a difference. But there doesn't seem to be any risk in adding defer, right? Or could it break anything (or cause worse load times somehow)?

ChatGPT's risk assessment sounds promissing:

What the PR actually does

The patch just adds the boolean defer attribute to every classic <script src="…"> tag that NiceGUI injects into templates/index.html (7 adds, 7 dels).
Nothing else changes.

How browsers handle defer (classic scripts)

Phase Without defer With defer

HTML parsing stops until the script file has been fetched & executed keeps going while the file is fetched

Script execution immediately, at the point the tag is met after parsing finishes, but before DOMContentLoaded

Execution order top-to-bottom, because the parser blocks still top-to-bottom (the spec keeps order for all deferred scripts)

Performance impact

First HTML parse becomes non-blocking → a small win, especially on mobile networks with high RTT.

No extra network requests are added; the files were already being downloaded in parallel thanks to HTTP/2.

Runtime cost is unchanged; execution merely shifts a few milliseconds later.

Real-world micro-benchmarks in the PR description show slightly faster Time to First Byte + First Contentful Paint under a throttled “slow-3G/low-CPU” profile; there was no measurable regression.

Recommendation

Unless you have:

a script that must run synchronously while the parser is blocked, or

a requirement to support browsers older than IE 11,

there is no functional risk in merging this PR.
Worst case, any inline script with defer is simply ignored; best case, initial rendering is marginally faster on constrained devices.

If you want to be extra-safe you can:

Deploy the patch to a staging server and check for any flash-of-unstyled-content on a throttled 3G profile.

Run the existing end-to-end test suite in “slow-network” mode; failures would highlight hidden timing dependencies.

Otherwise, the change is straightforward and should only provide a small performance upside.

Ready to merge? Or are you still testing?

evnchn · 2025-05-27T09:28:02Z

@falkoschindler I believe this is ready to merge.

My brain was just stuck in the thinking mode of #4756 for no reason 😅 and thinking that I need to init the Socket.IO ASAP.

But, before that PR gets merged, Socket.IO isn't initiated until we run createApp down in the main module script, so we can also defer Socket.IO.

It has a very very tiny risk of breaking existing code, if the user uses add_body_html to make their own Vue app like this:

ui.add_body_html('''
    <script>
    console.log(Vue); // breaks after this PR
    </script>
    ''')

But I don't think we should be held back by that very niche use case.

evnchn · 2025-05-27T09:40:32Z

Before I forget: If you really need to use Vue in add_body_html (which spoiler, by all means you should not, just embrace NiceGUI fully...), just put it after DOMContentLoaded.

document.addEventListener('DOMContentLoaded', function() {
    // Your code here
});

…ad (CPU-bound) (#4801) ### Motivation First, I discovered the `nomodule` attribute of `script` completely by accident in #4761. I pitched that it can speed up page load, but never got around to testing it. I was going on a trip back then. I promptly forgot about it. 😅 Until #4798 where we discovered the importance of applying a CPU throttle to make speed issues show themselves, and I was reminded of the topic, and proceeded to implement it and benchmark it. I found a significant speed increase. ### Implementation Simply set `nomodule` to the ES module shim. ### Progress - [x] I chose a meaningful title that completes the sentence: "If applied, this PR will..." - [x] The implementation is complete. - [x] Pytests have been added (or are not necessary). - [x] Documentation has been added (or is not necessary). ### Results showcase: Low-tier mobile speed throttling (8.5x on my machine), no need disable cache and network throttling (we want to show **only** JS execution time) Before: ![image](https://github.com/user-attachments/assets/0ac20df4-60a3-49b5-8a17-21057811a069) After: ![image](https://github.com/user-attachments/assets/e044ddcf-2516-45aa-a872-04f7b2dc1d1b) Finish: 7.71s -> 6.97s (Difference: -0.74s, Percentage Difference: -9.60%) DOMContentLoaded: 6.48s -> 5.85s (Difference: -0.63s, Percentage Difference: -9.72%) Load: 7.57s -> 6.82s (Difference: -0.75s, Percentage Difference: -9.91%) ### Potential impacts We can no longer use the functionalities that ES Module Shims offer to us - `importShim.getImportMap()` - `importShim.addImportMap(importMap);` - Shim import `importShim('/path/to/module.js').then(x => console.log(x));` - Dynamically injecting import map like ```js document.head.appendChild(Object.assign(document.createElement('script'), { type: 'importmap', innerHTML: JSON.stringify({ imports: { x: './y.js' } }), })); ``` I doubt they are useful, since if your browser doesn't support ES Module, ES Module Shims does **not** magically make NiceGUI run with better compatibility. Notably, `ui.markdown` with Mermaid diagram always breaks.

draft: defer non critical JS

5ac11ef

evnchn added feature Type/scope: New feature or enhancement ⚪️ minor Priority: Low impact, nice-to-have labels May 27, 2025

falkoschindler added the review Status: PR is open and needs review label May 27, 2025

falkoschindler added this to the 2.19 milestone May 27, 2025

defer also Socket.IO

c1f8cde

falkoschindler merged commit fd25a94 into zauberzeug:main May 27, 2025
1 check passed

evnchn deleted the defer-non-critical-js branch May 27, 2025 09:36

evnchn mentioned this pull request May 27, 2025

Skip ES module shim for modern browsers entirely, ~10% faster page load (CPU-bound) #4801

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Defer non-critical JS for slightly faster page load under CPU+Network simultaneous bottleneck #4798

Defer non-critical JS for slightly faster page load under CPU+Network simultaneous bottleneck #4798

Uh oh!

evnchn commented May 26, 2025

Uh oh!

evnchn commented May 27, 2025

Uh oh!

falkoschindler commented May 27, 2025

What the PR actually does

How browsers handle `defer` (classic scripts)

Performance impact

Recommendation

Uh oh!

evnchn commented May 27, 2025

Uh oh!

Uh oh!

evnchn commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Defer non-critical JS for slightly faster page load under CPU+Network simultaneous bottleneck #4798

Defer non-critical JS for slightly faster page load under CPU+Network simultaneous bottleneck #4798

Uh oh!

Conversation

evnchn commented May 26, 2025

Motivation

Implementation

Progress

Results showcase

Uh oh!

evnchn commented May 27, 2025

Uh oh!

falkoschindler commented May 27, 2025

What the PR actually does

How browsers handle defer (classic scripts)

Performance impact

Recommendation

Uh oh!

evnchn commented May 27, 2025

Uh oh!

Uh oh!

evnchn commented May 27, 2025

Uh oh!

Uh oh!

How browsers handle `defer` (classic scripts)