Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to prime URL metrics #1850

Open
wants to merge 41 commits into
base: trunk
Choose a base branch
from

Conversation

b1ink0
Copy link
Contributor

@b1ink0 b1ink0 commented Feb 5, 2025

Summary

Fixes #1311

Relevant technical choices

This PR introduces a new mechanism for priming URL metrics across the site. It uses a newly added submenu page in the Tools menu and automatically primes URL metrics when a post is saved in the block editor.

TODOS:

  • Add tests

Demos

Settings page:

UI.mp4

Saving post in Block Editor:

Block.Editor.mp4

b1ink0 added 23 commits January 23, 2025 23:58
Copy link

codecov bot commented Feb 5, 2025

Codecov Report

Attention: Patch coverage is 3.41880% with 339 lines in your changes missing coverage. Please review.

Project coverage is 67.95%. Comparing base (26442b9) to head (17e24c4).
Report is 2 commits behind head on trunk.

Files with missing lines Patch % Lines
plugins/optimization-detective/helper.php 0.00% 206 Missing ⚠️
...age/class-od-rest-url-metrics-priming-endpoint.php 0.00% 74 Missing ⚠️
plugins/optimization-detective/settings.php 0.00% 39 Missing ⚠️
plugins/optimization-detective/detection.php 0.00% 16 Missing ⚠️
plugins/optimization-detective/load.php 0.00% 3 Missing ⚠️
...orage/class-od-rest-url-metrics-store-endpoint.php 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            trunk    #1850      +/-   ##
==========================================
- Coverage   71.21%   67.95%   -3.26%     
==========================================
  Files          86       88       +2     
  Lines        7000     7346     +346     
==========================================
+ Hits         4985     4992       +7     
- Misses       2015     2354     +339     
Flag Coverage Δ
multisite 67.95% <3.41%> (-3.26%) ⬇️
single 38.79% <0.00%> (-1.92%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

'prime_url_metrics_verification_token',
odPrimeUrlMetricsVerificationToken
);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Authentication for REST API

  • WP Nonce Limitation: The default WordPress (WP) nonce does not function correctly when generated for the parent page and then passed to an iframe for REST API requests.

  • Custom Token Authentication: To address this, I have added a custom token-based authentication mechanism. This generates a time-limited token used to authenticate REST API requests made via the iframe.

In #1835 PR, WP nonces are introduced for REST API requests for logged-in users. This may allow us to eliminate the custom token authentication if URL metrics are collected exclusively from logged-in users.

};

// Load the iframe
iframe.src = task.url;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently if the IFRAME shares the same origin as the parent, then it allows it to access the parent session. This ensures that the user session in the page loaded within the iframe (which is a frontend page) matches the logged-in user of the WordPress dashboard.

But if the WordPress admin dashboard and the frontend have different origins, WP nonces won’t work for REST API authentication because the iframe will not recognize the logged-in session. As the different origin does not allow iframe to access parents session. For context I am talking about the REST nonce introduced in #1835.

iframe.style.transform = 'scale(0.05)';
iframe.style.transformOrigin = '0 0';
iframe.style.pointerEvents = 'none';
iframe.style.opacity = '0.000001';
Copy link
Contributor Author

@b1ink0 b1ink0 Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the detect.js requires the iframe to be visible in the viewport to resolve the onLCP promise. Traditional methods like moving the iframe off-screen using translate, setting visibility: hidden, or opacity: 0 cause the promise to hang.

// Obtain at least one LCP candidate. More may be reported before the page finishes loading.
await new Promise( ( resolve ) => {
onLCP(
( /** @type LCPMetric */ metric ) => {
lcpMetricCandidates.push( metric );
resolve();
},
{

I am using a workaround using the following CSS to keep the iframe minimally visible and functional:

  position: fixed;
  top: 0px;
  left: 0px;
  transform: scale(0.05);
  transform-origin: 0px 0px;
  pointer-events: none;
  opacity: 1e-6;
  z-index: -9999;

'OD_PRIME_URL_METRICS_REQUEST_SUCCESS',
'*'
);
resolve();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parent and IFRAME communication is handled via postMessage. A message is sent to the parent, and the promise resolves immediately.

If the promise isn't resolved immediately, navigating to a new URL causes the code following the promise to never execute. This is because changing the iframe.src does not trigger events like pagehide, pageswap, or visibilitychange.

// Wait for the page to be hidden.
await new Promise( ( resolve ) => {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder. Do we even need to post a message here? As soon as the iframe is destroyed won't it automatically cause the URL Metric to be sent, right?

Copy link
Contributor Author

@b1ink0 b1ink0 Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is we need to signal the parent that we can move to next URL or breakpoint using postMessage as the load event can't be used. Check this comment for detailed explanation #1850 (comment) .

Will it makes sense to send the postMessage after the navigator.sendBeacon then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it makes sense to send the message after the beacon is sent, definitely.

@westonruter
Copy link
Member

Just to not leave you hanging: I'm probably not going to be able to review this in depth and provide feedback for a couple more weeks. I'm preparing for travel to WordCamp Asia this weekend and I'll be giving a talk about Optimization Detective, so I need to focus on preparing for that.

@b1ink0 b1ink0 added this to the optimization-detective 1.0.0 milestone Feb 14, 2025
@b1ink0 b1ink0 marked this pull request as ready for review February 14, 2025 11:43
@b1ink0 b1ink0 requested a review from felixarntz as a code owner February 14, 2025 11:43
Copy link

github-actions bot commented Feb 14, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: b1ink0 <[email protected]>
Co-authored-by: westonruter <[email protected]>
Co-authored-by: felixarntz <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@felixarntz
Copy link
Member

@b1ink0 Thanks for the PR, this looks really promising, and I think it's going to be a super powerful feature. I'll take a closer look next week.

One early thought: I wonder whether we should make this feature conditionally available based on a check on how much content the site has. For most sites, this feature is probably great, but we may want to use caution for sites that have e.g. 1000s of posts. We could make the condition filterable so that sites can still opt in, but I think a safeguard like this would be useful.

Regarding the milestone, I think we should target this for a future release of Optimization Detective, such as 1.1.0. We're already in 1.0.0-beta, so it wouldn't be right to add an entirely new feature to 1.0.0 now.

@b1ink0 b1ink0 requested a review from westonruter March 4, 2025 14:27
@westonruter
Copy link
Member

@b1ink0 Sorry for the delay with reviewing. It's probably going to be another week before I'll get a chance. In the meantime, I have another area you can explore here related to this. Right now there are two methods for priming URL Metrics:

  1. Priming URLs by going to the Optimization Detective > Tools admin screen.
  2. Publishing a post in the block editor.

There is another use case which I mentioned in #1311 (comment):

This would be cool to implement as a CLI tool as well, using Puppeteer.

In order to facilitate this, I think there should be a new REST API endpoint in the optimization-detective namespace (but beware code being touched in #1865) which exposes information such as:

  1. The viewport group min/max widths, i.e. od_get_breakpoint_max_widths().
  2. The sample size for each URL Metric group.
  3. A list of the URLs captured in od_url_metrics posts which have URL Metrics with timestamp older than od_get_url_metric_freshness_ttl().
  4. A list of URLs pulled from wp-sitemap.xml which don't have any URL Metrics collected for them.

This endpoint need not be public. It could require authentication.

With this information in hand, it should be possible to construct a script with Puppeteer which loops over all URLs needing URL Metrics with each of the viewport sizes in order to keep URL Metrics fresh across an entire site without requiring any frontend anonymous writes. This could be part of some daily system cron on the server (not wp-cron). As an alternative to the REST API endpoint, this data could be exposed instead via a new WP-CLI command for piping into a Puppeteer script. Similarly with the iframe on the admin screen for priming URL Metrics, the Puppeteer browser session would probably need to be logged-in as a user who can od_store_url_metric_now (though we may want to revisit how the capabilities work here).

With this in place, we could eliminate the short-circuiting currently being done when the REST API is not available, introduced in #1762. If the user is an administrator, the REST API must be available. So od_is_rest_api_unavailable() can be changed to always return true if is_user_logged_in() && current_user_can( 'od_store_url_metric_now' ). There's probably an alternative mechanism to better check if the REST API is available, like dispatching an OPTIONS request to /url-metrics:store via rest_get_server()->dispatch() which will cause the rest_authentication_errors filter to apply. (Granted, the REST API can still be disabled via the web server in which case we still need to do our loopback requests, but we can do so as an authenticated user and not anonymously.)

Conversely, if there are URL Metrics collected for the current response and the user is not authenticated and yet the REST API is not available, then in this case we can go ahead and optimize the page but we can prevent any of the detection logic from being served.


The ability to prime URL Metrics as you're exploring here will likely be critical to Optimization Detective to being feasible for deployment on high traffic and high profile sites.

@westonruter
Copy link
Member

Oh, and the Puppeteer browser will need to have the same ability as the iframe'ed page to determine whether the page is "done" loading (i.e. URL Metric has been constructed and has been forcibly submitted). When this signal is received, the Puppeteer browser can move on to load the next URL or the same URL in the next viewport.

@westonruter
Copy link
Member

Something else that comes to mind: the URL Priming process could be sped up by skipping URLs for viewports which already have a fresh URL Metric in the group.

This could be something else which is facilitated by #1919. In particular, the integration could hook into the od_start_template_optimization action to insert a SCRIPT in the page which sends a message to the parent window (or Puppeteer) to indicate the status of all the URL Metric groups. So let's say the process by default accesses all URLs with the mobile viewport. When that mobile viewport loads, we can learn at that point that the other viewport groups are populated with fresh URL Metrics and then the process can skip to the next URL.

@b1ink0
Copy link
Contributor Author

b1ink0 commented Mar 14, 2025

@westonruter
I was experimenting with creating a WP CLI command to act as an orchestrator. Its purpose would be to check the system environment for node, which is required for Puppeteer, install Puppeteer if necessary, and then execute the priming logic. However, this approach relies on shell_exec, which might be disabled for security reasons.

As an alternative, I considered including a Puppeteer script project within the Optimization Detective plugin’s directory, along with its package.json. However, this would require the admin to manually navigate to the directory, run npm install, and then execute npm run start or node index.js.

I would like to know if you have any other idea for this?

@westonruter
Copy link
Member

@b1ink0

As an alternative, I considered including a Puppeteer script project within the Optimization Detective plugin’s directory, along with its package.json. However, this would require the admin to manually navigate to the directory, run npm install, and then execute npm run start or node index.js.

Yes, I think this is the way to go. We can include a subdirectory in the plugin which contains the Puppeteer script. Instead of a site owner doing npm install from that directory on the production site, it could also make sense to publish an npm package so that there's no node_modules directory in the document root (which could result in security vulnerabilities, depending on what assets are in there). But for development purposes, being able to install the dependencies from the plugin itself is useful. So if it were available as an npm package, then it could be installed globally and then call WP-CLI to obtain the required data to do the gathering of the URL Metrics:

npx @wordpress/performance-od-url-metric-collector

So yeah, I think the user would invoke node which would then do system calls to WP-CLI rather than calling WP-CLI which would do system calls to node. This is because WP-CLI is a dependency here for the Puppeteer script, not the other way around, I think.

@b1ink0
Copy link
Contributor Author

b1ink0 commented Mar 19, 2025

@westonruter
I have created this b1ink0@dd274ef (NOTE: POC branch is created from #1850 latest commit) POC for Puppeteer CLI. I would like to know if I am heading in the right direction.

Puppeteer_CLI_DEMO.mp4

If this approach is correct, should I add these changes to current PR #1850?

Another important thing to note is that I had to bypass the following check in detect.js, as it was taking a long time to resolve and, in some cases, never resolved. I’m still investigating the cause when using Puppeteer CLI.

// Wait yet further until idle.
if ( typeof requestIdleCallback === 'function' ) {
await new Promise( ( resolve ) => {
requestIdleCallback( resolve );
} );
}

@westonruter
Copy link
Member

@b1ink0 yes, that is looking good!

In regards to the idle callback, have you looked at using waitUntil: 'networkidle0' as opposed to waitUntil: 'load'?

Nevertheless, that may not be right either. Seems like this is a known issue with Puppeteer and requestIdleCallback: puppeteer/puppeteer#10350 (comment). But maybe you'll find more info on this.

Aside: We may want to have Optimization Detective fire a custom event when it has gotten to the point where it is waiting for pagehide. This could be used be a script injected by Puppeteer as an alternative to the message that is passed to the parent window as in the case of the iframe.

@b1ink0
Copy link
Contributor Author

b1ink0 commented Mar 20, 2025

@westonruter

In regards to the idle callback, have you looked at using waitUntil: 'networkidle0' as opposed to waitUntil: 'load'?

I tested both approaches: load with requestIdleCallback promise disabled and networkidle0 with requestIdleCallback promise enabled. The networkidle0 approach slows down URL metrics collection compared to the load event, as it waits for the network to become idle. However, if we prioritize accuracy, networkidle0 is the safer choice since the delay likely results in more accurate URL metrics.

  • load:
    Screenshot 2025-03-20 at 6 15 29 PM

  • networkidle0:
    Screenshot 2025-03-20 at 6 15 09 PM

Regarding other methods I explored, we could replace requestIdleCallback with a custom implementation using setTimeout, but that wouldn't be a true solution.


Additionally, should the Puppeteer CLI code be added to the current PR, or should it be part of a separate PR after this one is merged into trunk?

@westonruter
Copy link
Member

Additionally, should the Puppeteer CLI code be added to the current PR, or should it be part of a separate PR after this one is merged into trunk?

@b1ink0 If it makes it easier for you to prototype this, I think it's fine to add it to this PR. We can always remove parts of the PR prior to review, and then immediately revert the removals in subsequent PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ability to prime URL metrics across a site upon installation of Optimization Detective
3 participants