Skip to content

Conversation

@LiamSarsfield
Copy link
Contributor

@LiamSarsfield LiamSarsfield commented Dec 12, 2025

Addresses HOG-438: Create Jetpack Performance Tooling for LCP

Proposed changes:

  • Add performance testing infrastructure under tools/performance/ to measure wp-admin dashboard LCP (Largest
    Contentful Paint) with Jetpack connected
  • Uses Docker for isolated WordPress environment with simulated WordPress.com connection (fake tokens + mocked API
    with 200ms latency)
  • Includes CPU throttling calibration for consistent results across different machines
  • Posts metrics to CodeVitals for tracking over time

Other information:

  • Have you written new tests for your changes, if applicable?
  • Have you checked the E2E test CI results, and verified that your changes do not break them?
  • Have you tested your changes on WordPress.com, if applicable (if so, you'll see a generated comment below with
    a script to run)?

Jetpack product discussion

pc9hqz-3Rb-p2

Does this pull request change what data or activity we track or use?

No

Testing instructions:

Prerequisites: Docker running, Node 18+

cd tools/performance
pnpm install
pnpm exec playwright install chromium
pnpm calibrate
pnpm test -- --skip-codevitals

The test suite will automatically clone the pre-built plugin from jetpack-production on first run.

Expected output: LCP measurement for wp-admin dashboard with Jetpack connected (simulated)

Introduces automated LCP (Largest Contentful Paint) measurement for the
wp-admin dashboard with simulated Jetpack WordPress.com connection.

Key components:
- Docker environment with WordPress + simulated Jetpack connection
- CPU throttling calibration for consistent results across CI agents
- Playwright-based LCP measurement
- CodeVitals integration for metric tracking

Metric posted: wp-admin-dashboard-connection-sim-largestContentfulPaint
@github-actions
Copy link
Contributor

github-actions bot commented Dec 12, 2025

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ✅ Add a "[Status]" label (In Progress, Needs Review, ...).
  • ✅ Add a "[Type]" label (Bug, Enhancement, Janitorial, Task).
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


Follow this PR Review Process:

  1. Ensure all required checks appearing at the bottom of this PR are passing.
  2. Make sure to test your changes on all platforms that it applies to. You're responsible for the quality of the code you ship.
  3. You can use GitHub's Reviewers functionality to request a review.
  4. When it's reviewed and merged, you will be pinged in Slack to deploy the changes to WordPress.com simple once the build is done.

If you have questions about anything, reach out in #jetpack-developers for guidance!

@github-actions github-actions bot added the [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. label Dec 12, 2025
- Remove outdated 4-scenario comment (only 1 scenario exists)
- Remove dead baseline comparison code that never executed
- Trim JSDoc to single-line descriptions across all scripts
- Update eslint config to allow minimal JSDoc
@LiamSarsfield LiamSarsfield changed the title Add Jetpack performance testing CI infrastructure [Do not merge] Add Jetpack performance testing CI infrastructure Dec 12, 2025
@LiamSarsfield LiamSarsfield added the DO NOT MERGE don't merge it! label Dec 12, 2025
Changed the Docker startup sequence to prevent race conditions where
WordPress containers interfere with WP-CLI's database operations:

1. Start only the db container
2. Wait for MySQL to be ready
3. Run WP-CLI setup (WordPress container NOT running)
4. Start WordPress containers

This ensures WP-CLI has exclusive database access during setup,
eliminating "table doesn't exist" errors caused by concurrent access.

Changes:
- run-performance-tests.js: Sequential container startup
- setup-wordpress.sh: Simplified (removed HTTP wait logic)
- docker-compose.yml: Removed wpcli depends_on wordpress
The import plugin is not available in this context, and the base config
handles import resolution. Also ensure jsdoc rules are disabled for
these utility scripts.
Dependencies (playwright, dotenv) are in tools/performance/package.json,
not the monorepo root, so the import resolver can't find them.
- Log calibration file path and existence
- Show throttle rate, target score, calibration time, and samples
- Confirm throttling is applied via CDP on first iteration
@github-actions
Copy link
Contributor

github-actions bot commented Dec 14, 2025

Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.

  • To test on WoA, go to the Plugins menu on a WoA dev site. Click on the "Upload" button and follow the upgrade flow to be able to upload, install, and activate the Jetpack Beta plugin. Once the plugin is active, go to Jetpack > Jetpack Beta, select your plugin (Jetpack or WordPress.com Site Helper), and enable the add/perf-testing-ci-mvp branch.
  • To test on Simple, run the following command on your sandbox:
bin/jetpack-downloader test jetpack add/perf-testing-ci-mvp
bin/jetpack-downloader test jetpack-mu-wpcom-plugin add/perf-testing-ci-mvp

Interested in more tips and information?

  • In your local development environment, use the jetpack rsync command to sync your changes to a WoA dev blog.
  • Read more about our development workflow here: PCYsg-eg0-p2
  • Figure out when your changes will be shipped to customers here: PCYsg-eg5-p2

@jp-launch-control
Copy link

jp-launch-control bot commented Dec 14, 2025

Code Coverage Summary

This PR did not change code coverage!

That could be good or bad, depending on the situation. Everything covered before, and still is? Great! Nothing was covered before? Not so great. 🤷

Full summary · PHP report · JS report

…wordpress-jetpack-connected service, ensuring it starts only after the service is initiated, which helps prevent race conditions during container startup.
- Updated README.md to include new environment variables: CODEVITALS_URL, GIT_BRANCH, WP_ADMIN_USER, and WP_ADMIN_PASS.
- Modified post-to-codevitals.js to streamline metric extraction by removing unused baseMetrics.
- Improved run-performance-tests.js to prioritize GIT_COMMIT environment variable for git hash retrieval, ensuring accurate tracking during CI runs.
- Introduced an empty baseMetrics object in the payload to clarify that baseline normalization is not utilized in the performance metrics submission.
- Simplified browser launch to always use headless mode for consistency in performance calibration.
- Removed conditional logic for headful mode, ensuring a streamlined execution in both local and CI environments.
@LiamSarsfield LiamSarsfield added [Type] Infrastructure and removed [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. labels Dec 15, 2025
@LiamSarsfield
Copy link
Contributor Author

Hey @anomiex👋 would you mind taking a look at this when you get a chance?

This is the performance testing infrastructure I built during HACK week (more details here pc9hqz-3Rb-p2) it measures wp-admin dashboard LCP for Jetpack and posts results to CodeVitals.
It's a big PR, sorry about that. Most of it breaks down into:

  • scripts/ - Node.js orchestration and Playwright measurement code
  • docker/ - Docker Compose setup for WordPress + Jetpack environment
  • docker/mu-plugins/simulate-wpcom-connection.php - mu-plugin that fakes Jetpack connection and mocks WP.com API
    responses

The mu-plugin is probably the most relevant bit for review from a Jetpack perspective as it intercepts pre_http_request to return mock responses for various WP.com endpoints.

Happy to walk through any of it if that's easier.

@LiamSarsfield LiamSarsfield marked this pull request as ready for review December 15, 2025 13:46
@LiamSarsfield LiamSarsfield removed the DO NOT MERGE don't merge it! label Dec 15, 2025
@LiamSarsfield LiamSarsfield changed the title [Do not merge] Add Jetpack performance testing CI infrastructure Add Jetpack performance testing CI infrastructure Dec 15, 2025
@LiamSarsfield LiamSarsfield added [Status] Needs Review This PR is ready for review. and removed [Status] In Progress labels Dec 15, 2025
Copy link
Contributor

@anomiex anomiex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems ok from a monorepo perspective. I didn't look too closely at the code.

If you're wanting a review of the faked-connection stuff in the mu-plugin, @Automattic/jetpack-vulcan would be the team to ask.


## CI Usage

The test suite is designed to run in TeamCity. See build configuration for setup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it'd make more sense to run it in Actions rather than TeamCity, on each commit to trunk instead of backfilling weekly.

If you're running it in TeamCity, are you looking at the monorepo, or at https://github.com/Automattic/jetpack-production which has the already-built plugin?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the exploration intents is to see about reusing work. With our GHE instance, TeamCity is the runner we have to use, so seeing the pros/cons of TC here.

Copy link
Contributor Author

@LiamSarsfield LiamSarsfield Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it'd make more sense to run it in Actions rather than TeamCity, on each commit to trunk instead of backfilling weekly.

GitHub Actions runners can have unpredictable performance variability due to shared infrastructure. For detecting small regressions (5-10%), this noise can drown out minor regressions over time. The TeamCity build runs on a dedicated machine with no other agents interfering.

If you're running it in TeamCity, are you looking at the monorepo, or at Automattic/jetpack-production which has the already-built plugin?

The monorep is used instead of the already-built plugin mainly so we can leverage commit level tracking, so we can have the ability to bisect when issues appear (where needed). It may not be needed for now though, but you can see within CodeVitals that a point can be clicked and we can see the commit details when clicking on the commit SHA.

Screenshot 2025-12-18 at 14 37 07

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that every commit that changes anything in Jetpack gets mirrored to Automattic/jetpack-production, absent rare cases where something goes wrong with the mirroring. Each commit there also includes a footer like Upstream-Ref: Automattic/jetpack@d5b54134bb471f3a54d04b12e128e6e0e2d77bde to make it easy to find the corresponding monorepo commit.

It's up to you, but it may save having to build each monorepo commit before you can test it.

It'll also either save you testing commits that don't affect Jetpack-the-plugin at all or save you having to maintain code to try to identify which ones do. I see your P2 post mentions

that touched PHP files in Jetpack or packages

That would unnecessarily test changes to packages like packages/jetpack-mu-wpcom, unless you have a list of packages to ignore. It could also miss JS-only changes that might still affect the LCP timing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that every commit that changes anything in Jetpack gets mirrored to Automattic/jetpack-production, absent rare cases where something goes wrong with the mirroring. Each commit there also includes a footer like Upstream-Ref: d5b5413 to make it easy to find the corresponding monorepo commit.

Actually yes! I should have checked this before hand, I didn't realise Automattic/jetpack-production has bissectable commits like this. It'll also help regarding the issue you mentioned for testing commits that don't affect Jetpack-the-plugin also. Thanks, I'll update the TeamCity build to use that instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use jetpack-production, the implementation now:

  1. Clones jetpack-production instead of building from the monorepo
  2. Parses Upstream-Ref from each mirror commit to track the original monorepo SHA in CodeVitals
  3. Removes file filtering from the TC scheduler

The monorepo VCS root is still used for the tools/performance/ scripts, but the plugin itself comes directly from the pre-built mirror.

Comment on lines 20 to 22
// Dependencies are in tools/performance/package.json, not monorepo root
// so the import resolver can't find them. Disable this rule.
'import/no-unresolved': 'off',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't matter, eslint will look in tools/performance/node_modules for files under here.

Also, when I remove this and run eslint, it doesn't seem to complain about anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Removed the import/no-unresolved rule and its comment. Also simplified the config to match tools/cli/eslint.config.mjs

kraftbj
kraftbj previously approved these changes Dec 16, 2025
Copy link
Contributor

@kraftbj kraftbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where it came from; I was given a noticed that MacOS's rsync doesn't handle symlinks well and stating I should brew install rsync. It let me proceed, but it failed.

Installing rsync via brew install rsync and trying again worked. I don't think it's a blocker but wanted to note it.

@anomiex
Copy link
Contributor

anomiex commented Dec 16, 2025

I'm not sure where it came from; I was given a noticed that MacOS's rsync doesn't handle symlinks well and stating I should brew install rsync. It let me proceed, but it failed.

Probably from

// Apple ships a special fork of openrsync, which has various quirks.
// In particular, it doesn't handle symlink recursion well, which breaks
// in macOS 15.4 and copies child node_modules dirs in 15.5.
//
// See also:
// * p1742486518169009-slack-CDLH4C1UZ
// * https://github.com/apple-oss-distributions/rsync/tree/main/openrsync
if ( os.platform() === 'darwin' ) {
const { stdout: rsyncVersion } = await execa( 'rsync', [ '--version' ] );
isOpenrsync = rsyncVersion.indexOf( 'openrsync' ) >= 0;
if ( isOpenrsync ) {
const { stdout: macOS_version } = await execa( 'sw_vers', [ '--productVersion' ] );
if ( macOS_version === '15.4' ) {
console.error(
chalk.red(
'The implementation of rsync in macOS 15.4 is unable to properly sync symlinks.'
)
);
console.error( chalk.red( 'Please install standard rsync (e.g. `brew install rsync`).' ) );
process.exit( 1 );
} else {
console.error(
chalk.yellow(
'The implementation of rsync in macOS is unable to properly sync symlinks.'
)
);
console.error(
chalk.yellow( 'Installing standard rsync (e.g. `brew install rsync`) is recommended.' )
);
if ( argv.nonInteractive ) {
process.exit( 1 );
}
console.error();
await enquirer
.prompt( {
type: 'confirm',
name: 'proceedWithOpenrsync',
message:
'Continuing will not break anything, but will copy many unneeded files.\nProceed to sync files?',
initial: false,
} )
.then( answer => {
if ( ! answer.proceedWithOpenrsync ) {
process.exit( 0 );
}
} );
}
}
}

@kraftbj
Copy link
Contributor

kraftbj commented Dec 16, 2025

Probably from

Yup! Thanks! I scanned the PR but didn't check the existing commands. I wouldn't consider it a blocker for the PR, but it is required to use the fully-leaded version of rsync, not the Mac variant.

- Consolidated rules into a single object for improved readability and maintainability.
- Removed unnecessary comments and streamlined the configuration structure.
@LiamSarsfield
Copy link
Contributor Author

LiamSarsfield commented Dec 18, 2025

Yup! Thanks! I scanned the PR but didn't check the existing commands. I wouldn't consider it a blocker for the PR, but it is required to use the fully-leaded version of rsync, not the Mac variant.

Ah nice catch, I must have included already installed rsync previously hence why I missed it, I've updated the testing instructions accordingly.

Actually, this is no longer relevant - I've switched to using jetpack-production (the pre-built mirror) instead of
building from the monorepo, so rsync is no longer needed at all. Updated the testing instructions to reflect the
simpler prerequisites (just Docker and Node 18+).

- Added 'plugin/' directory to .gitignore to exclude cloned plugin files.
- Updated README.md with detailed setup instructions and clarified usage of the pre-built Jetpack plugin.
- Modified docker-compose.yml to mount the plugin from the new directory structure.
- Refactored run-performance-tests.js to clone the Jetpack plugin from the production mirror instead of using rsync, ensuring a more straightforward setup process.
@LiamSarsfield
Copy link
Contributor Author

Hey @Automattic/jetpack-vulcan 👋

I've been working on performance testing infrastructure for Jetpack (measuring wp-admin LCP with Jetpack connected).
Part of this involves an mu-plugin that simulates a WordPress.com connection without actually connecting to WP.com.
Could someone from the team review the mock implementation at tools/performance/docker/mu-plugins/simulate-wpcom-connection.php?

It currently:

  • Sets up fake blog/user tokens via Jetpack_Options
  • Intercepts HTTP requests to *.wordpress.com and *.wp.com
  • Returns mock responses for common endpoints (token health, site info, stats, sync, etc.)
  • Adds configurable latency (default 200ms) to simulate real-world conditions

Specifically looking for feedback on:

  1. Are the fake connection tokens set up correctly?
  2. Are we missing any critical API endpoints that get called on wp-admin load?
  3. Any concerns with this approach for performance testing?

@fgiannar
Copy link
Contributor

Hi @LiamSarsfield ,

Thanks for the ping and working on this!

Are the fake connection tokens set up correctly?

Yes 👍

Are we missing any critical API endpoints that get called on wp-admin load?

This would depend on the Jetpack plugin module configuration. I can confirm the endpoints that are called by Connection and Sync packages, but we can't know how each consumer of the Jetpack Connection behaves.
I noticed that you only enable modules that don't require a JP Connection but I wonder if we should enable all of them to get a realistic worst case scenario.
One way to check the remote calls to WPCOM on every page load, would be to sandbox your environment and monitor debug.log as we do log all sandboxed requests to WPCOM there.

That said, it might make sense to add some logging that would answer this question within the performance testing logic itself? This could be helpful in case more endpoints are added in the future, that we don't handle within the testing infrastructure.

Any concerns with this approach for performance testing?

This is not a blocker, but you could consider refactoring get_mock_response to avoid the if else logic and use eg a factory for setting up the fake endpoints.
One additional idea I had, is around the latency. Atm my understanding is that we assume a generic latency for each endpoint. If we extracted each fake endpoint definition to each own class we could potentially set the latency per endpoint using actual real world data we have on WPCOM.
As an example, the jetpack-sync-actions endpoint has a ~470ms median response time and ~3s for the 95th percentile.
We could take it even one step further and define a median and p95 latency per endpoint and repeat our tests for both cases to simulate a normal scenario and a site under stress.

- Introduced a mechanism to log unhandled endpoints, aiding in the identification of missing mock responses.
- Added a flag to track if a specific endpoint handler was matched, improving response handling clarity.
- Updated comments for legacy endpoints to indicate early returns without logging.
@LiamSarsfield
Copy link
Contributor Author

@fgiannar Thanks for the thorough review!

I noticed that you only enable modules that don't require a JP Connection but I wonder if we should enable all of them to get a realistic worst case scenario.

Good point. I initially took the conservative approach to avoid errors from modules expecting real WP.com responses, but enabling all modules would give us a more realistic measurement. I'll look into expanding the module list and adding mock responses for any additional endpoints they require.

It might make sense to add some logging that would answer this question within the performance testing logic itself?

Great idea, I added logging for any intercepted requests that hit the fallback response. That way it'll catch any unhandled endpoints as they appear instead of discovering them later.

You could consider refactoring get_mock_response to avoid the if else logic and use eg a factory for setting up the fake endpoints.

Agreed that the current implementation is a bit unwieldy. I'll refactor to a registry/factory pattern which will also make it easier to add per endpoint configuration.

We could take it even one step further and define a median and p95 latency per endpoint

Love this idea. Using actual latency data would make the measurements much more representative. I created a follow-up issue to:

  1. Extract endpoint definitions to a registry with configurable latency
  2. Add real-world latency values from WP.com metrics
  3. Consider test modes for median vs p95 scenarios

For now I'll focus on the logging improvement as a quick win, thanks again for the detailed feedback! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants