WordPress and WooCommerce at Scale with 500,000 Users

WordPress generally works fairly well on small-to-medium sites; on larger sites, it can run into performance issues because of the size of the database.

A current project I’m working on for LuminFire has a WooCommerce store with potentially 270,000+ customers, and that’s causing some issues with site performance. For sake of development, our dev store has 500,000+ users.

Contents:

Generating Dummy Users

First, I generated a CSV file with 50,000 fake users using mockaroo.com and imported them using this plugin (I had to bump up my PHP memory limit to 4096MB and execution time to 240 and it still timed out a couple of times, but I deleted the users who had already been imported and then ran the import again).

Update: I have since learned about WC Smooth Generator, and it may have worked just as well or better.

I figured 50K unique users were plenty and duplicating them via MySQL queries was more efficient, so wrote these queries to do the job.

The users queries ran in seconds each, copying a batch of 50K rows at a time. The usermeta queries, not so much…they took about 11 minutes each since there were about 1.71 million rows to clone each time.

In case it’s helpful to you, here are files with the dummy users:

  • CSV with 50K users
  • MySQL dump of wp_users table (30.4MB)
    • It does include the ID field since it needs to match the usermeta table.
    • The first user ID is 510024; if you try to import and have user IDs above 510024, you’ll have errors.
    • 500,000 rows
  • MySQL dump of wp_usermeta table (148.5MB)
    • It does not include the meta_id field (so no errors trying to overwrite existing IDs).
    • 17,000,000+ rows

WordPress User Dashboard

At the top of the user dashboard, WordPress typically displays a list of the user roles on your site, as well as the number of users in each role.

user roles

The number of each users is generated by the count_users function, which uses a resource-intensive SQL query. In our dev site, it takes 15+ seconds just to run the query.

This is a known issue and should be resolved in WordPress 5.0 (currently scheduled for release in late 2018), but we need this working much sooner.

WordPress Multisite drops the number of users per role and instead shows just a list of roles; that’s how WordPress.com and other large multisite networks avoid this performance hit.

The proposed patch on the ticket modifies the count_users function to behave similarly to WP Multisite: if there are more than 10,000 users (modifiable with the new wp_is_large_user_count filter), it doesn’t show the number of users per role.

Since the patch is for the development version of the WP code, it doesn’t apply cleanly to a production site, so I manually patched WP core. Here is a patch file with the changes for WordPress 4.9.6 through 4.9.8; I probably will add subsequent patch files for 4.9.9, etc. as we upgrade, so if you’re using it, make sure to use the appropriate patch for your version:

Core Patch File

You can apply this patch file by downloading it, opening your WordPress installation in a terminal, and running git apply <path-to-downloaded-patch-file>.

WooCommerce User Queries

The biggest performance hit I found was searching for customers when editing an order; it took 10–15 seconds for search results to be returned.

WooCommerce user search

  • If searching by customer ID, the backend would respond pretty quickly.
  • Otherwise, it runs a full text search for the search term in the first and last name, which takes a while; see the code for full details.

Since orders aren’t manually assigned/reassigned too often, we decided this behavior was acceptable for now.

WooCommerce Customer Reports

The WooCommerce customer reports were the other major performance hit. For now, we simply disabled the customers reports since this particular customer already has a third-party system where they manage the customer data, so they’re not likely to use the WooCommerce reports.

Customers vs. Guests

Timing details:

  • Around 20–40 sec to load the page
  • Around 7 sec to get administrator users
  • Around 7 sec to get shop_manager users
  • Around 6 sec to get all other users

Here are the actual actual MySQL queries. In my staging environment, PHP ran out of memory; in my local environment, it took 305MB(!) to load the report for 1 week (the default view).

As noted above, we gave up on optimizing this report since the customer doesn’t really need it.

Customers List

This was completely unusable; it took 300 seconds to load the page.

Here are the actual MySQL queries. Two JOINs times in each query with full-text search on two columns × three queries was just too much on such a large table.

As noted above, we gave up on optimizing this report since the customer doesn’t really need it.

WordPress Posts Dashboard

Update: found another place where there’s a 15+ second wait. When bulk-editing posts (or any other post type that supports authors), an author dropdown causes a full search of the wp_usermeta table. I’ve updated the patch file in the gist above to include a fix for this.

WordPress Posts Dashboard bulk edit author dropdown

Summary

In summary, having 500,000 users on a WordPress and WooCommerce site doesn’t hurt overall performance too much, as long as you include an upcoming change in WP core and can get by without the WooCommerce customer reports.

We may explore further options to improve performance particularly when getting user roles from wp_usermeta, and I will update this post or add a new post if we find other enhancements.

Merge + Minify + Refresh Force Clear Caches

This plugin clears other page caches/proxies when the Merge + Minify + Refresh cache is regenerated so users don’t end up missing static resources (CSS/JS files) due to a cached page trying to load old static resources.

Get the code at the GitHub repository. Continue reading “Merge + Minify + Refresh Force Clear Caches”

WordPress Theme Stylesheet Auto-Cachebusting

If you’re like me and you set up server-side caching, CloudFlare (or any other proxy cache), and/or long Expires headers on your theme stylesheets, you know the hassles that go into invalidating those caches and forcing browsers to load an updated version.

One method to force browsers to pull the new version is to add/change the ver parameter in the query string (WordPress’ wp_register_script, wp_register_style, wp_enqueue_script, and wp_enqueue_style have a built-in way of doing this).

Query Strings

Defining a Constant

Here’s how I used to do it: define a constant and then manually update that, as well as the Version header in the style.css file.

The main drawback is that I had to manually update the version number in two places: functions.php and style.css.

File Modification Time

Another method I’ve used at times is to add the style.css file modification time as the query string.

This has the advantage that the cache will be invalidated every time the stylesheet file changes. For some reason, I really never cared for it that much, though. It does add a tiny bit of performance hit, as the filemtime has to go get the file and figure out the timestamp.

WordPress Theme Version

I just read about this today and I think it’s going to be my go-to method in the future. It uses one of WordPress’ built-in theme functions to get the version number from the stylesheet.

This has a bit of overhead as well due to the function call, but I like the fact that it uses the theme version number so you can easily do Semantic Versioning.

Filename Modification

However, most performance testing tools will recommend you remove query strings altogether to improve caching especially by proxy servers. Here are two automatic methods of doing that:

File Modification Time

WordPress Theme Version

Conclusion

No matter which method you choose, setting up caching can greatly improve your site speed. It’s worth a little bit of hassle to ensure your stylesheets and scripts are cached, knowing that with any of these methods, you can easily invalidate those assets.

WordPress and Responsive Images: srcset and sizes Attributes

I’ve been puzzling over a performance issue on one of my sites lately and finally got it figured out today.

A custom shortcode uses the_post_thumbnail() to include a small image (~250px square); I have a bunch of custom image sizes set up, so the srcset attribute included 16 different image sizes. (Overkill? Maybe….) However, instead of using the image closest to 250px wide, Chrome was pulling in the largest available image—sometimes as much as 1500px! Obviously, this is terrible for performance.

In addition, certain pages had dozens of these images, multiplying the effect. My test-case page was 10+MB total (way too heavy…).

Here’s a sample <img> tag for one of my images:

I finally realized that the root cause was that the sizes attribute was set to “100vw,” basically suggesting to the browser that the image might at some point be rendered at 100% width of the viewport, so my browser was happily pulling down the largest possible image so it could display it in all of its high-resolution, hundred-KB glory instead of the scaled dozen-KB file.

Since I know this particular shortcode will only show images at a max width of 250px, I needed to set the sizes attribute to “250px,” hinting to the browser that anything slightly larger than that would be perfectly adequate.

Here’s how I handled it:

  1. Added a filter to wp_get_attachment_image_attributes and checked for a named image size or array
  2. If it was an array, I set the sizes attribute to that pixel size
  3. If it was a named image size, I set the sizes attribute to that pixel size

Sample code:

This approach admittedly is a bit heavy-handed and I’m still testing out the behavior on other pages on the site, but this fixed the problem with that specific shortcode, and my test page went from 10+MB to ~5MB—it cut my page weight in half!

The moral of the story is don’t “set it and forget it,” especially when adding responsive images using custom code.

Thanks to VIA Studio for their blog post that finally made the solution click for me.

WPForms: Force Async Scripts

This little plugin has one job: make WPForms’ Google Recaptcha script load asynchronously.

Get the plugin on GitHub.

Installation Instructions

  1. Copy the wpforms-recaptcha-async.php file to your wp-content/plugins/ directory (or compress to a zip file and upload in the WordPress backend “Plugins > Add New”
  2. Activate
  3. That’s it. All the merged scripts will be loaded with the async and defer attributes.

Merge + Minify + Refresh: Force Async Scripts

This little plugin has one job: make all merged scripts produced by the Merge + Minify + Refresh plugin load asynchronously.

Get the plugin on GitHub.

Installation Instructions

  1. Copy the merge-minify-refresh-async.php file to your wp-content/plugins/ directory (or compress to a zip file and upload in the WordPress backend “Plugins > Add New”
  2. Activate
  3. That’s it. All the merged scripts will be loaded with the async attribute.

Potential Issues

Note that if you have plugins or other code that adds inline Javascript, it probably will break if it’s depending on jQuery (for example) to be loaded and active already.

One workaround is to exclude /wp-includes/js/jquery/jquery.js from the Merge + Minify + Refresh cache, but you may have to exclude other scripts as well.