MySQL Table Size

Ever wondered which database or tables are taking up disk space on a MySQL/MariaDB server?

This query will provide the size of each table:

SELECT TABLE_SCHEMA AS `Database`,
TABLE_NAME AS `Table`,
ROUND(((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024), 2) AS `Size (MB)`,
ROUND((data_free / 1024 / 1024), 2) AS `Reclaimable Size (MB)`
FROM information_schema.TABLES
-- WHERE `TABLE_SCHEMA` = 'database_name'
ORDER BY (DATA_LENGTH + INDEX_LENGTH) DESC;

Database Platform Comparisons for Laravel Feature Tests

TL;DR: MySQL significantly outperforms MariaDB in my automated test suite.

The Problem

This Twitter thread prompted me to do a bit of research on database platforms for Laravel automated tests.

I’ve recently been building an ecommerce app based on Laravel. Partway through development, we added geometry fields to a couple of tables in order to determine distances. I’ve been using this spatial package, so SQLite was not an option for my test suite.

As soon as I switched the testing database driver from SQLite to MariaDB, my tests immediately took an extra 12–13 seconds to run, regardless of whether I ran the entire test suite, a single file, or just one test.

This significantly lengthened the feedback loop when making changes to code and re-running tests.

So when I saw Jack Ellis mention that he uses MySQL for his test suite, it made me curious if he had the same issue.

He said that one of his test files runs 39 tests in < 2 seconds, so apparently it’s not been a problem for him.

Context

  • I’m using the LazilyRefreshDatabase trait added in Laravel 8.62.0 on my entire test suite
  • I’m using squashed migrations
  • Many of my tables have constrained foreign keys referencing other tables

Comparisons

I decided to do some digging; here are comparisons using four different platforms for the same test in my application.

MariaDB

I’ve been using MariaDB as the main database platform on my development machine for years. Currently I’m on version 10.6.4.

In-Memory SQLite Database

I temporarily disabled the geometry features and tried the in-memory SQLite database (DB_CONNECTION=:memory:); it performed much better for the same tests:

SQLite File Database

I then tried with an SQLite file (DB_CONNECTION=sqlite), and it performed about the same:

MySQL 8

I have an installation of MySQL 8 set up for one app that uses some specific MySQL 8 and I figured why not give that a try too.

Here are the results:

Summary

For some reason, MariaDB takes approximately 12–13 seconds to tear down and recreate the database before starting to run tests, but MySQL is much faster.

While testing MariaDB, I opened the raw data directory for the database, and noticed chunks of files being removed and recreated at a time, so perhaps the foreign key constraints are (part of) the culprit here.

I do have 77 databases with ~3800 tables in my MariaDB installation built up from various projects over the years. It seems unlikely, but theoretically possible, that the server size could be part of the problem too.

I think I’ll experiment with switching back to MySQL as my development platform of choice.

Have you run into this same issue? Have any tips or tricks? Let me know in the comments.

Counting Distinct Values in a Single Field

A quick MySQL snippet to count how many times a value appears in a single field—much easier to grok than multiple JOINs.

SELECT
COUNT(CASE WHEN meta_value = 'value1' THEN 1 END) AS value1,
COUNT(CASE WHEN meta_value = 'value2' THEN 1 END) AS value2
FROM wp_post_meta;

/** Results

| value1 | value2 |
|--------|--------|
| 75     | 56     |
*/

WordPress and WooCommerce at Scale with 500,000 Users

WordPress generally works fairly well on small-to-medium sites; on larger sites, it can run into performance issues because of the size of the database.

A current project I’m working on for LuminFire has a WooCommerce store with potentially 270,000+ customers, and that’s causing some issues with site performance. For sake of development, our dev store has 500,000+ users.

Contents:

Generating Dummy Users

First, I generated a CSV file with 50,000 fake users using mockaroo.com and imported them using this plugin (I had to bump up my PHP memory limit to 4096MB and execution time to 240 and it still timed out a couple of times, but I deleted the users who had already been imported and then ran the import again).

Update: I have since learned about WC Smooth Generator, and it may have worked just as well or better.

I figured 50K unique users were plenty and duplicating them via MySQL queries was more efficient, so wrote these queries to do the job.

The users queries ran in seconds each, copying a batch of 50K rows at a time. The usermeta queries, not so much…they took about 11 minutes each since there were about 1.71 million rows to clone each time.

In case it’s helpful to you, here are files with the dummy users:

  • CSV with 50K users
  • MySQL dump of wp_users table (30.4MB)
    • It does include the ID field since it needs to match the usermeta table.
    • The first user ID is 510024; if you try to import and have user IDs above 510024, you’ll have errors.
    • 500,000 rows
  • MySQL dump of wp_usermeta table (148.5MB)
    • It does not include the meta_id field (so no errors trying to overwrite existing IDs).
    • 17,000,000+ rows

WordPress User Dashboard

At the top of the user dashboard, WordPress typically displays a list of the user roles on your site, as well as the number of users in each role.

user roles

The number of each users is generated by the count_users function, which uses a resource-intensive SQL query. In our dev site, it takes 15+ seconds just to run the query.

This is a known issue and should be resolved in WordPress 5.0 (currently scheduled for release in late 2018), but we need this working much sooner.

WordPress Multisite drops the number of users per role and instead shows just a list of roles; that’s how WordPress.com and other large multisite networks avoid this performance hit.

The proposed patch on the ticket modifies the count_users function to behave similarly to WP Multisite: if there are more than 10,000 users (modifiable with the new wp_is_large_user_count filter), it doesn’t show the number of users per role.

Since the patch is for the development version of the WP code, it doesn’t apply cleanly to a production site, so I manually patched WP core. Here are several patch files for different versions of WordPress; make sure to use the appropriate patch for your version:

Core Patch File

You can apply this patch file by downloading it, opening your WordPress installation in a terminal, and running git apply <path-to-downloaded-patch-file>.

WooCommerce User Queries

The biggest performance hit I found was searching for customers when editing an order; it took 10–15 seconds for search results to be returned.

WooCommerce user search

  • If searching by customer ID, the backend would respond pretty quickly.
  • Otherwise, it runs a full text search for the search term in the first and last name, which takes a while; see the code for full details.

Since orders aren’t manually assigned/reassigned too often, we decided this behavior was acceptable for now.

WooCommerce Customer Reports

The WooCommerce customer reports were the other major performance hit. For now, we simply disabled the customers reports since this particular customer already has a third-party system where they manage the customer data, so they’re not likely to use the WooCommerce reports.

Customers vs. Guests

Timing details:

  • Around 20–40 sec to load the page
  • Around 7 sec to get administrator users
  • Around 7 sec to get shop_manager users
  • Around 6 sec to get all other users

Here are the actual actual MySQL queries. In my staging environment, PHP ran out of memory; in my local environment, it took 305MB(!) to load the report for 1 week (the default view).

As noted above, we gave up on optimizing this report since the customer doesn’t really need it.

Customers List

This was completely unusable; it took 300 seconds to load the page.

Here are the actual MySQL queries. Two JOINs times in each query with full-text search on two columns × three queries was just too much on such a large table.

As noted above, we gave up on optimizing this report since the customer doesn’t really need it.

WordPress Posts Dashboard

Update: found another place where there’s a 15+ second wait. When bulk-editing posts (or any other post type that supports authors), an author dropdown causes a full search of the wp_usermeta table. I’ve updated the patch file in the gist above to include a fix for this.

WordPress Posts Dashboard bulk edit author dropdown

Summary

In summary, having 500,000 users on a WordPress and WooCommerce site doesn’t hurt overall performance too much, as long as you include an upcoming change in WP core and can get by without the WooCommerce customer reports.

We may explore further options to improve performance particularly when getting user roles from wp_usermeta, and I will update this post or add a new post if we find other enhancements.

Merge + Minify + Refresh Force Clear Caches

This plugin clears other page caches/proxies when the Merge + Minify + Refresh cache is regenerated so users don’t end up missing static resources (CSS/JS files) due to a cached page trying to load old static resources.

Get the code at the GitHub repository. Continue reading “Merge + Minify + Refresh Force Clear Caches”

WordPress Theme Stylesheet Auto-Cachebusting

If you’re like me and you set up server-side caching, CloudFlare (or any other proxy cache), and/or long Expires headers on your theme stylesheets, you know the hassles that go into invalidating those caches and forcing browsers to load an updated version.

One method to force browsers to pull the new version is to add/change the ver parameter in the query string (WordPress’ wp_register_script, wp_register_style, wp_enqueue_script, and wp_enqueue_style have a built-in way of doing this).

Query Strings

Defining a Constant

Here’s how I used to do it: define a constant and then manually update that, as well as the Version header in the style.css file.

The main drawback is that I had to manually update the version number in two places: functions.php and style.css.

File Modification Time

Another method I’ve used at times is to add the style.css file modification time as the query string.

This has the advantage that the cache will be invalidated every time the stylesheet file changes. For some reason, I really never cared for it that much, though. It does add a tiny bit of performance hit, as the filemtime has to go get the file and figure out the timestamp.

WordPress Theme Version

I just read about this today and I think it’s going to be my go-to method in the future. It uses one of WordPress’ built-in theme functions to get the version number from the stylesheet.

This has a bit of overhead as well due to the function call, but I like the fact that it uses the theme version number so you can easily do Semantic Versioning.

Filename Modification

However, most performance testing tools will recommend you remove query strings altogether to improve caching especially by proxy servers. Here are two automatic methods of doing that:

File Modification Time

WordPress Theme Version

Conclusion

No matter which method you choose, setting up caching can greatly improve your site speed. It’s worth a little bit of hassle to ensure your stylesheets and scripts are cached, knowing that with any of these methods, you can easily invalidate those assets.

WordPress and Responsive Images: srcset and sizes Attributes

I’ve been puzzling over a performance issue on one of my sites lately and finally got it figured out today.

A custom shortcode uses the_post_thumbnail() to include a small image (~250px square); I have a bunch of custom image sizes set up, so the srcset attribute included 16 different image sizes. (Overkill? Maybe….) However, instead of using the image closest to 250px wide, Chrome was pulling in the largest available image—sometimes as much as 1500px! Obviously, this is terrible for performance.

In addition, certain pages had dozens of these images, multiplying the effect. My test-case page was 10+MB total (way too heavy…).

Here’s a sample <img> tag for one of my images:

I finally realized that the root cause was that the sizes attribute was set to “100vw,” basically suggesting to the browser that the image might at some point be rendered at 100% width of the viewport, so my browser was happily pulling down the largest possible image so it could display it in all of its high-resolution, hundred-KB glory instead of the scaled dozen-KB file.

Since I know this particular shortcode will only show images at a max width of 250px, I needed to set the sizes attribute to “250px,” hinting to the browser that anything slightly larger than that would be perfectly adequate.

Here’s how I handled it:

  1. Added a filter to wp_get_attachment_image_attributes and checked for a named image size or array
  2. If it was an array, I set the sizes attribute to that pixel size
  3. If it was a named image size, I set the sizes attribute to that pixel size

Sample code:

This approach admittedly is a bit heavy-handed and I’m still testing out the behavior on other pages on the site, but this fixed the problem with that specific shortcode, and my test page went from 10+MB to ~5MB—it cut my page weight in half!

The moral of the story is don’t “set it and forget it,” especially when adding responsive images using custom code.

Thanks to VIA Studio for their blog post that finally made the solution click for me.

WPForms: Force Async Scripts

This little plugin has one job: make WPForms’ Google Recaptcha script load asynchronously.

Get the plugin on GitHub.

Installation Instructions

  1. Copy the wpforms-recaptcha-async.php file to your wp-content/plugins/ directory (or compress to a zip file and upload in the WordPress backend “Plugins > Add New”
  2. Activate
  3. That’s it. All the merged scripts will be loaded with the async and defer attributes.

Merge + Minify + Refresh: Force Async Scripts

This little plugin has one job: make all merged scripts produced by the Merge + Minify + Refresh plugin load asynchronously.

Get the plugin on GitHub.

Installation Instructions

  1. Copy the merge-minify-refresh-async.php file to your wp-content/plugins/ directory (or compress to a zip file and upload in the WordPress backend “Plugins > Add New”
  2. Activate
  3. That’s it. All the merged scripts will be loaded with the async attribute.

Potential Issues

Note that if you have plugins or other code that adds inline Javascript, it probably will break if it’s depending on jQuery (for example) to be loaded and active already.

One workaround is to exclude /wp-includes/js/jquery/jquery.js from the Merge + Minify + Refresh cache, but you may have to exclude other scripts as well.