Behold WordPress, Destroyer of CPUs

Lately I've been delving into the WordPress ecosystem, as it seems to be the most popular blogging platform around at the moment. I've set up two blogs with it so far. In the process, I've gotten quite comfortable with the setup, interface, and overall operation of WordPress.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2008/04/behold-wordpress-destroyer-of-cpus.html

Is the database on the same server? Seems like that may be another culprit if so…otherwise, if the DB is on another server, then I’d have to ask what the heck WordPress is doing that requires all of that CPU?

I don’t know or care about WordPress and do not doubt for one second that the insides are something like a steaming pile of burned lentil soup - however, this post reminds me of a fellow at our LUG who was complaining about how slow rsync was with lots of files when running under cygwin.

Perhaps I should try running a large DotNetNuke site with Mono on Debian unstable inside a Windows 98 Virtual PC and write a blog complaining about how bad it is.

WordPress + WP-Super-Cache + server running LiteSpeed = super-fast, solid site. Fireball-proof. :wink:

Also, there’s a lot you can do to remove redundant DB queries in a custom theme vs the standard theme files.

I agree caching should be rolled into the core, though.

Hey Drupal Fanboys,

Drupal is not the answer and is sometimes even more egregious, especially when the PathAuto (a must for SEO) module is deployed. And if you use Views (which most installations also require), a simple block with it can throw your queries through the roof (as many as 80 more per page). So drupal is not the answer to performance in any stretch of the imagination and its blogging tools are atrocious, lacking even fundamental features bloggers need. Not trying to start a holy war, but having used both at the enterprise level (30 - 80 Million uniques a month on some of our sites), I kind of have a handle on what it means to scale either one of these open source apps.

As far as the whole thread goes, Windows/IIS is a poor choice for PHP in general, much less for high traffic php apps. As you noticed a similarly equipped linux server can chew through it even more. Further more, ‘CGI’ PHP is slower overall from ‘Module’ PHP, and is another thing skewing the results.

“But I cannot accept that a default, bare-bones WordPress install hasn’t the first clue how to cache and avoid expensive, redundant trips to the database”

Actually, WordPress 2.5 does have just that sort of thing built right into it. It’s called the object cache:
http://neosmart.net/blog/2008/wordpress-25-and-the-object-cache/

Here’s the key though… Where do you store your cached data?

The caching solutions you’re talking about are “whole page” solutions, which cache the resulting HTML output of page generation and serve it up instead of regenerating the page. WP-Super-Cache is particularly effective at this, but does indeed rely on running Apache with mod_rewrite (the current most popular webserver combination).

The built in caching in WordPress is an object cache. Data retrieved from the database is stored and saved somewhere else. But, by default, WordPress does not attempt to tell the user where they should store that data. It’s a framework, with several possible plugins for just that.

See, back in previous versions, WordPress had a built in cache that defaulted to storing the data in a local cache of files. This helps somewhat, but for the most part, it’s not helpful. Storing data in files and retrieving them from files incurs a lot of disk I/O. And what is a database server but something that does much the same thing? So that was ripped out of 2.5 and the generic platform for object caching was put in instead. It can cache to a number of different things, including persistent and fast memory (using memcached). A plugin does exist to cache objects to files though:
http://neosmart.net/blog/2008/file-based-extension-to-the-wordpress-object-cache/

But on the whole, this is not going to reduce your CPU time, because you’re still generating the page. Database access is not the CPU limitation on most hosts, as the database is usually on another server entirely. Thus the need for whole page caches.

Incidentally, WordPress provides the necessary hooks specifically for whole-page caching. This was designed in, not some sort of “hack” that was done by these plugin programmers.

WordPress is designed, from the ground up, to be a base of systems, not to be a complete and total system in and of itself. WordPress does not do everything, it just does one thing as well as it can. You don’t use can openers on luggage, do you?

Wordpress’ coding framwork… isn’t a framework. A bunch of files randomly sewn and included together that somehow manages to produce a page. UHG! … If I had to depoly a blog tho, I’d use Joomla.

$irony++;

I used to blog and got out of it after using both WordPress and Movable Type. I agree with your comments about WP, and, in fact, they can be extended to cover any blogging engine that goes to the database every time a page is refreshed.

The WP approach makes life easier on the blogger because it eliminates the need to wait on static page rebuilds everytime new content is posted. But, as you point out, the burden is then passed to the blog readers, who are made to wait while WP makes those redundant database calls.

Certainly, the host I used sold lots and losts of shared space to WP users, but told anyone who asked that those sites would go belly up as soon as they began to get a lot of incoming traffic. While they didn’t push MT, they acknowledged that a large MT blog running on a fast non-Apache server would very likely survive its own popularity.

Jeff, how do you explain wordpress blogs surviving multiple social bookmarking front pages (digg, stumble and delicious) while on shared hosting?

I did it at least 10 times, and it never crashed.

By the way, I did that with no cache plugins whatsoever.

Umm, database accesses are just that, no matter if it’s mySQL, SQL Server, using PHP, .NET, Apache, IIS…

Unneeded db accesses are just bad design.

First of all, I don’t see the big deal with installing WP-Cache separately after installing WordPress. After all, it IS a UNIX program. In UNIX, everything is an option - nothing is forced on you, unlike Windows. I’m sure there are cases where WP-Cache is not wanted, or, more likely, people want to be able to pick which caching plugin they want to use. Because of this, it’s better that it not be included by default.

Besides, why are you using Windows Server? I mean, Windows is a mediocre OS for the desktop, at best. But Windows SERVER? Those two words just don’t go together in the same sentence. I’m sure you’d get much better results from KVM+JEOS, or just a straight up Debian, SUSE, Slackware, RHEL, or Gentoo install, if you don’t want virtualization.

20 lookups isn’t good, but I wonder how some of the plugin appearance widgets will work without some lookup overhead.

Beyond that, we have a couple of Wordpress mu (multi-user) sites set up and the processors aren’t even breaking a sweat. There not big sites either, but one of the is serving health lecture podcasts and is incredibly popular (I am assuming the RSS generator is dynamic as well and should be putting a load on the system) and another RSS feed is in the default display of our portal system that gets thousands of hits a day.

We’re using unix/apache.

Is there something special in that mix that separates them from Jeff’s situation?

WordPress optimization is horrible (or shall I say there’s no optimization?). Earlier, I used to worry about even the size of images in the theme design, but have since forgotten such things because optimization (or the lack thereof) of WP is a bigger matter.

For what it’s worth, I might add that the Super Cache plugin does an incredible job (it creates actual HTML files, a 'la MT).

I survived two medium-sized diggs on Dreamhost (ha!) with Super Cache. So, I am sure that Super Cache can handle half a million hits or whatever in a day on a better host.

As for Daniel Scocco’s comment above ^, I personally know that his host is Doreo (and I am on it too, now). They are a small but reliable company, and servers are blazing fast.

Last week, I survived 2200+ diggs on a post without any caching plugins. Of course, lesser hosts would’ve crumbled (for example, Dreamhost), but it shows that the host is to be blamed too (for cramming hundreds of sites per server and overselling).

Just wanted to say that the issue is not actually in the wordpress itself but the enviromment it is used on, but there’s plenty such comments…

I’m also curious to know why you never spoke anything of MT’s vices in optimization (there’s atleast some).

How about, um, rebuilding the cache?

Vladas: Because running it on linux would somehow decrease the number of queries how?

“Personally, I think it’s absolutely irresponsible that WP-Cache like functionality isn’t already built into WordPress. I would not even consider deploying WordPress anywhere without it. And yet, according to a recent podcast, Matt Mullenweg dismisses it out of hand and hand-wavingly alludes to vague TechCrunch server reconfigurations.”

That’s one opinion. I know where he’s coming from – a lot of stuff we can’t control from within WordPress. We can’t install a PHP opcode cache or set up MySQL’s query caching (two huge legs up for any PHP/MySQL application). But the simple fact is that HTML output caching (a la WP-Cache, WP-Super-Cache) makes a huge difference in the performance of a WordPress site, whether the server is correctly configured or not. We’re talking almost two orders of magnitude. I agree with you – I think that it’s high time we had an HTML output caching system baked into core. This issue is going to be tackled by a Google Summer of Code applicant, under guidance from senior WordPress developers.

Jeff, at far as I can tell (by using Alexa), this “trickle” is hundreds of tousands of page views PER DAY.

I see typical “just use linux” comments, with backing. Well, here’s some backing: varnishd. It’s a smart front-side cache that relies on a smart virtual memory system. Now, WordPress still needs to set cache-control headers properly, but that’s a much easier solution than implementing fine-grained caching of database resultsets.