Quad Core Desktops and Diminishing Returns

Dual core CPUs were a desktop novelty in the first half of 2005. Now, with the introduction of the Mac Pro (see one unboxed), dual core is officially pass. Quad core-- at least in the form of two dual-core CPUs-- is where it's at for desktop systems.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2006/08/quad-core-desktops-and-diminishing-returns.html

doesn’t Windows use other CPUs to run different applications?
I mean, you wouldn’t get much benefit from a single application (unless the application takes advantage ot it), but from the whole machine, running different applications at the same time

Hmmm…one of the things that having multiple CPUs really helps with for me, as a developer on linux, is speeding up compiles. “make -j [n]”, where [n] is the number of sources you want make to compile at a time is a real help. “scons”, which I actually use more often, also has a “-j [n]” option that does the same thing.

In fact, this often helps on single CPU systems as “-j 2” can cause slight speedups as one process can be actually compiling while the other is waiting on disk IO.

Just wondering - can Visual Studio do parallel compiles on multi-CPU systems? Having a 4-CPU system with the VS equivalent of “scons -j 8” running (although it would probably be multi-threaded instead of multi-process due to NT’s poor process creation speed, the synchronisation problems wouldn’t be that different) would be great for those big compiles.

For performance increases in individual applications you will need advances in thread management. For multiple applications you should see a much bigger benefit sooner.

An example of where I think this could shine is a web server running Apache with pre-forked processes.

Time to start delving more into functional languages.

a web server running Apache with pre-forked processes.

That’s great, but remember this is specifically about DESKTOPS.

On the server, it’s a totally different story, just like 64-bit:
http://www.codinghorror.com/blog/archives/000435.html

My thoughts exactly: as the number of processing units increase, we might have to solve more problems using functional languages to benefit from parallelism.

Adam: There are several powerful build solutions for VS that even go beyond single workstation builds, for instance IncrediBuild (http://www.xoreax.com/).

There is of course the issue of how much of a bottleneck the CPU is in the first place.

Disk I/O, RAM I/O, and various latencies are more and more the real bottleneck. Especially for desktop systems, where we often have 1 process running and 30 others sleeping, without much chance of parallelizing work.

And in come the natural inneficiencies of multiple cpus: the need for locking the bus, maintaining cache coherency, affinity, TLB misses and flushes, and so on.

For example, Apache load tests can be dominated by network I/O. And throwing more CPUs at the problem is not really the way to go. Which is why the linux networking people are looking for example at Jacobson’s net channels.

It’s not necessarily as hard as you make it out to be. Look at the results of a single “#pragma omp parallel for” on an application: http://www.knowing.net/PermaLink,guid,59d8fdd5-54fb-4ddf-8858-c784ac6209d6.aspx

Much (most?) media processing will involve at least some hotspots without loop-carried dependencies that can be sped up similarly. With dual-cores, the benefit of doing some multithreading is perhaps 70% – debatable benefit. With quad cores, it may reach 3x, and 8 cores perhaps x5-6. I can’t see any processor-intensive niche (media, gaming, database) ignoring that.

“I mean, you wouldn’t get much benefit from a single application (unless the application takes advantage ot it), but from the whole machine, running different applications at the same time”

I think Eber has hit the nail on the head here.

I agree that there is a point where the returns on optimising an application for more cores won’t be worth the effort expended. But then you can look at different ways of working.

Lets have a hypothetical situation where you’re a video editor - you’ve just finished editing one part of a show. So you click encode and away goes two of your CPUs/Cores.
On a dual core system you can’t do much during the encode without affecting performance.
On a quad core system, you could potentially pull up another project and finish some editing on that.

With a rewrite of the video editing software, it might be possible to have it encode multiple parts of the show at once - eg: each core gets 1/4 of the show to render.

For software developers, quad core might not provide any immediate benefits on “regular” applications.
You could however implement continuous unit testing - perhaps automated stress testing on your desktop.

Heck, for those developing software that works with database intensive stuff, you can do it all on your dev machine, rather than waiting for a shared resource (eg: the dev server) to become available.

Instead of just looking at how one application can benefit from many cores - start looking at how running many applications can provide a benefit.

A regular end user probably won’t see much improvement at first (except that their spyw*re runs faster), even for games. But then again - I don’t think these are being maketed for regular end users.

Great post.

Spot the programs that only use two threads… :slight_smile:

As long as you don’t lose CPU bandwidth with bus contention and the operating system sets the thread affinity randomly or even intelligently, then more cores should result in a better user experience when you are multitasking applications.

What would be nice is if you added some tests where you ran some of those applications simultaneously on dual and quad core systems.

l8r!

So I have not seen any price breakdowns of the quad cores yet, but you know for sure it will be a selling point. The initial first use will be simultaneous apps - but I do agree with the apps being disk-bound, bus contention, etc. I do love the dual cores - my favorite part is being able to kill a process that has pegged one of the CPUs.

Couldn’t agree more, quad cores are going to be a lot tougher to keep busy. I bet the folding/seti/mersenne/whatever guys are rejoicing though.

I wonder if quad cores will bring more benefits for virtualized operating systems. I would like to say yes, but I don’t know for sure.

If you use a lot of virtual machines you can win (developers take note), and if you regularly run lots of apps win again. For me, I suspect that my main time-critical tasks would speed up with more cores because the apps I use are usually multithreaded. Processing digital photos with www.bibblelabs.com it runs a queue in the background as edits to each photo are committed, and I not uncommonly get 30 or more entries in the queues. More cores, shorter queues :slight_smile:

RAM does become more of an issue, and disk speed too - but I’m willing to boost those to get faster throughput (I already have, RAID5 + mirrored boot disk)

What I do want is multithreaded input - I’d love it if Windows could give me one cursor per input device so that my mouse and tablet could be used independently. Ideally with some way to grab the keyboard with either.

Will and James are right. Sure, single apps won’t find much of a boost, but multiple apps will. With quad cores we could switch to being our own continuous integration server, continuously running tests in the background. There is already a framework for this in Ruby. My desktop basically is a server. It actually does more than a server. I listen to music, while writing code, against a web server on my machine that hits a database also on my machine. I hope things would be speedier with 4 CPU’s.

I am eager to see benchmarks along the lines of what James suggests. Benchmark a simulation of 50 users hitting an Apache server that hits a PHP application which pokes a MySQL database while playing music. At what point does it become a test of how well the OS handles multiple CPU’s?

Or even just run all the benchmarks at the same time.

The C++ compiler in VS2005 has some pragmas to automagically use multiple CPUs in loops, and some of my coworkers have gotten fairly hefty speedups using them, but it’s pretty specialized stuff. It’s also apparantly fairly .net hostile, too, and can’t be done anywhere near managed code.
It would be nice to run a benchmark of a game that was multiprocessor aware. IIRC, early versions of Quake 3 took advantage of multiple CPUs, but that feature got broken with an early update. It would be interesting to run the original game on a quad core system and see the results.

Right now, I have the following apps all running:

TextMate (a text editor)
NetNewsWire (a news reader)
Firefox
Safari
Adium (a chat client)
Apple Mail
Transmit (an FTP client)
Photoshop
Parallels (virtual machine)
Finder (file explorer)

In addition, I have the following servers up and running:

MySQL
Apache
Lighttpd (web server)
PHP

Now, most of the time, they are idling. But, there are periods when I have TextMate open on one screen, and Parallels on another. Inside the Parallels VM I am running IE, which is calling Lighttpd, which is calling PHP, which is calling MySQL.

At the same time, I am often pushing the last milestone out to the server via ssh or Transmit. And, there’s usually iTunes playing in the background.

1 core for Parallels
1 core for the web server, php and mysql (since they are serving one client and can be scheduled synchronously)
1 core for iTunes and Transmit/ssh
1 core for Mail (checking every 30 minutes), the OS services, and the idling apps

Running all that on a single core machine is painful. So painful that even with enough RAM, I would still break it up into discrete groups of apps and only run one group at a time. On a dual core system, it runs along nicely, but does get slow as the level of concurrency goes up. A quad core system would scale even higher.

It’s not single application performance that counts, it’s the ability to scale when you need it.

On a dual core system, it runs along nicely, but does get slow as the level of concurrency goes up. A quad core system would scale even higher.

Unlikely, because moving to dual core satisfies 90% of the CPU and scheduling bottlenecks you saw on a single core.

You’ll get no argument from me on the superiority of dual core for everyone. It’s a huge improvement! But throwing 2 more cores in there is a negligible perf gain except in highly specialized scenarios.

My desktop basically is a server. It actually does more than a server.

Really? 100+ users are hitting your desktop simultaneously? C’mon. This is a fantasy. 64-bit and quad-core are effective on a server because the load scenario is extreme. A single user, no matter how 1337 he or she may be, won’t even come close to the kind of load you see on a server.

First of all, I find myself greatly appreciating hyperthreading (my box predates the dual-core processors) as it allows a processor-intensive task to run without turning the rest of the system into a dog. There’s more room to go in that direction, though–I would definitely appreciate a quad core chip.

Second, as such chips become more common I expect we will see more programs written to take advantage of it.

As for games, yes, they are generally video bound but that doesn’t mean there isn’t room to improve things with multi-threading. As such chips become common enough I expect we will see additional threads used to run AI’s. There’s always room for smarter enemies and if you can move the enemy logic onto a separate processor you won’t bog the system if the enemy hits something that takes too much thinking about.