Should All Developers Have Manycore CPUs?

yeah I made the jump to 4 cores from one and so far I think it was mostly a waste. And I do spend a lot of time in photoshop but there still is not that much of a distance.

So in the sense that mainstream developers are modelling server workloads on their desktops, I agree, they do probably need as many cores as they can get.

Are you saying that you understand that web developers need a “many core” setup? or was that sarcasm?

In the event that it’s sarcasm consider this: corporate web developers today are trying to mimick a complex environment on their desktop. There may be a database server, a web server, and one or more web services running at the same time that the dev is debugging. By definition, they are an edge case in multi tasking.

I guess if you just work on small / simple websites that only interact with a small database backend then 2 cores and 2GB of RAM is enough. However, if you’re working on anything larger then the more the merrier.

Most developers aren’t writing desktop applications today. They’re writing web applications.

That would be my guess as well, but do you have any actual numbers on this? And not only desktop vs web application developers, but all the different branches of our field, including the far too often forgotten/ignored embedded software developers (we may be a relatively small group, but we write software that powers something like 98% of all CPUs).

The only thing I do (as a CS student) that really scales well on a multi-core CPU is compiling VHDL in Quartus II. A student project takes up to 20 mins to compile on that damn compiler on an Athlon XP 2000. A single flip-flop that I did took almost 30 seconds to compile on my Intel Dual-Core 1.6 ghz. On the laboratory machines (the Athlon XP one) took almost a minute. Anyway, that is still better than logical ports I guess.

The one issue I have with the idea that developers need the most kick-ass boxes to develop on is that I think developers often lose sight of the more constrained environments people are going to use their software in. Developers need to get in the habit of having at least one constrained box sitting around.

That said, since my development system is a Linux box with XP running constantly in a VM, I’m sure I could get more out of four cores.

Someone else mentioned it already, but VMWare is becoming a huge developer tool. In particular for Ruby or PHP developers running on Windows having a VMWare for deployment (or live programming on) is extremely beneficial as many subtle os and permissions bugs tend to creep in otherwise (not to mention the whole case sensitivity issue).

'If you’re a C++ developer, you need a quad-core CPU yesterday.'
I completely agree with you on this point. I compliled KDE on my two-dual-core-hyper-thread-cpu systems recently. The ‘make -j9’ is much much faster than the simple ‘make’, which make compiling the whole desktop suite possible in less than one hour.

Some facts:
On my Mac Pro with 8 cores, C++ build times are almost 4 times quicker than with a dual-core (using XCode / GCC).

On the same hardware, with Visual C++ 2008, build times are faster than dual-core, but unfortunatly absolutly not in the proportion of GCC on the Mac (using both the option to build multiple projects in parallel, and the little known /MP compilation flag). I haven’t tried incredibuild on this hardware (anyone has?)

I have not tried Visual 2005 of this hardware, but I had a 30% boost on another machine building with VS 2005 using both CPUs thanks to this great freeware: a href="http://www.todobits.es/mpcl.html"http://www.todobits.es/mpcl.html/a
Same speed gain as using incredibuild on this machine.

On all my tests, I have curiously noted no speed advantage using a Raptor 10k drive over a good 7.2k drive (caching I guess).

Therefore, multi-CPUs machine are obvious for programmers that need to be productive.

A last note: there are multiple reviews here and there about hardware and speed, with popular programs being benchmarked, but I do regret that we almost never see programmer tools being in these benchmarks.

As a couple of others have mentioned Virtual Machines can really benefit from multi-core systems. Some hypervisors even allowing you to specify how many cores you want your virtual machine to take advantage of.

“task manager wasn’t showing much use of even my measly little two cores”

Jeff, take a look at xperf for a more accurate view of measuring system performance:

http://www.microsoft.com/whdc/system/sysperf/perftools.mspx

its free…

-Rick

Testing multi-tier code using VMs for the tiers makes sense here. Often the alternative is multiple boxes because the OSs or their configurations must be different, or the “smart” middleware runs in-process with different rules if all tiers are the same “machine.”

Web development? That’s so 2001. We’re headed back toward the future, doing more slim-client development today. Productivity is higher along with user acceptance. Ajax is a dead horse in my world.

Virtual machines are another area that can benefit from multi-core. While the VM only runs on 1 CPU in most cases, if you have 2 or 3 VMs chugging away, the more cores the better.

That said, I suspect that in the case of multitasking, the benchmarks reflect a fundamental and fatal flaw in the commonly tested platform ( (cough) WINDOWS (cough) ) - in that it seems to love bouncing apps between cores just for the heck of it, heck if the app is consuming most of the CPU resources, Windows almost seems up for a game of pong as it tosses it betwixt processors hoping for a better fit that it can never seem to find. It is definitely time to reconsider how threads are allocated betwixt multi cores.

Once you hit 4 cores or more, it starts to make sense to let the OS have one core all to its own and then pin threads to the other cores based on how they are behaving in general. Maybe a CPU for low usage threads, and everything else gets sprinkled over the remaining cores, only moving once they prove a true trend. If I have 4 low usage apps that each use 2% of the CPU, why waste time finding the best fit? Leave them one one core, let the rest go idle. Bring in an app that uses 25%? Leave it with the other apps - there is still plenty of idle space on the core. An app maxes out the core and stays there for a bit? Now, maybe its time to consider moving it to an idle core and pinning it there.

Heck, maybe the CPU can even keep usage stats on apps and recognize that they are CPU hogs and using that to figure out where they need to start. (some apps are ALWAYS low usage, some are almost always high usage - maybe its time to recognize that and be preemptive intead of reactive about it!)

I’m a developer who skipped over dual-core and went straight to quad.

If you’re doing work with a big potential for deadlocks and race conditions, 4 is a magic number. A lot of code will have 1 master thread and multiple worker threads… and on a dual core system, you probably only end up with 1 worker thread. You will not notice any race conditions between multiple worker threads because you just don’t have them.

WRT multithreaded programming being tricky and difficult, yes and no. The implication is that there’s just one way to make code parallel, which is almost always not the case. Most developers I’ve seen tend to err by having too many threads and having them do tasks that are too fine-grained. There’s just too much effort spent starting, stopping, and synchronizing threads. With a lot of applications, it is not too hard (and pretty reliable) to divide up some work, have some threads do it in parallel with almost no communication, and write the results to some shared memory.

You have to ask which Windows you’re developing under. XP or a server version. This will make a big difference. I am assuming the server version will use more CPU’s if it finds more. Probably by turning on a switch.

But then I was using Visual Studio 2003 in a Windows 2003 computer and it was damn slow. Server version are not optimized for GUI applications. That’s why I prefer to use Windows XP for development.

However I am interested in benchmarks for:

Developing under XP single CPU
vs
developing under Windows 2003 multicore

with amount of ram and cpu speed being constant.

I want to know if the multicore will compensate for Windows 2003’s sluggishness in GUI applications.

and then throw in Vista in there as a third option.

and finally throw in a 64bit OS and someone can tell me which set has the best performance under VS 2008.

Where’s the sweet spot among all the different combinations and permutations per dollar amount taking into consideration number of cpu’s, amount of ram, type of windows and cpu speed.

Just talking about # of cpu’s is not good enough and is part of a whole solution.

While i can see where many people will benefit from many core systems i do mainly java web development and see no benefits. i have the database servers on actual servers and most of my time is spent copying files from one place to another durring a build. my personal favorite is a .war file with 42K + files in it, i am neither bound by the CPU or DISK but rather by concurnacy of FAT table usage.

Aha, glad to see the good Jeff we know is back :slight_smile:

It’s true that the Paint.NET benchmark tests a lot of filters – the types of code that are, as they say, “embarassingly parallel.” I made sure to include a lot of non-filter benchmarks as well, but they’re still of the obviously or embarassingly parallel category. The point I was trying to make was that every day people are using Paint.NET quite a bit and that running these filters in multiple threads is thus saving a lot of people a lot of time – even if it’s only 5 seconds per casual user per day (to make up another statistic). I would encourage you to try running some of your own normal Paint.NET usage on a quad core system, get a feel for it, and then set the process thread affinity to just 1 or 2 cores. Then compare how it runs and report on it! It’s definitely not 4x faster but it is faster.

Some of the work I’m doing now is in building a system that will let me rework more than just the easily parallelizable parts of Paint.NET to greatly improve response time. There are places that aren’t parallelizable but that should at least be isolatable as “background tasks” – writing undo data, for instance, currently blocks some other operations simply because they have write access to the data I’m spewing to disk. I think having better development tools/patterns will allow us to first and foremost write safe multithreaded code, after which we can optimize it for many-core scenarios. Even though Paint.NET is “multithreaded” I still consider it to have a “single threaded workflow”. And it’s failing to scale over time. Hey, I could write a blog post of my own on that :slight_smile:

Building Paint.NET in VS 2008 on a quad core is great. To do a Debug build with no multithreading takes 19 seconds on my QX6700 (yeah I bought the extreme edition right when it came out :)). Bumping that up to 4 threads brings the time down to 10 seconds (I also tried 8 threads with roughly the same result). At 2 threads it takes 14 seconds. That’s 4 - 9 seconds I’m saving every single time I do my edit-build-run loop. It rocks.

Towards the end of the year Intel is set to release Nehalem with up to 8 cores per processor package, and 2 threads per core (HyperThreading). I am seriously considering grabbing a Dual Xeon version of that – 32 cores. Then I can really find ways to make code fly and make sure Intel and AMD aren’t just caught in the same “But 2N is more than N! You need it!” hype cycle. But of course I’d never recommend that kind of system for … well … almost anyone else.

“If nothing else, dual-core CPUs protect you from badly written software; if a crashed program consumes all possible CPU time, all it can get is 50% of your CPU.”

Jeff, Jeff, that’s not a real argument for multi-cores. I can write badly written multi-threaded programs with consume 100% CPU-time of all available cores, even if there’s 4096 of them. :wink:

But recently I ordered a multi-core CPU especially for scalability tests (alas, it didn’t reach my desk yet).
Of course, we actually do RIP (as in raster image processing) for a living. So for me it’s not compilation speed (which is negligible for the compiler I use), it’s in fact execution times. So for a dual-core, I’d really expect a 9x% performance gain, or my code is wrong.

just FYI:
Most video codecs are not multithreaded, and don’t scale past dual cores. You are better off with the fastest dual core you can find as opposed to a quad-core. There are software packages that can use multiple cores well.

I used to work for Kulabyte, who sell a product that does do multi-core transcoding/live video, but they are one of small number of people currently able to do it. They all work by slicing up the video and using the existing codecs on each piece.

Jeff, thanks for the link, now I wish my site was actually up.

I guess the real question is “Where do we go from here?”. It is generally agreed that we are heading towards a future of more cores (Rich mentions the upcoming Nehalem from Intel, that is just one example) rather than faster cores, so what are we (as developers) going to do to punish these new CPUs properly?

Just as MS has used it’s OEM distributors to push Vista, Intel will do the same thing to push quad core (and larger) CPUs on the general user. You can’t even buy a (new platform) single core CPU anymore with the E1200 (dual core Celeron) being released at the same time as your E8500 and my E8400.

Is it the responsibility of the individual developers to push concurrent programming to the next level (through education and usage), or is the onus on the platform designers (whether that be .Net, JVM, or even RoR) to provide a robust set of tools which leverage concurrent programming?

Obviously the PC platform still has bottle necks to be overcome, including RAM latency and I/O performance, but being able to intelligently use multi-core systems for everyday applications is something that needs to be fixed.

Kevin

I wonder if your SharpDevelop compile performance test is being restricted by your hardrive. I’d be interested in seeing the same test via SSD or even from a RAMDISK.