There is of course a certain point where current OSes and apps give diminishing returns when more CPUs are used. There are also some studies that say beyond a certain point processing actually slows down for some tasks.
However, having four or eight cores is not a waste for many consumer desktops. I have an eight core MacPro with 10GB or RAM, running OSX (and often Windows and Linux in VMWare). OSX has a nice little dev tool where you can view the CPU load graphically, and you can disable CPUs as you wish - down to one CPU.
To make a point this weekend to someone considering whether to get a dual core or a quad core machine, I ran only an email client, a browser and iTunes - a typical consumer app load. In iTunes I was using a âspectrum analyzerâ visualizer which does load down the CPUs a bit, but it is not unusual for consumers to do something like this (or to be watching a DVD or streaming video).
At 8 cores not all 8 cores were used at the same time - it was switching between them. At four cores it was using all four cores simultaneously at about 25% average load each. At two cores it was using about 50% average load each. At one core the CPU was often pegged.
The point is that it doesnât much to load up current CPUs - even a top of the line machine like mine with an efficient OS can easily make efficient use of more than 2 cores with typical consumer use. Whether it is worth the extra money for a given consumer depends on their budget, their use and what they are buying. I have seen Dell quad core desktops on sale for under $500 so I donât hesitate to tell someone to get a quad core machine if the price is right. It wonât be wasted and in the near future apps and OSes will take even more advantage of more cores.
I remember when the '386 first came out. A pundit in a review article said it would never be useful for the desktop - it would only be useful for âfile serversâ. I also remember a couple of years after the Mac came out - another pundit asserted that Macs would never run Windows.
Guy: And remember, recovery time is just as important as backup time.
Why would you claim this? It seems that, generally speaking, you should be doing the backup well over 100x as many times as you do the recovery (other than spot checks for backup reliability, but thatâs not a time-sensitive task unless you insert it as a blocker). If I have a catastrophic failure and need to take 2 hours to decompress the backup, Iâm willing to charge that to the unfortunate but uncommon catastrophic failure event; if Iâm taking 2 hours to compress every day so that recovery is a 10 minute operation I think Iâm wasting a heck of a lot more time for little real payoff!
Or, are you implying that every backup should be uncompressed and verified immediately before closing the backup window?
@ Tom Dibble: Recovery time is as important as backup time if you want to have a DR site ready to go when a failure happens. In that kind of situation, youâll want to restore your latest backup to your DR server as fast as possible.
Unfortunately compression algorithms are like symmetric block-ciphers. If you want fast, then it wonât compress well / it will leak information.
If you want it to compress well / not leak information, the task becomes highly serialized.
If we can only standardize on a compression algorithm, and then add specialized circuits that would do that very quickly, then it would be fast. Itâs kinda like that test where the Via Nano trounced a C2Q on AES compression.
In 7-zip 4.19 (IIRC) bzip2 implementation was rewritten to archive higher compression ratios on redundant data (such as source tarballs) on highest modes while remaining compatible to bzip2. Unfortunatelly that rewrite also resulted in the normal mode becoming slower in total CPU time than the standard bzip2 implementation without higher compression ratio.
@Jeff: Even the simplest compression algorithms (Shannon-Fano and Huffman) already eliminate between 80-90% of the information content in a file.
With so much entropy reduced, trying to compress the file again usually yields less than 10% size reduction (or even a negative compression) so itâs far more efficient to use a better algorithm from the very beginning (I bet SQL2008 is simply using simple ZIPâs deflate compression).
Now, what I have never understood is: why has nobody ever created a compression card, just as we have specialized graphics, sound and network cards? As a DBA I have to do backups continuously, copy them over the network, burn them, etc. (and yes, I do full backups only on Sundays, differential backups at midnight the other days and transaction backups twice during working hours - I hope the sameâs done on StackOverflowâs database).
I think thereâs a market for a card whose circuitry is dedicated to compress.
Just imagine it: itâd copy a stream of bytes from memory (using DMA, of course), compress it without bothering the main CPU(s), then copy the result back to main memory. For both database and web server scenarios this would be a huge win!!!
@Developer Dude:
You must be kidding. Seriously. Are you really a developer?
Watching a video is the only CPU intensive application from the bunch you mention. Fetching email is mostly bandwith-constrained rather than processor-bound - no matter what Intel tells you, you wonât be either navigating faster or retrieving your mails faster by using a more powerful CPU (or multi-cores).
Writing to the HD is thousands of times slower than writing to memory - your backup task is constraint by your HDâs speed, youâre not gaining much sending it to its own core.
Placing data on a CD/DVD is actually a slow operation and requires very few CPU, the big CPU usage when burning is due to copying big chunks of data from HD to memory, and hence to the burner. The process is also highly intolerant of delays, so unless you have a RAID or a darn fast HD you should not be playing a video while burning a disk, anyway.
@Developer Dude:
You must be kidding. Seriously. Are you really a developer?
Watching a video is the only CPU intensive application from the bunch you mention. Fetching email is mostly bandwith-constrained rather than processor-bound - no matter what Intel tells you, you wonât be either navigating faster or retrieving your mails faster by using a more powerful CPU (or multi-cores).
Writing to the HD is thousands of times slower than writing to memory - your backup task is constraint by your HDâs speed, youâre not gaining much sending it to its own core.
Placing data on a CD/DVD is actually a slow operation and requires very few CPU, the big CPU usage when burning is due to copying big chunks of data from HD to memory, and hence to the burner. The process is also highly intolerant of delays, so unless you have a RAID or a darn fast HD you should not be playing a video while burning a disk, anyway.
Why not simply try splitting the original file in four equally sized parts and compress them all individually with 7zip?
Iâd be really curious what the total size would be. If itâs not significantly larger than the total size when compressing the original file in one piece, then doing this in parallel would be a piece of cake. Once youâve established a maximum size of input file at which point making the input file larger doesnât give better compression, you can easily split up the input file. It would require a slightly different 7zip file format though.
If the combined size is much larger than the single size, it might be much harder to parallelize 7zip thoughâŚ
Did you notice that Apple has remained resolutely in favor of Core Duos in their latest refresh of their iMac consumer desktop? Iâm in complete agreement with you, desktop PC performance is primarily driven by CPU clock speed, L1/L2 cache size and memory bus bandwidth rather than the number of cores.
As an aside did you measure file I/O during this test? I wonder how much of the bzip time was just spend reading/writing to the disk? I/O latency is definitely something that multiple cores can do very little about, and may actually make things worse (by randomizing the read/write operations).
Even though additional cpus past 2 cannot be used efficently, I would think that a system with 3+ cores would feel more responsive in interactive use since there is a greater possibility that cpu resources would be available immediately to respond to the user.
Basic queuing theory might indicate that 3 cores/CPUs may offer a practical advantage over just 2 even in light-load settings (under ~ 33% utilization) so a quad scenario might be overkill on the desktop.
Yep, as a compromise between limiting to dual-core and going all the way to quad-core, AMD offers tri-core (X3) versions of the Phenom and Phenom II. Sadly, Intel doesnât offer tri-cores.
I should mention, however, that in similar scenarios I have often found no improvement or even poorer performance when paralelizing encodings in this manner. You wonât know until you try, though. It all depends on where your bottlenecks are, and what kind of resource contention might develop between the processes.
Another alternative you can consider for remote backups is rdiff-backup. It uses delta compression like rsync does to only send the difference. It also keeps reverse diffs on the server side so that you can go back to a previous version of the data at any time. It works kind of like Time Machine on the Mac. You can see what everything looked like X days ago and extract files from the backup.
This means you can keep 60 days worth of old backups at a fraction of the cost of disk space. Itâs very convenient to be able to go back to old data in case you find a bug in your code thatâs been making your data go away.
We back up 25 GB in less than an hour over a 70kbytes / sec connection.
It can be tricky to get this going. Let me know if you want to try it and Iâll give you some pointers.
JeffA wrote:
No actual published benchmarks of typical computer use support your statement. I can point to dozens of articles backed by data on AnandTech, TechReport, Tomâs Hardware, etc, that all show the same thing â there is a massive point of diminishing return beyond 2 cores.
If you arenât in one of the narrow niches that can exploit parallelism, an ultra-fast dual core is all you need.
lt;lt;
If you are running benchmarks where you are looking at the performance of a single app, sure, many apps are not very parallel including not being very parallel beyond 2 CPUs.
However, if you are looking at the bigger picture where someone is using more than one app at a time, then you can start exploiting the ability of most modern OSes to use more than one CPU, to assign a given application at least one CPU.
It is not a ânarrowâ niche to be listening to tunes or watching a video while something else is happening on your computer. Whether you email client is fetching email/RSSfeeds/etc., and/or you are burning a DVD, and/or you have an automatic backup process running, and/or you are downloading something. These are all typical uses that if you - the user - multitask - can find that you can easily put more than 2 CPUs to good use.
I bought my MacPro because it supported OSX (so I can run OSX, Linux and Windows, including 64 bit versions, all at the same time - I am a cross platform developer) and because it supports up to 32 GB (those VMs and DBMS servers and web app back ends take up a lot of memory) - not so much because it had 8 Xeon CPUs - but I did get the 8 CPUs instead of the 4 because I knew from time to time I could use them and in the near future more and more apps would take advantage of them.
Is there some reason not to compress (and maybe encrypt) the db in situ? Would insert/compress or extract/decompress on-the-fly on (presumably) mostly small text files impact on response time? If not, your backup problem becomes a simple copy or ftp. Sarelâs point is good - when the backup medium of choice was 9-track tapes, backing up changes was the only feasible method.