There Ain't No Such Thing as the Fastest Code

Nothing beats really knowing the implications of the problem for making an optimized routine.

And, like David L said above, it bears testing on a “real” processor with real data before declaring it faster. I can’t tell you how many times I have written “fast” functions that failed to run quickly because parallelism was faulty or memory was fragmented or other real world pot holes.

After all, the fastest way to get 10,000 prime numbers would be to fork() out 10,000 checkers and collect the results as they came in. But it would never work unless you had 10,001 processors and zero fork time. Sounds fast, but isn’t.

Way back, in the early 80’s, I was attending a programming class in school. I was about 15 at the time. We were using Commodore Vic 20. Our teacher put us up with the task of writing a program (in Basic) that would print out the prime numbers from 1 to 100 as fast as possible.

I painstakingly came up with a naive version of Sieve of Erathostenes, as did most of the others. Of all those versions, mine was by far the fastest. One guy, though, came up with a program that were, literally, an order of a magnitude faster than that, and everybody was stunned.

That is, until we discovered his secret: He had precomputed the numbers, and was simply printing them out, one after the other.

I was outraged.

Morale: caching!

This book is excellent, although very little applies today. back in the early days as a games programmer this was invaluable, (our games were 100% assembly) so much so, that the optimisation stuff wasn’t really the fun part anymore - we used to optimise for size instead :wink:
we ran small competitions in the company for 256 byte games - in the days when windows was still a twinkle in bill eye. I managed to dig out some of the old 256 byte relics I did here

http://runtothehills.org/rob/archives/71

this was also a post looking at your 600 lines is a luxury post :wink:
Cheers,
Rob

One hat doesnt fits all :

Fast is a context tradeoff.

One god (J Carmack) wrote :

float Q_rsqrt( float number ){
long i;
float x2, y;
const float threehalfs = 1.5F;

x2 = number * 0.5F;
y  = number;
i  = * ( long * ) y;  // evil floating point bit level hacking
i  = 0x5f3759df - ( i  1 ); // what the fuck?
y  = * ( float * ) i;
y  = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y  = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed

#ifndef Q3_VM
#ifdef __linux__
  assert( !isnan(y) ); // bk010122 - FPE?
#endif
#endif
return y;

}

This is a fast square root.

But if I do have a CPU designed for math calculation, sqrt(float f) might be faster.

And, if I do make high precision calculus, I might want numerical recipies squareroot implementation

And if my web server is already congested, it may be faster to let database do the square root as a result of the sql request, given that IEEE floating point algorithm are often better implemented on database than in standard C library.

So my point is faster is a context issue not a coding issue, and as Steve Mc Connell pointed, aiming for code readability often gives good result in code maintanability, and performance.

I guess that is why we developpers that focus on code are the problem, because coding is not about programming, but understanding.

Speed will always matter, the thing is what took 10ms in a year old hardware today will take 1ms. So let’s say that 30 ms is good enough speed for this imaginary task, why bother optimizing more? It will only get faster with more modern hardware. Well, this is from a user point of view, from the server/mainframe point of view, better optimized code translates directly in money saved (power, hardware, refrigeration etc).

This was a post of personal interest to me. I used to belong to Bix all those years ago and I remember when the challenge went out.

There were a lot of historical conversations shaping the future of PC programming on Bix that are pretty much lost for ever.

Rich Shealer

Somebody fetch Scott Hanselman.

Dan,

Well, as a nit-picker I guess I am going to have to say that there IS such a thing as fastest code. On a single-core processor, it is code that requires only 1 machine instruction. You can’t improve on it (other than speeding up the clock).

Other than that… the guy is just flat-out right. Arogance is the bane of productivity and efficiency.

-Doug

[joke]
…it’s the compiler, not the programming language, you stupid, suit, pointy-haired, minion !!!
[/joke]

When Abrash states “From the user#8217;s perspective, performance is fundamental”, I tend to disagree. IMO from the users perspective the correctness of the application is even more important - it’s no use if the word count functionality is the fastest available if it’s off by 5% in some cases.
Or: I would rather have to wait some seconds or minutes for the byte.com page to load than looking at the “java.lang.Error: *** Cannot replace DB pool!” message that instead comes up…

The thing about “Zen” is that it could’ve been half the size if Abrash hadn’t repeated himself so much. It seemed that every other page he was adjuring us to measure performance (instead of just guessing). Valuable the first time, but after a while, it felt a little patronizing.

That didn’t stop me from having a great conversation with him at a conference, which almost lead to us (Sierra On-Line) luring him out to join us. Complete lack of interest by upper management scotched that idea, but I always regret not having a chance to work with a truly gifted person.

///ark

When Abrash states “From the user’s perspective, performance is fundamental”, I tend to disagree

Really? Because I distinctly remember switching to Google because it was good and crazy fast compared to AltaVista.

Speed still matters.
http://www.codinghorror.com/blog/archives/000722.html

i had this book too and it is classic. this introduced me to pure assembly computing when i’m still a student.

nice one. after remininse state induced by this post, an idea came up why i should not try coding in bytecodes or msil?

there are still developers out there that take assembly seriously. Take the Naughty Dog job listing for instance:

Assembler Programmer
Assembly programming alas is all too often considered a dying art form; however, this is definitely not the case at Naughty Dog. We take assembly programming VERY seriously and use assembly extensively in our games. We’re looking for someone who really enjoys getting down to the metal and writing highly optimized assembler code. This person should have a very solid grasp on caching issues, processor pipelining, and latencies. Strong 3D math skills are a big plus, and good fundamental 3D math skills are required. Past experience writing 3D renderers is a big plus. We’re not looking for the occasional down coder, we’re looking for someone passionate about assembly, and only people with extensive past assembly experience will be considered.

What happened to the site? I find it hard to believe that Byte has been down for days!

If you’re convinced that optimization is useless today, have your imagination refreshed.

jeff - your title, does it refer to Steel Magnolias? If so, wow, I am even MORE impressed by you!

“There ain’t no such thing as natural beauty”

OK “mere Visual Basic programmer” (which, of course, you no longer are)
I never even got beyond Basic having done my original training in machine code on an Elliot 405 valve machine in the early 60s. I don’t need to do any programming these days BUT I still find the logic of REAL programming fascinating and will proceed to try to find this book in a library.

I was given a handout on Bumper-Sticker Computer Science from Jon Bentley many years ago that has several rules related to this subject, but two stories in the article have always stood out to me.

In the 1960’s Vic Vyssotsky spent a week modifying a Fortran compiler making a correct routine very fast. The bug he introduced did not surface for two years because it had never been called in over 100,000 compilations. His premature optimization was worse than wasted, it made a good program bad.

Bill Wulf of Tartan Laboratories took only a brief conversation
to convince me that “if a program doesn’t work, it doesn’t matter how fast it runs” isn’t quite as true as I once thought. He used the case of a document production system that we both used. Although it was faster than its predecessor, at times it seemed excruciatingly slow: it took several hours to compile a book. Wulfs clincher went like this:
“That program, like any other large system, today has 10 known, but minor, bugs. Next month, it will have 10 different known bugs. If you could choose between removing the 10 current bugs or making
the program run 10 times faster, which would you pick?”

haha