The Sad Tragedy of Micro-Optimization Theater

I always try to use StringBuilder for consistency and the sake of correctness now. However, += is something I’ve used in the past and I’m not ashamed of it. Like you said, it’s good enough. Replace just looks ugly to me, not something I would ever have considered using like that. Interesting. :slight_smile:

Doesn’t seem to me that Replace has any advantage over Format, with respect to readability.

Hi Jeff,

The time may not vary by much? But how much memory is used by each and how many objects are created.

I know RAM is cheap, and a couple extra objects won’t matter much, but what about GC time collecting all the left over objects. And all those extra objects on a site the size of yours really has to add up? right?

Nick

In a scenario like the above I’d probably use StringBuilder.AppendFormat() but now I’ll think twice about it. That’s one of the reasons I advocate the introduction of String Interpolation in C#. Let the compiler decide what is the best implementation. I just want my code to be more readable.

Very good article. This is the kind of stuff that causes me to keep following CodingHorror.

I always use the string builder as I read years ago that it was the fastest way to combine strings. I took over maintenance of an application written by a consultant that used method #1 above. It took 4 hours to build output files. I switch to the string builder and it took 20 minutes.

Jeff -

Since the html/xml/etc explosion I’ve been waiting to see the emergence of a string-based language, such as a modernized Snobol. I’m still waiting. Perl, and other scripting languages seem to be some sort of an answer, but why interpret when I can compile? And the cloud/grid/VM juggernaut seems to be removing any advantage C ever had. Why iterate over a string when you could use Snobol’s built-in break operator? Plus built-in overloading to attract the C++ crowd: there’s not much work in redefining the + sign.

Go Snobol5 or 6 or whatever’s next.

  • Lepto

I do a lot of JavaScript, and have recently fell in love with regular expressions. Strings really do work well as long as you have the knowledge needed to mold them as needed.

The Java team at the last place I worked created their own mutable String class, because GCs were eating their lunch, as in they were spending 30+% of their time in GC.
My personal preference is to micro-optimize where it’s easy. That way I get into the habit, and in cases where it would matter (such as concatinating strings in a loop), I automatically do the right thing.
Most of the complaints that I’ve seen against premature optimization are about it wasting time and hurting readability. A lot of the little stuff does neither, so why not?

These tiny string operations are small enough to fit into the processor’s L1 cache, so I would expect to see very little difference between them. If most of the string ops on your site are like that, you’re probably right in your conclusions. But if you sometimes build pages or large DIVs that way, you might well see different results. StringBuilder exists for a reason.

urm, it isnt that you arent supposed to use concatenation inside the loop, its that you arent supposed to use the loop to build the concatenated string, your profiles are all just building a tiny string, you have a few bytes of redundant copies then it gets wiped from the stack when the function returns.

Overall a pretty good article and one that I agree with. However, I take issue with one thing:

In my experience, techniques that abuse memory also tend to take a lot of clock time.

This is sometimes true, but also sometimes false. There is often a trade-off between CPU and memory usage. As a simple example, consider things like lookup tables. There are lots of algorithms and tricks that use more memory to save clock time. But these techniques shouldn’t be used for micro-optimization, of course. :wink:

Agreed, it usually doesn’t matter.

One interesting tidbit is that for string concats outside of a loop, the compiler will rewrite things like ‘string s = h1 + title + /h1’ to ‘string s = String.Concat(h1, title, /h1)’

It’s part of the C# spec that this should be the case. I only found this out after I thought I was awesome doing a micro-optimization by using String.Concat instead of the + syntax.

It’s just part of a broader lesson that the compiler and runtime can often make the naive way automatically faster without any additional work while bloated code tends to stay the same speed forever.

If you think micro optimizations don’t matter you need to read Michael Abrash’s Graphics Programming Black Book. Also - while web applications make up the majority of the programming jobs I see posted around the majority of processors in the wild are low powered and provide very limited resources. (Full disclosure: I’m an embedded C/C++ programmer)

IMHO, String.Format() should be used in most circumstances simply because it is by far the most readable of the lot.

If you trade 500ms of cpu time per day against 20 seconds of programmer time when something has to be changed, and repeat this process enough, focus on anything but readability will seriously burn you.

what about not concatenating the strings at all? write an input stream which takes as input all the strings forming a page and then during the page rendering read the data directly from the stream. wouldn’t this be O(1)? (as in character per second is constant while increasing the number of string, while the rendering time is linear with the total character sum of all all the strings)

I don’t agree with your conclusion in this case, Jeff. The difference between String.Format – IMO the most readable – and String.Concat is 78ms, or 13%. That is a decent performance bump when you consider that, as you noted, string concatenation is at the core of building web pages.

Personally, I think the best option would be if .Net included a string formatter class, so that you could pass a string to the constructor and it would compile it to use stringbuilder or string.concat behind the scenes, and avoid the overhead of repeatedly parsing your format string. It would combine the readability of String.Format with the performance of string.Concat

I’m finding it bit difficult to trust your tests - because of the way you worded your question.( your one of the guys that taught me to test for myself rather than except a wild blog rant!)

reading through your code examples and the extrapolating a 10K interation run i would expect the StringBuilder case to be the worse performer.

move the new out of the loop and use StringBuilders Remove method to reset the ‘string’ - granted you may have done this in your test - but that’s not how you worded the question!

Dennis - you’re spot on. Jeff is in left field here. Just buy more hardware doesn’t WORK. Even if I had the ability to use a managed language, I wouldn’t! Control over every single bit is, well, ideal for a lot of applications. Makes you a better programmer too… I’m with Joel more times than not in the podcasts… now I’m starting to see why.

The correct answer is that if you’re concatenating HTML, you’re doing it wrong in the first place. Use an HTML templating language. The people maintaining your code after you will thank you (currently, you risk anything from open mockery to significant property damage).