Micro-Optimization and Meatballs

In my previous entry on the real cost of performance, there were some complaints that my code's slow and it sucks. If I had a nickel every time someone told me that, I could have retired years ago. Let's take a look at the specific complaint that the s <> "" comparison is inefficient, using low-level windows API timing in the Stopwatch class:


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2005/01/micro-optimization-and-meatballs.html

Yeah, I can’t help thinking that the compiler should be smart enough to optimize

s = “”

Into what you posted, Christian.

That’s the other problem with micro-optimizations. You gotta ask youself, why am I doing the compiler’s work? Something’s wrong here.

Something to be aware of, In C# the following statement will evaluate to false:

string s = null;
if (s == “”)
{
System.Console.WriteLine(“String is null”);
}
else
{
System.Console.WriteLine(“String is NOT null”);
}

whereas in VB the following code evaluates to true:

Dim s As String = Nothing
If s = “” Then
System.Console.WriteLine(“String is null!”)
Else
System.Console.WriteLine(“String is NOT null!”)
End If

C# has determined that “empty string” is not the same as null. This, to me, seems like the correct behavior.

In C++ an empty string is a pointer to an allocated block of memory of size 1 byte (or 2 if Unicode). When comparing an empty string with null in C++ the pointer addresses are compared, not sure if C# is doing the same without cracking open the IL. So the following statement will always evaluate the condition to false.

char *p = “\0”;
if (p == null)
{
}

I agree that micro-optimization can be a real headache when performance experts (or so called) break code. However, in this instance I would say that comparing a string to empty string when the string is null is bad programming, for the reason stated above. Also, it’s not clear in the code if the intention is to check for null or to ignore it and really look for empty string. I much prefer the following:

C#

if (s == null || s.Length == 0)

VB

If s is Nothing OrElse s.Length = 0 then
’OrElse to sidestep shortcut.

The best of both worlds…

Public Class NullString
Private Sub New()
End Sub
Public Shared Function Equals (ByVal tested As String) As Boolean
If tested Is Nothing OrElse 0 = tested.Length Then
Return True
End If
Return False
End Function
End Class

If NullString.Equals(myString) Then
’Do something
End If

Indeed, anyone who runs around adding incorrect micro-optimizations needs to be tasked with trimming my lawn with nail clippers until they’re broken of the habit. :slight_smile:

FYI, the next version of the framework will have a static String.IsNullOrEmpty(s) method. (I haven’t run it through a profiler to see what its comparative cost is yet.)

However, in this instance I would say that comparing a string to empty string when the string is null is bad programming, for the reason stated above.

Shrug. That’s how it always worked in VB.

I see where you’re coming from, but the null/empty distinction isn’t a very meaningful one. We do this all the time for database fields-- wrapper functions that substitute Null with default values of 0 (integer), “” (string), and False (bool).

Probably the best compromise is the String.IsNullOrEmpty() whidbey function Mr. Lippert pointed out; that makes it pretty clear what is happening.

I’m not familiar enough with Whidbey yet, but another question comes to mind immediately: I seem to remember reading about nullable value types like int if you use a syntax like int? myValue = null; (atrocious). What are we going to do then? Integer.IsNullOrZero()? Double.IsNullOrZero() ad nauseum? I suppose we could use generics, but this is one ugly pattern.

You might consider the Marshal.PrelinkAll method for your Stopwatch class to be sure to move the cost of resolving extern methods outside of the timing.

Great tip, I didn’t know that was possible!

In vb there isn’t a distinction between null and empty, but oh let me assure you in c# there is and it’s a huge one… for example, you can’t say s.Length == 0 if s is null or you get a NullReferenceException. Null means ‘exists nowhere in memory’ not the same as ‘empty, but at this place in memory waiting to hold a value’ The methods or properties of a reference type (as strings are in c#) go with the instances unless they are static methods, (length is most definitely instance specific, because it has no meaning in the static context (ie we don’t care what the length of String is we care what the length of this instance of string is)) So if you go to ‘nowehere in memory’ and from there try to execute a method on it, even something like .get_Length(), which is the same internally as saying Length, then you’re going to blow it up.

remember, null = nowhere, non-existant, while empty = somewhere not holding data, but waiting to with enough space for it.

Off topic
Hey Jeff, your comment boxes (background: black font:darkgray) are a ‘Reading Horror’! I have to select your text just so I can read it, like in the olden days when people put invisible text in their webpages as easter eggs!

On topic
When your writting something in a language like VB,Python or Ruby; these little performance tricks are near meaningless. They are also meaningless when making a common GUI program or a rarely used automation script or even a little webapp. Totally agree with you here.
However when your working on a high performance system like a database server or a game, then performance makes a big difference because all of the sudden your world becomes measured by the millisecond, no longer by the second that most End-Users need.
30ms for 1.000.000operations can be a bottleneck.
In the end its all about Appropriate programming, like a good carpenter use the right tools for the right job. And sometimes optimization is the right tool (however if you need that kind performance of that scale don’t use VB or a scripting language :slight_smile:

You might consider the Marshal.PrelinkAll method for your Stopwatch class to be sure to move the cost of resolving extern methods outside of the timing.
http://pralians.ru

Just following the link rot prevention crusade: the stopwatch class link is dead, I’m guessing that the right one is this one: http://blog.codinghorror.com/a-stopwatch-class-for-net-11/

Sorry for the revive.