Hungarian Wars

I've found a number of blog posts about the pros and cons of Simonyi's Hungarian Notation, most notably, this blog post commenting on the extreme polarity of the reprinted MSDN article rating:


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2004/06/hungarian-wars.html

Well, this may strike some as excessive typing, but I just add some descriptive text to the end:

MainForm
SettingsForm
AddressNotFoundException

etc…

I do tend to be a tad anal about self-documenting code, though I don’t go overboard (IMO).

Also, I use the prepended-underscore notation to denote private variables, a habit I picked up in the beautiful Python language.

I’m definitely open to suggestion w/r/t naming of general objects. objEverything is out of the question.

Ditto on the underscore, it’s an incredibly effective and very simple convention. The best kind!

The other thing nobody does any more: declare constants in UPPERCASE. Remember that?

I have an alternate approach to the one that you showed. I agree wholeheartedly that dsCustomers is overkill. You chose to emphasize the type, but for everything other than the most trivial code (and I would suggest that even trivial code such as the example tends to become less trivial over time), I would go the other way 'round. Don’t trim to “ds”, trim to “customers”. Yes, the type becomes hidden if you use Notepad and scroll, but its generally more obvious what the type is by context than it is what the data is. At least, that’s my experience.

In the truly trivial case I would agree - a tight loop on an Iterator (yeah, I’m a Java guy) would use the name “iter” for the iterator - but once you get into nested loops or anything else, having them be “customers” and “addresses” is much nicer than “iter” and “iter2” or, more aggrevatingly, some bizarre hybrid like “iter” and “addr” added by someone trying to change as few lines of code as possible (more applicable to large shops).

You chose to emphasize the type, but for everything other than the most trivial code (and I would suggest that even trivial code such as the example tends to become less trivial over time), I would go the other way 'round

Definitely, if there’s more than 10-12 lines of code. I tend to write fairly small, focused functions 80 percent of the time. In the 20 percent where I can’t, I definitely deviate from the “very short variable name based on type” rule.

Nested loops would be another valid reason, but for some reason, I rarely need to nest loops… I think because I tend to break that into two functions: a plural one for operating on “a bunch of” and a singular one for operating on “one of”. This the plural function calls the single one, and nested loops aren’t present.

Anyway, as with all guidelines, YMMV. I think the golden rule is to always try to keep simplicity as your ovearching goal, whatever you’re writing.

I use almost the exact same notation, but I only use simplified variable names (int i, DataSet ds etc…) if the variable is confined to a loop. Otherwise, I do think it’s important to use a descriptor as part of the variable name even if it’s just generic (cmdCommand). This gives another level of differentiation IMHO and keeps your functions from being full of “ads = dbr.Property;” which makes things difficult sometimes.

Either way, as long as SOME sort of standard is adherred to, it makes code re-use and refactoring much easier.

My approach has changed a bit since I wrote this. I use the “add the type to the end of the variable” style most often now:

CancelButton
ClickEvent

I think this is a lot more .NET friendly than the “classic” 3-character prefix eg

btnCancel
evtClick

I’ve also stopped trying to distingish strings and integers. In the above example,

_strCustomerName - _CustomerName
intCustomerID - CustomerID

Pretty much every single article on MSN shows the polarity that the Hungarian notation has. I figure there is some bot that visits MSN and marks every article with a 1. I don’t think there is any inference to be drawn from the polarity of opinions on MSN: it’s the same for every article.

Traipsing through old posts on a slow Friday afternoon…

The idea of using extremely short variable names for tightly-scoped variables triggered some really old synaptic paths in my brain. In most of my university classes, (circa 1986), the world was just coming out of the era where the languages themselves restricted the length of variable names. (Basic only “saw” 2 significant characters, FORTRAN 6 .OR. 8, etc.) most instructors were encouraging longer variable names as a good practice.

Then I took a course based on Modula-2, a language designed by Nicklaus Wirth, of Pascal fame. Every single code sample in the reference manual (written by Wirth) contained only single-character variable names. I found that stangely discordant with what I was being taught in class, given Wirth’s reputation.

But the code examples were all less than ten lines or so, making the scope/lifetime of those variables very short. I remember finding it very easy to follow, because the variable names didn’t get in the way of focusing on the language feature being described.

Using today’s OO languages and feature-rich IDEs, I have disposed with Hungarian prefixes and the numerous bastardizations thereof. I name a variable what it means in the application’s domain, like “checkAmount” for a check amount, for example. I let the compiler catch typos, (or have IntelliSense prevent them in the first place), and use the “Jump to definition/Jump back” keystrokes if I ever find myself questioning a variable’s type. It’s such a quick trip there and back that it’s not worth junking up the variable names just to avoid the quick F12 and Ctrl-dash. (VS2008 mappings)

Probably not quite worth $0.02, but since nobody’s paying me anyway…

I agree, I’ve switched to extremely short name / short scope local variables, too.

Ouch … Tons of people have COMPLETELY missed the point of Hungarian notation. Types are checked by the compiler. YOU DON’T NEED TO CHECK VISUALLY, THE COMPILER DOES IT FOR YOU. What Hungarian is for is to embed SEMANTIC information in a name, not SYNTACTIC. In the original, i was a prefix indicating an index into an array, r was a prefix indicating a row, c indicating a column … things a compile can’t check for you. It is unfortunate that some idiot mistook the original intentions and started using i to mean an integer, c to mean char and so on. It really is a neat system … in its original context

Check out http://www.joelonsoftware.com/articles/Wrong.html for a great discussion on this. Also the More Reading section has some great stuff.

I agree, I’ve switched to extremely short name / short scope local variables, too.

Dear Jeff,
Please write an update to this article, with your current variable naming schema. I’d be very interested to learn the opinions of your fanclub.

Cheers