The Case For Case Insensitivity

One of the most pernicious problems with C-based languages is that they're case-sensitive. While this decision may have made sense in 1972 when the language was created, one wonders why the sins of Kernighan and Ritchie have been blindly perpetuated for the last thirty-three years.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2005/12/the-case-for-case-insensitivity.html

The two languages stated are both dynamic languages. Case sensitivity has no buisiness in a dynamic language in my opinion.

I think it’s a mistake to draw an analogy from standard written English, ie, the KEANU REEVES example, because in written English, case does make a difference – if not in meaning (er, not always, but sometimes: compare “us” to “US”), but in function, where sentence-initial words are capped, proper nouns are capped, etc.

Anyway, I agree wholeheartedly that case sensitivity in entity names invites problems; the only sensible way to avoid those is to code with well-defined conventions for capitalization (e.g. Pascal capping, etc.). Has anyone come up with a good example of where using the same name with different caps for different entities is a good idea?

“but they are scripting language that do not resolve identifiers at parse-time”

This is a problem with scripting languages in general. Case sensitivity just happens to be one of the many ways that programmers can generate runtime errors. And this is exactly why I hate scripting languages. I seriously don’t have any problem with case sensitivity when working with compiled languages. So are we really talking about case sensitivity or the shortcomings of scripting languages here?

Use a good compiled language instead… :wink:

I agree, case sensitivity has got to go. I’ve just started with Python (or is it python?) and have had the same problems with capitalization errors.

I’d go even further and say that the existence of two identifiers in the same scope differentiated by only capitalization is a code smell. Something’s wrong if you can’t come up with identifiers different enough to not rely on CS. Either you’re not choosing very meaningful names, or there’s a flaw in the design. I prefer a language that doesn’t let me do it at all, it forces me to address that smell much sooner.

Why did the decision make more sense in 1972 ? (Except for the fact, as everyone pointed out, that the problem is even worse with interpreted languages).
Were they worried about the performance penalty on the compiler ?
Pascal was born before and is not case-sensitive.

Remember that requiring exact case matching is much easier to support. Either the two names are exact matches or they aren’t. You can end up with cononical issues with compilers when it comes to trying to determine if two strings are actually the “same” (differing only by case). Case insensitive comparisons are definitely much harder to do when you are dealing with more than just the standard English character set. There are now a whole bunch of extra string comparison functions in .NET to handle this because it is so difficult. So why make your compiler work that much harder just so that programmers can be sloppy coders? It just doesn’t make any sense to me…

Why not just code in assembly? Makes the compiler work much easier, and you don’t have to deal with sloppy coders who can’t manage register-allocation.

Making things easy for humans so that they can be sloppy is what computers are all about.

“So why make your compiler work that much harder just so that programmers can be sloppy coders”

From the “I jist don’t git that whole compyooter thang” department…

I think so, although I’ve never seen this stated definitively about the original 1972 C language. It was always positioned as a “mid-level language”, eg, one step up from Assembly but sacrificing very little speed.

And I think that’s because Pascal was a true “high level language”, whereas C was sort of intended as a type of portable cross-machine assembly language.

I can’t cite any sources to support these positions, though… does anyone have any?

It’s not about “sloppy coding”. It’s about the half-hour you spent tearing your hair out because you didn’t notice that Object != object.

It’s ironic that any developer would actually argue that case sensitivity helps us write code, because in my experience (and in many other developers’ experience, as cited in my post) it’s a giant productivity time sink-- a net negative!

Philosophical and religious differences aside, it has to be about results. Can anyone cite hard data supporting the idea that case-sensitivity lets us program faster?

I can’t give you any official sources, but it’s been said to me a quadrillin times, so it’s at least within the realm of computing urban legend.

I suspect the problem with Python and PHP the original writer has is that it’s not entirely case (in)sensitive. Pick one or the other and stick with it.

Wait a minute… what about the whole VB option strict nonsense? I can point to many ways that forcing case sensitivity helps us be more efficient.

Everyone here who is complaining is also probably using scripting languages that CAN’T catch casing errors as you code. I just don’t consider this to be a casing issue. It’s a scripting issue. Call it as it is and stop blaming your “poor productivity” on casing. You are just as like to type in a completely wrong identifier as you are to type one that simply differs by case so get over it.

As I said before, I don’t have any problems with casing. I use C# and Java. If you choose to use a scripting language to “boost productivity” then that is your fault.

If I came across code in source control that used inconsistent casing in a language that allowed it, I’d fix it.

Yup. In a case insensitive language, you’d end up bonking the previous value. In a case sensitive language, you’d get a compiler error.

Not even close. Human beings remember words because they can make associations with the meaning of the words. Casing is an arbitrary distinction, it is disassociated from any meaning in the thing it refers to. The need to distinguish identifiers by casing means that you’ve failed to choose identifiers that capture the full meaning of the thing you are using them to represent.

It is objectively easier to remember to use “frog” when you are dealing with a frog and “horse” when you are dealing with a horse than it is to remember that “Horse” refers to the blue horse and “horse” refers to the green horse. Call them “blueHorse” and “greenHorse”, and you won’t need case sensitivity.

What he is saying is that it’s easy enough to type “forg” as it is to type “frog” or “Frog”.

Unnecessary mixing is nasty too.

The canonical example is the one Scott Hanselman cites; URLs are NOT case sensitive, but the underlying file system CAN be… which means you end up with things like /filename.txt producing a 404 to the file “Filename.txt” because you copied your website from a Windows box to to a UNIX box.

IMO, Unix webservers should honor the DNS insensitive convention since they’re sitting under it. Mixing it up just makes things worse.

And you know what else is nasty? The de-facto default of all lowercase filenames in UNIX to avoid the constant specter of mismatches based on case. Another symptom of the case sensitivity disease.

That’s a completely different issue, but it’s evil for the very same reason: it causes developers to spend extra time debugging. This can be measured empirically; it isn’t a matter of opinion.

So I’m as likely to type in “Cucumber” instead of “HtmlTag” vs. “HTMLTag”? This is news to me…

Now you are really grasping at staws. Of course, calling one thing Horse and another horse to designate between blue vs. green would just be plain stupid. That’s not even the issue here. People want to be able to type something like horse and Horse interchangably without worrying about potentially generating a runtime error. But I still say that even if this was possible for your given scripting language, you would be just as likely to type hrse instead of either horse or Horse. So at what point do you think its ok for a language to forgive you for your typing mistakes?!?!

As a side note, I rarely get bitten by typing mistakes during coding. If I code all day long the comiler might catch one or two typing mistakes (if that many). Intellisense and related technologies go a long way toward fixing these types of “problems”. So again, the issue is with scripting languages, not casing. You are spending all day long trying to find these issues because you are using a scripting language.

As a final thought for all of you scripting fanatics, even if “horse” is the same as “Horse”… what about “ho\u0052se”?