The Case For Case Insensitivity

RiX0R · December 6, 2005, 12:00am

Identifiers that are different things shouldn’t look too much like one another. It’s confusing, and a potential source of bugs.

Example: cur1 and curl are two different identifiers, but you’d quickly glance over it. Indepedently of whether they’re used in a case sensitive language or not.

With that established:

The only point in case insensitiveness is being allowed to write the same identifier in different ways. So you could write HTMLTag at the top of your sourcefile, HtmlTag in the middle, and at the end you are too tired of typing to hit the shifts so you are conveniently allowed to type htmltag.

Would you do this?

Even if the language I’m programming in is case insensitive, I’ll still case the identifiers the same each time. Doing otherwise just looks messy. And I’m willing to bet a lot of other developers do it the same way.

So where’s the issue? Even if your language allows case insensitivity, you’re not going to use it. To give lazy developers to write messy code?

It’s just like another topic of heated debate: Python’s indentation-based scoping. Love it or hate it, but blocks are required by the language to have consistent indentation. Again, this helps writing better-looking code, and since it’s built into the language there’s no way around it for lazy/messy developers.

I guess my post has come off a bit as pro-case sensitivity, but I really don’t care either way.

VBMan · December 6, 2005, 12:00am

Jeff -

From the sounds of your rant, looks like you have NOT been using VB.NET lately huh?

Ever think of coming back to the dark side?

codinghorror · December 6, 2005, 12:00am

the IDE “corrects” your code after you press the ENTER key so that it appears as this:

The original article I cited is title “The Case for Case-Preserving, Case-Insensitivity”, so case preservation has always been on the table.

Can anyone even come up with an anecdote where case-sensitivity was helpful while programming? Come on, even a friend-of-a-friend.

I’m still waiting… but in all seriousness, this is my point. If a feature is painful, people complain about it. A lot. That’s not opinion, that’s empirical data.

You can find similar horror stories about “OPTION EXPLICIT” all over the place. And nobody seems to dispute the idea that having the compiler auto-declare variables for you is a bad one.

Here’s another:

http://dev.mysql.com/doc/refman/5.0/en/name-case-sensitivity.html

Be very careful with [the Identifier Case Sensitivity] option. I have had a difficult time migrating from 3.23.x on Mac OSX to 4.0.x do to this option being turned on by default. After restoring from a mysqldump bakup, I would get inconsistent results. Since none of my loading batch scripts or processing scripts changed (all used our standard naming convention with upper case in table names), the inconsistency must have come from a change in 3.23.x to 4.0.x . A simple change of this flag to ‘0’ fixed the problem, but only after hours of verifying our data and backups.

mike199 · December 6, 2005, 12:00am

I’m curious where VB.NET fits into the defintion of “scripting languages.”

matt107 · December 7, 2005, 12:00am

Thanks for the clarification on camel vs. pascal casing. I never get that right. You are also right about naming conventions of private members. I actually use the following but only because I saw it once and liked it. Note that it does NOT jive with what the VS forms designer does for the controls that are added to your form.

private string _lastName;
public string LastName
{
get { return _lastName; }
set { _lastName = value; }
}

The main reason I do this is because it is more clear when trying to come up with similar names for method parameters (which end up being publicly visible to coders).

Thus the following makes more sense to people using my classes:

public void initializeNames(string lastName, string firstName)
{
_lastName = lastName;
_firstName = firstName;
}

Hoakie · December 7, 2005, 12:00am

“So why make your compiler work that much harder just so that programmers can be sloppy coders? It just doesn’t make any sense to me…”

“On the other hand, case insensitivity leaves the door open for sloppy coding style that makes others second guess your intent, and drives me crazy when I look at it myself.”

How clear is the intent is this example interface? (yes i actually ran into this in someone’s production code)

public int myProperty;
public int MyProperty
    {
        get { return myProperty; }
        set { myProperty = value;}
    }

Which version of the “myproperty” do you call? Is this not sloppy code allowed because C# is case sensitive?

(An interesting point the code above compiles great and you could call it from VB.NET 2003 however you can’t call into this from VB.NET 2005 because it detects the ambigous name)

You can’t write an interface like this in VB.NET because this code:

Public myProperty As Integer
Public Property MyProperty() As Integer
    Get
        Return myProperty
    End Get
    Set(ByVal value As Integer)
        myProperty = value
    End Set
End Property

will give you a compile error stating:
‘MyProperty’ is already declared as ‘Public myProperty As Integer’ in this class.

matt108 · December 7, 2005, 12:00am

“Which version of the “myproperty” do you call? Is this not sloppy code allowed because C# is case sensitive?”

No, it’s just sloppy coding period. It would have been just as sloppy and confusing if they had done this:

public int my_Property;
public int MyProperty
{
get { return my_Property; }
set { my_Property = value;}
}

You still wouldn’t know which one to call and the casing issue has been removed. Sloppy coding is sloppy coding. And not enforcing case sensitivity allows coders to be even more sloppy.

Armen · December 7, 2005, 12:00am

iYou still wouldn’t know which one to call and the casing issue has been removed. Sloppy coding is sloppy coding. And not enforcing case sensitivity allows coders to be even more sloppy./i

This is the fundamental difference between the two camps of programmers. There are those at think “You should have the freedom to shoot yourself in the foot” and the other camp that advocates gun control.

matt109 · December 7, 2005, 12:00am

Good point. And I’m obviously one who advocates “gun control”.

Armen · December 7, 2005, 12:00am

Heh, that “gun control” comment might be the wittiest thing I’ve ever said!

Crap code is crap code. Shit heads who can’t grok camel-case (or whatever) and use it consistently NEVER WILL. They just don’t f’ing care.

The worst possible situation is a case insensitive language that doesn’t require declarlations. OMFG, you’re lucky if you survive with your pinky toe when dealing with these lazy coders.

Case-sensitive w/ declarations languages are free from these dangers, so there’s no real need to discuss them.

What really drives me nuts are case-sensitive completion engines and file-systems.

codinghorror · December 7, 2005, 12:00am

Good point. And I’m obviously one who advocates “gun control”.

Well, I have to go back to the “it’s a religous issue” position.

That depends on your definition of “gun control”. For me “gun control” is a language that DOESN’T let you declare “myProperty” and “MyProperty” in the same scope. For you, it’s a language that does. Which from my perspective, is some darn poor gun control

robtwister · December 7, 2005, 12:00am

As with anything, the use of scripting languages has its advantages and disadvantages. What you lose in the time spent tracking down the case sensitive variable bugs, you gain in faster development throughput in not having to compile the code over and over again. Over time, you become more careful with your variable naming, become less sloppy, more disciplined in your use of conventions and this makes you a better programmer.

Plus, if you test all new code as you write them, develop a unit test suite (recommended by Bram Cohen himself for python development), then you learn to identify these case/mispelling bugs quickly and end up minimizing them in the future. I for one never had these casing-related problems because I’m naturally meticulous with my code. But I can understand why other developers can get nasty bugs with casing. To each his own I would say. Just don’t blame the scripting languages please.

Joe · December 7, 2005, 12:00am

Let me preface this by stating that I am a very anal retentive person. I pay great attention to detail and love digging way down into code. If I can dig into an abstraction and traced it down to it’s real representation in numbers, it makes sense to me, otherwise I am nervous that whoever wrote it didn’t pay enough attention (read: as much attention as I ‘think’ I do ). I am a big fan of minimal abstraction in all cases. So to me case sensistivity does make sense. I have never encountered a problem with capitalization in my code that has not been obviously and squarely my own fault for not paying attention to what the heck I was typing. Quite simply, ‘a’ and ‘A’ are 2 different numerical values as characters so ‘and’ and ‘And’ of course are different entities. For high level scripting languages, in which the numerical value of ‘a’ and ‘A’ need never be calculated, sure case sensitivity can be abstracted away. But for more machine-powerful languages, such as C/C++, you simply cannot do away with that distinction. Anyway, just felt the desire to toss my 2 cents in the hat. Happy trails!

hu1 · December 7, 2005, 12:00am

Scott scott = new Scott()

Scott myScott = new Scott()
Scott AScott = new Scott()
Scott aScott = new Scott()
Scott theScott = new Scott()
Scott person = new Scott()

matt110 · December 7, 2005, 12:00am

“Quick, non-scientific poll of the people arguing in favor of case-sensitivity. How often do you have two or more variables whose name differs only in case?”

All the time… but they are always related. The .NET naming conventions would tell me to have a private member that is pascal cased and an associated property that is camel cased (did I get that right?).

So I have the following…

private string lastName;

public string LastName
{
get { return lastName; }
set { lastName = value; }
}

Yes… they differ only in the casing of the first letter.

Joost · December 7, 2005, 12:00am

I’ve never programmed in C# yet, but don’t most people do this in C#? -

private string _LastName;
public string LastName
{
get {…}
set {…}
}

btw I think lastName and LastName are both camelCased, only the first is in lowerCamelCase (or just camelCased) and the latter UpperCamelCase (which is the same as PascalCase)

(a href="http://en.wikipedia.org/wiki/CamelCase)"http://en.wikipedia.org/wiki/CamelCase)/a

Scott · December 7, 2005, 12:00am

Joost,

That’s normally how I name things. My private member vars are prefixed with “_”.

Matt, I think the official guidelines only specify naming conventions for protected members and public fields.
a href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpgenref/html/cpconcapitalizationstyles.asp"http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpgenref/html/cpconcapitalizationstyles.asp/a

"Protected instance field Camel redValue

Note   Rarely used. A property is preferable to using a protected instance field.

Public instance field Pascal RedValue

Note   Rarely used. A property is preferable to using a public instance field."

They do have a section dedicated to helping devs navigate the complex field of case-sensitivity.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpgenref/html/cpconcasesensitivity.asp

In fact, telling developers to avoid the naming convention you are using.

“Do not use names that require case sensitivity. Components must be fully usable from both case-sensitive and case-insensitive languages. Case-insensitive languages cannot distinguish between two names within the same context that differ only by case. Therefore, you must avoid this situation in the components or classes that you create.”

Damien_Katz · December 10, 2005, 12:00am

I know I’m really late of this and few will read it, but as guy whose already written one language runtime (installed on 50 million computers BTW) and is currently creating another compiler and runtime, I thought I’d weigh in.

Case insensitivity in scripting-languages is a performance drag, because when comparing identifiers it can’t just compare the raw bytes, special logic must be used to compare each character.

When I was performance profiling my first runtime, I found a large amount of time was spent lowercasing identifiers so that variables could be stored and retrieved from the runtime hashtable in a case-insensitive manner. I had to do a lot of optimization work to reduce that penalty, and while I figured out some good optimizations and made it a pretty much a non-issue, it will never be as fast a fully case-sensitive solution.

My gut feeling is that most creators of dynamic languages make the case sensitivity decision simply due to the performance benefit (statically compiled languages have no such excuse, but since most require you to declare identifiers the compiler will alert the developer before its a problem). It’s understandable, how can you not be concerned about performance when writing a programming language (even its just a small performance difference)?

I am definitely in the camp that dynamic languages should be case-insensitive. But I don’t think that you should just write code with random case just because the computer doesn’t mind. Code should be written with consistent case (as if the language is case sensitive) for the benefit of other who have to read the code. That way the code looks consistent, but if there is a case problem it won’t fall over when it hits that rarely called chunk of code (and probably at the worst time).

Bill206 · December 11, 2005, 12:00am

Old IBM terminals had a setting on them that capitalized everything when you hit enter. I always coded in all caps, which seemed normal to me. The only drawback I ever experienced was if you were displaying a lower or mixed case literal, you needed to be careful when editing that line of code.

PaulC · December 13, 2005, 12:00am

The one that really gets me is the programs that don’t work for some things because they expect Windows to be case-sensitive for filenames (apparently this is Windows fault, not the programmers). The argument for it is ensuring the code-base is “cross-platform compatible” (ie: does not work on the Windows platform).

But the clincher for me yesterday was this: how hard it is to do basic string operations in a language (which will remain nameless, as to be fair it is still in beta and this may change) where the string functions are case-sensitive and there is no parameter to switch this behaviour off.

In any other case, one line of code would find the location of a substring within a string, or replace it. But with case-sensitivity forced upon you, dealing with real world strings is impossible unless you are only interested in pre-defined tokens. About the best compromise is to write the code three times for upper-, lower- and sentence-case, but then you still miss the sloppy data entries where it is random.