Pretty Code, Ugly Code

Armen · June 20, 2006, 12:00am

static TARGETS *
make0sort( TARGETS *chain )
This is my biggest pet hate of some (not many) C programmers programming style!
WHY do you hit return after the data type in a function declaration?? There seems to be no reason.

Why? It’s obvious. Here’s an example

UserInformation* myLongFunction(const char *username, const char *group, const Date since, const LongStruct db, const vectorstring skipList, int ignoreCase);

void myShortFunction();
int myShorterFunction();

When you put the return type on a seperate line, the reader only need look at column 1 to find the name of the function.

jamesK1 · June 20, 2006, 12:00am

If you put the name of the function on a new line,
you can search for the function definition (and
nothing else) using a regular expression. For
example, ^make.

Only if the function name is in the first column. If it’s indented within a class block, or has classname:: before it, or superclassname:: before it, it becomes a lot harder to find than scanning a screenful of ‘grep -n make *’ and looking for the thing that looks like a definition or declaration.

James_Green · June 20, 2006, 12:00am

Ugly or not, Props to Christopher for putting it out there. I feel newer coders know about what comments should be but all they have seen is text book code which often sacrafices security or performance in order to illustrate one point. Real-world code is a good learning space for n00bies to find style.

On the topic of C, don’t forget the famous unix comment (attributed to dmr IIRC):

/* You are not expected to understand this */

It is sometimes misunderstood as being elitist, but in fact is a helpful comment. It means: “Don’t spend your time on this bit, it’s odd but take it on faith.”

ChrisB · June 20, 2006, 12:00am

“Reduce, reduce, reduce. Remove anything that will distract the reader.”

Use variable names like i and j.
Surely thats heresey.

I would suggest even the most locally scoped variable name should provide some information about its reason for being.

Darrin · June 20, 2006, 12:00am

About this:

Use short names (like i, x) for variables with a short, local scope or ubiquitous names. Use medium length for member names. Use longer names only for global scope (like distant or OS interfaces). The tighter the scope, the shorter the name. Long, descriptive names may help the first time reader but hinder him every time thereafter.

I’m sorry, but that’s ridiculous. The human languages we use simply don’t permit that on any real sort of level. Sure you can name locally scoped variables that have obvious uses i and j. But beyond that, you either start using abbreviations that are just as hard to understand at a glance as long names are to read, or you use descriptive names regardless of length, and future maintainers can understand it.

I’m just curious - why is there this obsession with keeping typing to a minimum? With all the debugging you have to do to get code working correctly, the amount of typing required for a slightly longer variable name seems negligible to me. Plus, I find that it can often help to slow down when programming. I come up with the best solutions when I walk away from the computer. I just don’t buy the argument presented in that point.

wfefdwar4rf2Q · June 20, 2006, 12:00am

The fact that you have to define a small variable name means you’re likely using a crappy abstraction. Perhaps it makes sense in C, which is an abstracted assembly language, but in any other language you shouldn’t need to do this.

This of it as a code smell.

johnb3 · June 20, 2006, 12:00am

Thanks for the post - it’s had me thinking all morning about coding standards in my more front-end-heavy environment. I’m so thankful my first boss INSISTED we read and outline “Code Complete” (http://www.amazon.com/gp/product/0735619670/ )!

My thoughts: http://www.smartminion.com/blog/?bmonth=6/1/2006#entry20060620133417

barry9 · June 20, 2006, 12:00am

Nobody has mentioned that “i” and “j” are hard to tell apart when a) read quickly b) appearing on a printed page where the toner was running out or c) appearing small on a screen with only modest resolution. Why introduce even the possibility of such easily preventable errors? Similarly, “1” and “l” are often hard to tell apart and should be avoided. Text should be easy and clear so that we can concentrate on the logical problem at hand.

Shawn_Oster · June 20, 2006, 12:00am

I do like his focus on maintainable code and his oft repeated manta that others will have an easier time editing your code if you follow a set of rules. With applications growing so large these days the ability to quickly find your way around a large body of code is critical.

I disagree about shorter variable names, I’d rather have descriptive vs. ISecManCon and with most modern IDE’s you really only have to type your variable name once.

All in all a good list. I’d go as far with #1 to say you shouldn’t deviate from the languages standard. It’s distracting to see C# code capped like old C code or Delphi code capped like Java, etc.

JeroenR · June 20, 2006, 12:00am

I couldn’t agree more, especially point #1. Endless reformatting of ‘{’ indentation styles waste a lot of time and makes doing ‘diffs’ on different versions of a file a nightmare.

Swallow your pride for that one file you are editing, stick with the same style, and apply your own or company’s superior style on all other files.

I am absolutely religious about formatting and coding guidelines. So much even that I publish my own guidelines document for various languages including C#, VB, JavaScript, HTML etc.

See http://dotnetjunkies.com/WebLog/jritmeijer/archive/2005/09/08/132429.aspx

codinghorror · June 20, 2006, 12:00am

Nobody has mentioned that “i” and “j” are hard to tell apart

I don’t know if the comparison between i - j and 1 - l is a fair one.

I’m just curious - why is there this obsession with keeping typing to a minimum?

It’s an obsession with keeping READING to a minimum. Half of what we write will go unread anyway… all other things being equal, shorter is better. Think of it as Strunk and White for your code.

I disagree about shorter variable names

The specific guidance he gives is this:

The more important a variable is, the better its name should be.

I agree completely. Short variable names mean “this isn’t very important, so don’t spend a lot of time trying to process it.” So when you see variable names “i”, “ds”, or “ts” you can assume they’re not important-- they’re probably used for ephemeral, temporary tasks.

Richard · June 21, 2006, 12:00am

Perhaps I’m just an oddball, but that make.c example does look, to my eyes, rather beautiful.
Also, don’t ever use “chtulhu” as a variable name, you’ll misspell it every time you write it.

minektur · June 21, 2006, 12:00am

“Only if the function name is in the first column. If it’s indented within a class block, or has classname:: before it, or superclassname:: before it, it becomes a lot harder to find than scanning a”

Uh, we are talking C here not C++ or something else? While I dislike putting function names in column 0 in C code myself, I code with friends who love it and I can tolerate it - the justificaiton is that it is “easy to find declaration in code with 50 source files” etc.

Tim_Lesher · June 21, 2006, 12:00am

“Switching to a better language” is an option, unless you happen to be writing a utility (like Jam) that needs to compile and run on dozens of platforms with minimal external requirements.

Ubiquity trumps all in this case, and ubiquity currently means vanilla, ANSI-standard C.

WaterBreath · June 21, 2006, 12:00am

As for the pillar that states that code that does identical things should look the same –
well, that’s true if a single function should look like itself.

Not always possible. Especially not in C, or other strongly typed languages without generics/templates. When your functions all do the same thing, but with different data types, then it’s often unavoidable to have a bunch of functions that look almost identical.

My recent work has, unfortunately, been in PL/SQL, an “object enhanced” scripting language for Oracle databases. I can’t tell you how many times I have thought to myself “Good gosh, this is going to take hours, and it’s going to be huge and repetitive. If only I had some sort of generics or dynamic typing available, it would be a breeze.”

At least in C, you can sometimes use macros, if the code will essentially be exactly the same. PL/SQL doesn’t even have that. Of course, there’s always Pro*C, or Java, and such things. But the company I’m contracted to doesn’t allow us to use them.

BuggyF · June 22, 2006, 12:00am

what I’ve found amusing for the last 20 years, almost, is this insistence, by people who should know better, on 80 column (or less) line limits. That was invented by COBOL (which only has ~66 characters available after you subtract the line number from 73) around 1960. It’s just silly. I remember how wonderful it was when the VT-220 came along and I could use 132 character lines. If nothing else, formatting a report both in code and as output was a piece of cake.

Beyond the character world, we have wide screen taking over, and folks still think that 80 columns is King. There’s more horizontal real estate, and always was. Use it well.

And as to scanning narrow: not everybody does; I’ve had the Great Books set and can’t read them because those narrow double columns drive my eyes over the edge.

Wider is better (with props to Pontiac?).

Alyosha12 · June 22, 2006, 12:00am

“Not so. Breaking code up with whitespace relies on programmer judgement to keep the pieces working independantly.”

All coding requires programmer judgment; there’s no avoiding it. You rely on programmer judgment to tell when a function is too long, I rely on programmer judgment to tell when a function has too many long-scoped variables.

“When you break the code into subroutines the compiler enforces this.”

You can restrict variable scope just as easily by declaring them in another level of scope (aka an anonymous block). I do this occassionally.

“It also forces you to declare the parameters that a block of code requires.”

And the benefit is …? A “clever” programmer is just going to declare functions which take a dozen parameters. And you’re back to programmer judgment again.

“It makes reuse easier if it should become necessary and it simplifies changing the order of execution”.

YAGNI, and at any rate, these transformations take what, 5 seconds to do with NOTEPAD? OTOH, the clutter of reading code filled with single-use functions impairs readability: the ability to find important methods, functions which are general-purpose – code which I really might want to reuse.

“Most importantly, if it were broken into subroutines, it would be possible to tell at a glance what steps were executed when that function was called instead of forcing the reader to scroll down through several screens of code culling out the meaningful comments.”

Really? So you’re saying:

#1: in the example code, you found it difficult to skim for the comments, and
#2: you’d rather see a function called addDependentsIncludesToDirectDependencies(target) than a comment such as /* Step 3c: add dependents’ includes to our direct dependencies */.

Can’t say I agree with you on either count. The scrolling wasn’t onerous at all. I could grep for comments pretty easily (even without syntax coloring, it wouldn’t be hard). And I don’t enjoy squeezing comments into 32 alphanumericunderscore characters. I also don’t like (as mentioned before) having to jump around if I need to drill down a level of detail.

Ian_Johns · June 22, 2006, 12:00am

When this topic first appeared, I figured it would be wise for a C developer like myself to stay out of any possible ensuing religious wars on software conventions. However, after a few days I realized I didn’t have to join in the religious wars – I discovered that Alyoshapresented many arguments I agree with. Obviously, Alyosha I share many similar beliefs on coding convention organization, including pragmatic function size :

"And though I may be in the minority, I both agree disagree with your above statements concepts.

As I stated above, "my rule is that a function should be as short as atomically reasonable – even if that means 2-5 lines for some functions or 10s/100s lines for others.

Each function should do what it requires with few if zero side effects. Whether a function requires two lines of code or 1000, functions should NOT be arbitrarily short JUST for the sake of a cliche."
– http://discuss.joelonsoftware.com/default.asp?design.4.202810.13#discussTopic203616

“Steve, meet #if 0. #if 0, meet Steve.”

Although I have certainly used #if 0, the advantage of this:
/*
…some code to comment out
*/

versus this:
#if 0
…some code to comment out
#endif

is in terms of visualization. Most of the editors I have used can easily color code the comment block.
Not so much with the #if 0 … #endif block.
Makes it far easier to clean up after you’re done, though …
– Steve Bush

I personally was trained to use the “#if 0/#endif” method for software maintenance reasons. So naturally I found Alyosha`s “meet Mr. #if 0” comment very funny.

However, Steve Bush correctly points out the primary advantage for using the “/* */” method is for IDE comment-coloring visualization. He also correctly points out one important advantage for using the “#if 0/#endif” method – it is “far easier to clean up”, i.e. enable/disable desired code blocks from compilation.

However, he fails to address the parallel problem using the “/* /" method. That is, it is often difficult to enable/disable code using "/ /" if the code block(s) themselves contain "/ */” comments. This is because ANSI C does NOT allow nested comment blocks. Even a careful developer could easily intend to comment out a code block but not realize that only some lines of the code block were commented out while some trailing lines are NOT commented out because of a nested comment block.

In fact, MISRA (Motor Industry Software Reliability Association) similar FAA FDA standards prohibit enabling/disabling code blocks using “/* */” for this very reason :

“Where it is required for sections of source code not to be compiled then this should be achieved by use of conditional compilation (e.g. #if or #ifdef constructs with a comment). Using start and end comment markers for this purpose is dangerous because C does not support nested comments, and any comments already existing in the section of code would change the effect.”
– MISRA-C 2004 Rule 2.4

Robert · June 22, 2006, 12:00am

“All coding requires programmer judgment; there’s no avoiding it.”

Sure, but if you can get the compiler to check something for you, you should take advantage of that.

“And the benefit is …? A “clever” programmer is just going to declare functions which take a dozen parameters.”

The benefit is that it says “this block needs to have this information”. It also shows the types of the variables near where they are used, which makes it easier for a reader to understand. The fact that it can be done badly does not negate these benefits.

“the clutter of reading code filled with single-use functions impairs readability: the ability to find important methods, functions which are general-purpose – code which I really might want to reuse.”

The implication there is that you know when you write the code what code you will want to reuse. But in my experience its common to discover new uses for a block of code that once seemed pretty specific. Having the code already residing in a function with an appropriate name not only makes it more convenient to resuse, but it allows people who are unfamiliar with the code to find reusable pieces easier.

“I could grep for comments pretty easily (even without syntax coloring, it wouldn’t be hard). And I don’t enjoy squeezing comments into 32 alphanumericunderscore characters. I also don’t like (as mentioned before) having to jump around if I need to drill down a level of detail.”

On my screen that function was over 6 pages long. Thats quiet a bit of jumping just to get a high level overview of it. Modern editors make drilling into functions as trivial as scrolling down a page, so I actually find it easier to navigate if its broken into seperate routines - I can jump quickly to the piece I’m interested in. Its like the table of contents in a book. Its not hidding details from you, its helping you find the right place to look for the details you want.

Alyosha13 · June 22, 2006, 12:00am

“The implication there is that you know when you write the code what code you will want to reuse.”

Oftentimes I do, but sometimes I don’t. Still, I tend to abstract out common code the second or third time I need to use it … not immediately on the hunch I “might need it later”. After all, it’s not any easier to find code that’s been buried in a mountain of single-purpose functions, as opposed to a snippet from a major function – and it might be harder, in the case that I prematurely factored it out wrong and that bit of “common code” that I really need involves stitching together two or three and half of these single-use functions.

“Thats quiet a bit of jumping just to get a high level overview of it.”

Did you do a lot of jumping around? I didn’t. There weren’t any big loops or ifs, the variable types and what they were meant to represent were pretty clear from immediate context … I mostly scrolled, which I agree with you is pretty trivial.

I think on these points we’re going round and round here … all I can say is, I didn’t have a problem reading and getting a high-level understanding of what the “big long function” does. The things that you found difficult, I didn’t. I don’t know what to chalk that up to. I’m pretty sure I’m not an easy person to please … I can tell you I have seen a lot of functions in my experience that were this long, that were MUCH harder to read, and that I DID think had serious problems – and I’m very vocal about them when I see them.