When Understanding means Rewriting

Mihai · September 22, 2006, 12:00am

Submitting the form deleted my initial quote:

“If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.”

But is ok, I am patient, in 4000 years all the blogging software will be perfect.

MorganS · September 22, 2006, 12:00am

Greetings,
Understanding does not require rewriting. It requires the ABILITY to rewrite.

One of the most valuable traits I have found in myself, as a software developer is the ability to look at a product, a service, an API, and think, ‘How would I have built this?’, and then apply a few small tests to see if my ‘design intuition’ is basically right. If everything looks right, I can then design my interactions with that software under the presumption that it’ll do ‘the right thing’, and generally be right.

The worst job experience I ever had was with a company where NOTHING they built was designed the way I would have built it, and the design decisions (or even the forces that created the design decisions) weren’t documented at all. Every time I tried to make a change, or build on top of their existing framework, something broke, often someones pet optimization.

In fact, the only developers who could adapt the existing systems were the ‘old school’ devs who had been there for 3+ years. After a year I got out of there, suffering from anxiety, near depression, and questioning my ability as a developer. Within days at my new job, I was a productive, confident developer again, as the frameworks I was building on top of (and the code my coworkers were writing) had no undocumented ‘magic tricks’ or convoluted optimizations, and most of our design decisions were sensible. We didn’t need to read every line of the source to avoid unpleasant surprises, we just had to think how it SHOULD be implemented, and we were generally right.

A professional developer should be able to, at a glance (or with a few moments of uffish thought) say how they would have designed a system, and not be very far off. This requires the ABILITY to rewrite it, but not the necessity of actually rewriting it. At a certain point, in fact, you can make the call that, ‘based on the developers experience (or inexperience), they probably wrote it like this:…’. Similarly, when you see a misbehavior, you can go, ‘Ah. Based on what I know, and my experience, someone probably made this design decision, and it’s mistaken in this circumstance. Look here for the issue.’

It may also help that I spent many years doing reverse engineering work professionally, so I see many things in this context.

I don’t know any phrase for it other than ‘design intuition’, and it’s incredibly helpful in everything from over-the-phone-debugging to bug reporting in packages you just use, to building good designs up front.

As for the concept of software development versus engineering branches, I pretty strongly disagree with the idea that software development is an ‘engineering’ field. It is more a creative field. There are certainly uncreative developers, and projects which require no creativity, but those are not the areas to be working in if you’re a professional programmer, as those are the areas that will be marginalized.

This also isn’t to argue that there isn’t creativity in engineering, but that the constraints of most engineering is physical, universal, and well defined, whereas the constraints of software development are more mental, often project-specific, and cannot be very clearly defined, often because they involve people, and people just aren’t well defined.

As for:
“If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.”

I always saw this as indicating the idea that programmers re-use, and build atop those who came before. I never saw it as a negative, really.

If programmers wrote programs the way builders build buildings, all software would come with its own Operating System.

AndrewC · September 22, 2006, 12:00am

It is true that re-writing large applications is a bad idea, but re-writing small to medium sized applications is a whole different kettle of fish.

I for one, have re-written some small projects and ended up further, with cleaner code and more tests, than I would have trying to hack the old code.

mjh · September 22, 2006, 12:00am

Feynmann didn’t say “what I do not create, I do not understand.” What he said was “cannot” - invert the clauses, and you’ll see his meaning.

PeterH · September 22, 2006, 12:00am

A factor in favour of rewriting is that the use that the system was originally coded to deal with may not in fact be how it is used.

Say you designed a system that could store it’s data in an xml database, a rdbms, flat files or quantum singularities for reasons of flexibility. After five years you find that you only use it with an rdbms - well there is a lot of code you can take an axe to resulting in a smaller code base, easier testing and maintenance and it will be easier to add new features to the leaner system.

Now that you know that you are dealing with rdbms you can get closer to the metal, so to speak, and start to get better performance by working with the SQL rather than some abstracted query language that needed to be translated into XQuery, file system calls or quantum mechanics.

T_E_D175 · September 22, 2006, 12:00am

Ack, no. That was a horrible rephrashing of Weinberg’s Second Law. It properly goes:

"If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization."

Bud_Pass · September 22, 2006, 12:00am

I have had experience creating, maintaining, and converting large complex business applications that use relational databases and have some reactions to the other bloggers.

Documentation and comments, when they exist at all, must be considered as unreliable, either because they were never done well initially or because they were not updated as changes to the design, database, or application were made.

All the comments about difficulty of reading existing source code are true, regardless of language. Even if the code is your own, and it is commented and documented, it can require significant effort to understand enough to make changes or to fix problems.

Refactoring/rewriting parts of source code to fix problems is tempting, although must be done carefully to avoid unintended consequences. Normally attempting to make minimum changes to resolve the current task is moat appropriate.

The only reliable bases for understanding are the source code and the database schemas, although reviewing inputs and outputs are sometimes useful also, given difficulty with other means.

Reviewing contents of tables and extracting smaller modules from the source code and running them in isolation are very useful techniques.

Attempting to reverse engineer major program logic and business rules by running the application and looking at the output will fail to produce reliable results, in real-world cases, altough may be very useful in limited cases.

There are several reasons for this, but the primary ones are the degree of complexity of some of the rules, the difficulty of determining the interaction between the program and the database, and the fact that the same external input may produce different outputs, because of the history stored in the database.

Jimbo · September 22, 2006, 12:00am

Feynman was bang on. What we do not understand, we cannot create (except by chance rearrangment of everything in all possible combinations).

We can create engines because we understand how to rearrange raw materials, and we understand some of how raw materials came to be, but eventually our understanding ends and we are left with something we cannot create, but can only gather from the universe around us. We’re down around the nucleon level at the moment, though, which is pretty far.

JeffS · September 22, 2006, 12:00am

Rewriting code is INSANE.
Programmers are INSANE.

Rule 1:
Only rewrite code IF the architecture is out of alignment with the requirements. Reason–it costs a whole lot more to patch a misfit architecture than to fix it. Notice I said misfit not ‘eligant’. (Note: modularize–it minimizes the scope of re-writes)

Rule 2:
Read rule #1!

keith_ray · September 22, 2006, 12:00am

Most programmers don’t know how to write “intention-revealing” code.

Code that answers WHY the function is doing what it’s doing.

Much head-scratching comes from figuring out what the goal of a piece of code is, and (unfortunately) fixing it so it actually achieves that goal.

“rewriting” is a vague term. I rarely rewrite, but often refactor. For example, if I change a method’s name and argument-list from “frpt(int x)” to “printFooReport(FileDescriptor anOpenFile)” that can make finding the bug in code that did “close(x); frpt(x);” easier to spot.

There are lots of code smells, but the most common is probably are those that violate the SRP - http://c2.com/cgi/wiki?SingleResponsibilityPrinciple - a method that does 5 very different things (and should be 5 methods), or a class that has 15 responsibilities (and should be split into 15 classes) can be VERY hard to read and understand.

Ko1 · September 22, 2006, 12:00am

That’s why I prefer reading and writing Python code! I find my old code (older than a few months) very understandable. Also, other people’s code is also quite understandable. I have tried other languages but I didn’t find them as easy to pick up old code as in Python.

If you object about significant white space, spend an hour writing Python code and you will probably not notice it anymore.

Colin_Wyers · September 22, 2006, 12:00am

What I’m suprised no one has mentioned is how bad Joel’s example of Netscape Navigator looks in hindsight for his position. Since then, the rewritten code base has, through the Firefox project, become the second most popular web browser out there, and probably the biggest competator to Microsoft out there in that space.

Maybe you could make the case that a Firefox based on legacy Navigator code would have been just as succesful, but that steals a few bases in my mind without something to back it up.

There are other examples where rewriting from the ground up have not been spectacular failures. Take Windows – NT was a completly different entity from Windows 3.x and 9x.

Or take the multitude of COBOL applications that were simply replaced rather than ported from mainframes to commodity PC hardware.

Tim_Dudra · September 23, 2006, 12:00am

“As for the concept of software development versus engineering branches, I pretty strongly disagree with the idea that software development is an ‘engineering’ field. It is more a creative field.”

Software development is not art; it never was nor ever should be intended as a creative field. You want to be creative, go into freakin’ marketing or sales. With software you are supposed to be building a system that, given a predictable set of inputs, performs a specific task and generates a predictable set of outputs. This is engineering; not art.

My dad was a civil engineer and there was an element of intuition to that job too but the engineer doesn’t have their intuitive flash and immediately sneak off into a corner all by their lonesome and dig a hole for a piling or throw up a main support beam. No, an engineer presents the inspiration, epiphany or boneheaded idea to their peers where as a group they consider its merits/demerits and decide if it will be a valuable addition to the plan. The idea is honed to as near perfection as they can take it before they try to realize it.

Many developers (if not most) on the other hand won’t present the idea to others because a) they are afraid it is boneheaded and others will criticize their fragile ego - better to have the computer break the news them in private or b) they are afraid somebody will point out that it is a great idea but it really isn’t necessary to rewrite that stable piece of code in order to employ the idea or c) “My precious, my precious, don’t touch my precious”.

Software is art, not because it should be or necessarily must be, it is art because, like those modern artists who throw paint into a jet blast or weld cruddy pieces of metal onto other cruddy pieces of metal, it is built in an ad hoc manner, without any plan, without any direction and without any concern for the future.

An artist will paint a purple background, slap a white oval onto it and run four green stripes across it. It looks like rancid bacon on fried EggBeaters being eaten off of a 90 year old ladies purple sweater but what it really is is a juxtaposition of the emerging environmental movement against the discord of global warming.

orcmid11 · September 23, 2006, 12:00am

This strikes a deep chord with me. I notice that I often end up rebuilding/writing something, even sample programs.

One purpose of my rebuilding is to get confirmation in small chunks. Another purpose is to demonstrate the function of the program. I want a form that provides a demonstration and explanation that I can comprehend without too much difficulty. This usually means that I will have ended up refactoring the internal structure of the program. I may also end up re-engineering the program from an abstracted understanding of its essential purpose (as well as repairing places where it is underspecified, as well as I can tell).

I don’t think about this very much, although I have wondered whether it is some sort of character flaw that has me need to restate most programs in order to understand them.

But I think Keith Ray has identified what it is I am responding to. It is the need to understand the intention of the program and have that reflected in the way I express it. Whether that helps someone else reuse the code or not, I cannot be sure about.

David · September 23, 2006, 12:00am

That pie graph looks like its made in Microsoft Excel 2007

bago · September 23, 2006, 12:00am

You know, good documentation of the design philosophy behind a class would solve this.

Also, with so much data for WoW, the best way to determine optimized behaviours would be to run analysis on the patterns of the top ranked players. You’ve got the users, exploit them.

Stephen · September 23, 2006, 12:00am

I’ve always felt that a programmer who’s only reasoning for a rewrite of a functioning, stable applicaiton was that the code wasn’t understandable is an idiot. The fact is, if the program works, then the code is understandable and it may contain reasons, unknown to you, that something was done a certain way. The “rewrite” mantra is the mantra of the weak and inexperienced.

In the case of obviously buggy software you may just decide to rewrite something instead of inheriting their bugs.

In those cases where you know you wrote something with “prototype” in mind: REWRITE

Ulic · September 23, 2006, 12:00am

It is with great interest that I read this article, however, I would like to submit a few corrections to the “where-developers-spend-their-time” graph…
here is a more accurate version, based on my own personal experience:

http://www.img2u.com/index.php?id=233

TechGuyDave · September 23, 2006, 12:00am

These posts are very enlightening. I have been writing my own code for 20 years and have finally decided to go to college for programming. The notion that I will spend most of my time understanding code is perplexing in that there is NOT ONE class dedicated to the methodology of ‘understanding’ someone elses code. If there is, I haven’t found it. There has to be some sort of universal rubric that can, at the very least, be the foundation upon which new programmers (or those new to a company) can build upon. If a programmer really spends 70+% of there time doing this, shouldn’t there be more in-house methodology devoted to this? If there is, where is it?

Tim_Dudra · September 23, 2006, 12:00am

TechGuyDave,

My opinion would be that there is no emphasis on reading other people’s code in university and college because “that’s not fun”. The last U course I taught, in January 2006, gave the students the option to build their own code or read and modify a substantial amount of existing code. In general, there were three categories of response:

a great deal of whining “there’s too much code - how can we understand it”, usually espoused after they had destroyed the natural organization of the code by eliminating the directories that housed the code for various components (so they could build it in Visual Studio when I had suggested they use Cygwin and g++ because the build scripts were all done inside the Cygwin environment),
a whole lot of “we weren’t able to understand it so we wrote our own” coupled with “there was way too much homework assigned”

and finally
3) about 5-10% of the students read, modified and submitted the code that was provided and then sat back and relaxed and watched the others suffer since the workload was quite manageable if you just used Cygwin, the existing build scripts and the existing code.

It was a very interesting exercise in student psychology that, I suspect, is indicative of developer psychology as well. Rather than tackle an area outside their comfort zone (install and learn the Cygwin environment and g++) they decided to stick with what they were comfortable in (Visual Studio) and suffer lots of downstream pain to avoid near term pain.

One economist, writing about the idea of property and the rationale for private property, wrote something to the effect that economic models tend to be based on the “efficient optimizer” model of human behaviour, namely that humans will act in their best interests and will make efficient use of their own resources. He stated that in reality humans are “efficient short-term optimizers”. We naturally look for short-term benefit gladly ignoring or deferring pain downstream (i.e., down “time”). We must fight our nature to make a rational decision to suffer the pain early to avoid the pain later. In the software industry, because corporations tolerate if not encourage the hacker, software developers almost_always dive into code (because it is fun - near term benefit) and produce complex, frankensteinian nightmares of code which result in significant long term pain.

Alternatively, if we learned to fight our nature and stop, think and design up front (the pain of having to think hard) in order to make the long term experience “profitable” (in terms of ease and fun “coding, not bug fixing and patching”), we might approach engineering status and build a piece of code that doesn’t require 2000 “test runs and crashes” before it is stable.