When Understanding means Rewriting

If you ask a software developer what they spend their time doing, they'll tell you that they spend most of their time writing code.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2006/09/when-understanding-means-rewriting.html

I think also a lot of developers feel that they can create cleaner code by rewriting. When you refactor code it still smells of the old code for a long time, and it takes a lot of work to clean that smell out.

I think Joel is right, it’s stupid to re-write the entire thing. But then there are different ways and mentalities of rewriting.

I’ve “rewritten” several medium sized applications by starting a new project, copying and pasting chunks of code from the old application, reordering and rewriting bits. It could be argued I was rewriting it, but in fact it was just a method of refactoring. A method of refactoring that leaves less of the old smell.

I was still re-learning what the code did. Sometimes by reading carefuly the code, and sometimes by observing the functionality. But I think I would have to agree with Joel, I’ve never “rewritten” a mature program, something with lots of bug fixes from lots of people. You have to really understand the code and what the strange bug fixes are in order to think about how you can rework the code to include those fixes logically, rather than attatched on the end. And if you can’t work that out, then I guess it’s probably best to leave it.

I find that well written unit tests help to understand code. What is the code doing? Well what does the programmer expect it to do? A test will tell you. Of course, you then need to understand the test.

This is where test infection pays off. Want to learn the rules of monopoly? At a minimum the tests should cover the basic rules, and makes it a lot easier to understand the outward-facing aspects, and ignore the internal details for the most part. Have a case that isn’t covered in a test? Write a new one, if it doesn’t pass, then you need to dig into internals.

I definitely agree that tests should play a big part of this. But like Simon, I agree that tests aren’t a silver bullet for understanding. Since most code has zero tests, this may be a moot point. Starting by adding some tests is a capital idea.

One of my bosses/mentors once told me that the best programmers are maintenance programmers; not because its glamorous but because its hard.

Sure, I’ve talked about this before…

http://www.codinghorror.com/blog/archives/000610.html

Joel doesn’t say you should never rewrite. He just says don’t rewrite from scratch. There’s a continuum of combinations of reading and writing code:

a) throwing it all away and rewriting from scratch

b) global refactoring

c) writing small throwaway code chunks to help understand a large codebase

d) understanding by reading, say using a debugger

e) nodding off staring at the code, trying to understand by osmosis

Joel says only that a) is a bad idea.

Rewrites are usually a guise for I want to add programming language X to my resume, so lets rewrite this component in language X, although in some cases technology has evolved which dictates a major change.

Back when I was mostly a VB.NET programmer, I had to manually convert any C# code that I wanted to integrate into my code base. It wasn’t a meaningful conversion, but doing the C# - VB.NET conversion forced me to look very closely at the code I was adding. I’d often add features or refactor the code as I went; it was rarely a straight port by the time I was through.

Now that I’m officially language agnostic, I miss that a little bit.

yeah …great post.

I thought you had a great point when you spoke about WoW… Similarly if you can get to pair with somebody who has written the code(i understand thats not always possible) … on some of the features/bugs which you understand. That for me is the fastest way of understanding the app.

an old canard: if civil engineers built bridges the way software engineers build applications, they (bridges) would all fall down in a month.

the point, of course, is that software engineering, after 50 years, still has little in common with real engineering (sorry about your feelings). we insist on doing it different just because we want to. and because platforms change along the way. and there are few rules to guide us. code is hard to understand because it doesn’t obey a few basic laws; there is no Newton in this melieu.

OTOH, if one is a mainframe COBOL coder, nothing much has changed in 40 years; they’re still writing another General Ledger just like the first one.

there are really few innovations in the 5 decades: disc drives, RS-232, relational databases, VVVVLSI, GUI. fact is, java/vb/smalltalk/foobar coders view data much as COBOL-ers did and do. there still a lot of batch programmers out here; most don’t know since they see objects as different from file records. would that it were so.

one my favorite quotes:
It’s easier to understand 600 tables than 100,000 lines of code.
– PaulC[on comp.databases.theory]/2005

such is the path out of darkness. take it.

I think we need some better UML style tools so we can get a “birds eye view” of code more. A big part of my time is spent rewritting code that was poorly designed in the first place ( by me ). That’s because is hard to design it correctly up front, rather you take a stab at it and improve it bit by bit.

I don’t rewrite source code, I edit my additions into it for probing.

“What I cannot create, I do not understand.”

By all accounts Feynman was brilliant, but perhaps in his last days he went a bit… ga-ga. Not quite the whole up quark, if you know what I mean. Even for an academic it’s unusual to completely disregard resourcing constraints in the way the above quote does. Since Jeff Atwood seems to have made the same mistake as Feynman, he can at least take comfort that great minds omit alike. :slight_smile:

And resources, or lack thereof, are the pivotal player in this analyse/hypothesise dilemma.
As a developer you want to get to the solution with least cost and greatest quality that you can.
When you’re faced with a significant learning curve in understanding old code (or a new technology) versus the time it takes to hypothesise a behaviour and reinvent your own wheel to test it, it’s an economic decision. Analysing existing code takes time, but it offers greater accuracy and quality. Building your own in a quick and dirty manner saves time, but it doesn’t tell you any of the details of the original.

There is no magic bullet universal answer to this. You can’t reliably fix bugs without understanding the source because it may happen at a level of abstraction invisible to the program’s clients. Maybe you don’t have the source.
Usually you won’t have the luxury of choice in this issue. If you do, it’s a matter of judgement to decide case-by-case as to which alternative is best - a decision influenced by effort estimates, skill, deadlines, availability of test environments, and bunch of other resource constraints.

Understanding is the fundamental problem of code maintenance, both for the developer and any other hapless soul who must follow in his footsteps.

For anyone who must maintain, or even rewrite a component, an invaluable resource is what I call a POPS (principles of operation) document, generally a text file a couple of pages long which describes briefly how the component works, and items such as what it connects to, what it persists, its states, and especially debugging tips. There are no rules except to try to be helpful to anyone reading unfamiliar code, especially someone debugging in that area.

I have found that time spent writing these is well repaid.

Understanding code is definitely the hardest job for a coder, I agree. We might write our own code, then 1000 lines later, realize that we need to go back and change was we previously wrote, and forget how we implemented something! Or, even worse, we’re told to update code that somebody else, who has a different understanding and style of coding that you do (more than likely), has written.

Of course, this is why encapsulation is a beautiful thing. Designate a class or function to do one particular thing, that way when you go to the function to understand it (hopefully with comments to help), you know that it’s supposed to perform one task. What probably mades the task of understanding a huge amount of code, in my opinion, is going through hundreds or even thousands of lines of code and trying put get all the puzzle pieces put together in your head without losing any of them.

If you’re writing new code, yeah, at least 50% of it you will probably have to go back and change later because of something you didn’t forsee. But this is why they harp so much on comments in school. I remember one of my co-workers trying to understand the code for a map-image generator written by another company and the only comment we found in the code was “here the iteration begins.” That comment was useless! The more documentation there is, the less daunting the task of understand code bcomes.

There are exceptions:
Say you have a wizard where the pages you see depend on which options you choose. It’s a lot quicker to check out a switch statement in the code than it is to try and work out all the possible combinations from the interface.

Looks about right to me. I spend more time re-reading (and re-reading) my code than anything else. Seems to annoy the boss, but it gets the job done.

I would expect that kind of ratio on anything beyond a simple “Hello, World”

“What I cannot create, I do not understand.”

This is different than, “What I do not create, I do not understand.”

I agree that understanding code is very important. But the implication in this article seems to be that if you don’t understand some code, you should rewrite it. I don’t agree with that at all. Perhaps if you don’t have access to the person who wrote it, or you can’t figure it out after stepping through it, it might need rewriting.

I was just asked by the wife last night how I learned best. She asked because I was bitching about getting some of my roles changed at work and I was learning a homegrown system from another developer - a case where I can’t look at the code.

I explained that I probably learned best by first needing to understand the basic concept, and then to actually understand how why it works I take it apart and rebuild it myself in a way I know how.

Usually along the way I learn new techniques and commands and so fourth, but I totally agree that rebuilding is where the value lies.

While 95% of the time rebuilding is not better than fixing, many times I feel that rebuilding is the only way I can insure that I can continue support the application - even though I might have built it in the first place!

There is a reason there are very few people driving 1978 Cadillacs today. At some point everyone has to ditch the old to make way for the new. But in code world the turn around happens regulary - more like a lease instead of a buy.

yup, rewriting is better than reading.

there is this thing that frequently happens to me. someone says “where is such n such” and i say “oh it’s up in parkfield” (or some other suburb). they ask “do you know how to get there?” and i say, "well sort of. but, no, not really. see, i’ve been driven there before, but i’ve never driven myself there. so, actually, no i don’t know the way."
and that’s like everything. if you’ve sort of looked at it… well that’s nothing compared to putting yourself in the driver’s seat.

I am currently in the position where i had to take over somebody’s code - and i can tell you nothing is more time consuming and mentally draining that using another person’s thought around an issue and trying to make it your own in terms of continued use.

I had to make a call if i wanted to rewrite a six month project or make use of what has been done already where the latter was the prefered option for my boss and i did just that and i can honestly say that understanding what bits of codes do in a system that has a primary objective and with the developer having his own secondary objective in terms of how he would solve a problem makes it a huge mountain to climb.

People here seem to be confusing rewriting in order to understand with rewriting in order to replace. Rewriting a function in a separate file is a good way to understand what a function does. After it has served that purpose, you can just throw your code away.

a href="http://www.objectmentor.com/resources/listArticles?key=topictopic=Craftsman"http://www.objectmentor.com/resources/listArticles?key=topictopic=Craftsman/a
Craftsman #4 is an interesting related read.