The First Rule of Programming: It's Always Your Fault

True. Unless you’re coding in Ruby. In which case it’s a bug in Ruby :slight_smile:

On one project where we were dealing with some bugs I suggested we tell the customer that “we didn’t see xyz issue coming and need some time to rearchitect things to handle it in a graceful manner.” My manager wouldn’t hear anything of it. This was a fairly major issue, not something like a spelling error in a dialog box. It was swept under the rug and glossed over.

I see this as a manifestation of the ego-centric culture of engineering: “It can’t POSSIBLY be my fault, for when you find a flaw in what I created, it is like you found a flaw in ME!” All the bravado and machoness is usually a cover for insecurity.

Good post. I agree that you should always point the finger at your code first. Only after you have empirically proven through exhaustive tests that it’s not your code should you go on to look at the OS, outside libraries, etc.

Here’s a fun tale for you Windows ecosystem devs:

Short version: Don’t pass a null reference to a COM object on a Win2000 machine!!

This was incredibly hard to track down because my development machine is running XP, which has a newer version of COM than Win2000 machines do, so this bug only showed up on the production server. XP’s version of COM quietly handles this and goes along its merry way, but 2000 barfs all over the place.

Jeff, I love your blog and I agree with many of your points. The only thing I have to say about this one is that you have to be careful in accepting responsibility as well. I personally always accept responsibility when my code fails. Some times though to the point where I don’t feel like anyone else should have to fix it except me. One of the pitfalls of accepting responsibility of your own bugs is madness just the same as blaming the OS or the tools library, etc. Like Gordon Ramsey points out to chef’s on the British version of Ramsey’s Kitchen Nightmares, the more work you put in the more it hurts when it doesn’t turn out right or someone doesn’t like it. Good programmers are much the same, the more effort put into a project, the more it hurts when bugs are found or worse when people don’t like the finished product (not one person mind you but the majority, you can’t always please everyone). You have to manage the acceptance of responsibility with a little bit of detachment from your own code. Once the bug or problem is found it’s good to say ‘yeah it’s my fault, I’ll get to the bottom of it and fix it.’ but it’s also good to approach getting to the bottom of it and fixing it like it wasn’t your code. Part of the problem in both cases is thinking you did something right and not bothering to look at it again, but if it wasn’t your code, you may look there. When I started out in the field I had to deal with this a lot. I would write code I thought pristine and when a bug was found I’d jump in to fix it. Problem was that I never looked at the part I thought pristine, just the parts that I thought didn’t look as good. In almost every case it was a minor insignificant bit in the code I wasn’t looking at. All because I wasn’t looking at it objectively. Just a little food for thought.

Well, almost true. I did once have the pleasure of finding and documenting a bona-fide driver bug in nvidia’s display drivers. Of course finding that out is only half the solution, you still need to come up with a workaround.

This reminds me of the Happy Days episode where Fonzie had a really hard time saying, “I’m wwwrrrroooonnnnggggg.”

When I was learning how to program, it felt like a loss of innocence the day I realized the compiler was a program just like any other, and subject to bugs and quirks like any other program. That’s just been reinforced by a few significant compiler bugs since then (crashing the VC6 compiler using C++ templates being a favorite hair-pulling-out moment).

I agree that as a rule of thumb, bugs almost always ultimately turn out to be a problem with your code. However, even in those cases there are a number of ways third party software can share in the blame:

  • Missing, incomplete or misleading documentation, so the semantics and assumptions of the third party software are ambiguous.

  • Missing, incomplete or misleading error reporting, so when you use the third party software incorrectly, it fails silently, or reports something useless like “Generic Error -1: An error occurred.”

  • Not having the source code to the third party software, making it much more difficult to diagnose the interaction between your code and the library. Even if you accept that your code is probably wrong, because of (1) and (2) it may still be nearly impossible to debug without stepping into the third party code to figure out what it is expecting to happen. It is amazing what five minutes with the debugger can discover compared to hours of poking at a black box.

  • Having the rug pulled out from under your feet. You wrote some code, it worked perfectly fine, and a library/OS upgrade changes some subtle aspect of the behavior (or has a bug fixed itself that maybe you had worked around) that causes some previously flawless code to fall over.

Whereas this advice is certainly true, it’s a little unfair. Things like OSes, compilers, libraries, et cetera, define the rules. When creating these backbone type systems, these programmers can simply change the rules to fit what they’ve coded. Think of the applications that call what look like bugs to be features. “I know that looks odd, but it’s suppose to look that way!”

So I would modify code complete a little bit… perhaps only 90% are your fault and the missing 5% are bugs that have had the documentation changed to fit the bug.

=D

I don’t see any defensiveness, just an amusing list of times when it WASN’T the developer. I can recall quite a few cases I have found. (A vsprintf bug in MSCRT.DLL is my favorite. I ran this down to the one line of assembly that was in error.) The fact that 99% of the bugs are mine or another developer on the team is a good thing. Tracking down a system bug is tedious, time-consuming, and just plain hard. Trying to come up with a workaround is just about as bad. Then you should keep track of the workaround because this is dependent behavior, the bug could get fixed and break the workaround with no warning. An example of this is a bug in the Oracle 7.3 libs. The bug was known and documented, we had a workaround. The bug was fixed in a version 9i build, which broke the workaround. It was pretty obvious when it happened because I knew about the workaround, but it could have slipped by someone else who didn’t know.

It’s only programming by coincidence if you tweak things without knowing why. If you’re working with a proprietary OS like Solaris 15 years ago, maybe that’s how you fix things.

These days, all our code runs on Linux (or maybe OpenSolaris). If select() isn’t working right, we open up select.c and check. If your whole stack is open-source, there’s no need to program by coincidence.

Of the work I’ve done the past 5 years, when it looks like a language or library bug, it almost always is. I think the First Rule should simply be “Don’t Program By Coincidence”. Passing blame is just a corollary, at best.

While I’m a big fan of Code Complete, and I would generally advise junior devs to always assume the bug is theirs, I think that things have changed a lot since the original advice was given - the frameworks we use (ASP.NET in my case) are huge, and therefore much less well tested/ exercised than Solaris ‘select’. I’ve certainly seen several instances in the past few years of stuff broken in ASP.NET (eg. CheckBox looses its ViewState if sufficiently deeply nested, just to quote the most recent example).

I had to test a guy’s code… let’s call him “Frank”… and interestingly, every time I came to him with a defect found while inspecting his code, he’d consistently try to direct me elsewhere. Once, he even advised me that I was the defect. Trust me, he was the only developer I ever ran across that was so sure he was not the cause of a defect that he stuck in my memory.

When I read the article today, I was reminded of Frank. I didn’t have respect for him as a developer only because he never took on responsibility for his portion of the overall objective. The developers I have the greatest respect for are those who will open up their code and step through it while a peer sits with them in amazement. Those developers do take responsibility. When it’s not their code, or piece of an integral system, they will identify what the real issue is - and smile. Frank rarely smiled.

I get your point, but there really does come a time when you’ve found a bug in someone else’s stuff. It does happen.

OK, let’s say you’re coding something that’s going “onto the metal” for some embedded device, and so, to ease testing, you find an emulator. This is a fine emulator, and everything’s good until one day it all falls apart. You’re doing some testing, and something really unexpected pops up: the emulator is expecting some set of registers that you’ve not set up.

“OK,” you think, “I did something wrong,” so you check the manuals for the machine you’re aiming it at. All seems good: the manuals mention requiring the entity in question, but only on a totally different code path from what you’re following. So you check it out properly: burn it onto the ROM and fire off a proper test. And it works perfectly.

So you go to the makers of the emulator, proffer a patch, telling them you’d tested it on the hardware, and they come back to you saying that they’re not going to put the patch in because [insert excuse here].

Is it STILL your fault?

Well, yeah, you should assume it is always your bug…but not to the extent that you miss system bugs. All the nastiest bugs I remember from my career were system bugs. They were nasty precisely because everyone assumed the system was bug-free.

A couple months ago, we spent a few days of confusion on a proprietary platform until we finally pared it down to this code, which caused a crash:

try {
throw 5;
}
catch(int)
{
}

This used to be more true in the 70’s, the 80’s and early 90’s. My experience tells me it’s not as true today.

I recall one time where I claimed the problem was in a system component after analysis. In particular, System.IO.MemoryStream.GetBuffer() was returning too long of an array. My response was to write a drop-in replacement for the system component. Problem gone. Whether the system component was actually at fault or it was just not doing what we expected I’ll never know (vague documentation…), but the problem was fixed.

well… if you want to put it into perspective, ultimately everything is our fault. or maybe a better frase is " its your job "
like… why did we use this piece of code?? why didnt we use another piece of code??
heck… why did we even use this language in the first place? why didnt we use anohter language??
it was our choice, thats why its our fault
its like when morpheus offered neo to take the blue pill or the red pill
yeah… we all took the blue pill and followed the rabbit through the hole
ultimately it is our fault… “if only i hadnt drinked that much beer before morpheus asked me that stupid question”.
it is my fault… isnt it???
to fault or not to fault (oot)
i think there are going to be some pretty depressed people when their working their mindset is always “its always my fault”. maybe in the future, maybe, just maybe, proggrammers will be in the top 5 most deadliest jobs under alaskan king crab fisher and miners because of the high suicidal death rate of depression (maybe a tad to much exagerration :stuck_out_tongue: but im not trying to be funny).

coding is not for the faint hearted, its tough, when your confronting thousands/millions of line of code, hundreds of bugs and a tight deadline. you gotta be pretty special to rise above the challenge and conquer all. I agree with the saying “If you want to be great, you’re responsible for making yourself great”.

i pray for the day to come when programmers can say “ITS NOT MY FAULT”

Most bugs I come across these days are compiler bugs… but then again, I work on that compiler =)

I have learnt this the hard way but the problem is it makes you very reluctant to accept when you do find a bug in a system library.

I spent a whole day trying to prove I HADNT found a bug in dotnets XmlSerializer, but I had.

I dont think having a ‘its my fault mind-set’ is good advice, instead logically removing causes 1 by 1 seems more a more sensible mantra for bug fixing.

I’ve slightly different experience-most of the time that was framework’s fault.And it seems reasonable to me,most of fremeworks/apps released this days has “beta” tag attached to them or assume it.With ability to update software by internet,you don’t feel much responsibility for product you release-release erlier,release often principle in the wild;-)So,despite of TDD popularization,quality of software and frameworks(as more complex beasts) taken in particular moment of time,generally going downhill IMHO.But software evolving faster,fixing old bugs and intruducing new in equal proportion.More complex software is more susceptible to bugs,so it’s quite reasonable to assume bugs in frameworks this days.

Thanks.