Exception-Driven Development

This post is very similar to what Joel Spolsky wrote on beta testing. There is one point he mentioned there that I think adds value to this blog post. He said that if you ship your beta software too early, you will see two negative results:

  1. You will be deluged with more bug reports than you can deal with, and you will end up being forced to ignore most of them.

  2. You will alienate your users because your product is so buggy and unusable.

This is why I think TDD is so valuable. As Iain mentioned, tests aid you in writing better designed code and enable you to refactor with confidence, but they also provide a sanity check for you to let you know that your code works in the basic cases. For the most part I don’t think anyone writes tests that are extremely thorough; they mainly write them to test that code handles the common cases as well as a few potential edge cases. When you have done enough internal testing that you know your product works in the ideal cases, then it is time to let your users take a stab at using it.

@jeff
I get automatic Firefox (+plugins) updates all the time for issues I haven’t run into yet, personally, as a user. This method of shipping software as fast as possible should be seamless and automatic, and it is … in that case!

This sort of thing only works when a user is connected to the internet. It’s an assumption a lot of developers make, which can in many cases be wrong.

At a previous location, we had a mature enterprise product where we took this approach to stabilizing the app. For a mere 30 users, we had 400+ distinct issues come in through FogBugz in the first 3 months (600+ came in and most had several occurrences). All we did was put an exception logging call in all the event handlers.

When I stopped supporting that app after about 2 years, we were down to ~0.8 issues per day following this same approach. We did have the benefit of a ClickOnce deployment so we could pump out several releases per day.

That’s assuming your bona fide error causes an exception in the software. What about, say, a captcha system that produces completely illegible squiggles? A search system that turns up useless results for the first 5 keywords I tried when looking for a specific item (and then I go to Google and get it, the first hit on my first search)?

StackOverflow.com, for example, suffers from both these problems and more. It’s one of those programs/websites that doesn’t throw an exception very often (maybe once every week or two for me), and yet many features are not in a state that any person would call working.

I think I prefer sites that are more usable, even if they throw an exception now and then. Of course we aim for both, but putting all of your eggs in the never throw an exception basket seems to lead to lack of manpower to deal with usability.

I don’t believe that exception driven development requires the use of built-in exceptions.

Very nice and interesting post, and nice interesting read about the WER. We usally sit with user and observe how they interact with the application. If there are any errors we fix it then and there, make the changes in code. This works for us becuase the tools/application that we develope are usually in-house and small size. Any bugs found are corrected. Out process is not systematic, we tried to but it didn’t work becuase of various reasons. One reason being here the process is more result oriented than caring about if the standars are mantained or not, or if the desing is modular enough, the scope of futhur imporvement is not considered etc.Unfortunately these tools are just developed just in time for and also, these tools/applications are used few times and forgotten untill they need it again (probably never). These are more like one time use tools, or rarely used once in 3 months, or once in two years, so really there is no interest in putting that little extra effort, becuase any way we know it will be used for a while and fogotten all about. There are few applicaiton/tools that I have worked on and tested it well enough and put that extra effor and the desing was modular enough that it can handle and changes, so far so good.

That’s home in on. Why would you write hone? It doesn’t even make sense…

People have been saying hone instead of home since like 1968. Stop living in the past.

Amen. I also set up error reporting as the first thing on any web-deployed project. It makes a huge difference.

One quibble, though. Test-Driven Development (or whatever you want to call the different variations of the developers write code to exercise code) shouldn’t be justified in terms of bug fixing or bug finding. It’s about bug prevention. Just as surgeons don’t sterilize instruments to cure existing infections, they do it to prevent new infections.

I can sympathize with this post, however there are two important issues worth mentioning.

First, unless you are extremely liberal with your use of exceptions (which most aren’t for good reason), you will miss a potentially large class of bugs due to calculation or workflow bugs. Think about cancellation issues in financial calculations that look A-OK but really reflect a bug in the system. Good TDD is great at routing out these issues.

Second, users do funny things. I noticed on one project (that implemented a similar exception logging system) that users would give up on parts of the application that were consistently buggy or - worse yet - figure out complex workarounds. This will cause the number of exceptions caused by a given code path to drop over time and divert the developers attention to more frequent exceptions.

I agree partially, it’s true what you say but a balance must be found to not ship too-buggy software that scares away your clients or possible future client (in demo versions, for example).

In resume, shipping too late is bad and shipping too early is also bad.

When a user informs me about a bona fide error they’ve experienced with my software, I am deeply embarrassed. And more than a little ashamed

Actually, I think this is possibly the most powerful statement in the whole article. Regardless of technical procedures used, if more people actually felt this way, they would probably do a better job. I work with a lot of people who have no pride in their work, and fixing bugs is just something they take in stride. They feel nothing about the existence of the bug in the first place, and do nothing to try to eliminate them unless someone complains.

100% agree that this is very important - I also try to fix exception handling first - but crashes!=errors.

Perhaps a feature is too broken to work at all, or your parser rejects uncommon but valid input, or your application got slower, or its screens no longer match the help pictures - none of these generate exceptions.

Don’t only log exceptions, log all types of errors and quite a lot of other information too, and regularly spend a few minutes sitting with users.

I hate to pile on, but Jeff, you counter-TDD example is bogus.

If you find a bug with your test that a user would never encounter then why does that code even exist?

To me, TDD enforces the contract that the production code has with its clients. There’s a school of thought that says the test code should be written before the production code. You’re done when the production code satisfies the test.

I do this a lot with Paint.NET. Most of my 0.01 minor updates are pushed out solely to fix bugs that were identified from the stream of crash logs that were e-mailed to me. Occasionally I’m able to implement a conservative fix for a bug that I can’t even repro myself. For example, I compile without Check for underflow/overflow enabled for major performance reasons. Some people have software that installs mouse or keyboard or other hooks, and these start executing in my process and flip the x87 floating point exception handling bits and then don’t un-flip them. This causes my code to throw overflow exceptions when it shouldn’t be (or rather, it happens in code that I’ve written and tested and determined that even if it does overflow it’s ok – certain types of pixel manipulation code, for instance). So, based on exception logs that identify a few hots spots, I can put in a try {…} catch (OverflowException) { provide default value of zero or whatever }. It’s then reasonably proven that the crash is fixed because I stop getting logs with that callstack and exception type.

Wrong wrong wrong!

EDD rebuttal: 1. if u have no beta users, if your SW is 2 buggy no one would keep using 2. EDD encourages behavior such as Run-It-And-See-If-It-Crashes…

@Jeff Atwood
I enjoyed the article, although I get the feeling you don’t have a good grasp on test driven development based on your following statement:
Although I remain a fan of test driven development, the speculative nature of the time investment is one problem I’ve always had with it. If you fix a bug that no actual user will ever encounter, what have you actually fixed?
This completely misses the whole point of Test Driven Development. I think you would benefit reading the article Test-Driven Development Isn’t Testing by Jeff Patton
(http://www.stickyminds.com/sitewide.asp?Function=edetailObjectType=COLObjectId=8497)

And constantly being annoyed by emails is a great incentive to fix bugs quickly.
Or just ignore them all.

I prefer to not write bugs to begin with.

@Bill

Good luck with that.

To the people that said …why does that code even exist?, I think the correct question is why does that test exist? Presumably the code does something useful but the test scenario can’t happen in real life and that’s how I understood the comment about TDD being speculative.

Being an embedded systems guy I can only envy web developers who can rush something out and let users do the testing. To a much lesser degree we do that too but it results in devices being shipped back to us and lot’s of warranty cost.