Error Codes Must Die

A recent Scott Hanselman post described a problem he had with Windows Defender:


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2006/02/error-codes-must-die.html

Oh god yes, I hate error codes with a passion. As I deal with mainframes all day at work (go go os/400) I see a TON of them… and end up just googling the number to find out what it means.

Every developer should be REQUIRED to read About Face 2.0 if they have anything to do with UI and especially if they have to create the output that comes with error messages.

Jeff, I totally agree that such error code-based messages are not great to say the least (who wouldn’t BTW!).

OTOH, I think that “Don’t ever use error codes” is too easy a conclusion.

Because you assume:
a) we (the devs) can always know the cause of the error.
And/or:
b) there is always a meaningful message associated with the error code.

Boy! If a) were true, our apps could in most cases avoid or workaround the error.
As for b), while an error message is usually way better than a code, it’s not always available. And it’s often not self-speaking (e.g. “An argument is invalid”). In the case of b), I often find myself appending the error code to the message because if users have to get back to you, there is not one single chance they’ll report the error message correctly (unless they send you a screenshot).

The problem is with errors we were not able to anticipate. In most such cases, expecting to be able to give useful diagnostic to the user is over-optimistic at best.

Cheers,

I agree with the last comment. I always logs the errors in a human readable form, but most times the errors come from 3rd party software (including windows).

You cannot give a meaning to that errors (most time unexpected errors catched). The COM errors as show in your blog are a real headache as a user and also as a developer.

To solve the error code problem we must start fixing it from the verylow (OS) and going up thru the software chain.

Hope that my bad english is not a problem.

Nice blog.

I always do both. Give a helpful error-message and the error-code producing the problem.

The first is for helping the users to find the solution for the problem themselfs, the errorcode to help the support-people to track down the problem better.

Not all helpful error-messages fit into a popup-window and an error-code often help to find solutions using Google.

BTW2: Errorcodes are still very common, especially when different systems have to work together without user-interaction.

Regards, Lothar

http://www.electrosonics.net/technotes/xperror0x8024402C.htm

googled in 5 seconds. im totally happy with errorcodes :slight_smile: especially if the program that produces this problem is multilanguage capable - or have you ever seen multilanguage errorcodes ? gg

Error codes and those totally useless messages that just frustrate - like ‘Error connecting to database’, ‘failed to read file’, ‘error reading’

Put the name of the file, say what you were trying to do, put the OS error message. Give the poor user some sort of a fair chance at using your program.

Worse, libraries that display their own messages without the programmer being able to trap them. I came across a wonderful one recently - ‘See the 5185 error code documentation for details. axServerConnect AdsConnect’ - now thats a useful error message.

Ian

I agree that messageboxes that contain just error codes are utterly useless.

However, messageboxes that contain textual descriptions of the error can be equally as useless. Consider the case of a file open error. The lazy programmer thinks the file open will never fail and doesn’t check that case. The not-so-lazy program catches it and displays a messagebox with the ever helpful “File open error” message. Sigh. What file didn’t open?

How about something really usefull like:
+

So, who is General Failure and why is he reading my hard disk?

Error codes aren’t for users, they are for developers and support staff.

If you let a user think he has a clue what they’re doing, then they try to fix things, and NOTHING good ever happens then.

Frankly, I think that the real problem are the USERS, not the codes. If you could just keep people away from computers, they would be so much more stable(both the computer and the users.)

I say ban computer users today. What say we start an online petition?

unfortunately as long as my software relies upon other systems that still use error codes there will always be cases that can be very difficult to deal with

Here’s what I would do in this situation

  1. Have the software “phone home” (with user’s consent) when it encounters an error.

  2. Log all errors.

  3. For the top 5 errors, implement help that follows Cooper’s guidelines

  4. Ignore the rest, since the top 5 probably account for 80-90 percent of errors users encounter.

The critical piece here is getting a list of the ACTUAL errors users are encountering, rather than playing a guessing game.

OTOH, I think that “Don’t ever use error codes” is too easy a conclusion

Maybe I should clarify this: don’t ever present error codes alone to the user. You can parse them internally if you need to, but presenting them as-is to a user is just irresponsible.

Having a Cooper-compliant error dialog and an error code doesn’t hurt, of course. Users will just ignore the code.

Sigh. I wish as many people comment on my site as on yours. :slight_smile:

I agree with the commenter who noted that an error code (as distinct from just a message) is more international.

If everything is a number, then it’s a snap to internationalize everything. You might as well build your whole UI with numbers to save some cash. Of course nobody will be able to use your app, but hey, we saved some bucks.

I don’t subscribe to this “we must do things to make our apps easier to translate” theory, which also came up in the icon/text post. It’s just not true-- you’re only shuffling the costs around, not saving anything.

I can google “0x8024402c”

Isn’t the whole point of computers to take on this kind of mind-numbing work? Like, say, manually looking up error codes?

But I definitely agree that having an additional error code in the (polite, illuminating, helpful) error message somewhere is fine. Preferably behind a " More " button.

Maybe we as developers are affraid that the user will see that we did something wrong or show that we did some crappy implementation.

Maybe error 0x32324234 0x123123123123 0x213123 sounds more advanced then:

“Your machine ran out of memory, we thought that 640k should be enough for anyone. Please reboot your machine and try to recreate whatever data you lost. In the future, please try saving your data as often as you can.”

Anyway I was wondering, what about error reporting software. Do they ever work ?

Scott: With all fairness I thought about it, but my lateness to the party wanted me to comment somewhere closer to the time when I could get to it.

Jeff:
“Here’s what I would do in this situation”

Funny thing is, Windows Error Reporting does practically all 4 of those. There’s an option to contact support. Common error messages retrieve a description if there’s a resolution all within the same dialog. I’ve still received lame “Your video card stinks” or similar messages, but at least it was something. If you use Event Viewer and use the help link, it finds the error code and gets a list of knowledge base articles and other resources. 98% of all errors in the event log I’ve fixed following something from this list, never having to stray even one level away. There are some messages with no response in a “Sorry, figure it out yourself” (which feels more like a middle finger to me) but luckily there’s searching for the descriptive error message which covers the bulk of what is missed.

I’m honestly surprised this wasn’t a product that followed the existing error reporting. It’s an ONLINE anti-virus application. It may function but is no real “Defender” if it’s definitions are out of date. Using the existing error reporting service or at least something with similar functionality seems minimum. You want users figuring this out quickly if they can, reducing your support load. You most certainly would get more support calls if you couldn’t find the right information to fix the problem quickly.

I agree that both a helpful message and an error code should be presented to the user.
However, a more pressing issue I think is the ability to copy and paste errors. Too many times am I presented with a message box that I can’t select the text for and copy, and find myself manualy copying the error message (or the shorted possible uniquely identifiable substring of it) into google manualy.
The ideal solution (other than no errors) is a short, well designed error dialog that presents key information that the user may be able to use to fix the problem, a textbox with a more detailed description and an errorcode, that can be selected and copied, and an option to look it up on a database of common problems for your website so you can post fixes and hotpatches as you find them.

is a short, well designed error dialog that presents key information that the user may be able to use to fix the problem, a textbox with a more detailed description and an errorcode, that can be selected and copied,

I agree. Take a look at the screenshots in these articles:

http://www.codeproject.com/dotnet/ExceptionHandling.asp

http://www.codeproject.com/aspnet/ASPNETExceptionHandling.asp

Doesn’t that look like a COM error? Notoriously unhelpful, those.

I agree with the commenter who noted that an error code (as distinct from just a message) is more international. Assuming you don’t have the wherewithal to get all the errors translatable. As a corollary, this jibes with the folks who note that an error code is for other software to read – it is a bad practice, I believe, to have to parse an error message string to determine in software what error occured. Since, for example, the string might have been translated.

But that doesn’t mean that users have to see them, unless the problem is a) catastrophic and b) the point is to be able to read that error code to a help technician. That would not be the most common case, no.

Put the name of the file, say what you were
trying to do, put the OS error message.

There can be security reasons not to be broadcasting the names of things to random users. Certainly in a Web app.

You might as well build your whole UI with numbers to save some cash.

Uh, what I SAID was that users should not have to see those numbers, but they are suitable as machine-readable data.

I don’t subscribe to this “we must do things to make our apps easier to translate” theory.

Well, we do. Windows is translated into, what, 36 languages and counting? You can bet that those folks organize and manage their errors by number, not by text. They localize the text for those errors, but no piece of code anywhere in Windows should ever rely on an error message string.

I didn’t say it was cheaper to translate. What the benefit is is that it simply makes it practical to translate thousands of error messages into dozens of language if you can keep track of what the errors are. And the practical way to do that is to give them a unique, globally neutral identifier – i.e, a number.

"I agree. Take a look at the screenshots in these articles"
It’s a very good progress over usual errors. One, and the only I can really think of, is that the copy and pastable part of the message is too long in itself. You really need a short bit that novice users can copy and paste into google that is most likely to come up with a result, and is obviously so. But I like it, I may have to start writing my own custom error dialogs based on this.