Design Away Error Handling?

Evil Monkey Pointing

Writing is an interesting process. Some posts spring to life; ignited by some inspiration, they swiftly flow from fingertips to (virtual) page. Other posts simmer. An idea is half-conceived, then languishes incomplete. It sits in the corner staring at you balefully, a reproach for your lack of commitment. In the case of this one, it sat for the better part of a year because I wasn’t quite sure which side I wanted to come down on.

It started with a fairly uncontroversial tweet from Michael Feathers: “Spend more time designing away errors so that you don’t have to handle them.” On its face, this is reasonable; eliminating error vectors should lead to a more robust product. Some warning flags appear, however, when you read the stream leading up to that tweet:

Francis Fish’s point about context (“…medical equipment and (say) android app have totally diff needs…”) certainly applies, as does Feather’s reply. Whenever I see the word “best” devoid of context, the credibility detector bottoms out. It’s the response to Brian Knapp (“Yeah, they are better not used at all. :)”) that is worrisome. Under some circumstances, throwing an exception when an error condition occurs is the right answer.

Having default values for parameters is one technique for designing away errors. Checking for problem conditions such as disk space or network connectivity prior to use can be used as well. The key thing to remember is that these techniques assume that the problem is an expected one and that something can be done about it. Checking for space or connectivity is useless if you don’t have an alternate location to write to or if you lack the ability to restore the connection. Likewise, use of a default value is only appropriate when there is a meaningful default.

The thing to remember is that avoiding an exception is not the goal, correct execution/valid state is. If you’re transferring money between accounts, you want to be able to trust that either the transaction completed and the balances are adjusted or that you know something went wrong. Silent failures are much more of a problem than noisy errors. As Jef Claes noted in “Tests as part of your code”, silent failures can put you in the newspaper (and not in a good way).

A more recent Twitter exchange involving Feathers returned to this same issue:

The short answer, is yes, it’s a bug. Otherwise things found in code reviews would not count as defects because they had not happened “live”. The last tweet in that stream summed it up nicely:

We cannot rely on design alone to eliminate error conditions because we cannot foresee all potential issues. Testing shares this same dilemma. I believe Arlo Belshee strikes the right balance in “Treat bugs as fires”. Fire departments concentrate foremost on preventing fires while still extinguishing those that fall through the cracks. Where one occurs, it’s treated as a learning experience. So too should we treat error conditions. Dan Cresswell put it nicely:

Handling exceptions is tedious, but critical. Where we can remove risks, or at least reduce them via design, so much the better. We cannot, however, rely on our ability to foresee every circumstance. Chaos Monkey chooses you.

[Hat tip to Lorrie MacVittie for tweeting the evil monkey image above]


2 thoughts on “Design Away Error Handling?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s