A recent post on The Daily WTF highlighted a system that “…throws the fewest errors of any of our code, so it should be very stable”. The punchline, of course, was that the system threw so few errors because it was catching and suppressing almost all the errors that were occurring. Once the “no news is good news” code was removed, the dysfunctional nature of the system was revealed.
On one level, it’s funny to think of a system being considered “very stable” on the basis of it destroying the evidence of its failures. Anyone who has been in software development for any length of time probably has a war story about a colleague who couldn’t tell the difference between getting rid of error messages and correcting the error condition. However, if the system in question is critical to the user’s personal or financial well-being, then it’s not so amusing. Imagine thinking you had health insurance because the site where you enrolled said you did, and finding out later that you really didn’t.
Developing software that accomplishes something isn’t trivial, but then again, it isn’t rocket science either. Performing a task when all is correct is the easy part. We earn our money by how we handle the other cases. This is not only a matter of technical professionalism, but also a business issue. End users are likely to be annoyed if our applications leave them stranded out of town or jumping through hoops only to be frustrated at the end of the process.
Better an obvious failure than a mystery that leaves the user wondering if the system did what it said it did. Mystery impairs trust, which is a key ingredient in the customer relationship.
All of the above was written with the assumption of incompetence rather than malice. However, a comment from Charlie Alfred made during a Twitter discussion about technical debt raised another possibility:
Wonder if such a thing as “Technical Powerball”? Poor design, unreadable code, no doc, but hits jackpot anyway 🙂
Charlie’s question doesn’t assume bad intent, but it occurred to me that if “jackpot” is defined as “it just has to hold together ’til I clear the door and cash the check”, then perhaps a case of technical debt is really a case of “Technical Powerball”. Geek and Poke put it well:
I’ve been saying for some time now that not only should exceptions be big and obvious, but that I usually don’t let lower application layers (e.g. data layer) even catch them. Throw a huge exception and let the consuming layers figure out what to do with it.
LikeLike
Indeed…like most issues, the first consideration should always be “why?” If you’re not catching a specific, expected exception that you intend to deal with in a certain way, then it should be left to bubble up to the top of your call stack to be reported and logged.
LikeLike
In addition to the flagrant exception that illuminates the user’s screen, all successes and failures could/should be written to an application specific event log. This provides additional auditing capabilities for the results and has proven more than once to be worth the time and effort it takes to impliment.
LikeLike
Absolutely…logging is critical not only for dealing with specific situations, but also to look for trends. Lots of ostensibly user-related errors may indicate a lack of intuitiveness in a particular area of an application.
LikeLike
Pingback: Everything is Hunky Dori, Always, No Matter What | Becoming agile
Conway’s Law suggests that the architecture of the product is a reflection of the architecture of the product, and vice-versa.
More thoughts in relation to this post are elaborated here: http://wp.me/p1TTTL-cw
LikeLike
Great observation, Ilan. Organizational culture could definitely be a source of this type of behavior.
LikeLike
Pingback: Design Follies – ‘I paid for it so you have to use it’ | Form Follows Function
Pingback: Hatin’ on Nulls | Form Follows Function
Pingback: Error Handling – No News is Really Bad News | Iasa Global
Pingback: Design Away Error Handling? | Form Follows Function
Pingback: Error Handling – No News is Really Bad News | IasaGlobal