Ignorance Isn’t Bliss, Just Good Tactics

Donkey

There’s an old saying about what happens when you assume.

The fast lane to asininity seems to run through the land of hubris. Anshu Sharma’s Tech Crunch article, “Why Big Companies Keep Failing: The Stack Fallacy”, illustrates this:

Stack fallacy has caused many companies to attempt to capture new markets and fail spectacularly. When you see a database company thinking apps are easy, or a VM company thinking big data is easy  — they are suffering from stack fallacy.

Stack fallacy is the mistaken belief that it is trivial to build the layer above yours.

Why do people fall prey to this fallacy?

The stack fallacy is a result of human nature  — we (over) value what we know. In real terms, imagine you work for a large database company  and the CEO asks , “Can we compete with Intel or SAP?” Very few people will imagine they can build a computer chip just because they can build relational database software, but because of our familiarity with building blocks of the layer up,  it is easy to believe you can build the ERP app. After all, we know tables and workflows.

The bottleneck for success often is not knowledge of the tools, but lack of understanding of the customer needs. Database engineers know almost nothing about what supply chain software customers want or need.

This kind of assumption can cost an ISV a significant amount of money and a lot of good will on the part of the customer(s) they attempt to disrupt. Assumptions about the needs of the customer (rather than the customer’s customer) can be even more expensive. The smaller your pool of customers, the more damage that’s likely to result. Absent a captive customer circumstance, incorrect assumptions in the world of bespoke software can be particularly costly (even if only in terms of good will). Even comprehensive requirements are of little benefit without the knowledge necessary to interpret them:

But, that being said:

This would seem to pose a dichotomy: domain knowledge as both something vital and an impediment. In reality, there’s no contradiction. As the old saying goes, “a little knowledge is a dangerous thing”. When we couple that with another cliche, “familiarity breeds contempt”, we wind up with Sharma’s stack fallacy, or as xkcd expressed it:

'Purity' on xkcd.com

In order to create and evolve effective systems, we obviously have a need for domain knowledge. We also have a need to understand that what we possess is not domain knowledge per se, but domain knowledge filtered through (and likely adulterated by) our own experiences and biases. Without that understanding, we risk what Richard Martin described in “The myopia of expertise”:

In the world of hyperspecialism, there is always a danger that we get stuck in the furrows we have ploughed. Digging ever deeper, we fail to pause to scan the skies or peer over the ridge of the trench. We lose context, forgetting the overall geography of the field in which we stand. Our connection to the surrounding region therefore breaks down. We construct our own localised, closed system. Until entropy inevitably has its way. Our system then fails, our specialism suddenly rendered redundant. The expertise we valued so highly has served to narrow and shorten our vision. It has blinded us to potential and opportunity.

The Clean Room pattern on CivicPatterns.org puts it this way:

Most people hate dealing with bureaucracies. You have to jump through lots of seemingly pointless hoops, just for the sake of the system. But the more you’re exposed to it, the more sense it starts to make, and the harder it is to see things through a beginner’s eyes.

So, how do we get those beginner’s eyes? Or, at least, how do we get closer to having a beginner’s eyes?

The first step is to reject the notion of our own understanding of the problem space. Lacking innate understanding, we must then do the hard work of determining what the architecture of the problem, our context, is. As Paul Preiss noted, this doesn’t happen at a desk:

Architecture happens in the field, the operating room, the sales floor. Architecture is business technology innovation turned to strategy and then executed in reality. Architecture is reducing the time it takes to produce a barrel of oil, decreasing mortality rates in the hospital, increasing product margin.

Being willing to ask “dumb” questions is just as important. Perception without validation may be just an assumption. Seeing isn’t believing. Seeing and validating what you’ve seen, is believing.

It’s equally important to understand that validating our assumptions goes beyond just asking for requirements. Stakeholders can be subject to biases and myopic viewpoints as well. It’s true that Henry Ford’s customers would probably have asked for faster horses, it’s also true that, in a way, that’s exactly what he delivered.

We earn our money best when we learn what’s needed and synthesize those needs into an effective solution. That learning is dependent on communication unimpeded by our pride or prejudice:

Design Follies – ‘Why can’t I do that?’

Man in handcuffs

It’s ironic that the traits we think of as making a good developer are also those that can get in the way of design and testing, but that’s just the case. Think of how many times you’ve heard (or perhaps, said) “no one would ever do that”. Yet, given the event-driven, non-linear nature of modern systems, if a given execution path can occur, it will occur. Our cognitive biases can blind us to potential issues that arise when our product is used in ways we did not intend. As Thomas Wendt observed in “The Broken Worldview of Experience Design”:

To a certain extent, the designer’s intent is irrelevant once the product launches. That is, intent can drive the design process, but that’s not the interesting part; the ways in which users adopt the product to their own needs is where the most insight comes from. Designer intent is a theoretical, speculative formulation even when based on the most rigorous research methods and valid interpretations. That is not to say intention and strategic positioning is not important, but simply that we need to consider more than idealized outcomes.

Abhi Rele, in “APIs and Data: Journey to the Center of the Customer Experience”, put it in more concrete terms:

If you think you’re in full control of your customers’ experience, you’re wrong.

Customers increasingly have taken charge—they know what they want, when they want it, and how they want it. They are using their mobile phones more often for an ever-growing list of tasks—be it searching for information, looking up directions, or buying products. According to Google, 34% of consumers turn to the device that’s closest to them. More often than not, they’re switching from one channel or device mid-transaction; Google found that 67% of consumers do just that. They might start their product research on the web, but complete the purchase on a smartphone.

Switch device in mid-transaction? No one would ever do that! Oops.

We could, of course, decide to block those paths that we don’t consider “reasonable” (as opposed to stopping actual error conditions). The problem with that approach, is that our definition of “reasonable” may conflict with the customer’s definition. When “conflict” and “customer” are in the same sentence, there’s generally a problem.

These conflicts, in the right domain, can even have deadly results. While investigating the Asiana Airlines crash from July of 2013, one of the findings of the National Transportation Safety Board (NTSB) was that the crew’s belief of what the autopilot system would do did not coincide with what it actually did (my emphasis):

The NTSB found that the pilots had “misconceptions” about the plane’s autopilot systems, specifically what the autothrottle would do in the event that the plane’s airspeed got too low.

In the setting that the autopilot was in at the time of the accident, the autothrottles that are used to maintain specific airspeeds, much like cruise control in a car, were not programmed to wake up and intervene by adding power if the plane got too slow. The pilots believed otherwise, in part because in other autopilot modes on the Boeing 777, the autothrottles would in fact do this.

“NTSB Blames Pilots in July 2013 Asiana Airlines Crash” on Mashable.com

Even if it doesn’t contribute to a tragedy, a poor user experience (inconsistent, unstable, or overly restrictive) can lead to unintended consequences, customer dissatisfaction, or both. Basing that user experience on assumptions instead of research and/or testing increases the risk. As I’ve stated previously, risky assumptions are an assumption of risk.

Surfing the Plan

Hang loose

In a previous post, I used the Eisenhower quote “…plans are useless but planning is indispensable”. The Agile Manifesto expresses a preference for “Responding to change over following a plan”. A tweet I saw recently illustrates both of those points and touches on why so many seem to have problems with estimates:

Programming IRL:
“ETA for an apple pie?”
“2h”
8h later:
“Where is it?”
“You didn’t tell me the dishes were dirty and you lacked an oven.”

At first glance, it’s the age-old story of being given inadequate requirements and then being held to an estimate long after it’s proven unreasonable. However, it should also be clear that the estimate was given without adequate initial planning, no “plan B” and when the issues were discovered, there was no communication of the need to revise the estimate by an additional 300%.

Before the torches and pitchforks come out, I’m not assigning blame. There are no villains in the scenario, just two victims. While I’ve seen my share of dysfunctional situations where the mutual distrust between IT and the business was the result of bad actors, I’ve also seen plenty that were the result of good people trapped inside bad processes. If the situation can be salvaged, communication and collaboration are going to be critical to doing so.

People deal with uncertainty every day. Construction projects face delays due to weather. Watch any home improvement show and chances are you’ll see a renovation project that has to change scope or cost due to an unforeseen situation. Even surgeons find themselves changing course due to circumstances they weren’t aware of until the patient was on the table. What the parties need to be aware of is that the critical matter is not whether or not an issue appears, but how it’s handled.

The first aspect of handling issues is not to stick to a plan that is past its “sell by” date. A plan is only valid within its context and when the context changes, sticking to the plan is delusional. If your GPS tells you to go straight and your eyes tell you the bridge is out, which should you believe?

Sometimes the expiration of a plan is strategic; the goal is not feasible and continuing will only waste time, money, and effort. Other times, the goal remains, but the original tactical approach is no longer valid. There are multiple methods appropriate to tactical decision-making. Two prominent ones are Deming’s Plan-Do-Check-Act and Boyd’s Observe-Orient-Decide-Act. Each has its place, but have a looping nature in common. Static plans work for neither business leaders nor fighter pilots.

The second aspect of handling issues is communication. It can be easy for IT to lose sight of the fact that the plan they’re executing is a facet of the overarching plan that their customer is executing. Whether in-house IT or contractor, the relationship with the business is a symbiotic one. In my experience, success follows those who recognize that and breakdowns occur when it is ignored. Constant communication and involvement with that customer avoids the trust-killing green-green-green-RED!!! project management theater.

In his post “Setting Expectations”, George Dinwiddie nailed the whole issue with plans and estimates:

What if we were able to set expectations beyond a simple number? What if we could say what we know and what we don’t know? What if we could give our best estimate now, and give a better one next week when we know more? Would that help?

The thing is, these questions are not about the estimates. These questions are about the relationship between the person estimating and the person using the estimate. How can we improve that relationship?

Risky Assumptions and the Assumption of Risk

Long ago, in a land far away, a newly minted developer received a lesson in the danger of untested assumptions. There was an export job that extracted and transmitted to the state data about county jail inmates each month for reimbursement. Having developed this job, our hero was called upon to diagnose and correct an error that the state’s IT staff reported: the county had claimed reimbursement for an inmate held in a nearby city. This was proven by the fact that the Social Security Number submitted for the county’s inmate was identical to that submitted for the city’s inmate.

After much investigation, the plucky newbie determined that the identity of the county inmate was correct (fingerprints, and all that). The person submitted by the city was actually a former roommate of the person in question who had been admitted to the city jail under the borrowed identity of his old pal. It was truly shocking to realize that someone who had a lengthy criminal record would stoop to using another’s identity (although he did deserve kudos for being a pioneer – this was the mid 90’s when identity theft wasn’t yet in vogue).

What was even more shocking was the fix to be used: the county should re-submit the record with a “999-99-9999” value and the state would generate a fake SSN to be used for the remainder of the inmate’s incarceration. Since the city’s submission was first in, it would have to be considered “correct”.

Really???

The truly wonderful thing about that story is that it illustrates so many potential issues that can result from an assumption being allowed to roam unchallenged. Needless error conditions were created that delayed the business process of getting reimbursed. The validity of data in the system was compromised: across multiple incarcerations the same inmate could have multiple identities and multiple inmates could share the same identity over the life of the system.

Just as you can have technical debt, you can have cognitive debt by failing to adequately think through your design. One bad assumption can snowball into others(e.g. the “first in equals correct” rule could only be a poor reaction to the belated realization that the identity logic was unable to guarantee uniqueness). Just as collaboration can help avoid design issues, so too can adequate analysis of the problem space. Making a quick and dirty assumption and running with it leaves you at risk for wasting a lot of time and money.

[Photograph by Alex E. Proimos via Wikimedia Commons.]