Abuse Cases – What Could Go Wrong?

Trainwreck

Last week, in a post titled “The Flaw in All Things”, John Vincent discussed the problem of seeing “the flaw in all things”:

It’s overwhelming. It’s paralyzing.

I can’t finish a project because I keep finding things that could cause problems. I even mentioned this to our CTO and CEO at one point when we were trying to size some private deploys of our stack.

I couldn’t see anything but the largest configuration because all I could see was places where there was a risk. There were corners I wasn’t willing to cut (not bad corners like risking availability but more like “use a smaller instance here”) because I could see and feel and taste the pain that would come from having to grow the environment under duress.

I’m frustrated with putting everything in Docker containers because all I see is having to take down EVERYTHING running on one node because there’s may be a critical Docker upgrade. I see Elasticsearch rebalancing because of it. I see Kafka elections. Mind you the system is designed for this to happen but why add something that makes it a regular occurance?

I can certainly sympathize. For what it’s worth, it sounds like those making the trade-offs he’s worried about could stand to be a bit more inclusive. That doesn’t necessarily mean the decisions would change, but at least being heard and knowing the answers to his questions might reduce some of the stress (not to mention perhaps helping out those responsible for the decisions).

When making design decisions, having (or, at least, having access to) this level of knowledge and experience has a great deal of value. As I noted in “NPM, Tay, and the Need for Design”, you have to consider both the use cases and the abuse cases for a given system (whether software or social).

It’s not possible to foresee every potential flaw and it probably won’t be feasible to eliminate every risk that’s discovered. That being said, it doesn’t mean time spent in risk evaluation is wasted. Dealing with foreseeable issues before they become problems (where “dealing” is defined as either mitigating it outright or at least planning for a response should it occur) will work better than figuring it out on the fly when the problem occurs.

Understanding why “Should I?” is a more important question than “Can I?” is something I’ve touched on before. Snapchat is finding this out by way of a lawsuit over their filter that allows users to record their speed. Who would have thought someone might cause an accident using that?

Trial and error/experimenting is one method of learning, but it’s not the only method and is frequently not the best method. Fear of failure can hold back learning, but a cavalier attitude toward risk can make experiments just as, if not more costly. It’s the difference between testing a 9 volt battery by touching it to your tongue and using the same technique on a 240 volt circuit.

The Ignorance of Management – Deep and Wide

Iceberg

While on LinkedIn a couple of weeks ago, an interesting graphic caught my eye. Titled “The Iceberg of Ignorance”, it referred to a 1989 study in which:

…consultant Sidney Yoshida concluded: “Only 4% of an organization’s front line problems are known by top management, 9% are known by middle management, 74% by supervisors and 100% by employees…”

The metaphor of the iceberg is simple to understand. The implications of these numbers, however, require some unpacking so as to understand the full nature of the problem. Once the problem is better defined, effective solutions can then be proposed.

A naive reading of this would be that the line-level employees know everything and that the higher you travel up the hierarchy, the more out of touch you get. That reading, however, fails on two counts. Firstly, it ignores the qualification of “front line”, which will not apply to all problems faced by the enterprise. Secondly, it fails to account for the fact that while 100% of front line problems will be known by front line employees, that’s not the same as saying that each front line employee will know 100% of front line problems.

It’s a question of cognitive capacity, both depth and width. As organizations grow, the idea that any one person could be aware of each and every detail of the operation becomes laughable, even assuming perfect communication (which is an extreme assumption). This difficulty is compounded by cultural conditioning to expect those in charge to know what’s going on:

Unfortunately, I suspect the vast majority of leaders and managers believe they should have all the answers — even though they couldn’t possibly know everything that’s going on at all levels and in all departments within their organization. And even though the world is changing so quickly that what we know right this second … may not be true and accurate anymore … in this second.

But because we’ve been entrained to have all the right answers, all the time, many of us put on a brave face and pretend we know — particularly when our boss asks us a question, or when a direct-report does. After all, we want to look good. We want to seem “on top of things.”

Pretending to have all the answers is stressful. It’s lonely. It’s draining.

And what if, when we are pretending to know, we give an answer that we later discover is wrong? Yikes! Now what?

In this situation, many people feel forced to “stick to their guns,” even in the face of conflicting evidence. So they wind up suffering from stress, anxiety and fear that they’ll be found out.
They may even hide the “correct answer” to save face, which certainly doesn’t do their conscience — or their company — any good.

Can you see how this need to have all the answers, all the time, can contribute to a culture of assumptions, half-truths and even outright lies?

In this sort of environment, do you think people are connecting deeply and sharing freely? Of course not. They’re competing with one another and hoarding information, because they believe the person with the right answer wins!

This culture of denial, delusion, and deception is how organizations arrive at the extreme situation I discussed in my last post, “Volkswagen and the Cost of Culture”. Casual dishonesty, towards others and yourself, leads to habitual dishonesty. Corruption breeds corruption. Often, it’s not even coldly calculated evil designed to profit, but ad hoc “going along to get along” to avoid consequences – impostor syndrome on an epic scale.

Command and control (the term of art from military science, not the pejorative description for micromanagement) is a subject with a long history. It’s frequently abbreviated as C3, since communication is an integral component of the discipline. A pair of techniques with a pedigree going back to the 18th century have a track record of success.

In his post “Auftragstaktik and fingerspitzengefühl”, Tom Graves describes these techniques:

The terms originate from the German military, from around the early-19thC and mid-20thC respectively. They would translate approximately as:

The crucial point here is to understand that they work as a dynamic pair, to provide a self-updating bridge between strategy, tactics and operations, or, more generally, between plan and action.

In the post, Tom describes these techniques through the example of the air defense system used by Britain in WWII. In this system, information flowed into the command centers from observers, and radar installations. The information was combined and contextualized, then sent back out to airfields and anti-aircraft batteries where it was used to repel air attacks. Tom noted that this is often depicted as a linear flow:

Yet describing the Dowding System in this way kinda misses the point – not least that there’s alot of information coming back from each of the front-line units at the end of that supposed one-way flow. Instead, the key here is that auftragstaktik and fingerspitzengefühl provide afeedback-loop that – unlike classic top-down command-and-control – is able to respond to fast-paced change right down to local level.

The linear-flow description also misses the point that it depends on more than information alone – there are key human elements without which the auftragstaktik / fingerspitzengefühl loop risk fading away into nothingness. For example, auftragstaktik is deeply dependent on trust, which in turn depends on a sense of personal connection and personal, mutual commitment, whilst fingerspitzengefühl depends on a more emotive form of sensing, of feeling, of an often-literal sense of ‘being in touch‘ with what’s going on out there in the real-world

The Dowding system worked by combining effective communication and realistic command & control methods. Bi-directional communication improved situational awareness both up and down the hierarchy, providing detail to the upper layers and context to the lower ones. Realistic command and control can be summarized like this: since no one can possibly have full breadth and depth of knowledge about the situation, provide direction appropriate to your level (in accordance with the directions you received) and delegate to your subordinates the decisions appropriate to their level. In theory, as well as largely in practice, this resulted in decisions being made by those with the best knowledge to do so (again, guided by general direction from above).

A system that works with reality, rather than against it? What a concept.

Volkswagen and the Cost of Culture

Hand holding a wad of cash

Thanks to Volkswagen, we now have an idea of the cost of failing to maintain an ethical culture, roughly $18 billion US (emphasis added in the quoted text below by me):

Volkswagen’s financial disclosure on Friday, in a preliminary earnings report, came a day after the company agreed on the outlines of a plan to settle some legal claims in the United States, which would include giving owners of about 500,000 affected vehicles the option of selling the cars back to the company or having them repaired.

Volkswagen is still negotiating the size of the fines it will pay to the United States government for violations of clean-air laws, as well as how much additional compensation it will provide to owners. The money set aside by the company on Friday provides an indication of what Volkswagen expects the total global costs of the scandal to be, although the figure could rise further.

Since the scandal broke in September, 2015, the news has steadily worsened. Last December, Volkswagen’s chairman admitted that the cheating found was not an isolated lapse:

…the decision by employees to cheat on emissions tests was made more than a decade ago, after they realized they could not meet United States clean air standards legally.

Hans-Dieter Pötsch, the chairman of Volkswagen’s supervisory board, said the cheating took place in a climate of lax ethical standards.

“There was a tolerance for breaking the rules,” Mr. Pötsch said here on Thursday during his first lengthy news conference since the company admitted in September that 11 million cars with diesel engines were rigged to fool emissions tests.

Volkswagen’s executive leadership explanation at the time:

Mr. Müller and Mr. Pötsch conceded that the deception reflected organizational shortcomings.

For example, the people who developed the software were the same ones who approved it for use in vehicles. At other companies, it is standard practice for one team to develop components and another to check them for quality. Volkswagen said it would correct those procedures.

Mr. Müller also said he wanted to change the company’s culture so that there was better communication among employees and more willingness to discuss problems. His predecessor, Martin Winterkorn, who resigned after the scandal, was criticized for creating a climate of fear that made managers afraid to admit mistakes.

“We don’t need yes men,” Mr. Müller said, “but managers and engineers who make good arguments.”

I would argue that what’s needed more than “good arguments” is a corporate culture where it’s understood that refusing to break the law is not only allowed, but expected. Given that the size of the loss reserve has more than doubled since then, perhaps they’ve realized that now as well.

What is not needed, however, is the traditional response to high-profile issues, layering on additional ad hoc rules and regulations with an eye toward making sure this “never again happens”. For one thing, there’s no indication that anyone was not aware of the fact that this behavior was wrong. Additional compliance theater is unlikely to improve anything in that respect, and may actually cause new problems in addition to exacerbating the root problem, VW’s culture.

A recent study (reported on in The Atlantic) by Simon Gächter and Jonathan Schulz, University of Nottingham, reports that corrupt cultures breeds corruption. In this study, they:

…asked volunteers from 23 countries to play the same simple game. The duo found that participants were more likely to bend the game’s rules for personal gain if they lived in more corrupt societies. “Corruption and fraud are things going on in the social environment all the time, and it’s plausible that it shapes people’s psychology, what they can get away with,” says Gächter. “It’s okay! Everybody does it around here.”

This study also has implications for Volkswagen’s ability to fix the problem:

Causality could eventually flow in the other direction. “If people are dishonest or think it’s okay to violate rules, it would also be harder to fight corruption and install institutions that work,” says Gächter. “In the long run, these things move together. But to show that, you’d need a 20 year project measuring this on an annual basis.”

Volkswagen, however, probably does not have twenty years to fix their problem. In the US alone, VW will have to fix or buy back (at the owner’s option) over 500,000 vehicles. I suspect VW’s reputation is severely impaired with those that opt to have their vehicles repaired and I would be willing to bet that the majority of those who sell their cars back won’t be returning as customers. This loss for Volkwagen is only the beginning of their financial problems, and it could all have been avoided.

Back in November, Matt Balantine floated an interesting (and very plausible) theory:

That may well turn out to be the case, but I also have to agree with what Grady Booch tweeted when the scandal first broke:

There’s plenty of blame to go around, but ultimately I believe only a systemic fix, top to bottom, will have any chance of correcting the problem (not that I’d be willing to give VW very good odds on remaining in business long enough for that to take effect). Their value going forward may be to serve as empiric confirmation of Gächter and Schulz’s work. Their bad example may serve as a wake up call for others to pay attention to the culture they’ve fostered (and are fostering) before their employees, innocent and guilty alike, pay the price.

Abstract Dangers – When ‘And’ Meets ‘Or’

There’s an old saying that if you put one foot in a bucket of ice and the other in a bucket of boiling water, on average you’re comfortable. Sometimes analyzing information in the aggregate obscures rather than enlightens.

A statistician named Francis Anscombe pointed out this same principle in a more visual (though less colorful) manner more than forty years ago:

It’s an idea that I’ve been meaning to write about for a while, but was brought back to mind last week while reading an article on the Austrian school of economic theory posted on a site about medical practice and health care in the U.S. (diversity of interests and a very broad reading list is something I find useful, but that’s a topic for another day). The relevant passage:

When Ludwig von Mises began to establish a systematic theory of economics, he insisted on what he called the principle of methodological dualism: the scientific methods of the hard sciences are great to study rocks, stars, atoms, and molecules, but they should not be applied to the study of human beings. In stating this principle, he was voicing opposition to the introduction into economics of concepts such as “market equilibrium,” which were largely inspired by the physical sciences, and were perhaps motivated by a desire on the part of some economists to establish their field as a science on par with physics.

Mises remarked that human beings distinguish themselves from other natural things by making intentional (and usually rational) choices when they act, which is not the case for stones falling to the ground or animals acting on instinct. The sciences of human affairs therefore deserve their own methods and should not be tempted to apply the tools of the physical sciences willy-nilly. In that respect, Mises agreed with Aristotle’s famous dictum that ” It is the mark of an educated man to look for precision in each class of things just so far as the nature of the subject admits.”

I find myself agreeing and disagreeing with this at the same time. Human behavior is far from being as predictable as gravity and I agree with this for exactly the reason I disagree (at least in part) with the second paragraph. It is a mistake to characterize human action as intentional and rational. That’s not to say that all our choices are irrational and reactionary, but that there is a blend. Not only will different people respond in different ways to the same circumstance for different reasons, but the same person may react differently with different motivations on another occasion. Human nature isn’t rigidly deterministic and we consider it so at our peril.

Tom Graves post “Control, complex, chaotic” makes the same observation:

Attempting to ‘control’ complexity just doesn’t work: we need to treat the complex as complex, not as a ‘controllable problem for which we don’t quite know all the rules (but will know them all Real Soon Now, honest…)’.

Yet I’m also noticing another deeper problem: misguided attempts to apply complexity-theory to things that are neither rule-bound control nor pattern-based complexity, but are inherently ‘chaotic’ – a ‘market-of-one‘. Although we can identify definite patterns in health and health-care – that’s the whole basis of epidemiology, for example – neither rules nor statistics can help us deal with the blunt fact the everyone is different. The kind of patterns that we’d use in a complexity-model – probabilities, Bell-curve distributions, outliers, all that kind of thing – can all too easily mask the real underlying fact of uniqueness, from which that supposed ‘pattern’ will actually arise: somewhat like the barely-visible deep-randomness that underlies the visible patterns of Brownian-motion.

Trying to force something into a mold which it doesn’t fit is unlikely to work well.

Abstraction can be useful in understanding the contexts that influence the architecture of the problem. Designing an effective solution, however, will involve not just integrating the concerns of those contexts, but also dealing with any emergent challenges. The variability of human nature (in other words, sometimes the members of those contexts will not all think and act alike) can be one such emergent challenge.

Tom Grave’s again, this time from his “On mass-uniqueness”:

In practice, the scope of every system will comprise a mix of sameness and uniqueness – of predictable and unpredictable, certain and uncertain. If we design only on an assumption of sameness – as IT-systems often are – we set ourselves up for guaranteed failure. The same applies if – as is all too common – we say that our IT-system will handle all of the ‘sameness’ part of the context, and that the ‘not-sameness’ will Somebody Else’s Problem – without giving any means for that supposed ‘somebody else’ to be able to address the rest of the problem, or to link it up with the parts of the context that our system does handle.

The first requirement to make something that works in the real-world is to design for uniqueness, not against it.

In other words, a solution based on a poorly understood problem is unlikely to be a good fit. Abstraction is one tool to understand the problem, but doesn’t provide the whole picture. Shades of gray (black and white) is more likely than black or white.

Nest and Revolv – Smart Devices, Not so Smart Moves

I’ve made another guest appearance on Architecture Corner. In episode 39, “New and Obsolete”, Greger Wikstrand, Casimir Artmann and I discuss product lifetimes and the Internet of Things.

How could Nest have better handled the end of life of the Revolv device?

Innovation – What’s Old can be New Again

Roman Road Ruins

There’s an old rhyme about what a bride should wear for luck on her wedding day: “Something old, something new, something borrowed, something blue…”. While reading an article on the origins of the US highway system, I thought about this rhyme in relation to the concept of innovation. Part of that article related the US Army’s interest in highways as a means of moving troops, etc.:

When the war ended in 1919, the Army sent an expedition from Camp Meade, Maryland, to San Francisco to study the feasibility of moving troops and equipment by truck, and it was a disaster. At an average speed of seven miles per hour, the trip took two months, and many of the vehicles broke down along the way. The rough journey convinced the officer in charge—a young Dwight D. Eisenhower—that impassable roads were a risk to national security and that building good roads should be the highest priority for the nation.

During World War II, Eisenhower’s views were confirmed by the Allies’ use of the autobahns in the invasion of Germany (ironically, the German military had been much less enthusiastic about their military potential, preferring the older, more established railroad system). Eisenhower would go on to spearhead the construction of the Interstate Highway System when elected President after the war.

While the construction and use of high-quality roads for military purposes was an innovation, it certainly wasn’t a new development. The Romans had “been there, done that” long before. It became innovative again due to the fact that the social and technological context had shifted, making it more effective than the previous transport innovation, railroads.

There’s an old saying that “history does not repeat itself, but it rhymes”. Recognizing when something old has regained its relevance can be a path to innovation. In my post “Innovation on Tap”, I talked about Amazon’s plan to open physical bookstores and embrace the concept of “showrooming”, something that traditional retailers have been struggling with for years. Other companies are jumping aboard. What I find interesting is that this concept was the business model of a company headquartered in my home town. They operated from 1957 to 1997, when they went bankrupt. Now, less than twenty years later, that model is seen as innovative. Once again, it’s the shift in social and technological contexts that has changed it into an effective solution. It’s the value that makes it innovative.

This is part 17 of an ongoing conversation with Greger Wikstrand on the topic of innovation.

[Roman Road image by Phr61 via Wikimedia Commons.]

Form Follows Function on SPaMCast 389

SPaMCAST logo

This week’s episode of Tom Cagley’s Software Process and Measurement (SPaMCast) podcast, number 389, features Tom’s essay on Agile acceptance testing, Kim Pries talking about soft skills, and a Form Follows Function installment on sense-making and decision-making in the practice of software architecture.

Tom and I discuss my post “OODA vs PDCA – What’s the Difference?”. We talk about what differentiates John Boyd’s Observe-Orient-Decide-Act loop from the Plan-Do-Check-Act cycle made famous by Deming.

You can find all my SPaMCast episodes using under the SPAMCast Appearances category on this blog. Enjoy!