Emergence: Babies and Bathwater, Plans and Planning

blueprints

 

“Emergent” is a word that I run into from time to time. When I do run into it, I’m reminded of an exchange from the movie Gallipoli:

Archy Hamilton: I’ll see you when I see you.
Frank Dunne: Yeah. Not if I see you first.

The reason for my ambivalent relationship with the word is that it’s frequently used in a sense that doesn’t actually fit its definition. Dictionary.com defines it like this:

adjective

1. coming into view or notice; issuing.
2. emerging; rising from a liquid or other surrounding medium.
3. coming into existence, especially with political independence: the emergent nations of Africa.
4. arising casually or unexpectedly.
5. calling for immediate action; urgent.
6. Evolution. displaying emergence.

noun

7. Ecology. an aquatic plant having its stem, leaves, etc., extending above the surface of the water.

Most of the adjective definitions apply to planning and design (which I consider to be a specialized form of planning). Number 3 is somewhat tenuous for that sense and and 5 only applies sometimes, but 6 is dead on.

My problem, however, starts when it’s used as a euphemism for a directionless. The idea that a cohesive, coherent result will “emerge” from responding tactically (whether in software development or in managing a business) is, in my opinion, a dangerous one. I’ve never heard an explanation of how strategic success emerges from uncoordinated tactical excellence that doesn’t eventually come down to faith. It’s why I started tagging posts on the subject “Intentional vs Accidental Architecture”. Success that arises from lack of coordination is accidental rather than by design (not to mention ironic when the lack of intentional coordination or planning/design is intentional itself):

If you don’t know where you are going, any road will get you there.

 

The problem, of course, is do you want to be at the “there” you wind up at? There’s also the issue of cost associated with a meandering path when a more direct route was available.

None of this, however, should be taken as a rejection of emergence. In fact, a dogmatic attachment to a plan in the face of emergent facts is as problematic as pursuing an accidental approach. Placing your faith in a plan that has been invalidated by circumstances is as blinkered an approach as refusing to plan at all. Neither extreme makes much sense.

We lack the ability to foresee everything that can occur, but that limit does not mean that we should ignore what we can foresee. A purely tactical focus can lead us down obvious blind alleys that will be more costly to back out of in the long run. Experience is an excellent teacher, but the tuition is expensive. In other words, learning from our mistakes is good, but learning from other’s mistakes is better.

Darwinian evolution can produce lead to some amazing things provided you can spare millions of years and lots of failed attempts. An intentional approach allows you to tip the scales in your favor.


Many thanks to Andrew Campbell and Adrian Campbell for the multi-day twitter conversation that spawned this post. Normally, I unplug from almost all social media on the weekends, but I enjoyed the discussion so much I bent the rules. Cheers gentlemen!

Organizations as Systems and Innovation

Portrait of Gustavus Adolphus of Sweden

Over the last year or so, the concept of looking at organizations as systems has been a major theme for me. Enterprises, organizations and their ecosystems (context) are social systems composed of a fractal set of social and software systems. As such, enterprises have an architecture.

Another long-term theme for this site has been my conversation with Greger Wikstrand regarding innovation. This post is the thirty-fifth entry in that series.

So where do these two intersect? And why is there a picture of a Swedish king from four-hundred years ago up there?

Innovation, by its very nature (“…significant positive change”), does not happen in a vacuum. Greger’s last post, “Innovation arenas and outsourcing”, illustrates one aspect of this. Shepherding ideas into innovations is a deliberate activity requiring structural support. Being intentional doesn’t turn bad ideas into innovations, but lack of a system can cause an otherwise good idea to wither on the vine.

Another intersection, the one I’m focusing on here, can be found in the nature of innovation itself. It’s common to think of technological innovation, but innovation can also be found in changes to organizational structure and processes (e.g. Henry Ford and the assembly line). Organization, process, and technology are not only areas for innovation, but when coupled with people, form the primary elements of an enterprise architecture. It should be clear that the more these elements are intentionally coordinated towards a specific goal, the more cohesive the effort should be.

This brings us to Gustavus Adolphus of Sweden. In his twenty years on the throne, he converted Sweden into a major power in Europe. Militarily, he upended the European status quo in a very short time (after intervening in the Thirty Years’ War in 1630, he was killed in battle in 1632) by marshaling organizational, procedural, technological innovations:

The Swedish army stood apart from its’ contemporaries through five characteristics. Its’ soldiers wore uniform and had a nucleus of native Swedes, raised from a surprisingly diplomatic system of conscription, at its’ core. The Swedish regiments were small in comparison to their opponents and were lightly equipped for speed. Each regiment had its’ own light and mobile field artillery guns called ‘leathern guns’ that were easy to handle and could be easily manoeuvred to meet sudden changes on the battlefield. The muskets carried by these soldiers were of a type superior to that in general use and allowed for much faster rates of fire. Swedish cavalry, instead of galloping up to the enemy, discharging their pistols and then turning around and galloping back to reload, ruthlessly charged with close quarter weapons once their initial shot had been expended. By analysing this paradigm it becomes apparent that the army under Gustavus emphasized speed and manoeuvrability above all – this greatly set him apart from his opponents.

By themselves, none of the innovations were original to Gustavus. Combining them together, however, was and European military practice was irrevocably changed. Inflection points can be dependent on multiple technologies catching up with one another (since the future is “…not very evenly distributed”), but in this case the pieces were all in place. The catalyst was someone with the vision to combine them, not random chance.

Emergence will be a factor in any complex system. That being said, the inevitability of those emergent events does not invalidate intentional design and planning. If anything, design and planning is more necessary to deal with the mundane, foreseeable things in order to leave more cognitive capacity to deal with that which can’t be foreseen.

Back to the OODA – Making Design Decisions

OODA Loop Diagram

A few weeks back, in my post “Enterprise Architecture and the Business of IT”, I mentioned that I was finding myself drawn more and more toward Enterprise Architecture (EA) as a discipline, given its impact on my work as a software architect. Rather than a top-down approach, seeking to design an enterprise as a whole, I find myself backing into it from the bottom-up, seeking to ensure that the systems I design fit the context in which they will live. This involves understanding not only the technology, but also how it interacts, or not, with multiple social systems in order to (ideally) carry out the purpose of the enterprise.

Tom Graves is currently in the middle of a series on whole-enterprise architecture on his Tetradian blog. The third post of the series, “Towards a whole-enterprise architecture standard – 3: Method”, focuses on the need for a flexible design method:

But as we move towards whole-enterprise architecture, we need architecture-methods that can self-adapt for any scope, any scale, any level, any domains, any forms of implementation – and that’s a whole new ball-game…

He further states:

the methods we need for the kind of architecture-work we’re facing now and, even more, into the future, will need not only to work well with any scope, any scale and so on, but must have deep support for non-linearity built right into the core – yet in a way that fully supports formal rigour and discipline as well.

To begin answering the question “But where do we start?”, Tom looks at the Plan-Do-Check-Act (PDCA) cycle, which he finds wanting because:

… the reality is that PDCA doesn’t go anything like far enough for what we need. In particular, it doesn’t really touch all that much on the human aspects of change

This is a weakness that I mentioned in “OODA vs PDCA – What’s the Difference?”. PDCA, by starting with “Plan” and without any reference to context, is flawed in my opinion. Even if one argues that assessing the context is implied, leaving it to an implication fails to give it the prominence it deserves. In his post, Tom refers to ‘the squiggle’, a visualization of the design process:

Damien Newman's squiggle diagram

In an environment of uncertainty (which pretty much includes anything humans are even peripherally involved with), exploration of the context space will be required to understand the gross architecture of the problem. In reconciling the multiple contexts involved, challenges will emerge and will need to be integrated into the architecture of the solution as well. This fractal and iterative exploratory process is well represented by ‘the squiggle’ and, in my opinion, well described by the OODA (Observe, Orient, Decide, and Act) loop.

In “Architecture and OODA Loops – Fast is not Enough”, I discussed how the OODA loops maps to this kind of messy, multi-level process of sense-making and decision-making. The “and” is extremely important in that while decision-making with little or no sense-making may be quick, it’s less likely to effective due to the disconnect from reality. On the other hand, filtering and prioritizing (parts of the Orient component of the loop) is also needed to prevent analysis paralysis.

In my opinion, its recognition and handling of the tension between informed decision-making and quick decision-making makes OODA an excellent candidate for a design meta-method. It is heavily subjective, relying on context to guide appropriate response. That being said, an objective method that’s divorced from context imposes a false sense of simplicity with no real benefit (and very real risks).

Reality is messy; our design methodology should work with, not against that.

[OODA Loop diagram by Patrick Edwin Moran via Wikimedia Commons]

Abuse Cases – What Could Go Wrong?

Trainwreck

Last week, in a post titled “The Flaw in All Things”, John Vincent discussed the problem of seeing “the flaw in all things”:

It’s overwhelming. It’s paralyzing.

I can’t finish a project because I keep finding things that could cause problems. I even mentioned this to our CTO and CEO at one point when we were trying to size some private deploys of our stack.

I couldn’t see anything but the largest configuration because all I could see was places where there was a risk. There were corners I wasn’t willing to cut (not bad corners like risking availability but more like “use a smaller instance here”) because I could see and feel and taste the pain that would come from having to grow the environment under duress.

I’m frustrated with putting everything in Docker containers because all I see is having to take down EVERYTHING running on one node because there’s may be a critical Docker upgrade. I see Elasticsearch rebalancing because of it. I see Kafka elections. Mind you the system is designed for this to happen but why add something that makes it a regular occurance?

I can certainly sympathize. For what it’s worth, it sounds like those making the trade-offs he’s worried about could stand to be a bit more inclusive. That doesn’t necessarily mean the decisions would change, but at least being heard and knowing the answers to his questions might reduce some of the stress (not to mention perhaps helping out those responsible for the decisions).

When making design decisions, having (or, at least, having access to) this level of knowledge and experience has a great deal of value. As I noted in “NPM, Tay, and the Need for Design”, you have to consider both the use cases and the abuse cases for a given system (whether software or social).

It’s not possible to foresee every potential flaw and it probably won’t be feasible to eliminate every risk that’s discovered. That being said, it doesn’t mean time spent in risk evaluation is wasted. Dealing with foreseeable issues before they become problems (where “dealing” is defined as either mitigating it outright or at least planning for a response should it occur) will work better than figuring it out on the fly when the problem occurs.

Understanding why “Should I?” is a more important question than “Can I?” is something I’ve touched on before. Snapchat is finding this out by way of a lawsuit over their filter that allows users to record their speed. Who would have thought someone might cause an accident using that?

Trial and error/experimenting is one method of learning, but it’s not the only method and is frequently not the best method. Fear of failure can hold back learning, but a cavalier attitude toward risk can make experiments just as, if not more costly. It’s the difference between testing a 9 volt battery by touching it to your tongue and using the same technique on a 240 volt circuit.

Abstract Dangers – When ‘And’ Meets ‘Or’

There’s an old saying that if you put one foot in a bucket of ice and the other in a bucket of boiling water, on average you’re comfortable. Sometimes analyzing information in the aggregate obscures rather than enlightens.

A statistician named Francis Anscombe pointed out this same principle in a more visual (though less colorful) manner more than forty years ago:

It’s an idea that I’ve been meaning to write about for a while, but was brought back to mind last week while reading an article on the Austrian school of economic theory posted on a site about medical practice and health care in the U.S. (diversity of interests and a very broad reading list is something I find useful, but that’s a topic for another day). The relevant passage:

When Ludwig von Mises began to establish a systematic theory of economics, he insisted on what he called the principle of methodological dualism: the scientific methods of the hard sciences are great to study rocks, stars, atoms, and molecules, but they should not be applied to the study of human beings. In stating this principle, he was voicing opposition to the introduction into economics of concepts such as “market equilibrium,” which were largely inspired by the physical sciences, and were perhaps motivated by a desire on the part of some economists to establish their field as a science on par with physics.

Mises remarked that human beings distinguish themselves from other natural things by making intentional (and usually rational) choices when they act, which is not the case for stones falling to the ground or animals acting on instinct. The sciences of human affairs therefore deserve their own methods and should not be tempted to apply the tools of the physical sciences willy-nilly. In that respect, Mises agreed with Aristotle’s famous dictum that ” It is the mark of an educated man to look for precision in each class of things just so far as the nature of the subject admits.”

I find myself agreeing and disagreeing with this at the same time. Human behavior is far from being as predictable as gravity and I agree with this for exactly the reason I disagree (at least in part) with the second paragraph. It is a mistake to characterize human action as intentional and rational. That’s not to say that all our choices are irrational and reactionary, but that there is a blend. Not only will different people respond in different ways to the same circumstance for different reasons, but the same person may react differently with different motivations on another occasion. Human nature isn’t rigidly deterministic and we consider it so at our peril.

Tom Graves post “Control, complex, chaotic” makes the same observation:

Attempting to ‘control’ complexity just doesn’t work: we need to treat the complex as complex, not as a ‘controllable problem for which we don’t quite know all the rules (but will know them all Real Soon Now, honest…)’.

Yet I’m also noticing another deeper problem: misguided attempts to apply complexity-theory to things that are neither rule-bound control nor pattern-based complexity, but are inherently ‘chaotic’ – a ‘market-of-one‘. Although we can identify definite patterns in health and health-care – that’s the whole basis of epidemiology, for example – neither rules nor statistics can help us deal with the blunt fact the everyone is different. The kind of patterns that we’d use in a complexity-model – probabilities, Bell-curve distributions, outliers, all that kind of thing – can all too easily mask the real underlying fact of uniqueness, from which that supposed ‘pattern’ will actually arise: somewhat like the barely-visible deep-randomness that underlies the visible patterns of Brownian-motion.

Trying to force something into a mold which it doesn’t fit is unlikely to work well.

Abstraction can be useful in understanding the contexts that influence the architecture of the problem. Designing an effective solution, however, will involve not just integrating the concerns of those contexts, but also dealing with any emergent challenges. The variability of human nature (in other words, sometimes the members of those contexts will not all think and act alike) can be one such emergent challenge.

Tom Grave’s again, this time from his “On mass-uniqueness”:

In practice, the scope of every system will comprise a mix of sameness and uniqueness – of predictable and unpredictable, certain and uncertain. If we design only on an assumption of sameness – as IT-systems often are – we set ourselves up for guaranteed failure. The same applies if – as is all too common – we say that our IT-system will handle all of the ‘sameness’ part of the context, and that the ‘not-sameness’ will Somebody Else’s Problem – without giving any means for that supposed ‘somebody else’ to be able to address the rest of the problem, or to link it up with the parts of the context that our system does handle.

The first requirement to make something that works in the real-world is to design for uniqueness, not against it.

In other words, a solution based on a poorly understood problem is unlikely to be a good fit. Abstraction is one tool to understand the problem, but doesn’t provide the whole picture. Shades of gray (black and white) is more likely than black or white.

Form Follows Function on SPaMCast 389

SPaMCAST logo

This week’s episode of Tom Cagley’s Software Process and Measurement (SPaMCast) podcast, number 389, features Tom’s essay on Agile acceptance testing, Kim Pries talking about soft skills, and a Form Follows Function installment on sense-making and decision-making in the practice of software architecture.

Tom and I discuss my post “OODA vs PDCA – What’s the Difference?”. We talk about what differentiates John Boyd’s Observe-Orient-Decide-Act loop from the Plan-Do-Check-Act cycle made famous by Deming.

You can find all my SPaMCast episodes using under the SPAMCast Appearances category on this blog. Enjoy!

NPM, Tay, and the Need for Design

Take a couple of seconds and watch the clip in the tweet below:

While it would be incredibly difficult to predict that exact outcome, it is also incredibly easy to foresee that it’s a possibility. As the saying goes, “forewarned is forearmed”.

Being forewarned and forearmed is an important part of what an architect does. An architect is supposed to focus on the architecturally significant aspects of a system. I like to use Ruth Malan‘s definition of architectural significance due to its flexibility:

Decisions (both those that were made and those that were left unmade) that end up taking systems offline and causing very public embarrassment are, in my opinion, architecturally significant.

Last week, two very public, very foreseeable failures took place: first was the chaos caused by a developer removing his modules from NPM, which was followed by Microsoft having to pull the plug on its Tay chatbot when it was “trained” to spew offensive comments in less than 24 hours. In my opinion, these both represented design failures resulting from a lack of due consideration of the context in which these systems would operate.

After all, can anyone really claim that no one would expect that people on the internet would try to “corrupt” a chatbot? According to Azeem Azhar as quoted in Business Insider, not really:

“Of course, Twitter users were going to tinker with Tay and push it to extremes. That’s what users do — any product manager knows that.

“This is an extension of the Boaty McBoatface saga, and runs all the way back to the Hank the Angry Drunken Dwarf write in during Time magazine’s Internet vote for Most Beautiful Person. There is nearly a two-decade history of these sort of things being pushed to the limit.”

The current claim, as reported in CIO.com, is that Tay was developed with filtering built-in, but there was a “critical oversight” for a specific kind of attack. According to the article, it’s believed that the attack vector involved asking Tay to “repeat after me”.

Or, as Matt Ballantine put it:

Likewise, who could imagine issues with a centralized repository of cascading dependencies? Failing to consider what would happen if someone suddenly pulled one of the bottom blocks out led to a huge inconvenience to anyone depending on that module or any downstream module. There’s plenty of blame to go around: the developer who took his toys and went home, those responsible for NPM’s design, and those who depended on it without understanding its weaknesses.

“The Iron Law of Tools” is “that which does for you will also do to you”. Understanding the trade-offs allows you to plan for risk mitigation in advance. Ignoring them merely ensures that they will have to be dealt with in crisis mode. This is something I covered in a previous post, “Dependency Management is Risk Management”.

Effective design involves not only the internals of a system but its externals as well. The conditions under which the system will be used, it’s context, is highly significant. That means considering not only the system’s use cases, but also its abuse cases. A post written almost a year ago by Brandon Harris, “Designing for Evil”, conveys this well:

When all is said and done, when you’ve set your ideas to paper, you have to sit down and ask yourself a very specific question:

How could this feature be exploited to harm someone?

Now, replace the word “could” with the word “will.”

How will this feature be exploited to harm someone?

You have to ask that question. You have to be unflinching about the answers, too.

Because if you don’t, someone else will.


When I began working on this post, the portion above was what I had in mind to say. In essence, I planned a longer-form version of what I’d tweeted about the Tay fiasco:

However, before I had finished writing the post, Greger Wikstrand posted “The fail fast fallacy”. Greger and I have been carrying on a conversation about innovation over the last few months. While I had initially intended to approach this as a general issue of architectural practice rather than innovation, the points he makes are just too apropos to leave out.

In the post, Greger points out that the focus seems to have shifted from learning to failure. Learning from experience can be the best way to test an idea. However, it’s not the only way:

Evolution and nature has shown us that there are two, equally valid, approaches to winning the gene game. The first approach is to get as much offspring as possible and “hope” many of them survive (r-selection). The second approach is to have few offspring but raise them and nurture them carefully (K-selection). Biologists tell us that the first strategy works best in a harsh, unpredictable environment where the effort of creating offspring is low. The second strategy works better in an environment where there is less change and offspring are more expensive to produce. Some of the factors that favour r-selection seems to be large uncompeted resources. K-selection is more favourable in resource scarce, low predator areas.

The phrase “…where the effort of creating offspring is low” is critical here. The higher the “cost” of the experiment, the more risk is involved in failure. This makes it advisable to tilt the playing field by supporting and nurturing the “offspring”.

In response to Greger’s post, Casimir Artmann posted two excellent articles that further elaborated on this. In “Fail Fast During Adventures”, he noted that “There is a fine line between fail fast and Darwin Awards in IRL.” His point, preparation beforehand and being willing to abort during an experiment before failure is equivalent to suffering a fatality can be effective learning strategies. Lessons that you don’t live to apply aren’t worth much.

Casimir followed with “Fail is not an Option”, in which he stated:

I want the project to succeed, but I plan for things going wrong so that the consequences wouldn’t be to huge. Some risk are manageable, as walking alone, but not alone and off-trail. That’s to risky. If you doing outdoor adventures, you are probably more prepared and skilled than a ordinarie project member, and thats a huge benefit.

I guess the best advice, when doing completely new things with IT, is to start really small so that the majority of your business is not impacted if there is a failure. When something goes wrong, be sure that you could go back to safe place. Point of no return is like being on a sailing boot in the middle of the Atlantic where you can’t go back.

That’s excellent advice. “Fail Fast” has the advantage of being able to fit on a bumper sticker, but the longer, more nuanced version is more likely to serve you well.