Form Follows Function on SPaMCast 450

SPaMCAST logo

It’s time for another appearance on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast.

This week’s episode, number 450, features Tom’s excellent essay on roadmaps and a Form Follows Function installment based on my post “Holistic Architecture – Keeping the Gears Turning”.

Our conversation in this episode continues with the organizations as system concept, this time from the standpoint of how the social system impacts (often negatively) the software systems the social systems rely on. Specifically, we talk about how an organization that fails to manage itself as a system can lead to an architecture of both the enterprise and its IT that resembles “spare parts flying in formation”. It’s not a good situation, no matter how well made those spare parts are!

You can find all my SPaMCast episodes using under the SPaMCast Appearances category on this blog. Enjoy!

Holistic Architecture – Keeping the Gears Turning

Gears Turning Animation

In last week’s post, “Trash or Treasure – What’s Your Legacy?”, I talked about how to define “legacy systems”. Essentially, as the divergence grows between the needs of social systems and the fitness for purpose of the software systems that enable them, the more likely that those software systems can considered “legacy”. The post attracted a few comments.

I love comments.

It’s nearly impossible to have writers’ block when you’ve got smart people commenting on your work and giving you more to think about. I got just that courtesy of theslowdiyer. The comment captured a critical point:

Agree that ALM is important, and actually also for a different reason – a financial one:

First of all, the cost of operating the system though the full Application Life Cycle (up to and including decommissioning) needs to be incorporated in the investment calculation. Some organisations will invariably get this wrong – by accident or by (poor) design (of processes).

But secondly (and this is where I have seen things go really wrong): If you invest a capability in the form of a new system then once that system is no longer viable to maintain, you probably still need the capability. Which means that if you are adding new capabilities to your system landscape, some form of accruals to sustain the capability ad infinitum will probably be required.

The most important thing is the capability, not the software system.

The capability is an organizational/enterprise concern. It belongs to the social systems that comprise the organization and the over-arching enterprise. This is not to say that software systems are not important – lack of automation or systems that have slipped into the legacy category can certainly impede the enterprise. However, without the enterprise, there is no purpose for the software system. Accordingly, we need to keep our focus centered on the key concern, the capability. So long as the capability is important to enterprise, then all the components, both social and technical, need to be working in harmony. In short, there’s a need for cohesion.

Last fall, Grady Booch tweeted:

Ruth Malan replied with a great illustration of it from her “Design Visualization: Smoke and Mirrors” slide deck:

Obviously, no one would want to fly on a plane in that state (which illustrates the enterprise IT architecture of too many organizations). The more important thing, however, is that even if the plane (the technical architecture of the enterprise) is perfectly cohesive, if the social system maintaining and operating it is similarly fractured, it’s still unsafe. If I thought that pilots, mechanics, and air traffic controllers were all operating at cross purposes (or at least without any thought of common cause), I’d become a fan of travel by train.

Unfortunately, for too many organizations, accidental architecture is the most charitable way to describe the enterprise. Both social and technical systems have been built up on an ad hoc basis and allowed to evolve without reference to any unifying plan. Technical systems tend to be built (and worse, maintained) according to project-oriented mindset (aka “done and run”) leading to an expensive cycle of decay, then fix. The social systems can become self-perpetuating fiefs. The level of cohesion between the two, to the extent that it existed, breaks down even more.

A post from Matt Balantine, “Garbage In” illustrates the cohesion issue across both social and technical systems. Describing an attempt to analyze spending data across a large organization composed of federated subsidiaries:

The theory was that if we could find the classifications that existed across each of the organisations, we could then map them, Rosetta Stone-like, to a standard schema. As we spoke to each of the organisations we started to realise that there may be a problem.

The classification systems that were in use weren’t being managed to achieve integrity of data, but instead to deliver short-term operational needs. In most cases the classification was a drop-down list in the Finance system. It hadn’t been modelled – it just evolved over time, with new codes being added as necessary (and old ones not being removed because of previous use). Moreover, the classifications weren’t consistent. In the same field information would be encapsulated in various ways.

Even in more homogeneous organizations, I would expect to find something similar. It’s extremely common for aspects of one capability to bear on others. What is the primary concern for one business unit may be one of many subsidiary concerns for another (see “Making and Taming Monoliths” for an illustrated example). Because of the disconnected way capabilities (and their supporting systems) are traditionally developed, however, there tends to be a lot of redundant data. This isn’t necessarily a bad thing (e.g. a cache is redundant data maintained for performance purposes). What is a bad thing is when the disconnects cause disagreements and no governance exists to mediate the disputes. Not having an authoritative source is arguably worse than having no data at all since you don’t know what to trust.

Having an idea of what pieces exist, how they fit together, and how they will evolve while remaining aligned is, in my opinion, critical for any system. When it’s a complex socio-technical system, this awareness needs to span the whole enterprise stack (social and technical). Time and effort spent maintaining coherence across the enterprise, rather than detracting from the primary concerns will actually enhance them.

Are you confident that the plane will stay in the air or just hoping that the wing doesn’t fall off?

Form Follows Function on SPaMCast 446

SPaMCAST logo

It’s time for another appearance on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast.

This week’s episode, number 446, features Tom’s essay on questions, a powerful tool for coaches and facilitators. A Form Follows Function installment based on my post “Go-to People Considered Harmful” comes next and Kim Pries rounds out the podcast with a Software Sensei column on servant leadership.

Our conversation in this episode continues with the organizations as system concept and how concentrating institutional knowledge in go-to people creates a dependency management nightmare. Social systems run on relationships and when we allow knowledge and skill bottlenecks to form, we set our organization up for failure. Specialists with deep knowledge are great, but if they don’t spread that knowledge around, we risk avoidable disasters when they’re unavailable. Redundancy aids resilience.

You can find all my SPaMCast episodes using under the SPaMCast Appearances category on this blog. Enjoy!

Go-to People Considered Harmful

Neck of Codd bottle

Okay, so the title’s a little derivative, but it’s both accurate and it fits in with the “organizations as systems” theme of recent posts. Just as dependency management is important for software systems, it’s likewise just as critical for social systems. Failures anywhere along the chain of execution can potentially bring the whole system to a halt if resilience isn’t considered in the design (and evolution) of the system.

Dependency issues in social systems can take a variety of forms. One that comes easily to mind is what is referred to as the “bus factor” – how badly the team is affected if a person is lost (e.g. hit by a bus). Roy Osherove’s post from today, “A Critical Chain of Bus Factors”, expands on this. Interlocking chains of dependencies can multiply the bus factor:

A chain of bus factors happens when you have bus factors depending on bus factors:

Your one developer who knows how to configure the pipeline can’t test the changes because the agent is down. The one guy in IT who has access to the agent needs to reboot it, but does not have access. The one person who has access to reboot it (in the Infra team) is sick, so now there are three people waiting, and there is nothing in this earth that can help that situation.

The “bus factor”, either individually or as a cascading chain, is only part of the problem, however. A column on CIO.com, “The hazards of go-to people”, identifies the potential negative impacts on the go-to person:

They may:

  • Resent that they shoulder so much of the burden for the entire group.
  • Feel underpaid.
  • Burn out from the stress of being on the never-ending-crisis treadmill.
  • Feel trapped and unable to progress in their careers since they are so important in the role that they are in.
  • Become arrogant and condescending to their peers, drunk with the glory of being important.

The same column also lists potential problems for those who are not the go-to person:

When they realize that they are not one of the go-to people they might:

  • Feel underappreciated and untrusted.
  • Lose the desire to work hard since they don’t feel that their work will be recognized or rewarded.
  • Miss out on the opportunities to work on exciting or important things, since they are not considered dedicated and capable.
  • Feel underappreciated and untrusted.

A particularly nasty effect of relying on go-to people is that it’s self-reinforcing if not recognized and actively worked against. People get used to relying on the specialist (which is, admittedly, very effective right up until the bus arrives) and neglect learning to do for themselves. Osherove suggests several methods to mitigate these problems: pairing, teaching, rotating positions, etc. The key idea being, spreading the knowledge around.

Having individuals with deep knowledge can be a good thing if they’re a reservoir supplying others and not a pipeline constraining the flow. Intentional management of dependencies is just as important in social systems as in software systems.

Capability Now, Capability Later

Mock tank, British Army in Italy, WWII

In my post “Strategic Tunnel Vision”, I touched on the concept of capability. I discussed how focusing on new capabilities can crowd out existing capabilities and the detrimental effects of that when those existing capabilities are still necessary. I also spoke to how choices about strategic capabilities can trickle down to effect tactical capabilities.

What I failed to do, however, was define what was meant by the term “capability”. That’s a pretty big oversight on my part, because, in my opinion, understanding the concept is critical across all levels of architectural concerns.

Tom Graves, in his “Definitions on capability”, defines the term (along with some related concepts):

— Capability: the ability to do something.

— Capability-based planning: planning to do something, based on capabilities that already exist, and/or that will be added to the existing suite of capabilities; also, identifying the capabilities that would be needed to implement and execute a plan.

— Capability increment: an extension to an existing capability; also, a plan to extend a capability.

— Capability map: a visual and/or textual description of (usually) an organisation’s capabilities.

Yes, I do know that those definitions are terribly bland and generic – and they need to be that way. That’s the whole point: they need to be generic enough to be valid and usable at every possible level and in every possible context – otherwise we’ll introduce yet more confusion to something that’s often way too confused already.

That last paragraph is critical. The concept of “capability” is a high-level one that is useful across multiple levels of architectural concern (ie. application, solution, enterprise IT, and the enterprise itself). Quoting Tom again:

Note what else is intentionally not in that definition of ‘capability‘:

  • there’s no actual doing – it’s just an ability to do something, not the usage of that ability
  • there’s no ‘how’ – we don’t assume anything about how that capability works, or what it’s made up of
  • there’s no ‘why‘ – we don’t assume any particular purpose
  • there’s no ‘who‘ – we don’t assume anything about who’s responsible for this capability, or where it sits in an organisational hierarchy or suchlike

We do need all of those items, of course, as we start to flesh out the details of how the capabilities would be implemented and used in real-world practice. But in the core-definition itself, we very carefully don’t – they must not be included in the definition itself.

The reason why we have to be so careful and pedantic about this is because the relationship between service, capability, function and the rest is inherently recursive and fractal: each of them contains all of the others, which in turn each contain all of the others, and so on almost to infinity. If we don’t use deliberately-generic definitions for all of those items, we get ourselves into a tangle very quickly indeed – as can be seen all too easily in the endless definitional-battles about the relationships between ‘business-function’ versus ‘business-process’ versus ‘business-service’ versus ‘business-capability’ and so on.

In short, it’s a crucial building block in our designs and plans (which is redundant, since design is a form of planning). If we don’t have and can’t get the ability to do something, it’s game over. However, as Tom noted, we need to move beyond the raw ability in order to make effective use of capabilities. We need to think timing and personnel (which will probably largely drive timing anyway). A capability later may well not be as valuable as the same capability right now.

This was brought to mind while skimming a book review on a military strategy site (emphasis added by me):

In March 2015, then-Chief of Staff of the U.S. Army General Raymond T. Odierno admitted to the British newspaper The Telegraph that the so-called special relationship between the United States and Great Britain isn’t what it used to be. “In the past we would have a British Army division working alongside an American army division,” he said, but he feared that in the future British battalions and brigades would have to operate “inside” American units. “What has changed,” Odierno declared, “is the level of capability.”

Later that week, I asked a senior British general about Odierno’s remarks. He replied, deadpan, that although Odierno’s candor was appreciated, his statement was factually incorrect. “We can still field a division,” the general insisted. “It is just a question of how long it takes us to field one.” Potential tanks, he seemed to think, were just as relevant as an actual ones.

The highlighted portion of the quote illustrates my point. Having the capability to do something immediately and the capability to do that same thing at some point in the future are not equivalent (just to be fair to the British Army, the US Army was in this same position during Operation Desert Shield – the initial ground forces that could be deployed were extremely thin). Treating them as equivalent potentially risks disaster.

It should be noted, however, that level of concern will color the perception of the value of a future capability versus a current one. At the tactical level, in business as well as in war, “…first with the most…” is likely a winning move. At the strategic level, however, where resources must be budgeted across multiple initiatives, priorities should dictate which capabilities get preference. Tactical leaders may have to be satisfied with “on time with just enough”.

Regardless of level, a clear assessment of capabilities, what’s available when, is key to making effective decisions.

Strategic Tunnel Vision

Mouth of a Tunnel

 

Change and innovation are topics that have been prominent on this blog over the last year. In fact, Greger Wikstrand and I have traded a total of twenty-six posts (twenty-seven counting this one) on the subject.

Greger’s last post, “Successful digitization requires focus on the entire customer experience – not just a neat app” (it’s in Swedish, but it translates well to English), discussed the critical nature of customer experience to digital innovation. According to Greger, without taking customer experience into account:

One can make the world’s best app without getting more, more satisfied and profitable customers. It’s like trying to make a boring games more exciting by spraying gold paint on the playing pieces.

Change and innovation are not the same thing. Change is inevitable, innovation is not (with a h/t to Tom Cagley for that quote). As Greger pointed out in his latest article, to get improved customer experience, you need depth. Sprinkling digital fairy dust over something is not likely result in innovation. New and different can be really great, but new and different solely for the sake of new and different doesn’t win the prize. Context is critical.

If you’ve read more than a couple of my posts, you’ve probably realized that among my rather varied interests, history is a major one. I lean heavily on military history in particular when discussing innovation. This post won’t break with that tradition.

The blog Defense in Depth, operated by the Defence Studies Department, King’s College London, has published two posts this week dealing with the Suez Crisis of 1956, primarily in terms of the Anglo-French forces. One deals with the land operations and the other with naval operations. They struck a chord because they both illustrated how an overreaction to change can have drastic consequences from the strategic level down to the tactical.

Buying into a fad can be extremely expensive.

The advent of the nuclear age at the end of World War II dramatically transformed military and political thought. The atomic bomb was the ultimate game-changer in that respect. In the time-honored tradition, the response was over-reaction. “Atomic” was the “digital” of the late 40s into the 60s. They even developed a recoilless gun that could launch a 50 pound nuclear warhead 1.25-2.5 miles. “Move fast and break things” was serious business back in the day.

This extreme focus on what had changed, however, led to a rather common problem, tunnel vision. Nuclear capability became such an overarching consideration that other capabilities were neglected. Due to this neglect of more conventional capabilities, the UK’s forces were seriously hampered in their ability to perform their mission effectively. Misguided thinking at the strategic level affected operations all the way down to the lowest tactical formations.

It’s easy to imagine present-day IT scenarios that fall prey to the same issues. A cloud or digital initiative given top priority without regard to maintaining necessary capabilities could easily wind up failing in a costly manner and impairing the existing capability. It’s important to understand that time, money, and attention are finite resources. Adding capability requires increasing the resources available for it, either through adding new resources or freeing up existing ones by reducing the commitment to less important capabilities. If there is no real appreciation of what capabilities exist and what the relative value of each is, making this decision becomes a shot in the dark.

Situational awareness across all levels is required. To be effective, that awareness must integrate changes to the context while not losing sight of what already was. Otherwise, to use a metaphor from my high school football days, you risk acting like a “blind dog in a meat-packing plant”.