Form Follows Function on SPaMCast 450

SPaMCAST logo

It’s time for another appearance on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast.

This week’s episode, number 450, features Tom’s excellent essay on roadmaps and a Form Follows Function installment based on my post “Holistic Architecture – Keeping the Gears Turning”.

Our conversation in this episode continues with the organizations as system concept, this time from the standpoint of how the social system impacts (often negatively) the software systems the social systems rely on. Specifically, we talk about how an organization that fails to manage itself as a system can lead to an architecture of both the enterprise and its IT that resembles “spare parts flying in formation”. It’s not a good situation, no matter how well made those spare parts are!

You can find all my SPaMCast episodes using under the SPaMCast Appearances category on this blog. Enjoy!

Advertisement

Holistic Architecture – Keeping the Gears Turning

Gears Turning Animation

In last week’s post, “Trash or Treasure – What’s Your Legacy?”, I talked about how to define “legacy systems”. Essentially, as the divergence grows between the needs of social systems and the fitness for purpose of the software systems that enable them, the more likely that those software systems can considered “legacy”. The post attracted a few comments.

I love comments.

It’s nearly impossible to have writers’ block when you’ve got smart people commenting on your work and giving you more to think about. I got just that courtesy of theslowdiyer. The comment captured a critical point:

Agree that ALM is important, and actually also for a different reason – a financial one:

First of all, the cost of operating the system though the full Application Life Cycle (up to and including decommissioning) needs to be incorporated in the investment calculation. Some organisations will invariably get this wrong – by accident or by (poor) design (of processes).

But secondly (and this is where I have seen things go really wrong): If you invest a capability in the form of a new system then once that system is no longer viable to maintain, you probably still need the capability. Which means that if you are adding new capabilities to your system landscape, some form of accruals to sustain the capability ad infinitum will probably be required.

The most important thing is the capability, not the software system.

The capability is an organizational/enterprise concern. It belongs to the social systems that comprise the organization and the over-arching enterprise. This is not to say that software systems are not important – lack of automation or systems that have slipped into the legacy category can certainly impede the enterprise. However, without the enterprise, there is no purpose for the software system. Accordingly, we need to keep our focus centered on the key concern, the capability. So long as the capability is important to enterprise, then all the components, both social and technical, need to be working in harmony. In short, there’s a need for cohesion.

Last fall, Grady Booch tweeted:

Ruth Malan replied with a great illustration of it from her “Design Visualization: Smoke and Mirrors” slide deck:

Obviously, no one would want to fly on a plane in that state (which illustrates the enterprise IT architecture of too many organizations). The more important thing, however, is that even if the plane (the technical architecture of the enterprise) is perfectly cohesive, if the social system maintaining and operating it is similarly fractured, it’s still unsafe. If I thought that pilots, mechanics, and air traffic controllers were all operating at cross purposes (or at least without any thought of common cause), I’d become a fan of travel by train.

Unfortunately, for too many organizations, accidental architecture is the most charitable way to describe the enterprise. Both social and technical systems have been built up on an ad hoc basis and allowed to evolve without reference to any unifying plan. Technical systems tend to be built (and worse, maintained) according to project-oriented mindset (aka “done and run”) leading to an expensive cycle of decay, then fix. The social systems can become self-perpetuating fiefs. The level of cohesion between the two, to the extent that it existed, breaks down even more.

A post from Matt Balantine, “Garbage In” illustrates the cohesion issue across both social and technical systems. Describing an attempt to analyze spending data across a large organization composed of federated subsidiaries:

The theory was that if we could find the classifications that existed across each of the organisations, we could then map them, Rosetta Stone-like, to a standard schema. As we spoke to each of the organisations we started to realise that there may be a problem.

The classification systems that were in use weren’t being managed to achieve integrity of data, but instead to deliver short-term operational needs. In most cases the classification was a drop-down list in the Finance system. It hadn’t been modelled – it just evolved over time, with new codes being added as necessary (and old ones not being removed because of previous use). Moreover, the classifications weren’t consistent. In the same field information would be encapsulated in various ways.

Even in more homogeneous organizations, I would expect to find something similar. It’s extremely common for aspects of one capability to bear on others. What is the primary concern for one business unit may be one of many subsidiary concerns for another (see “Making and Taming Monoliths” for an illustrated example). Because of the disconnected way capabilities (and their supporting systems) are traditionally developed, however, there tends to be a lot of redundant data. This isn’t necessarily a bad thing (e.g. a cache is redundant data maintained for performance purposes). What is a bad thing is when the disconnects cause disagreements and no governance exists to mediate the disputes. Not having an authoritative source is arguably worse than having no data at all since you don’t know what to trust.

Having an idea of what pieces exist, how they fit together, and how they will evolve while remaining aligned is, in my opinion, critical for any system. When it’s a complex socio-technical system, this awareness needs to span the whole enterprise stack (social and technical). Time and effort spent maintaining coherence across the enterprise, rather than detracting from the primary concerns will actually enhance them.

Are you confident that the plane will stay in the air or just hoping that the wing doesn’t fall off?

Form Follows Function on SPaMCast 442

SPaMCAST logo

A new month brings a new appearance on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast.

This week’s episode, number 442, features Tom’s excellent essay on capability teams (highly recommended!), followed by a Form Follows Function installment based on my post “Systems of Social Systems and the Software Systems They Create”. Kim Pries bats cleanup with a Software Sensei column, “Software Quality and the Art of Skateboard Maintenance”.

In this episode, Tom and I continue our discussion on the organizations as system concept and how systems must fit into their context and ecosystem. In my previous posts on the subject, I took more of a top down approach. With this post, I flipped things around to a bottom up view. Understanding how the social and software systems interact (including the social system involved in creating/maintaining the software system) is critical to avoid throwing sand in the gears.

You can find all my SPaMCast episodes using under the SPaMCast Appearances category on this blog. Enjoy!

Systems of Social Systems and the Software Systems They Create

I’ve mentioned before that the idea of looking at organizations as systems is one that I’ve been focusing on for quite a while now. From a top-down perspective, this makes sense – an organization is a system that works better when it’s component parts (both machine and human) intentionally work together.

It also works from the bottom up. For example, from a purely technical perspective, we have a system:

Generic System

However, without considering those who use the system, we have limited picture of the context the system operates within. The better we understand that context, the better we can shape the system to fit the context, otherwise we risk the square peg in a round hole situation:

Generic System with Users

Of course, the users who own the system are also only a part of the context. We have to consider the customers as well:

Generic System with Users and customers

Likewise, we need to consider that the customers of some systems can be internal to the organization while others are external. Some of the “customers” may not even be human. For that matter, sometimes the customer’s interface might be a human (user) rather than software. Things get complicated when we begin adding in the social systems:

Generic EITA with Users and customers

The situation is even more complicated than what’s seen above. We need to account for the team developing and operating the automated system:

Generic System with Users, customers, and IT team

And if that team is not a unified whole, then the picture gets a whole lot more interesting:

Generic System with Users, customers, and IT teams

Zoomed out to the enterprise level, that’s a lot of social systems. When multiplied by the number of automated systems involved, the number easily becomes staggering. What’s even more sobering is reflecting on whether those interactions have been intentionally structured or have grown organically over time. The interrelationship of social and software systems is under-appreciated. A series of tweets from Gregory Brown last week makes the same case:

A number of questions come to mind:

  • Is anyone aware of all the systems (social and software) in play?
  • Is anyone aware of all the interactions between these systems?
  • Are the relationships and interactions a result of intentional design or have they “just happened”?
  • Are you comfortable with the answers to the first three questions above?

Strategic Tunnel Vision

Mouth of a Tunnel

 

Change and innovation are topics that have been prominent on this blog over the last year. In fact, Greger Wikstrand and I have traded a total of twenty-six posts (twenty-seven counting this one) on the subject.

Greger’s last post, “Successful digitization requires focus on the entire customer experience – not just a neat app” (it’s in Swedish, but it translates well to English), discussed the critical nature of customer experience to digital innovation. According to Greger, without taking customer experience into account:

One can make the world’s best app without getting more, more satisfied and profitable customers. It’s like trying to make a boring games more exciting by spraying gold paint on the playing pieces.

Change and innovation are not the same thing. Change is inevitable, innovation is not (with a h/t to Tom Cagley for that quote). As Greger pointed out in his latest article, to get improved customer experience, you need depth. Sprinkling digital fairy dust over something is not likely result in innovation. New and different can be really great, but new and different solely for the sake of new and different doesn’t win the prize. Context is critical.

If you’ve read more than a couple of my posts, you’ve probably realized that among my rather varied interests, history is a major one. I lean heavily on military history in particular when discussing innovation. This post won’t break with that tradition.

The blog Defense in Depth, operated by the Defence Studies Department, King’s College London, has published two posts this week dealing with the Suez Crisis of 1956, primarily in terms of the Anglo-French forces. One deals with the land operations and the other with naval operations. They struck a chord because they both illustrated how an overreaction to change can have drastic consequences from the strategic level down to the tactical.

Buying into a fad can be extremely expensive.

The advent of the nuclear age at the end of World War II dramatically transformed military and political thought. The atomic bomb was the ultimate game-changer in that respect. In the time-honored tradition, the response was over-reaction. “Atomic” was the “digital” of the late 40s into the 60s. They even developed a recoilless gun that could launch a 50 pound nuclear warhead 1.25-2.5 miles. “Move fast and break things” was serious business back in the day.

This extreme focus on what had changed, however, led to a rather common problem, tunnel vision. Nuclear capability became such an overarching consideration that other capabilities were neglected. Due to this neglect of more conventional capabilities, the UK’s forces were seriously hampered in their ability to perform their mission effectively. Misguided thinking at the strategic level affected operations all the way down to the lowest tactical formations.

It’s easy to imagine present-day IT scenarios that fall prey to the same issues. A cloud or digital initiative given top priority without regard to maintaining necessary capabilities could easily wind up failing in a costly manner and impairing the existing capability. It’s important to understand that time, money, and attention are finite resources. Adding capability requires increasing the resources available for it, either through adding new resources or freeing up existing ones by reducing the commitment to less important capabilities. If there is no real appreciation of what capabilities exist and what the relative value of each is, making this decision becomes a shot in the dark.

Situational awareness across all levels is required. To be effective, that awareness must integrate changes to the context while not losing sight of what already was. Otherwise, to use a metaphor from my high school football days, you risk acting like a “blind dog in a meat-packing plant”.

Monolithic Applications and Enterprise Gravel

Pebbles

It’s been almost a year since I’ve written anything about microservices, and while a lot has been said on that subject, it’s one I still monitor to see what new pops up. The opening of a blog post that I read last week caught my attention:

Coined by Melvin Conway in 1968, Conway’s Law states: “Any organization that designs a system will produce a design whose structure is a copy of the organization’s communication structure.” In software development terms, Conway’s Law suggests that a given team will build apps that mirror the team’s organizational structure. Siloed functional teams produce siloed application architectures.

The result is a monolith: A massive application whose functionality is crammed into a few crowded parts. Scaling a simple pattern to the enterprise level often results in a monolith.

None of this is wrong, per se, but in reading it, one could come to a wrong conclusion. Siloed functional teams (particularly where the culture of the organization encourages siloed business units) produce siloed application architectures that are most likely monoliths. From an enterprise IT architecture aspect, though, the result is not monolithic. Googling the definition of “monolithic”, we get this:

mon·o·lith·ic
ˌmänəˈliTHik/
adjective
  1. formed of a single large block of stone.
  2. (of an organization or system) large, powerful, and intractably indivisible and uniform.
    “rejecting any move toward a monolithic European superstate”
    synonyms: inflexible, rigid, unbending, unchanging, fossilized
    “a monolithic organization”

Rather than “a single large block of stone”, we get gravel. The architecture of the enterprise’s IT isn’t “large, powerful, and intractably indivisible and uniform”. It may well be large, but its power in relation to its size will be lacking. Too much effort is wasted reinventing wheels and maintaining redundant data (most likely with no real sense of which set of data is authoritative). Likewise, while “intractably indivisible” isn’t a virtue, being intractable while also lacking cohesion is worse. Such an IT architecture is a foundation built on shifting sand. Lastly, whether the EITA is uniform or not (and I would give good odds that it’s not), is irrelevant given the other negative aspects. Under the circumstances, worrying about uniformity would be like worrying about whether the superstructure of the Titanic had a fresh paint job.

Does this mean that microservices are the answer to having an effective EITA? Hardly.

There are prerequisites for being able to support a microservice architecture; table stakes, if you will. However, the service-oriented mindset can be of value whether it’s applied as far down as the intra-application level (i.e. microservices – it is an application architecture pattern) or inter-application (the more traditional SOA). Where the line is drawn depends on the context of the application(s) and their ecosystem. What can be afforded and supported are critical aspects of the equation at all levels.

What is necessary for an effective EITA is a full-stack approach. Governance and data architecture in particular are important aspects to consider. The goal is consistent, intentional alignment across all levels (enterprise, EITA, solution, and application), promoting a cohesive architecture throughout, not a top-down dictatorship.

Large edifices that last are built from smaller pieces that fit together on purpose.

Building a Legacy

Greek Trireme image from Deutsches Museum, Munich, Germany

 

Over the last few weeks, I’ve run across a flurry of articles dealing with the issue of legacy systems used by the U.S. government.

An Associated Press story on the findings from the Government Accountability Office (GAO) issued in May reported that roughly three-fourths of the $80 billion IT budget was used to maintain legacy systems, some more than fifty years old and without an end of life date in sight. An article on CIO.com about the same GAO report detailed seven of the oldest systems. Two were over 56 years old, two 53, one 51, one 35, and one 31. Four of the seven have plans to be replaced, but the two oldest have no replacement yet planned.

Cost was not the only issue, reliability is a problem as well. An article on Timeline.com noted:

Then there’s the fact that, up until 2010, the Secret Service’s computer systems were only operational about 60% of the time, thanks to a highly outdated 1980s mainframe. When Senator Joe Lieberman spoke out on the issue back in 2010, he claimed that, in comparison, “industry and government standards are around 98 percent generally.” It’s alright though, protecting the president and vice president is a job that’s really only important about 60 percent of the time, right?

It would be easy to write this off as just another example of public-sector inefficiency, but you can find these same issues in the private sector as well. Inertia can, and does, affect systems belonging to government agencies and business alike. Even a perfectly designed implemented system (we’ve all got those, right?) is subject to platform rot if ignored. Ironically, our organizations seem designed to do just that by being project-centric.

In philosophy, there’s a paradox called the Ship of Theseus, that explores the question of identity. The question arises, if we maintain something by replacing its constituent parts, does it remain the same thing? While many hours could be spent debating this, to those whose opinion should matter most, those who use the system, the answer is yes. To them, the identity of the system is bound up in what they do with it, such that it ceases to be the same thing, not when we maintain it but when its function is degraded through neglect.

Common practice, however, separates ownership and interest. Those with the greatest interest in the system typically will not own the budget for work on it. Those owning the budget, will typically be biased towards projects which add value, not maintenance work that represents cost.

Speaking of cost, is 75% of the budget an unreasonable amount for maintenance? How well are the systems meeting the needs of their users? Is quality increasing, decreasing, or holding steady? Was more money spent because of deferred maintenance than would have been spent with earlier intervention? How much business risk is involved? Without this context, it’s extremely difficult to say. It’s understandable that someone outside an organization might lack this information, but even within it, would a centralized IT group have access to it all? Is the context as meaningful at a higher, central level as it is “at the pointy end of the spear”?

Maintaining systems bit by bit, replacing them gradually over time, is likely to be more successful and less expensive, than letting them rot and then having a big-bang re-write. In my opinion, having an effective architecture for the enterprise’s IT systems is dependent on having an effective architecture for the enterprise itself. If the various systems (social and software) are not operating in conjunction, drift and inertia will take care of building your legacy (system).

[Greek Trireme image from Deutsches Museum, Munich, Germany via Wikimedia Commons]

Full Stack Enterprises (Who Needs Architects?)

In my last post, “Locking Down the Prisoners: Control, Conflict and Compliance for Organizations”, I returned to a topic that I’ve been touching on periodically over the last year, organizations as systems, which overlaps significantly with the topic of enterprise architecture (not to be confused with enterprise IT architecture of which EA is a superset). While I do not pretend to be an expert on the subject, the fractal nature of the environment I work within as a software architect makes it difficult (perhaps even dangerous) to ignore.

Systems reside within ecosystems, which being systems themselves, reside within an ecosystem of their own. Both systems and ecosystems are mutually influencing and that influence must be understood to the extent possible and accounted for, both upstream and down. Where this fails to happen, coherence between information systems and social systems breaks down. Significantly, issues in systems at higher levels of granularity can negate positive aspects of the systems that comprise them.

This is probably a good place to point out that my use of terms like “social system” is not meant to remove the human aspect. On the contrary, social systems are intensely human in nature. Where those systems are dysfunctional, it is individuals that ultimately pay the price. The intent, is to point out the inter-relationships that make up our environment.

Consider the hypothetical systems outlined in “Making and Taming Monoliths”. Assuming all of the systems involved were modular enough and technically capable of interoperation, it would not be enough. Considerations of data architecture (starting with, which system is authoritative?) could spike the effort, or at least seriously delay it. Organizational structure (aka Conway’s Law), behavior (management, governance, process), and psychology/culture could all either impel or impede. A perfect example of this concept is the near requirement that DevOps be in place where microservices are used – it may be possible to develop and deploy microservices in other process models, but it will involve far more difficulty.

“Alignment” is a term that is often mentioned in terms of IT. However, even if an organization’s IT systems are perfectly aligned with the business units they support, it will do little good if those units are working at cross purposes. Just as the full stack of an application requires coherent, intentional design to work well, so too does an enterprise. This does not, however, imply that a rigid, top-down mode of operation is needed; it’s actually quite the opposite.

In “Auftragstaktik and fingerspitzengefühl”, Tom Graves described the UK’s World War II air defense system as an example of an effective enterprise. The two German words refer to the concepts of situational awareness (fingerspitzengefühl) and mission tactics (Auftragstaktik), which underlay the success of the system. In essence, situational awareness at each level was enabled by the flow of information up and down the hierarchy, while decisions appropriate to each level were made at that level informed by understanding of the situation and the commander’s intent. This is not just a matter of process. As Tom noted:

Another key element is organisational-culture – whether the culture invites or dissuades individual judgement within real-time action (auftragstaktik), elicitation and capture of real-world subtleties (for fingerspitzengefühl) and/or whistleblower-type algedonic responses.

An organization must not only be structured so as to achieve its aims, the drive to do so must be part of its DNA. Even when some components of the organization may have remits that conflict with others (think audit, accounting, InfoSec, etc.), the ultimate direction of all parts should harmonize. Structuring a system so as to resolve conflicts is an architectural practice, regardless of the type of system.

“Microservice Architectures aren’t for Everyone” on Iasa Global

Simon Brown says it nicely:

Architect Clippy is a bit more snarky:

In both cases, the message is the same: microservice architectures (MSAs), in and of themselves, are not necessarily “better”. There are aspects where moving to a distributed architecture from a monolithic one can improve a system (e.g. enabling selective load balancing and incremental deployment). However, if someone isn’t capable of building a modular monolith, distributing the pieces is unlikely to help matters.

See the full post on the Iasa Global site (a re-post, originally published here).

Microservices, Monoliths, and Conflicts to Resolve

Two tweets, opposites in position, and both have merit. Welcome to the wonderful world of architecture, where the only rule is that there is no rule that survives contact with reality.

Enhancing resilience via redundancy is a technique with a long pedigree. While microservices are a relatively recent and extreme example of this, they’re hardly groundbreaking in that respect. Caching, mirroring, load-balancing, etc. has been with us a long, long time. Redundancy is a path to high availability.

Centralization (as exemplified by monolithic systems) can be a useful technique for simplification, increasing data and behavioral integrity, and promoting cohesion. Like redundancy, it’s a system design technique older than automation. There was a reason that “all roads lead to Rome”. Centralization provides authoritativeness and, frequently, economies of scale.

The problem with both techniques is that neither comes without costs. Redundancy introduces complexity in order to support distributing changes between multiple points and reconciling conflicts. Centralization constrains access and can introduce a single point of failure. Getting the benefits without the incurring the costs remains a known issue.

The essence of architectural design is decision-making. Given that those decisions will involve costs as well as benefits, both must be taken into account to ensure that, on balance, the decision achieves its aims. Additionally, decisions must be evaluated in the greater context rather than in isolation. As Tom Graves is fond of saying “things work better when they work together, on purpose”.

This need for designs to not only be internally optimal, but also optimized for their ecosystem means that these, as well as other principles, transcend the boundaries between application architecture, enterprise IT architecture, and even enterprise architecture. The effectiveness of this fractal architecture of systems of systems (both automated and human) is a direct result of the appropriateness of the decisions made across the full range of the organization to the contexts in play.

Since there is no one context, no rule can suffice. The answer we’re looking for is neither “microservice” nor “monolith” (or any other one tactic or technique), but fit to purpose for our context.