Microservice Mistakes – Complexity as a Service

Michael Feathers’ tweet about technical empathy packs a lot of wisdom into 140 characters. Lack of technical empathy can lead to a system that is harder to both implement and maintain since no thought was given to simplifying things for the caller. Maintainability is one of those quality of service requirements that appears to be a purely technical consideration right up to the point that it begins significantly affecting development time. If this sounds suspiciously like technical debt to you, then move to the head of the class.

The issue of technical empathy is particularly on point given the popular interest in microservice architectures (MSAs). The granular nature of the MSA style brings many benefits, but also comes with the cost of increased complexity (among others). Michael Feathers referred to this in a post from last summer titled “Microservices Until Macro Complexity”:

It is going to be interesting to see how this approach scales. Some organizations have a relatively low number of microservices. Others are pushing higher, around the 600 mark. This is a bit beyond the point where people start seeking a bigger picture. If services are often bigger than classes in OO, then the next thing to look for is a level above microservices that provides a more abstract view of an architecture. People are struggling with that right now and it was foreseeable. Along with that concern, we have the general issue of dealing with asynchrony and communication patterns between services. I strongly believe that there is a law of conservation of complexity in software. When we break up big things into small pieces we invariably push the complexity to their interaction.

The last sentence bears repeating: “When we break up big things into small pieces we invariably push the complexity to their interaction”. Breaking a monolith into microservices simplifies each individual component, however, as Eran Hammer observed in “The Fallacy of Tiny Modules”, “…at some point, someone has to put it all together and then, all the complexity is going to surface…”. As Yamen Sader illustrated in his slide deck “Microservices 101 – The Big Why” (slide #26), the structure of the organization, domain, and system will diverge in an MSA. The implication of this is that for a given domain task, we need to know more about the details of that task (i.e. have less encapsulation of the internals) in a microservice environment.

The further removed the consumer of these services are from the providers, the harder and less convenient it will be to transfer that knowledge. To put this in perspective, consider two fast food restaurants: one operates in a traditional manner where money is exchanged for burgers, the other is a microservice style operation where you must obtain the beef, lettuce, pickles, onion, cheese, and buns separately, after which the cooking service combines them (provided the money is there also). The second operation will likely be in business for a much shorter period of time in spite of the incredible flexibility it offers. Additionally, it that flexibility only truly exists when the contracts between two or more providers are swappable. As Ben Morris noted in “Do Microservices create extra challenges for distributed development?”:

Each team will develop its own interpretation of the system and view of the data model. Any service collaboration will have to involve some element of translation or mapping and will be vulnerable to subtle bugs that can arise from semantic differences.

Adopting a canonical model across the platform can help to address this problem by asserting a common vocabulary. Each service is expected to exchange information based on a shared definition of the main entities. However, this requires a level of cross-team governance which is very much at odds with the decentralised nature of microservices.

None of this is meant to say that the microservice architecture style is “bad” or “wrong”. It is my opinion, however, that MSA is first and foremost an architectural style of building applications rather than systems of systems. This isn’t an absolute; I could see multiple MSA applications within an organization sharing component microservices. Where those microservices transcend the organizational boundary, however, making use of them becomes more complex. Actions at a higher level of granularity, such as placing an order for some product or service, involves coordinating multiple services in the MSA realm rather than consuming one “chunkier” service. While the principles behind microservice architectures are relevant at higher levels of abstraction, a mismatch in granularity between the concept and implementation can be very troublesome. Maintainability issues are both technical debt and a source of customer dissatisfaction. External customers are far easier to lose than internal ones.

Advertisement

Architecture in Context – Part 2

By Charlie Alfred and Gene Hughson

Up until this point, we’ve discussed what it means for Architecture to have context.  Contexts enable us to reason about the behavior of a group of stakeholders, whether they be buyers, end users, support staff, distributors, manufacturers or suppliers.

As we’ve pointed out, virtually all products or services are multi-context.  This means that the architecture of these products or services, if they expect to be effective, must also be multi-context.  A multi-context architecture is a lot like juggling.  You must balance your attention across an array of concerns and try to satisfy an array of stakeholders.  The following sidebar illustrates this point:

In the early 1960’s, the United States found itself engaged in a space race with the USSR.  Government officials in both countries  believed strongly that the first country to travel successfully in space and land men on the moon would have a significant military advantage.

President Kennedy collaborated with NASA to initiate a space program designed to put men on the moon’s surface and return them safely to earth, and achieve this by the end of the 1960’s.  On July 20, 1969, the Apollo 11 mission was successful at landing Neil Armstrong and Buzz Aldrin on the moon, and four days later, on July 24th, 1969, the crew safely splashed down in  the Pacific Ocean, where they were brought to safety by the USS Hornet.

The Apollo 11 mission is a high point of 20th Century US history for many.  It also does an excellent job of highlighting several concepts related to Architecture in Context.

  • Contexts:
    • U.S. Executive Branch,
    • U.S. Legislative Branch,
    • NASA astronauts
    • NASA scientists and engineers
    • U.S. Taxpayers, etc.
  • Value Expectations:
    • Land astronauts safely on the moon during 1960’s
    • Return the astronauts to earth safely
  • Pain Points:
    • Ceding space to the USSR would put the US Military at a strategic disadvantage
  • Priorities:
    • US President had more clout in this mission than Legislative Branch and NASA
    • Return the astronauts safely to earth is higher priority than landing on the moon
    • Landing on the moon is more important for NASA and US Executive Branch than for US Taxpayers as a whole
  • Challenges:
    • Difficulty of lunar landing complicated by moon’s craters and hills
    • Difficulty of space capsule reentry is driven by speed, atmospheric friction, and earth’s orbit (i.e.land in water, close enough to battleship)
    • Size and design of the space capsule depends on the power of the booster rockets that are needed to launch the capsule beyond the earth’s gravitational pull

[http://en.wikipedia.org/wiki/List_of_Apollo_missions]

Due to space constraints (pun intended), this sidebar merely scratches the surface of the interesting aspects of the Apollo 11 mission.  Another critical point is that the Apollo and Gemini missions that preceded Apollo 11 demonstrated feasibility of technical approaches to address important challenges.  For example, Apollo 7 was the first manned mission to orbit the earth and splashdown safely, and Apollo 8 was the first manned mission to orbit the moon and return safely.  It is worth noting how NASA recognized how difficult and important the reentry challenge was, and how early mission decisions were made to overcome it.

Balancing importance, difficulty and centrality (cross-dependency among challenges) can be a daunting problem.  What best practices exist to help you solve it?

Best Practice 1:  Identify and understand your key contexts.

Understanding your contexts means identifying them on the basis of similar goals, priorities, external forces, and pain points.  When doing this, be sure to focus on  the “why” questions:

  • Why do stakeholder groups A and B have the same pain point?
  • Do they have the same priority for relieving this pain point, or is this a much higher priority for one group vs. the other?
  • Is the pain point caused by the same external forces?
  • Is the pain point obstructing the same goals, or different goals?

Questions like these will enable you to cleanse your context notions so that the behaviors and external forces in each context are as consistent as possible.

Best Practice 2:  Understand your key contexts as soon as possible

One tendency in multi-context system development is to leave future contexts as some other day’s problem.  The tendency is to focus on the stakeholders for the first release as the only ones who matter:

  • We won’t be marketing to those groups (countries) for years, why waste time thinking about them now?
  • Those stakeholders are so new, they won’t know what they need.
  • Why spend all the money and  effort interviewing or doing other forms of research when their needs are likely to change?

These objections are based on the perception that identifying and understanding a context requires a lot of effort.  In reality, it requires much less than it appears.  A context is a generalization of behavior and the quality of this generalization can vary with how imminently it is needed.

I realize this sounds like it violates the YAGNI (“you ain’t going to need it”) principle, but it doesn’t.  There is a big difference between anticipating a future context and doing a little defensive design to accommodate its variations than there is of building in full support for it.  The first case is one of defining good interfaces to encapsulate variation.  The other is an implementation and testing burden.   Additionally, it is better to be aware of potential conflicts between contexts before resolving them involves significant design and code changes.

Best Practice 3:  Understand the challenges in satisfying each context’s pain points

Pain points and challenges are similar, and may be confused.  Challenges are concerns of the solution provider.  A challenge is one or more forces that must be overcome to provide value.
A pain point is similar, however it is linked to stakeholders within a context.  This difference shows up in goals, priorities and external factors are considered important.  For example, a pain point might be the need to keep trade secrets confidential from hackers, while the challenge might deal with specific types of cyber threats.

As the solution architect, we derive challenges from pain points in contexts.   Contexts are important here, as similar pain points may exist in related contexts, but with different goals, priorities, or even external challenges.  For example, within an investment management firm, response time needs differ for portfolio managers, traders and compliance officers.

Challenges are framed as problem statements — specific issues that the solution provider must overcome in order to provide value.  It is important to keep the relationships between contexts, pain points and challenges.  In general, contexts have many-to-many relationships to contexts and to pain points.  In other words, there could be several challenges for certain pain points.  Some challenges may be quite similar across several pain points, potentially spanning many contexts.

Best Practice 4:  Carefully consider the nature of challenges when combining over contexts

Challenges are like chemical elements.  Each has its own structure and properties.  However, in a system (especially a multi-context system), challenges combine and form molecules.  Molecules have their own chemical properties, aside from their constituent elements.  For example, Carbon and Oxygen independently are staples of life, while their combination, Carbon Monoxide, is an odorless, tasteless, and most notably, deadly gas.

As mentioned above, challenges take a common form:

  • How does the challenge create or detract from value expectations in a context?
  • Which external forces cause the challenge’s impact to be magnified or compounded

These two properties make it relatively easy to look at a challenge as a chemical element, or combine it with other challenges to view it as a  molecule.  Here are a few ways to characterize challenges that can be useful in examining their impact:

Some useful dimensions for categorizing challenges are:

Compatibility (of two challenges)

  • Compatible – independent problems, no issues solving simultaneously
  • Friction – partially dependent problems, some tradeoffs and/or risks
  • Antagonistic – highly dependent, serious tradeoffs and risks

Breadth (of a challenge)

  • Pervasive – scope of challenges reaches throughout solution
  • Regional – scope of challenges pervades an area but encapsulated
  • Local – scope of challenges limited to a narrow area

Occurrence (of a challenge)

  • Persistent – challenge occurs all or most of the time
  • Intermittent – challenge occurs with a predictable frequency
  • Conditional – challenge occurs in certain situations or conditions

These dimensions and categories can be useful for determining how to manage or aggregate challenges:

  • Within a context, as in how the challenge interacts and combines with others
  • Across context, as in whether similar challenges in two or more contexts are sufficiently alike to merge, or whether they combine into a more significant one

Best Practice 5: Prioritize challenges to identify the most important ones to address
 
Software architecture, just as software engineering, is a discipline of deciding among alternative approaches to reach a decision.  While some decisions get made in small sets (e.g. this system will be 3-Tier, .NET, IIS with a SqlServer database), many, if not most, decisions are made independently.

One important thing to remember is that virtually every decision made reduces the degrees of freedom for the rest.  Sometimes this reduction is desirable, but many times it paints the subsequent decisions into a corner.

If the goal of a software architect is to make good decisions for the system, it makes sense that s/he should address the highest priority challenges first (while the degrees of freedom are more available).  But how should challenges be prioritized?  Three criteria need to be balanced:

  1. Importance:
    • How many contexts does the challenge affect?
    • How much of an  impact does the challenge make on each one?
    • What is the weighted average of this impact, given the relative importance of each context?
  2. Difficulty:
    • The more difficult a challenge, the more degrees of freedom it is likely to need
    • The more difficult a challenge, the more degrees of freedom it will consume
    • The more degrees a challenge will consume, the earlier it should be tackled.
  3. Centrality (also called Core):
    • Challenges are also dependent on the solutions to other challenges
    • The dependent challenge should be addressed before or concurrently with the challenges that depend on them

Architectural design takes place in an environment of constraints.  Constraint should be understood as a neutral concept, because it both prevents and enables.  The same cup that constrains the flow of water also enables you to bring that water to your lips.  Managed appropriately, constraints provide the structure and form that yield the desired function(s).  Part of that management is understanding that design decisions are constraints.  Decisions made in isolation risk inappropriately constraining a design in terms of the whole.

Remediation of architectural constraints is, by its very nature, expensive.  Rewiring is more involved than repainting; replacing a foundation is far more extensive still.  Understanding and accounting for all of the contexts involved allows you to see the architecture of the problem as a whole.  The architecture of the problem then becomes the skeleton upon which the architecture of the solution can be built, incrementally, iteratively, and most important, effectively.

Architecture in Context – Part 1

By Charlie Alfred and Gene Hughson

It’s a common occurrence on online forums to see someone ask what architectural style is the right one. Likewise, it’s common to see a reply of “it depends” because “context is king”. Many will nod sagely at this; of course it is, context is king. But, what is context? Specifically, what do we mean by the term “context” in terms of software and solution architecture?

At a high level, context defines the problem environment for which a solution is intended to provide value. That definition, however, is far too superficial to suffice, because only the most trivial of systems will have a single context. Refining our definition, we can state that a context is a grouping of stakeholders sharing similar goals, priorities, and perceptions. In practice, unless you’re dealing with a system that you are developing yourself for your sole use, you will be dealing with multiple contexts, and they are likely to overlap like the Olympic logo. If you are puzzled by this statement, consider silverware. We need knives, forks and spoons and we have different types of each of them.

Identifying contexts and understanding how they interrelate gives definition to the architecture of the problem, which is a necessary prerequisite to designing the architecture of a solution (note the use of “a solution” instead of “the solution” – it’s intentional). This involves considerable work to identify the goals, priorities, and perceptions in that the natural tendency is for a stakeholder to ask for what they think will solve their problem rather than enumerate their problems (which is the point of the apocryphal “faster horses” quote frequently attributed to Henry Ford). Nonetheless, breaking out those raw needs is crucial to success.

Where you have multiple contexts, chances are the goals, priorities, and perceptions diverge rather than complement. Take, for example, a pickup truck. One context may be those who haul bulky and/or heavy goods daily. Another context may be made up of those who frequently tow trailers. A third context might be those who haul or tow from time to time, but mainly use the truck for transportation. The value of any given feature of the truck: miles per gallon of fuel, size of the bed, whether or not the bed is covered, quality of the sound system, torque, etc. will vary based on the context we’re considering. Additionally, external forces (such as the laws of physics) will come into play and complicate making design decisions. For example, long, open truck beds maximize hauling capacity, but will also harm fuel efficiency.

Another complicating factor is the presence of peripheral stakeholders, those whose contexts will affect the direct stakeholders. In the sense of our pickup truck example, we can consider mechanics as an example of this. Trucks that have more room in the engine compartment are easier (therefore cheaper) to service but will suffer in terms of fuel efficiency due to increased size and weight. In the IT realm, development and support teams would fall into this category. Although their contexts would not be primary drivers, ignoring them could well impose a cost on the direct stakeholders in terms of turnaround time for enhancements and fixes.

Examples of multi-context software systems can be found many places. The more generic a system, the more contexts it will have. Commodity office software (word processors, spreadsheets, etc.) are a prime example. Consider Microsoft Excel, which is a simple row/column tabulator, a statistical analysis tool, a database client, and, with its macro language, is an application development platform. Stretching to cover this many needs turns the Excel User Experience into a game of Twister. Platforms, such as Android are another example. Highly configurable applications like Salesforce.com (which has arguably crossed the line between product and platform) span multiple contexts as well. In the enterprise IT space, first SOA and now microservice architectures are nothing if not an attempt to address the plethora of contexts in play via separation of concerns.

A holistic consideration of the contexts at hand is an important success factor when evolving a system iteratively and incrementally. Without that consideration, decisions can be made that are conducive to one context but antagonistic to another. Deferring the reconciliation of the two contextual conflicts until later may result in considerable architectural refactoring. This refactoring will involve time and expense at a minimum, and if time is constrained, the likelihood of incurring technical debt is high. Each additional conflict that is blundered into due to the lack of up front consideration increases the risk to the system. This is not to endorse a detailed Big Design Up Front (BDUF), but a recognition that problem awareness and problem avoidance is cheaper than rework. If the context has been identified, then it’s YAGNI – not “You Ain’t Gonna Need It” but “You Are Gonna Need It”.

The next post will in this series will address concrete practices to integrate multiple contexts into a unified architecture of the problem that can serve as the foundation for a coherent architecture of a solution.

Problem Space, Solution Space, and Tunnel Vision

Peter Kretzman‘s tweet about Sir Roland and his lightweight mini-shield brought both a smile to my face and the idea for this post. That idea actually has little to do with #NoEstimates (which I’ve touched on previously) and everything to do with architectural design. The cartoon highlights a design dysfunction that frequently manifests in systems:

Sir Roland has a point in that his shield is far easier to carry and wield than the traditional non-agile shield. Unfortunately, due to his tunnel vision, he probably won’t discover that the aspects that he focused on were peripheral to overall solution (i.e. keeping sharp implements out of Sir Roland’s innards) until it’s too late. Learning via experimentation is a powerful technique, but analysis has its place too, particularly when the value at risk is high.

Just as software is a system, so too are organizations (admittedly, systems that run on far less deterministic “hardware”, but systems nonetheless). Designing systems, particularly those that involve multiple stakeholders (i.e. nearly all that have more than one person involved with them), involves designing the solution space to best match the problem space. Note that I didn’t say “perfectly match the problem space”, as this conjunction is, in my opinion, unlikely to occur and should it occur, highly unlikely to persist. That being said, getting and keeping that match as close as possible to the theoretical perfect one is important. When a stakeholder’s influence on the solution is out of balance with their centrality to the problem, expect conflict.

IT’s traditional customer service is a notorious example of this type of imbalance at the organizational level. That imbalance also manifests in the technology realm in the forms of choosing solutions on the basis of justifying sunk costs, being apathetic toward user experience, chasing the tool/technique of the day, experimenting at the expense of the customer, and indulging in egotism.

Value for the owner of the system is a better tool to keep the proper balance. As the owner(s) should be the stakeholder(s) central to the problem space, where the solution is geared toward their needs it will most likely be well aligned to the problem. Where the concerns of peripheral stakeholders provide benefit to the central stakeholder(s) is where those concerns become important to the solution.

Unlike Sir Roland, failure to choose wisely may not be literally fatal, but it could be figuratively so.

#ShadowSocialMedia or Why Won’t People Use the Product the Way They’re Supposed to

Scott Berkun dislikes the way people are using images to bypass Twitter’s 140 character limit:

His point is very valid, but:

Which is the issue. Sometimes there’s a need to go beyond that limit. Sure, you can chunk your thoughts up across multiple tweets, but users find it burdensome to respect Twitter’s constraint on the amount of text per tweet. Constrained customers, assuming they stick with a product, tend to come up with “creative” solutions to that product’s shortcomings that reflect what they value. The customers’ values may well conflict with the developers’. When “conflict” and “customer” are in the same sentence, there’s generally a problem..

Berkun’s response to @honatwork‘s rebuttal nearly captures the issue:

I say “nearly”, because Twitter was built long before 2015. The problem is that it’s 2015 and Twitter has not evolved to meet a need that clearly exists.

In the IT world, it’s common to hear terms like “Shadow IT” or “Rogue IT”. Both refer to users (i.e. customers) going beyond the pale of approved tools and techniques to meet a need. This poses a problem for IT in that the customer’s solution may not incorporate things that IT values and retrofitting those concerns later is far more difficult. Taking a “products, not projects” approach can minimize the need for customer “creativity”, for in-house IT and external providers.

Trying to hold back the tide just won’t work, because the purpose of the system is to meet the customers’ needs, not respect the designers’ intent.

Form Follows Function on SPaMCAST 323

SPaMCAST logo

I’m back with another appearance on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast.

SPaMCast 323 features Tom’s “Five Factors Leading to Failing With Agile”, my “Microservice Principles and Enterprise IT Architecture” and an installment of Jo Ann Sweeny’s column, “Explaining Communication”.

Enjoy!

Who Needs Architects? – When Tactics Do Not Add Up to Strategy

Death of Wladyslaw Szujski, 1914

Going into World War I, the French army had adopted a simple doctrine, attack. Attaque à outrance; attack to excess. Courage, particularly that inspired by the Poilu‘s dashing red trousers, would carry the day. Unfortunately for several million French soldiers, the plan was flawed.

It’s easy to see why this is the case in the age of machine guns and modern artillery, but it would likely have been just as ineffective without that complication. The bigger problem was that, even if the attacks had succeeded, each additional success would carry the seed of a greater defeat in the future. Each success would carry the victorious unit farther forward, making re-supply and communications that much more difficult. Each success would spread the victor that much thinner, making it more brittle to counter-attack.

Without focus, lots of little tactical successes can breed strategic defeat.

Or, as Richard Sage put it:

Lack of clarity should always be a warning sign.

It’s important to understand that while there is a relationship between goals, strategies, and tactics, there are differences as well. Seth Godin’s “Goals, strategy and tactics for change” captures both:

The Goal: Who are you trying to change? What observable actions will let you know you’ve succeeded?
The Strategy: What are the emotions you can amplify, the connections you can make that will cause someone to do something they’ve hesitated to do in the past (change)? The strategy isn’t the point, it’s the lever that helps you cause the change you seek.
The Tactics: What are the actions you take that cause the strategy to work? What are the events and interactions that, when taken together, comprise your strategy?

While he was defining the terms in the context of implementing change, the same definitions work well for software development. More than just a bag of tactics, a strategy is a set of tactics focused on attaining a goal. Just as a book is made up of words and sentences, it is also more than that. The intentional structuring of those words and sentences are what makes it a book; that is what gives it meaning. Likewise, systems are more than the sum of their parts.

A system exists via the structure and interaction of its component parts. A well designed system is one where those parts are intentionally structured so that their interaction achieves a goal effectively and efficiently. A failure to do so can impact the technical architecture of a system (taken from Tony DaSilva’s “Fragmented, Half-baked, Thoughts”):

System in Context by Tony DaSilva

While both System A and System B have been made to fit their context, is there any question as to which is more likely to have quality issues? Extensive refactoring, particularly at the architectural level, can become expensive. This leaves the owners of System B with a dilemma – how long can they tolerate the technical debt caused by the mismatch between the system and its context before the costs justify the work?

Failure to intentionally design can also manifest in the user experience. As Stephen Anderson tweeted:

As I noted in “Who Needs Architects? – Navigating the Fractals”, systems exist in contexts which exist in ecosystems. When designing for a particular level of abstraction, you must consider your context(s) and design your internal architecture accordingly. Likewise, you must consider the ecosystem in which the system will exist and design the external architecture accordingly. Avoiding intentional design (which I will note is neither BDUF nor over-engineering) might seem agile, but is in reality, a form of technical debt affecting quality, cohesion, and maintainability.

Tactical excellence focused on a goal is strategic excellence. Without the focus, however, tactical excellence is wasted.

“Design? Security and Privacy? YAGNI” on Iasa Global

Two of my favorite “bumper sticker philosophies” (i.e. short, pithy, and incredibly simplistic sayings) are “the simplest thing that could possibly work” and YAGNI. Avoiding unnecessary complexity and unneeded features are good ideas at first glance. The problem is determining what is unnecessary and unneeded. Just meeting the functional requirements is unlikely to be sufficient.

Read “Design? Security and Privacy? YAGNI” on the Iasa Global site for a post about how it’s important to have someone responsible for Quality of Service requirements in general and security in particular.