Eberhard Wolff‘s question set the stage.
If Conway's Law is so important - are #Microservices more an organizational approach than an architecture?—
Eberhard Wolff (@ewolff) February 18, 2015
Adrian CockCroft‘s reply tied everything together.
.@ewolff Agile, DevOps, Conways Law and Microservices all reinforce each other to speed up product development.—
adrian cockcroft (@adrianco) February 20, 2015
Conway’s Law is the common thread tying microservice architectures (MSAs) and DevOps together. Significantly, this common thread runs through the entire organization, not just the IT parts. As I noted in my previous post, paying attention to this principle allows you to work with, rather than against, the grain of an organization. Working with the grain of the organization is key, because DevOps, lovely as it is, is not an end, but a means. The desired end, as identified by Mike Kavis in “Is DevOps What Organizations Really Seek?” is to “become high-performing organizations”.
Of all the attributes of such an organization (as identified by Kavis): “Strong Leadership”, “Strong Culture”, “Sound Architecture”, etc., the most important is “High Customer Satisfaction”. For many organizations, high customer satisfaction is a problem area for IT. In a recent article on Business Insider, Red Hat’s CEO Jim Whitehurst noted an increased interest in DevOps on the part of CIOs to deal with what he terms IT’s “fight for its life”:
That’s because IT departments say they had better figure out how to be faster, cheaper, and better. If they don’t, the company’s employees will no longer depend on them. They bring their own PCs, tablets and phones to work and they buy whatever cloud services they want to do their jobs. And the CIO will find his budget increasingly shifted to other manager’s pockets.
IT has has a history of cycles of neglect or rejection of new/disruptive technologies followed by catch-up crises: PCs in the 80s, Web in the 90s, BYOD/Cloud/IOT/etc. now. What’s different this time is the increasing level of technical knowledge and access to solutions outside of IT. As Krishnan Subramanian has noted, playing catch-up may become less and less viable:
@GeneHughson This time impact will be orders of magnitude greater. Catching up in next refresh cycle might be too late—
Krish (@krishnan) February 20, 2015
MSAs enable flexible applications via composing vertical slices of business functionality rather than horizontal layers of technical concerns. These same principles can apply to higher levels of abstraction up to the enterprise’s IT architecture. Likewise, DevOps incorporates the same viewpoint shift in terms of the IT organization. This architectural change (both technical and business) can allow for integration between IT and the business it enables. As Twila Day recently noted, this integration goes far beyond mere alignment into “active partnership between IT and your business units”:
Partnership means working together, side by side. It means that technology leaders are actively involved in strategic development at the highest levels of your organization. It means that all the way up and down your organization, any talented person can propose an innovative idea, tactic or strategy, regardless of where s/he works. The business might have an idea first, or IT might have it first. No matter. What’s important is that the two groups work side by side to accomplish the most important business objectives.
Transforming IT from an adversary into a partner is primarily a cultural shift that involves both parties if it’s to be successful. Organizational and technical architecture cannot be neglected, however, in that they can either help or hinder that transformation. DevOps can facilitate this via its focus on the product rather than any one project (which is a concern shared by the customers) and by having the flexibility to tailor its pace to that of the customer rather than forcing a one size fits all (aka one size fits none).
In my part of the world, it’s not uncommon for people to say that someone wouldn’t recognize something if it “bit them in the [rude rump reference]”. For many organizations, that seems to be the explanation for the state of their enterprise IT architecture. For while we might claim to understand terms like “design”, “encapsulation”, and “separation of concerns”, the facts on the ground fail to show it. Just as software systems can degenerate into a “Big Ball of Mud”, so too can the systems of systems comprising our enterprise IT architectures. If we look at organizations as the systems they are, it should be obvious that this same entropy can infect the organization as well.
One symptom of this entropy is when the dividing lines blur, weakening or even removing constraints. While the term “constraint” commonly has a negative connotation, constraints provide the structure and definition of a system. Separation of concerns, encapsulation, and DRY are all constraints that are intended to provide benefit. We accept limits on concerns addressed, accessibility of internals and/or instances of code or data in order to reduce complexity, not just check off a philosophical box. If we remove or even just relax these types of constraints too much, we incur risk.
This blurring of lines can occur at any level and on multiple levels. Additionally, architectural weakness at a higher level of abstraction can negate strengths at lower levels. A collection of well-designed systems will not ensure a coherent enterprise IT architecture if there is overlap and redundancy without a clear understanding of which ones are authoritative. Accidental architecture is no more likely to work at higher levels of abstraction than lower ones.
Architectural design, at each level of granularity, should be intentional and appropriate to that level. The ideal, is not to over-regulate, but to strike a balance. Micromanaging internals wastes effort better spent on something beneficial; abdicating design responsibility practically guarantees chaos. An additional consideration is the fit between the human and technological aspects. Conway’s law is more than just an observation, it can be used as a tool to align applications to a specific business concern as well as aligning development teams to specific applications/application components.
Just as a carver takes note of the grain of the wood being shaped, so should an architect work with rather than against the grain of the organization.
Jessica Kerr’s post, “Microservices, Microbusinesses”, captures these concepts from the viewpoint of microservice architectures. Partitioning application concerns into microservices allows for internal flexibility at the cost of external conformance to necessary governance. As Kerr puts it, “…everybody is a responsible adult”:
That’s a lot of overhead and glue code. Every service has to do translation from input to its internal format, and then to whatever output format someone else requires. Error handling, caching or throttling, failover and load balancing and monitoring, contract testing, maintaining multiple interface versions, database interaction details. Most of the code is glue, layers of glue protecting a small core of business logic. These strong boundaries allow healthy relationships with other services, including new interactions that weren’t designed into the architecture from the beginning. For all this work, we get freedom on the inside.
Kerr also recognizes the applicability of this trade-off to the architecture of the organization:
Still, a team exists as a citizen inside a larger organization. There are interfaces to fulfill. Management really does need to know about progress. Outward collaboration is essential. We can do this the same way our code does: with glue. Glue made of people. One team member, taking the responsibility of translating cards on the wall into JIRA, can free the team to optimize communication while filling management’s strongest needs.
Management defines an API. Encapsulate the inner workings of the team, and expose an interface that makes sense outside. By all means, provide a reference implementation: “Other teams use Target Process to track work.” Have default recommendations: “We use Clojure here unless there’s some better solution. SQL Server is our first choice database for these reasons.” Give teams a strong process and technology stack to start from, and let them innovate from there.
“Good fences make good neighbors” not by keeping out, but by channeling traffic into commonly understood and commonly accepted directions. We recognize lines so as to influence those aspects we truly need to influence. More importantly, we recognize lines to prevent needless conflict and waste. The key is to draw the lines so that they work for us, not against us.
In “Microservice Mistakes – Complexity as a Service”, I argued that the fine-grained nature of microservices opened up the risk of pushing complexity out to the consumers of those services. Rather than encapsulating details, microservice architectures expose them, forcing clients to know more about the internals than is common in both object-oriented and SOA traditions. In the comments, it was suggested that granularity was irrelevant as multiple granular microservices could be composed to form a coarser-grained microservice that would provide a more appropriate level of abstraction. My response was that while this is theoretically true, aggregating service calls in that manner risks issues due to network latency. This drew a response quoting Donald Knuth: “Premature optimization is the root of all evil (or at least most of it) in programming.”
Okay, in my rebuttal I did say that Knuth was wrong about this when it came to distributed systems. A better response would have been to point out that Knuth’s quote did not apply. Far from being an optimization, taking latency (as well as other network characteristics) into consideration is just basic design. Meeting a certain time to complete for in-process calls affects quality of service, making efforts to reduce that time optimizations. Meeting a certain time to complete for remote calls affects function. Achieving a functional state is not an optimization.
Location agnostic components, code that “just works” whether in-process, out of process, or over the wire, has been a Holy Grail since the days of DCOM and CORBA. The laws of physics, however, just won’t be denied. Services are not JARs and DLLs. Changing components that were designed to run in-process into ones capable of running remotely will almost certainly involve major re-work, not a little optimization.
It’s time for another appearance on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast.
Technical Empathy - the ability to see the system from the point of view of the caller of your code, not just the point of view of your code—
Michael Feathers (@mfeathers) January 26, 2015
Michael Feathers’ tweet about technical empathy packs a lot of wisdom into 140 characters. Lack of technical empathy can lead to a system that is harder to both implement and maintain since no thought was given to simplifying things for the caller. Maintainability is one of those quality of service requirements that appears to be a purely technical consideration right up to the point that it begins significantly affecting development time. If this sounds suspiciously like technical debt to you, then move to the head of the class.
The issue of technical empathy is particularly on point given the popular interest in microservice architectures (MSAs). The granular nature of the MSA style brings many benefits, but also comes with the cost of increased complexity (among others). Michael Feathers referred to this in a post from last summer titled “Microservices Until Macro Complexity”:
It is going to be interesting to see how this approach scales. Some organizations have a relatively low number of microservices. Others are pushing higher, around the 600 mark. This is a bit beyond the point where people start seeking a bigger picture. If services are often bigger than classes in OO, then the next thing to look for is a level above microservices that provides a more abstract view of an architecture. People are struggling with that right now and it was foreseeable. Along with that concern, we have the general issue of dealing with asynchrony and communication patterns between services. I strongly believe that there is a law of conservation of complexity in software. When we break up big things into small pieces we invariably push the complexity to their interaction.
The last sentence bears repeating: “When we break up big things into small pieces we invariably push the complexity to their interaction”. Breaking a monolith into microservices simplifies each individual component, however, as Eran Hammer observed in “The Fallacy of Tiny Modules”, “…at some point, someone has to put it all together and then, all the complexity is going to surface…”. As Yamen Sader illustrated in his slide deck “Microservices 101 – The Big Why” (slide #26), the structure of the organization, domain, and system will diverge in an MSA. The implication of this is that for a given domain task, we need to know more about the details of that task (i.e. have less encapsulation of the internals) in a microservice environment.
The further removed the consumer of these services are from the providers, the harder and less convenient it will be to transfer that knowledge. To put this in perspective, consider two fast food restaurants: one operates in a traditional manner where money is exchanged for burgers, the other is a microservice style operation where you must obtain the beef, lettuce, pickles, onion, cheese, and buns separately, after which the cooking service combines them (provided the money is there also). The second operation will likely be in business for a much shorter period of time in spite of the incredible flexibility it offers. Additionally, it that flexibility only truly exists when the contracts between two or more providers are swappable. As Ben Morris noted in “Do Microservices create extra challenges for distributed development?”:
Each team will develop its own interpretation of the system and view of the data model. Any service collaboration will have to involve some element of translation or mapping and will be vulnerable to subtle bugs that can arise from semantic differences.
Adopting a canonical model across the platform can help to address this problem by asserting a common vocabulary. Each service is expected to exchange information based on a shared definition of the main entities. However, this requires a level of cross-team governance which is very much at odds with the decentralised nature of microservices.
None of this is meant to say that the microservice architecture style is “bad” or “wrong”. It is my opinion, however, that MSA is first and foremost an architectural style of building applications rather than systems of systems. This isn’t an absolute; I could see multiple MSA applications within an organization sharing component microservices. Where those microservices transcend the organizational boundary, however, making use of them becomes more complex. Actions at a higher level of granularity, such as placing an order for some product or service, involves coordinating multiple services in the MSA realm rather than consuming one “chunkier” service. While the principles behind microservice architectures are relevant at higher levels of abstraction, a mismatch in granularity between the concept and implementation can be very troublesome. Maintainability issues are both technical debt and a source of customer dissatisfaction. External customers are far easier to lose than internal ones.
By Charlie Alfred and Gene Hughson
Up until this point, we’ve discussed what it means for Architecture to have context. Contexts enable us to reason about the behavior of a group of stakeholders, whether they be buyers, end users, support staff, distributors, manufacturers or suppliers.
As we’ve pointed out, virtually all products or services are multi-context. This means that the architecture of these products or services, if they expect to be effective, must also be multi-context. A multi-context architecture is a lot like juggling. You must balance your attention across an array of concerns and try to satisfy an array of stakeholders. The following sidebar illustrates this point:
In the early 1960’s, the United States found itself engaged in a space race with the USSR. Government officials in both countries believed strongly that the first country to travel successfully in space and land men on the moon would have a significant military advantage.
President Kennedy collaborated with NASA to initiate a space program designed to put men on the moon’s surface and return them safely to earth, and achieve this by the end of the 1960’s. On July 20, 1969, the Apollo 11 mission was successful at landing Neil Armstrong and Buzz Aldrin on the moon, and four days later, on July 24th, 1969, the crew safely splashed down in the Pacific Ocean, where they were brought to safety by the USS Hornet.
The Apollo 11 mission is a high point of 20th Century US history for many. It also does an excellent job of highlighting several concepts related to Architecture in Context.
- U.S. Executive Branch,
- U.S. Legislative Branch,
- NASA astronauts
- NASA scientists and engineers
- U.S. Taxpayers, etc.
- Value Expectations:
- Land astronauts safely on the moon during 1960’s
- Return the astronauts to earth safely
- Pain Points:
- Ceding space to the USSR would put the US Military at a strategic disadvantage
- US President had more clout in this mission than Legislative Branch and NASA
- Return the astronauts safely to earth is higher priority than landing on the moon
- Landing on the moon is more important for NASA and US Executive Branch than for US Taxpayers as a whole
- Difficulty of lunar landing complicated by moon’s craters and hills
- Difficulty of space capsule reentry is driven by speed, atmospheric friction, and earth’s orbit (i.e.land in water, close enough to battleship)
- Size and design of the space capsule depends on the power of the booster rockets that are needed to launch the capsule beyond the earth’s gravitational pull
Due to space constraints (pun intended), this sidebar merely scratches the surface of the interesting aspects of the Apollo 11 mission. Another critical point is that the Apollo and Gemini missions that preceded Apollo 11 demonstrated feasibility of technical approaches to address important challenges. For example, Apollo 7 was the first manned mission to orbit the earth and splashdown safely, and Apollo 8 was the first manned mission to orbit the moon and return safely. It is worth noting how NASA recognized how difficult and important the reentry challenge was, and how early mission decisions were made to overcome it.
Balancing importance, difficulty and centrality (cross-dependency among challenges) can be a daunting problem. What best practices exist to help you solve it?
Best Practice 1: Identify and understand your key contexts.
Understanding your contexts means identifying them on the basis of similar goals, priorities, external forces, and pain points. When doing this, be sure to focus on the “why” questions:
- Why do stakeholder groups A and B have the same pain point?
- Do they have the same priority for relieving this pain point, or is this a much higher priority for one group vs. the other?
- Is the pain point caused by the same external forces?
- Is the pain point obstructing the same goals, or different goals?
Questions like these will enable you to cleanse your context notions so that the behaviors and external forces in each context are as consistent as possible.
Best Practice 2: Understand your key contexts as soon as possible
One tendency in multi-context system development is to leave future contexts as some other day’s problem. The tendency is to focus on the stakeholders for the first release as the only ones who matter:
- We won’t be marketing to those groups (countries) for years, why waste time thinking about them now?
- Those stakeholders are so new, they won’t know what they need.
- Why spend all the money and effort interviewing or doing other forms of research when their needs are likely to change?
These objections are based on the perception that identifying and understanding a context requires a lot of effort. In reality, it requires much less than it appears. A context is a generalization of behavior and the quality of this generalization can vary with how imminently it is needed.
I realize this sounds like it violates the YAGNI (“you ain’t going to need it”) principle, but it doesn’t. There is a big difference between anticipating a future context and doing a little defensive design to accommodate its variations than there is of building in full support for it. The first case is one of defining good interfaces to encapsulate variation. The other is an implementation and testing burden. Additionally, it is better to be aware of potential conflicts between contexts before resolving them involves significant design and code changes.
Best Practice 3: Understand the challenges in satisfying each context’s pain points
Pain points and challenges are similar, and may be confused. Challenges are concerns of the solution provider. A challenge is one or more forces that must be overcome to provide value.
A pain point is similar, however it is linked to stakeholders within a context. This difference shows up in goals, priorities and external factors are considered important. For example, a pain point might be the need to keep trade secrets confidential from hackers, while the challenge might deal with specific types of cyber threats.
As the solution architect, we derive challenges from pain points in contexts. Contexts are important here, as similar pain points may exist in related contexts, but with different goals, priorities, or even external challenges. For example, within an investment management firm, response time needs differ for portfolio managers, traders and compliance officers.
Challenges are framed as problem statements — specific issues that the solution provider must overcome in order to provide value. It is important to keep the relationships between contexts, pain points and challenges. In general, contexts have many-to-many relationships to contexts and to pain points. In other words, there could be several challenges for certain pain points. Some challenges may be quite similar across several pain points, potentially spanning many contexts.
Best Practice 4: Carefully consider the nature of challenges when combining over contexts
Challenges are like chemical elements. Each has its own structure and properties. However, in a system (especially a multi-context system), challenges combine and form molecules. Molecules have their own chemical properties, aside from their constituent elements. For example, Carbon and Oxygen independently are staples of life, while their combination, Carbon Monoxide, is an odorless, tasteless, and most notably, deadly gas.
As mentioned above, challenges take a common form:
- How does the challenge create or detract from value expectations in a context?
- Which external forces cause the challenge’s impact to be magnified or compounded
These two properties make it relatively easy to look at a challenge as a chemical element, or combine it with other challenges to view it as a molecule. Here are a few ways to characterize challenges that can be useful in examining their impact:
Some useful dimensions for categorizing challenges are:
Compatibility (of two challenges)
- Compatible – independent problems, no issues solving simultaneously
- Friction – partially dependent problems, some tradeoffs and/or risks
- Antagonistic – highly dependent, serious tradeoffs and risks
Breadth (of a challenge)
- Pervasive – scope of challenges reaches throughout solution
- Regional – scope of challenges pervades an area but encapsulated
- Local – scope of challenges limited to a narrow area
Occurrence (of a challenge)
- Persistent – challenge occurs all or most of the time
- Intermittent – challenge occurs with a predictable frequency
- Conditional – challenge occurs in certain situations or conditions
These dimensions and categories can be useful for determining how to manage or aggregate challenges:
- Within a context, as in how the challenge interacts and combines with others
- Across context, as in whether similar challenges in two or more contexts are sufficiently alike to merge, or whether they combine into a more significant one
Best Practice 5: Prioritize challenges to identify the most important ones to address
Software architecture, just as software engineering, is a discipline of deciding among alternative approaches to reach a decision. While some decisions get made in small sets (e.g. this system will be 3-Tier, .NET, IIS with a SqlServer database), many, if not most, decisions are made independently.
One important thing to remember is that virtually every decision made reduces the degrees of freedom for the rest. Sometimes this reduction is desirable, but many times it paints the subsequent decisions into a corner.
If the goal of a software architect is to make good decisions for the system, it makes sense that s/he should address the highest priority challenges first (while the degrees of freedom are more available). But how should challenges be prioritized? Three criteria need to be balanced:
- How many contexts does the challenge affect?
- How much of an impact does the challenge make on each one?
- What is the weighted average of this impact, given the relative importance of each context?
- The more difficult a challenge, the more degrees of freedom it is likely to need
- The more difficult a challenge, the more degrees of freedom it will consume
- The more degrees a challenge will consume, the earlier it should be tackled.
- Centrality (also called Core):
- Challenges are also dependent on the solutions to other challenges
- The dependent challenge should be addressed before or concurrently with the challenges that depend on them
Architectural design takes place in an environment of constraints. Constraint should be understood as a neutral concept, because it both prevents and enables. The same cup that constrains the flow of water also enables you to bring that water to your lips. Managed appropriately, constraints provide the structure and form that yield the desired function(s). Part of that management is understanding that design decisions are constraints. Decisions made in isolation risk inappropriately constraining a design in terms of the whole.
Remediation of architectural constraints is, by its very nature, expensive. Rewiring is more involved than repainting; replacing a foundation is far more extensive still. Understanding and accounting for all of the contexts involved allows you to see the architecture of the problem as a whole. The architecture of the problem then becomes the skeleton upon which the architecture of the solution can be built, incrementally, iteratively, and most important, effectively.
By Charlie Alfred and Gene Hughson
It’s a common occurrence on online forums to see someone ask what architectural style is the right one. Likewise, it’s common to see a reply of “it depends” because “context is king”. Many will nod sagely at this; of course it is, context is king. But, what is context? Specifically, what do we mean by the term “context” in terms of software and solution architecture?
At a high level, context defines the problem environment for which a solution is intended to provide value. That definition, however, is far too superficial to suffice, because only the most trivial of systems will have a single context. Refining our definition, we can state that a context is a grouping of stakeholders sharing similar goals, priorities, and perceptions. In practice, unless you’re dealing with a system that you are developing yourself for your sole use, you will be dealing with multiple contexts, and they are likely to overlap like the Olympic logo. If you are puzzled by this statement, consider silverware. We need knives, forks and spoons and we have different types of each of them.
Identifying contexts and understanding how they interrelate gives definition to the architecture of the problem, which is a necessary prerequisite to designing the architecture of a solution (note the use of “a solution” instead of “the solution” – it’s intentional). This involves considerable work to identify the goals, priorities, and perceptions in that the natural tendency is for a stakeholder to ask for what they think will solve their problem rather than enumerate their problems (which is the point of the apocryphal “faster horses” quote frequently attributed to Henry Ford). Nonetheless, breaking out those raw needs is crucial to success.
Where you have multiple contexts, chances are the goals, priorities, and perceptions diverge rather than complement. Take, for example, a pickup truck. One context may be those who haul bulky and/or heavy goods daily. Another context may be made up of those who frequently tow trailers. A third context might be those who haul or tow from time to time, but mainly use the truck for transportation. The value of any given feature of the truck: miles per gallon of fuel, size of the bed, whether or not the bed is covered, quality of the sound system, torque, etc. will vary based on the context we’re considering. Additionally, external forces (such as the laws of physics) will come into play and complicate making design decisions. For example, long, open truck beds maximize hauling capacity, but will also harm fuel efficiency.
Another complicating factor is the presence of peripheral stakeholders, those whose contexts will affect the direct stakeholders. In the sense of our pickup truck example, we can consider mechanics as an example of this. Trucks that have more room in the engine compartment are easier (therefore cheaper) to service but will suffer in terms of fuel efficiency due to increased size and weight. In the IT realm, development and support teams would fall into this category. Although their contexts would not be primary drivers, ignoring them could well impose a cost on the direct stakeholders in terms of turnaround time for enhancements and fixes.
Examples of multi-context software systems can be found many places. The more generic a system, the more contexts it will have. Commodity office software (word processors, spreadsheets, etc.) are a prime example. Consider Microsoft Excel, which is a simple row/column tabulator, a statistical analysis tool, a database client, and, with its macro language, is an application development platform. Stretching to cover this many needs turns the Excel User Experience into a game of Twister. Platforms, such as Android are another example. Highly configurable applications like Salesforce.com (which has arguably crossed the line between product and platform) span multiple contexts as well. In the enterprise IT space, first SOA and now microservice architectures are nothing if not an attempt to address the plethora of contexts in play via separation of concerns.
A holistic consideration of the contexts at hand is an important success factor when evolving a system iteratively and incrementally. Without that consideration, decisions can be made that are conducive to one context but antagonistic to another. Deferring the reconciliation of the two contextual conflicts until later may result in considerable architectural refactoring. This refactoring will involve time and expense at a minimum, and if time is constrained, the likelihood of incurring technical debt is high. Each additional conflict that is blundered into due to the lack of up front consideration increases the risk to the system. This is not to endorse a detailed Big Design Up Front (BDUF), but a recognition that problem awareness and problem avoidance is cheaper than rework. If the context has been identified, then it’s YAGNI – not “You Ain’t Gonna Need It” but “You Are Gonna Need It”.
The next post will in this series will address concrete practices to integrate multiple contexts into a unified architecture of the problem that can serve as the foundation for a coherent architecture of a solution.