Microservices or Monoliths – Fences and Neighbors

Photo of fence separating fields from a road

 

At the end of my last post, “What Makes a Monolith Monolithic?”, I stated that I didn’t consider the term “monolithic” to be inherently derogatory. It is, rather, a descriptive term relating to the style of organizing an application’s architecture. Depending on the context the system operates within, a monolithic architectural style could lie anywhere on the continuum between perfectly suited and perfectly disastrous. Placing it on that continuum requires a sense of what qualities are most needed or desired and which can be traded off in their stead. Everything comes with a cost, and attempting to ignore that fact merely sets us up for unpleasant future surprises.

After an initial period of unbridled enthusiasm, opinion seemed to gel around the idea that highly distributed application architectures (aka microservice architectures) were not suitable to all contexts. There are prerequisites for jumping into the microservices pool in terms of problem architecture, infrastructure, and organization. Attempting to shoehorn a microservice architecture into an environment that cannot support it will be overly expensive at best and a failure of apocalyptic proportions at worst.

There are many aspects of application design that are commonly recognized as beneficial: modularity, loose-coupling, high cohesion, and separation of concerns. It is critical to realize that these aspects can be found in systems with microservice architectures, monolithic systems, and everything in between. Distributed architectures are not necessary for modularity, nor any of those other aspects. In fact, one could easily create an application with a microservice architecture whose qualities are opposite to these desirable ones.

There are, however, situations where the benefit of a microservice architecture outweighs the costs and complexity. The ability to independently deploy and scale the various parts of an application is a major benefit, in my opinion. A well designed microservice architecture can even allow for the components of an application to be replaced on the fly. These features are not unique to microservice architectures, but are arguably easier to achieve than in other application architectures.

Real design, balancing both costs and benefits, is required. Sticking a bit of network in between the components is insufficient to ensure success. Deliberate design, especially as the boundaries multiply, is critical for an effective system. Identifying and providing for those boundaries at the conceptual level (i.e. before they become physical) is key. Good fences can either make for good neighbors, or they can create a maze of barriers.

What Makes a Monolith Monolithic?

Photo of Stonehenge, 1877

 

It seems like everybody throws around the term “monolith”, but what do we mean by that?

Sam Newman started the ball rolling yesterday with this tweet:

My first response was a (semi) joke:

I say semi joke because, in truth, semantics (i.e. meaning) is critical. The English language has a horrible tendency to overload terms as it is, and in our line of work we tend to make it even worse. Lack of specificity obscures, rather than enlightens. The problem with the term “monolith” is that, while it’s a powerfully evocative term, it isn’t a simple one to define. My second response was closer to an actual definition:

The purpose of this post is to expand on that a bit.

The “mono” portion of the term is, in my opinion, the crucial part. I believe that quality of oneness is what defines a monolithic system. As I noted in the second tweet, it’s a matter of meta-coupling, whether that coupling exists in the form of deployment, data architecture, or execution style (Jeppe Cramon‘s post “Microservices: It’s not (only) the size that matters, it’s (also) how you use them – part 3” shows how temporal coupling can turn a distributed system into a runtime monolith). The following tweets between Anne Currie and Sam illustrate the amorphous nature of what is and isn’t a monolith:

Modules that can be deployed to run in a single process need not be considered monolithic, if they’re not tightly coupled. Likewise, running distributed isn’t a guarantee against being monolithic if the components are tightly coupled in any way. The emphasis on “in any way” is due to the fact that any of the types of coupling I mentioned above can be a deal killer. If all the “microservices” must be deployed simultaneously for the system to work, it’s a distributed monolith. If the communication is both synchronous and fault intolerant, it’s a distributed monolith. If there’s a single data store backing the entire system, it’s a distributed monolith. It’s not the modularity that defines it (you can have a modular monolith), but the inability to separate the parts without damaging the whole system.

I would also point out that I don’t consider “monolithic” to be derogatory, in and of itself. There is a trade-off involved in terms of coupling and complexity (and cost). While I generally prefer more flexibility, there is always the danger of over-engineering. If we’re hand-carving marble gargoyles to stick on a tool shed, chances are the customer won’t be pleased. The solution should bear at least a passing resemblance to the problem context it’s supposed to address.

Stopping Accidental Technical Debt

Buster Keaton looking at a poorly constructed house

In one of my earlier posts about technical debt, I differentiated between intentional debt (that taken on deliberately and purposefully) and accidental debt (that which just accrues over time without rhyme or reason or record). Dealing with (in the sense of evaluating, tracking, and resolving it) technical debt is obviously a consideration for someone in an application architect role. While someone in that role absolutely should be aware of the intentional debt, is there a way to be more attuned to the accidental debt as well?

Last summer, I published a post titled “Distance…is the one true enemy…”. The post started with a group of tweets from Gregory Brown talking about the corrosive effects of distance on software development (distance between compile and run, between failure and correction, between development and feedback, etc.). I then extended the concept to management, talking about how distance between sense-maker and decision-maker could negatively affect the quality of the decisions being made.

There’s also a distance that neither Greg nor I covered at the time, design distance. Design distance is the distance between the design and the outcome. Reducing design distance makes it easier to keep a handle on the accidental debt as well as the intentional.

Distance between the architectural decisions and the implementation can introduce technical debt. This distance can come from remote decision-makers, architecture pigeons who swoop in, deposit their “wisdom”, and then fly away home. It can come from failing to communicate the design considerations effectively across the entire team. It can also come from failing to monitor the system as it evolves. The design and the implementation need to be in alignment. Even more so, the design and the implementation need to align with particular problems to be solved/jobs to be done. Otherwise, the result may look like this:

Distance between development of the system and keeping the system running can introduce technical debt as well. The platform a system runs on is a vital part of the system, as critical as the code it supports. As with the code, the design, implementation, and context all need to be kept in alignment.

Alignment of design, implementation, and context can only be maintained by on-going architectural assessment. Stefan Dreverman’s “Using Philosophy in IT architecture” identified four questions to be asked as part of an assessment:

  1. “What is my purpose?”
  2. “What am I composed of?”
  3. “What’s in my environment?”
  4. “What do I communicate?”

These questions are applicable not only to the beginning of a system, but throughout its life-cycle. Failing to re-evaluate the architecture as a whole as the system evolves can lead to inconsistencies as design distance grows. We can get so busy dealing with the present that we create a future of pain:

https://twitter.com/jetpack/status/854661368611471360

At first glance, this approach might seem to be expensive, but rewriting legacy systems is expensive as well (assuming the rewrite would be successful, which is a tenuous assumption). Building applications with a one-and-done mindset is effectively building a legacy system.

Microservices, Monoliths, and Modularity

Iceberg

 

There are very valid reasons for considering a microservice architecture (MSA) when building/evolving an application. In my opinion, however, forcing modularity isn’t one of those very valid reasons.

Just the other day, I saw tweet from Simon Brown saying this same thing:

I still like his comment from two years back: “I’ll keep saying this … if people can’t build monoliths properly, microservices won’t help”. I believe that if you’re having problems building a monolith properly, trying to use a distributed architecture to force modularity may actually cause harm.

MSAs, like any distributed application architecture, involve increased complexity and costs; table stakes, if you will. Like an iceberg, there’s both a lot more to it than just what’s showing above the waterline and a fair amount of hazard for the unwary. If a development team cannot or will not comply with design guidelines (e.g. modularity requirements), injecting additional complexity is probably not the solution you need.

Distributing an application makes it harder to accidentally entangle different concerns, but it doesn’t make it impossible:

I’d argue that making it harder to accidentally break modularity addresses neither of the groups I mentioned earlier: those that cannot or will not comply. It’s ironic, but those who fail to understand the need for modularity can be very creative in their “solutions”, regardless of the obstacles. Likewise, those who refuse to comply.

In short, distribution as a means of “ensuring” modularity fails the fitness for purpose test.

The situation becomes worse when you factor in the additional complexity inherent in a distributed system. Likewise, there’s the cost of the table stakes (infrastructure, process, staffing, etc.) mentioned above. Of course, having abandoned the principle of cause and effect, one could attempt some “creative” workarounds to avoid having to pay the price (in other words, adding more and more complexity).

When you introduce significant additional complexity (with all its attendant risk) with little chance of the technique actually achieving its goal, you’ve caused harm.

These concerns are not solely limited to the application architecture. Distributing the data architecture has the same limitations in terms of ensuring modularity and introduces additional complexity. Adding boundaries adds the need for governance. A disciplined, monolithic team can maintain modularity in a monolithic data architecture. Multiple separate teams trying to share a monolithic data architecture will either experience a crippling level of governance overhead or a complete breakdown in modularity.

MSAs can be useful when you need independently scalable and replaceable components. When you have multiple teams working on one logical application, they can also be appropriate as well. Using the technique when the cost outweighs the potential payoff, however, is a losing bet.

Pragmatic Application Architecture

I saw a tweet on Friday about a SlideShare deck that looked interesting, so I bookmarked it to read later. As I was reading it this morning, I found myself agreeing with the points being made. When I got to the next to the last slide, I found myself (or at least, this blog) listed alongside some very distinguished company under “Reading Material”.

Thanks, Bart Blommaerts and nice job!

Strategic Tunnel Vision

Mouth of a Tunnel

 

Change and innovation are topics that have been prominent on this blog over the last year. In fact, Greger Wikstrand and I have traded a total of twenty-six posts (twenty-seven counting this one) on the subject.

Greger’s last post, “Successful digitization requires focus on the entire customer experience – not just a neat app” (it’s in Swedish, but it translates well to English), discussed the critical nature of customer experience to digital innovation. According to Greger, without taking customer experience into account:

One can make the world’s best app without getting more, more satisfied and profitable customers. It’s like trying to make a boring games more exciting by spraying gold paint on the playing pieces.

Change and innovation are not the same thing. Change is inevitable, innovation is not (with a h/t to Tom Cagley for that quote). As Greger pointed out in his latest article, to get improved customer experience, you need depth. Sprinkling digital fairy dust over something is not likely result in innovation. New and different can be really great, but new and different solely for the sake of new and different doesn’t win the prize. Context is critical.

If you’ve read more than a couple of my posts, you’ve probably realized that among my rather varied interests, history is a major one. I lean heavily on military history in particular when discussing innovation. This post won’t break with that tradition.

The blog Defense in Depth, operated by the Defence Studies Department, King’s College London, has published two posts this week dealing with the Suez Crisis of 1956, primarily in terms of the Anglo-French forces. One deals with the land operations and the other with naval operations. They struck a chord because they both illustrated how an overreaction to change can have drastic consequences from the strategic level down to the tactical.

Buying into a fad can be extremely expensive.

The advent of the nuclear age at the end of World War II dramatically transformed military and political thought. The atomic bomb was the ultimate game-changer in that respect. In the time-honored tradition, the response was over-reaction. “Atomic” was the “digital” of the late 40s into the 60s. They even developed a recoilless gun that could launch a 50 pound nuclear warhead 1.25-2.5 miles. “Move fast and break things” was serious business back in the day.

This extreme focus on what had changed, however, led to a rather common problem, tunnel vision. Nuclear capability became such an overarching consideration that other capabilities were neglected. Due to this neglect of more conventional capabilities, the UK’s forces were seriously hampered in their ability to perform their mission effectively. Misguided thinking at the strategic level affected operations all the way down to the lowest tactical formations.

It’s easy to imagine present-day IT scenarios that fall prey to the same issues. A cloud or digital initiative given top priority without regard to maintaining necessary capabilities could easily wind up failing in a costly manner and impairing the existing capability. It’s important to understand that time, money, and attention are finite resources. Adding capability requires increasing the resources available for it, either through adding new resources or freeing up existing ones by reducing the commitment to less important capabilities. If there is no real appreciation of what capabilities exist and what the relative value of each is, making this decision becomes a shot in the dark.

Situational awareness across all levels is required. To be effective, that awareness must integrate changes to the context while not losing sight of what already was. Otherwise, to use a metaphor from my high school football days, you risk acting like a “blind dog in a meat-packing plant”.

Monolithic Applications and Enterprise Gravel

Pebbles

It’s been almost a year since I’ve written anything about microservices, and while a lot has been said on that subject, it’s one I still monitor to see what new pops up. The opening of a blog post that I read last week caught my attention:

Coined by Melvin Conway in 1968, Conway’s Law states: “Any organization that designs a system will produce a design whose structure is a copy of the organization’s communication structure.” In software development terms, Conway’s Law suggests that a given team will build apps that mirror the team’s organizational structure. Siloed functional teams produce siloed application architectures.

The result is a monolith: A massive application whose functionality is crammed into a few crowded parts. Scaling a simple pattern to the enterprise level often results in a monolith.

None of this is wrong, per se, but in reading it, one could come to a wrong conclusion. Siloed functional teams (particularly where the culture of the organization encourages siloed business units) produce siloed application architectures that are most likely monoliths. From an enterprise IT architecture aspect, though, the result is not monolithic. Googling the definition of “monolithic”, we get this:

mon·o·lith·ic
ˌmänəˈliTHik/
adjective
  1. formed of a single large block of stone.
  2. (of an organization or system) large, powerful, and intractably indivisible and uniform.
    “rejecting any move toward a monolithic European superstate”
    synonyms: inflexible, rigid, unbending, unchanging, fossilized
    “a monolithic organization”

Rather than “a single large block of stone”, we get gravel. The architecture of the enterprise’s IT isn’t “large, powerful, and intractably indivisible and uniform”. It may well be large, but its power in relation to its size will be lacking. Too much effort is wasted reinventing wheels and maintaining redundant data (most likely with no real sense of which set of data is authoritative). Likewise, while “intractably indivisible” isn’t a virtue, being intractable while also lacking cohesion is worse. Such an IT architecture is a foundation built on shifting sand. Lastly, whether the EITA is uniform or not (and I would give good odds that it’s not), is irrelevant given the other negative aspects. Under the circumstances, worrying about uniformity would be like worrying about whether the superstructure of the Titanic had a fresh paint job.

Does this mean that microservices are the answer to having an effective EITA? Hardly.

There are prerequisites for being able to support a microservice architecture; table stakes, if you will. However, the service-oriented mindset can be of value whether it’s applied as far down as the intra-application level (i.e. microservices – it is an application architecture pattern) or inter-application (the more traditional SOA). Where the line is drawn depends on the context of the application(s) and their ecosystem. What can be afforded and supported are critical aspects of the equation at all levels.

What is necessary for an effective EITA is a full-stack approach. Governance and data architecture in particular are important aspects to consider. The goal is consistent, intentional alignment across all levels (enterprise, EITA, solution, and application), promoting a cohesive architecture throughout, not a top-down dictatorship.

Large edifices that last are built from smaller pieces that fit together on purpose.

Dealing with Technical Debt Like We Mean it

What’s the biggest problem with technical debt?

https://twitter.com/jetpack/status/724637235459547136

In my opinion, the biggest problem is that it works. Just like the electrical outlet pictured above, systems with technical debt get the job done, even when there’s a hidden surprise or two waiting to make life interesting for us at some later date. If it flat-out failed, getting it fixed would be far easier. Making the argument to spend time (money) changing something that “works” can be difficult.

Failing to make the argument, however, is not the answer:

Brenda Michelson‘s observation is half the battle. The argument for paying down technical debt needs to be made in business-relevant terms (cost, risk, customer impact, etc.). We need more focus on the “debt” part and remember “technical” is just a qualifier:

https://twitter.com/jetpack/status/703327984107782144

The other half of the battle is communicating, in the same business-relevant manner, the costs and/or risks involved when taking on technical debt is considered:

Tracking what technical debt exists and managing the payoff (or write off, removing failed experiments is a reduction technique) is important. Likewise, managing the assumption of technical debt is critical to avoid being swamped by it.

Of course, one could take the approach that the only acceptable level of technical debt is zero. This is equivalent to saying “if we can’t have a perfect product, we won’t have a product”. That might be a difficult position to sell to those writing the checks.

Even if you could get an agreement for that position, reality will conspire to frustrate you. Entropy emerges. Even if the code is perfected and then left unchanged, the system can rot as its platform ages and the needs of the business change. When a system is actively maintained over time without an eye to maintaining a coherent, intentional architecture, then the situation becomes worse. In his post “Enterprise Modernization – The Next Big Thing!”, David Sprott noted:

The problem with modernization is that it is widely perceived as slow, very expensive and high risk because the core business legacy systems are hugely complex as a result of decades of tactical change projects that inevitably compromise any original architecture. But modernization activity must not be limited to the old, core systems; I observe all enterprises old and new, traditional and internet based delivering what I call “instant legacy” [Note 1] generally as outcomes of Agile projects that prioritize speed of delivery over compliance with a well-defined reference architecture that enables ongoing agility and continuous modernization.

Kellan Elliot-McCrea, in “Towards an understanding of technical debt”, captured the problem:

All code is technical debt. All code is, to varying degrees, an incorrect bet on what the future will look like.

This means that assessing and managing technical debt should be an ongoing activity with a responsible owner rather than a one-off event that “somebody” will take care of. The alternative is a bit like using a credit card at every opportunity and ignoring the statements until the repo-man is at the door.

The Hidden Cost of Cheap – UX and Internal Applications

Sisyphus by Titian

Why would anyone worry about user experience for anything that’s not customer-facing?

This question was the premise of Maurice Roach’s post in the Zühlke blog, “Empathise with your users or you won’t solve their problems”:

Bring up the subject of user empathy with some engineers or product owners and you’ll probably hear comments that fall into one of the following categories:

  • Why do we need to empathise when the requirements tell us all we need to know about the problem at hand?
  • Is this really going to improve anything?
  • Sounds like an expensive waste of time
  • They’ll have to use whatever they’re given

These aren’t unexpected responses, it’s easy to put empathy into the “touchy feely”, “let’s all hug and get along” box of product management.

Roach’s answers:

Empathy does a number of things, but mainly it increases the likelihood that the delivery team will think of a user and their pain points when delivering a feature.

If an engineer, UX designer or product owner will has sat with a user, watched them interact with their current software or device they will have an understanding of their frustrations, concerns and impediments to success. The team will be focused on creating features with the things they have witnessed in mind, they’re thinking about how their software will affect a human being and no amount of requirement documentation will give them that emotional connection.

Empathy can also help to develop a shared trust in the application development process. The users see that the delivery team are interested in helping to solve their problems and the product delivery team see the real users behind the application.

All of these are valid reasons, but the list is incomplete. All of these answer the question from a software development point of view. To his credit, Roach pushes past the purely technical aspects into the world of the user. This expanded exploration of the context is, in my opinion, absolutely essential. What’s presented above is an IT-centric viewpoint that needed to be married with a business-centric viewpoint in order to get a fuller picture.

Nick Shackleton-Jones, in his post “The Future Is… Organisational Usability!”, outlined on the problem:

Here’s how your organisation works: you hire people who are increasingly used to a world where they can do pretty much anything via an app on their iPhone, and you subject them to a blizzard of process, policy, antiquated systems and outdated ways of working which pretty much stop them in their tracks, leaving them unproductive and demoralised. Frankly, it’s a miracle they manage to accomplish anything at all.

As he notes, enterprises are putting a lot of effort into digital initiatives aimed at making it easier for customers to engage with them. However:

…if we are going to be successful in future we need to make it much easier for our people to do their jobs: because they are going to be spending less time with us, and because we want engagement and retention, and because if we require high levels of capability (to work our complex systems) then our resourcing costs will go through the roof. We have to simplify ‘getting stuff done’. To put it another way: in an ideal world, any job in your organisation should be do-able by a 12-yr old.

While I disagree that “any job in your organisation should be do-able by a 12-yr old”, Shackleton-Jones point is well-taken that it is in the interests of the business to make it easier for people to do their jobs. All aspects of the system, whether organizational, procedural or technological, should be facilitating, not hindering, the mission. Self-inflicted, unnecessary impediments are morale-killers and degrade both effectiveness and efficiency. All three of these directly impact customer-experience.

While this linkage between employee user experience and customer experience makes usability important for line of business systems (both technological and social), it has value for peripheral systems as well. Time people spend on ancillary tasks (filling out time sheets, requesting supplies, etc.) is time not spent on their primary duties. You may not be able to eliminate those tasks, but you can minimize their expense by making them quick and easy to complete. The further someone’s knowledge/skill/experience level gets from “do-able by a 12-yr old”, the bigger the savings by paying attention to this.

Rather than asking if you can afford to pay attention to user experience, you might want to ask whether you can afford not to.

The Seductive Myth of Greenfield Development

Greger Wikstrand‘s tweet from earlier this week packed a wealth of inspiration into one image:

The second statement particularly resonated with me: “The present is built on the past.”

How often do we, or those around us, long for a chance to do things “from scratch”. The idea being, without the constraints of “legacy” code, we could do things “right”. While it’s a nice idea, it has no basis in reality.

Rewrites, of course, will involve dealing with existing data. I’ve yet to encounter a system where no one was interested in the data when it was replaced. I’ve shut down a few where there was no interest, but that’s a different story. The need for that existing data will serve as a potent influence on what can or cannot be done with the replacement system. Likewise, its structure. It’s not reasonable to assume that the data will be any less “legacy” than the code.

We might be tempted to believe that brand new systems escape this pitfall. In doing so, we fail to consider that new systems still must deal with the wants, needs, and attitudes of their stakeholders. People, processes, and organization form the ecosystem that new systems must fit into as surely as replacement systems must.

A crucial part of problem solving is having an adequate understanding of the problem. Everything has a backstory. Understanding the backstory is dependent on understanding the ecosystem the thing fits into. This what Sullivan was talking about when he said “…form ever follows function”.

Nothing’s Ex Nihilo.