Microservices, Monoliths, and Modularity

Iceberg

 

There are very valid reasons for considering a microservice architecture (MSA) when building/evolving an application. In my opinion, however, forcing modularity isn’t one of those very valid reasons.

Just the other day, I saw tweet from Simon Brown saying this same thing:

I still like his comment from two years back: “I’ll keep saying this … if people can’t build monoliths properly, microservices won’t help”. I believe that if you’re having problems building a monolith properly, trying to use a distributed architecture to force modularity may actually cause harm.

MSAs, like any distributed application architecture, involve increased complexity and costs; table stakes, if you will. Like an iceberg, there’s both a lot more to it than just what’s showing above the waterline and a fair amount of hazard for the unwary. If a development team cannot or will not comply with design guidelines (e.g. modularity requirements), injecting additional complexity is probably not the solution you need.

Distributing an application makes it harder to accidentally entangle different concerns, but it doesn’t make it impossible:

I’d argue that making it harder to accidentally break modularity addresses neither of the groups I mentioned earlier: those that cannot or will not comply. It’s ironic, but those who fail to understand the need for modularity can be very creative in their “solutions”, regardless of the obstacles. Likewise, those who refuse to comply.

In short, distribution as a means of “ensuring” modularity fails the fitness for purpose test.

The situation becomes worse when you factor in the additional complexity inherent in a distributed system. Likewise, there’s the cost of the table stakes (infrastructure, process, staffing, etc.) mentioned above. Of course, having abandoned the principle of cause and effect, one could attempt some “creative” workarounds to avoid having to pay the price (in other words, adding more and more complexity).

When you introduce significant additional complexity (with all its attendant risk) with little chance of the technique actually achieving its goal, you’ve caused harm.

These concerns are not solely limited to the application architecture. Distributing the data architecture has the same limitations in terms of ensuring modularity and introduces additional complexity. Adding boundaries adds the need for governance. A disciplined, monolithic team can maintain modularity in a monolithic data architecture. Multiple separate teams trying to share a monolithic data architecture will either experience a crippling level of governance overhead or a complete breakdown in modularity.

MSAs can be useful when you need independently scalable and replaceable components. When you have multiple teams working on one logical application, they can also be appropriate as well. Using the technique when the cost outweighs the potential payoff, however, is a losing bet.

Microservices – Sharpening the Focus

Motion Blurred London Bus

While it was not the genesis of the architectural style known as microservices, the March 2014 post by James Lewis and Martin Fowler certainly put it on the software development community’s radar. Although the level of interest generated has been considerable, the article was far from an unqualified endorsement:

Despite these positive experiences, however, we aren’t arguing that we are certain that microservices are the future direction for software architectures. While our experiences so far are positive compared to monolithic applications, we’re conscious of the fact that not enough time has passed for us to make a full judgement.

One reasonable argument we’ve heard is that you shouldn’t start with a microservices architecture. Instead begin with a monolith, keep it modular, and split it into microservices once the monolith becomes a problem. (Although this advice isn’t ideal, since a good in-process interface is usually not a good service interface.)

So we write this with cautious optimism. So far, we’ve seen enough about the microservice style to feel that it can be a worthwhile road to tread. We can’t say for sure where we’ll end up, but one of the challenges of software development is that you can only make decisions based on the imperfect information that you currently have to hand.

In the course of roughly fourteen months, Fowler’s opinion has gelled around the “reasonable argument”:

So my primary guideline would be don’t even consider microservices unless you have a system that’s too complex to manage as a monolith. The majority of software systems should be built as a single monolithic application. Do pay attention to good modularity within that monolith, but don’t try to separate it into separate services.

This mirrors what Sam Newman stated in “Microservices For Greenfield?”:

I remain convinced that it is much easier to partition an existing, “brownfield” system than to do so up front with a new, greenfield system. You have more to work with. You have code you can examine, you can speak to people who use and maintain the system. You also know what ‘good’ looks like – you have a working system to change, making it easier for you to know when you may have got something wrong or been too aggressive in your decision making process.

You also have a system that is actually running. You understand how it operates, how it behaves in production. Decomposition into microservices can cause some nasty performance issues for example, but with a brownfield system you have a chance to establish a healthy baseline before making potentially performance-impacting changes.

I’m certainly not saying ‘never do microservices for greenfield’, but I am saying that the factors above lead me to conclude that you should be cautious. Only split around those boundaries that are very clear at the beginning, and keep the rest on the more monolithic side. This will also give you time to assess how how mature you are from an operational point of view – if you struggle to manage two services, managing 10 is going to be difficult.

In short, the application architectural style known as microservice architecture (MSA), is unlikely to be an appropriate choice for the early stages of an application. Rather it is a style that is most likely migrated to from a more monolithic beginning. Some subset of applications may benefit from that form of distributed componentization at some point, but distribution, at any degree of granularity, should be based on need. Separation of concerns and modularity does not imply a need for distribution. In fact, poorly planned distribution may actually increase complexity and coupling while destroying encapsulation. Dependencies must be managed whether local or remote.

This is probably a good point to note that there is a great deal of room between a purely monolithic approach and a full-blown MSA. Rather than a binary choice, there is a wide range of options between the two. The fractal nature of the environment we inhabit means that responsibilities can be described as singular and separate without their being required to share the same granularity. Monoliths can be carved up and the resulting component parts still be considered monolithic compared to an extremely fine-grained sub-application microservice and that’s okay. The granularity of the partitioning (and the associated complexity) can be tailored to the desired outcome (such as making components reusable across multiple applications or more easily replaceable).

The moral of the story, at least in my opinion, is that intentional design concentrating on separation of concerns, loose coupling, and high cohesion is beneficial from the very start. Vertical (functional) slices, perhaps combined with layers (what I call “dicing”), can be used to achieve these ends. Regardless of whether the components are to be distributed at first, designing them with that in mind from the start will ease any transition that comes in the future without ill effects for the present. Neglecting these issues, risks hampering, if not outright preventing, breaking them out at a later date without resorting to a re-write.

These same concerns apply higher levels of abstraction as well. Rather than blindly growing a monolith that is all things to all people, adding new features should be treated as an opportunity to evaluate whether that functionality coheres with the existing application or is better suited to being a service from an external provider. Just as the application architecture should aim for modularity, so too should the solution architecture.

A modular design is a flexible design. While we cannot know up front the extent of change an application will undergo over its lifetime, we can be sure that there will be change. Designing with flexibility in mind means that change, when it comes, is less likely to be an existential crisis. As Hayim Makabee noted in his write-up of Rotem Hermon’s talk, “Change Driven Design”: “Change should entail extending the system rather than refactoring.”

A full-blown MSA architecture is one possible outcome for an application. It is, however, not the most likely outcome for most applications. What is important is to avoid unnecessary constraints and retain sufficient flexibility to deal with the needs that arise.

[London Bus Image by E01 via Wikimedia Commons.]

Institutional Amnesia, Cargo Cults and Software Development

When George Santayana stated that “Those who cannot remember the past are condemned to repeat it.”, he wasn’t talking about technology. When Brenda Michelson and Ed Featherston said much the same thing recently, they were:

It’s a sad fact of life that today’s silver bullet is likely to be yesterday’s junk which was probably the day before yesterday’s silver bullet.

Poor design choices are made for a variety of reasons. Sometimes it’s a matter of ego. Sometimes inadequate analysis is the culprit. Focusing on technology rather than problem-solving can be another pitfall. Even attempts at post-hoc justification of a prior bad decision can drive new mistakes.

An uncritical acceptance of tradition is a significant source of problem designs. Eberhard Wolff recently took a swipe at one old standard:

The stock reason for a tiered/distributed design is scalability. However, it’s not a given that distributing X horizontal layers across Y machines (yielding X/Y instances) will yield better results than Y machines, each with all three layers deployed on the same machine. The context in which this sort of distribution makes sense is far from universal. Even when the costs of distribution are outweighed by the benefits, traditional monolithic horizontal layers will likely be less efficient than vertical slices. One of the purported benefits of microservices is the ability to independently scale according to business concerns (vertical slices organized around bounded contexts) rather technology concerns (horizontal layers).

The mention of microservices brings to mind the problem of jumping on bandwagons. How many applications currently under development are being designed using this architectural style because it’s the “next big thing” rather than because the style fits the problem? Sam Newman, author of O’Reilly’s Building Microservices, in “Microservices for Greenfield?”, even states that he considers the style to be more suitable for evolving an existing system rather than building from scratch:

I remain convinced that it is much easier to partition an existing, “brownfield” system than to do so up front with a new, greenfield system. You have more to work with. You have code you can examine, you can speak to people who use and maintain the system. You also know what ‘good’ looks like – you have a working system to change, making it easier for you to know when you may have got something wrong or been too aggressive in your decision making process.

You also have a system that is actually running. You understand how it operates, how it behaves in production. Decomposition into microservices can cause some nasty performance issues for example, but with a brownfield system you have a chance to establish a healthy baseline before making potentially performance-impacting changes.

I’m certainly not saying ‘never do microservices for greenfield’, but I am saying that the factors above lead me to conclude that you should be cautious. Only split around those boundaries that are very clear at the beginning, and keep the rest on the more monolithic side. This will also give you time to assess how how mature you are from an operational point of view – if you struggle to manage two services, managing 10 is going to be difficult.

This same over-eagerness is present in front-end development as much as back-end development. Stefan Tilkow recently tweeted regarding the trend of jumping straight into complex Javascript framework applications rather than evolving into them based on need:

In my opinion, the key to effective design is being able to give a good answer when asked “why”. Being able to articulate the reasons behind the choices made is critical to justifying them. By reasons, I mean a logical explanations of how the techniques chosen contribute to the desired ends. Neither “X recommends this” nor “This is what everybody’s doing” count. Designing, developing, and evolving software systems is not a game of following a recipe. In the words of Grady Booch:

Form Follows Function on SPaMCast 335

SPaMCAST logo

It’s time for another appearance on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast. This time I’m taking on Knuth’s quote: “Premature optimization is the root of all evil (or at least most of it) in programming.”

SPaMCast 335 features Tom on the meaning of effectiveness, efficiency, frameworks and methodologies; a discussion of my “Wait, did I just say Knuth was wrong?” post and an installment of Jo Ann Sweeny’s column, “Explaining Communication”, talking about content and a framework to guide the development of content.

Microservices, SOA, Reuse and Replaceability

Unicorn

While it’s not as elusive as the unicorn, the concept of reuse tends to be talked about more often talked about than seen. Over the years, object-orientation, design patterns, and services have all held out the promise of reuse of either code or at least, design. Similar claims have been made regarding microservices.

Reuse is a creature of extremes. Very fine grained components (e.g. the classes that make up the standard libraries of Java and .Net) are highly reusable but require glue code to coordinate their interaction in order to yield something useful. This will often be the case with microservices, although not always; it is possible to have very small services with few or no dependencies on other services (it’s important to remember, unlike libraries, services generally share both behavior and data.). Coarse grained components, such as traditional SOA services, can be reused across an enterprise’s IT architecture to provide standard high-level interfaces into centralized systems for other applications.

The important thing to bear in mind, though, is that reuse is not an end in itself. It can be a means of achieving consistency and/or efficiency, but its benefits come from avoiding cost and duplication rather than from the extra usage. Just as other forms of reuse have had costs in addition to benefits, so it is with microservices as well.

Anything that is reused rather than duplicated becomes a dependency of its client application. This dependency relationship is a form of coupling, tying the two codebases together and constraining the ability of the dependency to change. Within the confines of an application, it is generally better for reuse to emerge. Inter-application reuse will require more coordination and tend to be more deliberately designed. As with most things, there is no free lunch. Context is required to determine whether the trade is a good one or not.

Replaceability is, in my opinion, just as important, if not more so, than reuse. Being able to switch from one dependency to another (or from one version of a dependency to another) because that dependency has its own independent lifecycle and is independently deployed enables a great deal of flexibility. That flexibility enables easier upgrades (rolling migration rather than a big bang). Reducing the friction inherent in migrations reduces the likelihood of technical debt due to inertia.

While a shared service may well find more constraints with each additional client, each client can determine how much replaceability is appropriate for itself.

Wait, did I just say Knuth was wrong?

Surprised Women

In “Microservice Mistakes – Complexity as a Service”, I argued that the fine-grained nature of microservices opened up the risk of pushing complexity out to the consumers of those services. Rather than encapsulating details, microservice architectures expose them, forcing clients to know more about the internals than is common in both object-oriented and SOA traditions. In the comments, it was suggested that granularity was irrelevant as multiple granular microservices could be composed to form a coarser-grained microservice that would provide a more appropriate level of abstraction. My response was that while this is theoretically true, aggregating service calls in that manner risks issues due to network latency. This drew a response quoting Donald Knuth: “Premature optimization is the root of all evil (or at least most of it) in programming.”

Okay, in my rebuttal I did say that Knuth was wrong about this when it came to distributed systems. A better response would have been to point out that Knuth’s quote did not apply. Far from being an optimization, taking latency (as well as other network characteristics) into consideration is just basic design. Meeting a certain time to complete for in-process calls affects quality of service, making efforts to reduce that time optimizations. Meeting a certain time to complete for remote calls affects function. Achieving a functional state is not an optimization.

Location agnostic components, code that “just works” whether in-process, out of process, or over the wire, has been a Holy Grail since the days of DCOM and CORBA. The laws of physics, however, just won’t be denied. Services are not JARs and DLLs. Changing components that were designed to run in-process into ones capable of running remotely will almost certainly involve major re-work, not a little optimization.

Quick Fixes That Last a Lifetime

Move Fast and Break Things on xkcd

“Move fast and break things.”

“Fail fast.”

“YAGNI.”

“Go with the simplest thing that can possibly work.”

I’ve written previously about my dislike for simplistic sound-bite slogans. Ideas that have real merit under the appropriate circumstances can be deadly when stripped of context and touted as universal truths. As Tom Graves noted in his recent post “Fail, to learn”, it’s not about failing, it’s about learning. We can’t laugh at cargo cultists building faux airports to lure the planes back while we latch on to naive formulas for success in complex undertakings without a clue as to how they’re supposed to work.

The concepts of emergent design and emergent architecture are cases in point. Some people contend that if you do the simplest thing that could possibly work, “The architecture follows immediately from that: the architecture is just the accumulation of these small steps”. It is trivially true that an architecture will emerge under those circumstances. What is unclear (and unexplained) is how a coherent architecture is supposed to emerge without any consideration for the higher levels of scope. Perhaps the intent is to replicate Darwinian evolution. If so, that would seem to ignore the fact that Darwinian evolution occurs over very long time periods and leaves a multitude of bodies in its wake. While the species (at least those that survive) ultimately benefit, individuals may find the process harsh. If the fittest (most adaptable, actually) survive, that leaves a bleaker future for those that are less so. Tipping the scales by designing for more than the moment seems prudent.

Distributed systems find it even more difficult to evolve. Within the boundary of a single application, moving fast and breaking things may not be fatal (systems dealing with health, safety, or finance are likely to be less tolerant than social networks and games). With enough agility, unfavorable mutations within an application can be responded to and remediated relatively quickly. Ill-considered design decisions that cross system boundaries can become permanent problems when cost and complexity outweigh the benefits of fixing them. There is a great deal of speculation that the naming of Windows 10 was driven by the number of potential issues that would be created by naming it Windows 9. Allegedly, Microsoft based its decision on not triggering issues caused by short-sighted decisions on the part of developers external to Microsoft. As John Cook noted:

Many think this is stupid. They say that Microsoft should call the next version Windows 9, and if somebody’s dumb code breaks, it’s their own fault.

People who think that way aren’t billionaires. Microsoft got where it is, in part, because they have enough business savvy to take responsibility for problems that are not their fault but that would be perceived as being their fault.

It is naive, particularly with distributed applications, to act as if there are no constraints. Refactoring is not free, and consumers of published interfaces create inertia. While it would be both expensive and ultimately futile to design for every circumstance, no matter how improbable, it is foolish to ignore foreseeable issues and allow a weakness to become a “standard”. There is a wide variance between over-engineering/gold-plating (e.g. planting land mines in my front yard just in case I get attacked by terrorists) and slavish adherence to a slogan (e.g. waiting to install locks on my front door until I’ve had something stolen because YAGNI).

I can move fast and break things by wearing a blindfold while driving, but that’s not going to get me anywhere, will it?

Microservices Under the Microscope

Microscope

Buzz and backlash seems to describe the technology circle of life. Something (language, process, platform, etc.; it doesn’t seem to matter) gets noticed, interest increases, then the reaction sets in. This was seen with Service-Oriented Architecture (SOA) in the early 2000s. More was promised than could ever be realistically attained and eventually the hype collapsed under its own weight (with a little help from the economic downturn). While the term SOA has died out, services, being useful, have remained.

2014 appears to be the year of microservices. While neither the term nor the architectural style itself are new, James Lewis and Martin Fowler’s post from earlier this year appears to have significantly raised the level of interest in it. In response to the enthusiasm, others have pointed out that the microservices architectural style, like any technique, involves trade offs. Michael Feathers pointed out in “Microservices Until Macro Complexity”:

If services are often bigger than classes in OO, then the next thing to look for is a level above microservices that provides a more abstract view of an architecture. People are struggling with that right now and it was foreseeable. Along with that concern, we have the general issue of dealing with asynchrony and communication patterns between services. I strongly believe that there is a law of conservation of complexity in software. When we break up big things into small pieces we invariably push the complexity to their interaction.

Robert, “Uncle Bob”, Martin has recently been a prominent voice questioning the silver bullet status of microservices. In “Microservices and Jars”, he pointed out that applications can achieve separation of concerns via componentization (using jars/Gems/DLLs depending on the platform) without incurring the overhead of over-the-wire communication. According to Uncle Bob, by using a plugin scheme, components can be as independently deployable as a microservice.

Giorgio Sironi responded with the post “Microservices are not Jars”. In it, Giorgio pointed out independent deployment is only part of the equation, independent scalability is possible via microservices but not via plugins. Giorgio questioned the safety of swapping out libraries, but I can vouch for the fact that plugins can be hot-swapped at runtime. One important point made was in regard to this quote from Uncle Bob’s post:

If I want two jars to get into a rapid chat with each other, I can. But I don’t dare do that with a MS because the communication time will kill me.

Sironi’s response:

Of course, chatty fine-grained interfaces are not a microservices trait. I prefer accept a Command, emit Events as an integration style. After all, microservices can become dangerous if integrated with purely synchronous calls so the kind of interfaces they expose to each other is necessarily different from the one of objects that work in the same process. This is a property of every distributed system, as we know from 1996.

Remember this for later.

Uncle Bob’s follow-up post, “Clean Micro-service Architecture”, concentrated on scalability. It made the point that microservices are not the only method for scaling an application (true); and stated that “the deployment model is a detail” and “details are never part of an architecture” (not true, at least in my opinion and that of others):

While Uncle Bob may consider the idea of designing for distribution to be “BDUF Baloney”, that’s wrong. That’s not only wrong, but he knows it’s wrong – see his quote above re: “a rapid chat”. In the paper that’s referenced in the Sironi quote above, Waldo et al. put it this way:

We argue that objects that interact in a distributed system need to be dealt with in ways that are intrinsically different from objects that interact in a single address space. These differences are required because distributed systems require that the programmer be aware of latency, have a different model of memory access, and take into account issues of concurrency and partial failure.

You can design a system with components that can run in the same process, across multiple processes, and across multiple machines. To do so, however, you must design them as if they were going to be distributed from the start. If you begin chatty, you will find yourself jumping through hoops to adapt to a coarse-grained interface later. If you start with the assumption of synchronous and/or reliable communications, you may well find a lot of obstacles when you need to change to a model that lacks one or both of those qualities. I’ve seen systems that work reasonably well on a single machine (excluding the database server) fall over when someone attempts to load balance them because of a failure to take scaling into account. Things like invalidating and refreshing caches as well as event publication become much more complex starting with node number two if a “simplest thing that can work” approach is taken.

Distributed applications in general and microservice architectures in particular are not universal solutions. There are costs as well as benefits to every architectural style and sometimes having everything in-process is the right answer for a given point in time. On the other hand, you can’t expect to scale easily if you haven’t taken scalability into consideration previously.

Carving it up – Microservices, Monoliths, & Conway’s Law

Felling a gumtree

In a previous post, I quoted a passage from Ruth Malan’s “Conway’s Law”:

The organizational divides are going to drive the true seams in the system.

The architecture of the system gets cemented in the forms of the teams that develop it. The system decomposition drives team ownership. Then the organizational lines of communication become reflected in the interfaces, with cleaner, better preserved interfaces along the lines where organizational distance increases. In small, co-located teams, short-cuts can be taken to optimize within the team. But each short-cut that introduces a dependency is like rebar in concrete–structurally efficient, but rigid. If the environment changes, demanding new lines to be drawn, the cost becomes clear. The architecture is hard to adapt.

Those eight sentences contain extremely important considerations for application architecture, particularly if you are considering a microservice architecture.

Microservice architectures, although they have gained currency due to a recent post by James Lewis and Martin Fowler, are not a new concept. Lewis and Fowler note a pedigree going back as far as 2011, and in the opinion of Steve Jones, “Microservices is SOA, for those who know what SOA is”. Roger Sessions published a 3 part series in the fall of 2012 detailing his concept of a “Snowman Architecture” (Part 1, Part 2, Part 3), which is fundamentally similar to the Lewis and Fowler model.

According to Lewis and Fowler, the characteristics of a microservice architecture are:

  • Componentization via Services: Services are independently deployable, encapsulated, out-of-process components that can be composed into a system of systems with their interactions defined via service contracts (APIs).
  • Organized around Business Capabilities: Services are vertical slices of technologies dedicated to particular business capability, ideally with coinciding service and team boundaries.
  • Products not Projects: The team should own the service over its entire lifecycle, both development and support. (*)
  • Smart endpoints and dumb pipes: Logic should reside within the services rather than the communications mechanisms that tie them together (such as an ESB).
  • Decentralized Governance: Because of the partitioning inherent in this style, teams are not forced into one set of standards for their internal implementations. (*)
  • Decentralized Data Management: Because of the partitioning inherent in this style, data will be distributed across multiple services with a tendency towards eventual consistency.
  • Infrastructure Automation: Lewis and Fowler recommend automated testing, automated deployment and automation of infrastructure. (*)
  • Design for failure: The distributed, partitioned nature of microservice architectures increases the need for system monitoring and resilience.
  • Evolutionary Design: Smaller, modular services enable agile, controllable change.

* Note: I consider these to more in the realm of process than architectural style. That being said, I also consider most of these to be good practice for teams using this particular architectural style.

Composing a system of independent, autonomous components, as opposed to a monolith, allows them to be partitioned by business capability as well as be developed and maintained by a permanent team. This transforms Conway’s Law from an observation about the nature of software systems into a principle that can be used to improve the structure of those systems (“The organizational divides are going to drive the true seams in the system.”). The social system architecture will influence the software system architecture in any event, the question is whether that influence will be intentional or not.

The microservice architectural style has been described as “applying many of the principles of SOLID at an architectural level”. As opposed to monolithic architectures, it enables a high degree of flexibility, cohesion and composability (when done well). It is not, however, a free lunch. Distributed applications are more complex than monolithic ones in terms of development, operations, and governance. Accessing components that are out-of-process, not to mention over the network, is not the same as dealing with in-process components. Network latency becomes more and more important as applications become more distributed. Asynchronous operations can ease some of that pain, but at the cost increasing the complexity of the system.

Overly granular services can be another pitfall with this architectural style. In “Services, Microservices, Nanoservices – oh my!”, Arnon Rotem-Gal-Oz discusses the nanoservice anti-pattern where “…overhead (communications, maintenance, and so on) outweighs its utility”. To expand on his “so on”, security is one of those overhead aspects that can become more burdensome as the application becomes more distributed. At the very minimum, authorization, if the application is entirely internal, is more complicated than it would be with a monolith. External users and/or dependencies increase the scope of that security overhead, potentially adding multiple methods of authentication, encryption, etc.

Data architecture complexity is another consideration. It should be borne in mind that each microservice is its own self-contained system with its own data store(s). This adds fragmentation and most likely, redundancy, to the enterprise’s data architecture. Redundancy is not bad per se (e.g. caching), but it does require more in terms of governance to identify what source is authoritative and what others are not. Likewise, fragmentation is not always poor design (think bounded contexts with shared concepts), but it may require more effort to consolidate all the data for a given entity for reporting purposes. One might be tempted to “cheat” and go the shared database route, but that would be a mistake.

In a post on application boundaries, Martin Fowler observed that “essentially applications are social constructions…drawn primarily by human inter-relationships and politics rather than technical and functional considerations”. Realizing that, we can use the organizational dynamics to shape the components of our systems of systems in a way that helps, rather than harms the technical and functional considerations. The critical things to keep in mind are balance and intent.

Dependency Management is Risk Management

It depends

How well-managed are your dependencies? Are you aware of all of them? Which ones can fail gracefully? Which ones allow the application to continue in a degraded state in the event of a failure? How many dependencies would cause your application to become unavailable in the event of a failure?

It’s instructive that the Latin root of the word “depend” means “to hang”. Although we may rely on them, dependencies also hang over our heads like the sword of Damocles, an ever-present threat to the well-being of our systems and our peace of mind. Since we cannot live without them, we must find a way to harness the usefulness while minimizing the risk.

It’s common to think of code when the subject of dependencies comes up. Issues around how to organize an application into packages, reuse of common components, use of third-party components, and even proper usage of framework classes are all dependency management issues. For these types of dependencies, the anti-patterns tend to be well known, as are techniques for managing them:

My Four Principles of Dependency Management have an order of precedence.

  1. Minimise Dependencies – the simpler our code, the less “things” we have referring to other “things”
  2. Localise Dependencies – for the code we have to write, as much as possible, “things” should be packaged – in units of code organisation – together with the “things” they depend on
  3. Stabilise Dependencies – of course, we can’t put our entire dependency network in the same function (that would be silly). For starters, it’s at odds with minimising our dependencies, since modularity is the mechanism for removing duplication, and modularisation inevitably requires some dependencies to cross the boundaries between modules (using the most general meaning of “module” to mean a unit of code reuse – which could be a function or could be an entire system in a network of systems). When dependencies have to cross those boundaries, they should point towards things that are less likely – e.g., harder – to change. This can help to localise the spread of changes across our network of dependencies, in much the same way that a run on the banks is less likely if banks only lend to other banks that are less likely to default.
  4. Abstract Dependencies – when we have to depend on something, but still need to accomodate change into system somehow, the easiest way to that is to make things that are depended upon easier to substitute. It’s for much the same reason that we favour modular computer hardware. We can evolve and improve our computer by swapping out components with newer ones. To make this possible, computer components need to communicate through standard interfaces. These industry abstractions make make it possible for me to swap out my memory with larger or faster memory, or my hard drive, or my graphics card. If ATI graphics cards had an ATI-specific interface, and NVidia cards had NVidia-specific interfaces, this would not be possible.

Jason Gorman, “Revisiting Unified Principles of Dependency Management (In Lieu Of 100 Tweets)”

Infrastructure dependencies are another common dependency management issue. Database and web servers, middleware, and distributed storage all fall into this category, as does network connectivity. While the focus around code dependencies was on complexity and API stability, the main concern for infrastructure dependencies will be availability. Monitoring, clustering and/or mirroring are common tactics for mitigating risks with these. Other tactics include retries and queuing requests until communications are restored. In some cases, optional functionality can be disabled when unavailable.

Services, particularly third-party services, combine the features of both code and infrastructure dependencies in that API stability and availability are equal concerns. While this presents extra challenges, it also means that both sets of mitigation strategies are available for use. For example, assuming that two providers host an equivalent service, an abstraction layer can be combined with queuing and retries to remain up even when one provider is out of service. Where appropriate, an enterprise service bus can be used to handle translation across multiple message formats, taking that complexity out of the client application.

Third-party services pose a special case of availability concerns – supplier continuity. You must be prepared to deal with the contingency that the provider will either go out of business or discontinue their offering, leaving you to find a permanent replacement. This applies to third-party infrastructure (aka “the Cloud”) as well.

Configuration data is a dependency that doesn’t come to mind as readily. However, as systems become more complex and more redundant (purposely, for availability), configuration issues can cripple. Jason Gorman’s first two principles (minimize and localize) can help. Additionally, automating changes and additions to configuration data will also help ensure that items aren’t forgotten or poorly formatted.

A side effect of increased integration is increased reliance on shared data values and mapping from one value to another. If the same item is a “Gadget” on the web site and a “Widget” in the inventory system, there is a potential for problems. By the same token, even when items are named identically, issues can arise when changes are made for business reasons. When the systems involved cross organizational boundaries, the potential for problems increases further. It is critical to identify and understand these dependencies so that you have a plan in place to manage them prior to their becoming an issue.

Understanding and managing dependencies contributes to both the reliability and maintainability of a system. Potential issues can be identified, making them both quicker and easier to debug when a problem occurs. This allows issues to be evaluated for what level of risk is acceptable and which measures are appropriate to mitigate that risk. Where appropriate, the system’s architecture can then be structured to handle them with minimal manual intervention. Failing to do the work up front can “leave you hanging” when things go wrong.