Service Versioning Illustrated

Hogarth painting the muse

My last post, “No Structure Services”, generated some discussion on LinkedIn regarding service versioning and how an application’s architecture can enable exposing services that are well-defined, stable, internally simple and DRY. I’ve discussed these topics in the past: “Strict Versioning for Services – Applying the Open/Closed Principle” detailed the versioning process I use to design and maintain services that can evolve while maintaining backwards compatibility and “On the plane or in the plane?” covered how I decouple the service from the underlying implementation. Based on the discussion, I decided that some visuals would probably provide some additional clarity to the subject.

Note: The diagrams below are meant to simplify understanding of these two concepts (versioning and the structure to support it) and not be a 100% faithful representation of an application. If you look at them as a blueprint, rather than a conceptual outline, you’ll find a couple SRP violations, etc. Please ignore the nits and focus on the main ideas.

Internal API Diagram

In the beginning, there was an application consisting of a class named Greeter, which had the job of constructing a greeting for some named person from another person. A user interface was created to collect the necessary information from the end-user, pass it to Greeter and display the results. The input to the Greet method is an object of type GreetRequest, which has members identifying the sender and recipient. Greeter.Greet() returns a GreetResponse, the sole member of which is a string containing the Greeting.

And it was good (actually, it was Hello World which until recently was just a cheesy little sample program but is now worth boatloads of cash – should you find yourself using these diagrams to pitch something to a VC, sending me a cut would probably be good karma 😉 ).

At some point, the decision was made to make the core functionality available to external applications (where external is defined as client applications built and deployed separately from the component in question, regardless of whether the team responsible for the client is internal or external to the organization). If the internal API were exposed directly, the ability to change Greeter, GreetRequest and GreetResponse would be severely constrained. Evolving that functionality could easily lead to non-DRY code if backwards compatibility is a concern.

Note: Backwards compatibility is always a concern unless you’re ditching the existing client. The option is synchronized development and deployment which is slightly more painful than trimming your fingernails with a chainsaw – definitely not recommended.

The alternative is to create a facade/adapter class (GreetingsService) along with complementary message classes (GreetingRequest and GreetingResponse) that can serve as the published interface. The GreetingsService exists to receive the GreetingRequest, manage its transformation to a GreetRequest, delegate to Greeter and manage the transformation of the GreetResponse into a GreetingResponse which is returned to the caller (this is an example of the SRP problem I mentioned above, in actual practice, some of those tasks would be handled by other classes external to GreetingsService – an example can be found here).

Internal API with External Service Adapter Diagram

Later, someone decided that the application should have multilingual capability. Wouldn’t it be cool if you could choose between “Hello William, from your friend Gene” and “Hola Guillermo, de su amigo Eugenio”? The question, however, is how to enable this without breaking clients using GreetingsService. The answer is to add the Language property to the GreetRequest (Language being of the GreetingLanguage enumeration type) and making the default value of Language be English. We can now create GreetingsServiceV1 which does everything GreetingsService does (substituting GreetingRequestV1 and GreetingResponseV1 for GreetingRequest and GreetingResponse) and adds the new language capability. The result is like this:

Internal API with External Service Adapters (2 versions) Diagram

Because Language defaults to English, there’s no need to modify GreetingsService at all. It should continue to work as-is and its clients will continue to receive the same results. The same type of results can be obtained using a loose versioning scheme (additions, which should be ignored by existing clients, are okay; you only have to add a new version if the change is something that would break the interface, like a deletion). The “can” and “should” raise flags for me – I have control issues (which is incredibly useful when you support published services).

Control is the best reason for preferring a strict versioning scheme. If, for example, we wanted to change the default language to Spanish going forward while maintaining backward compatibility, we could not do that under a loose regime without introducing a lot of kludgy complexity. With the strict scheme, it would be trivial (just change the default on GreetingRequestV1 to Spanish and you’re done). With the strict scheme I can even retire the GreetingService once GreetingServiceV1 is operational and the old clients have had a chance to migrate to the new version.

Our last illustration is just to reinforce what’s been said above. This time a property has been added to control the number of times the greeting is generated. GreetingsServiceV2 and its messages support that and all prior functionality, while GreetingsService and GreetingsServiceV1 are unchanged.

Internal API with External Service Adapters (3 versions) Diagram

As noted above, being well-defined, stable, internally simple and DRY are all positive attributes for published services. A strict versioning scheme provides those attributes and control over what versions are available.

Advertisements

No Structure Services

Amoeba sketch

Some people seem to think that flexibility is universally a virtue. Flexibility, in their opinion, is key to interoperability. Postel’s Principle, “…be conservative in what you do, be liberal in what you accept from others”, is often used to justify this belief. While this sounds wonderful in theory, in practice it’s problematic. As Tom Stuart pointed out in “Postel’s Principle is a Bad Idea”:

Postel’s Principle is wrong, or perhaps wrongly applied. The problem is that although implementations will handle well formed messages consistently, they all handle errors differently. If some data means two different things to different parts of your program or network, it can be exploited—Interoperability is achieved at the expense of security.

These problems exist in TCP, the poster child for Postel’s principle. It is possible to make different machines see different input, by building packets that one machine accepts and the other rejects. In Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection, the authors use features like IP fragmentation, corrupt packets, and other ambiguous bits of the standard, to smuggle attacks through firewalls and early warning systems.

In his defense, the environment in which Postel proposed this principle is far different from what we have now. Eric Allman, writing for the ACM Queue, noted in “The Robustness Principle Reconsidered”:

The Robustness Principle was formulated in an Internet of cooperators. The world has changed a lot since then. Everything, even services that you may think you control, is suspect.

Flexibility, often sold as extensibility, too often introduces ambiguity and uncertainty. Ambiguity and uncertainty are antithetical to APIs. This is why 2 of John Sonmez’s “3 Simple Techniques to Make APIs Easier to Use and Understand” are “Using enumerations to limit choices” and “Using default values to reduce required parameters”. Constraints provide structure and structure simplifies.

Taken to the extreme, I’ve seen flexibility used to justify “string in, string out” service method signatures. “Send us a string containing XML and we’ll send you one back”. There’s no need to worry about versioning, etc. because all the versions for all the clients are handled by a single endpoint. Of course, behind the scenes there’s a lot of conditional logic and “hope for the best” parsing. For the client, there’s no automated generation of messages nor even guarantee of structure. Validation of the structure can only occur at runtime.

Does this really sound robust?

I often suspect the reluctance to tie endpoints to defined contracts is due to excessive coupling between the code exposing the service and the code performing the function of the service. If domain logic is intermingled with presentation logic (which a service is), then a strict versioning scheme, an application of the Open/Closed Principle to services, now violates Don’t Repeat Yourself (DRY). If, however, the two concerns are kept separate within the application, multiple endpoints can be handled without duplicating business logic. This provides flexibility for both divergent client needs and client migrations from one message format to another with less complexity and ambiguity.

Stable interfaces don’t buy you much when they’re achieved by unsustainable complexity on the back end. The effect of ambiguity on ease of use doesn’t help either.

Strict Versioning for Services – Applying the Open/Closed Principle

In a previous post, I mentioned that I preferred a strict model of service versioning for the safety and control that it provides. In the strict model, any change results in a new contract. This is in contrast to the flexible model which allows changes that do not break backwards compatibility and the loose model which supports both backwards and forwards compatibility (by eliminating any concept of contract).

The loose model generally comes in two flavors: string in/string out and generic xml. Both share numerous disadvantages:

… sending such arbitrary messages in a SOAP envelope often requires additional processing by the SOAP engine. The wire format of a message might not be very readable once it gets encoded. Moreover, you must write code manually to deal with the payload of a message. Since there is no clear definition of the message in WSDL, the web services tooling cannot generate this code, which can make such a solution more error prone. Validating messages cannot take place. If a message format changes, it might be easier to update the service interface and regenerate binding code than ensuring all consumers and providers properly handle the new format.

In the loose model, a slight advantage in terms of governance (not having to manage multiple endpoints) is far outweighed by the additional complexity and effort required to compensate for its weaknesses.

The flexible model initially seems to be a compromise. Adding an optional message element with a default value arguably allows you to make a backward compatible change without having a new endpoint. But what happens if the default value is not appropriate to all of your original consumers? Blank or null defaults may work, but only if blank or null is otherwise meaningless for the service. Additionally, changes which break backwards compatibility will require a new contract anyway. Lastly, because multiple versions share the same physical artifacts, it will be impossible to determine which versions are still in use by monitoring log files.

The strict model I prefer is essentially an application of Bertrand Meyer’s Open/Closed Principle. This principle states that “software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification”. In other words, new functionality should be implemented via new code (which may build on existing code) rather than by changing the existing code. In the words of Bob Martin:

When a single change to a program results in a cascade of changes to dependent modules, that program exhibits the undesirable attributes that we have come to associate with “bad” design. The program becomes fragile, rigid, unpredictable and unreusable. The open-closed principle attacks this in a very straightforward way. It says that you should design modules that never change. When requirements change, you extend the behavior of such modules by adding new code, not by changing old code that already works.

(“The Open-Closed Principle”, Robert C. Martin)

Applied to services, this means that all changes (with the exception of bug fixes that don’t affect the signature) result in a new service contract: endpoint, messages, entities. Assuming the service is a facade for components of the business layer, this principle can be applied (or not) to those underlying components based on whether the risk of change outweighs the redundancy introduced. This allows the impact to existing consumers of the service to be managed.

Some general rules for governing service versions (from SOA World Magazine, “Design Strategies for Web Services Versioning”):

1. Determine how often versions are to be released. When considering frequency, you should consider how many versions of the Web service you want to support in parallel.

2. Understand the timeframe within which you expect consumers to move to a new version of a service. The Web services management platform may be able to provide guidance on service usage to determine the appropriate time to phase out older versions.

3. Consider releasing a pilot or an early release of a new version. Give consumers an opportunity to test compatibility and determine potential code impacts.

4. Approach Web services versioning the same way software packages might be released. Changes to your service, either as a result of bug fixes, partner requests, or specification upgrades, should follow a specific release cycle.

5. Clearly communicate your Web services versioning strategy to users of your Web service.

It should also be noted that chunkier services designed around business processes will be less likely to change frequently than fine-grained CRUD services. Additionally, such services will generally be more cohesive, making them easier to understand and use.

Like any rule, the Open/Closed Principle has its exceptions. Applying it universally to an application soon leads to a proliferation of duplicate classes and methods, some of which may no longer be used. However, when dealing with code that is directly consumed by external applications (i.e. a service you expose), then the Open/Closed Principle provides a way to avoid the pain you would otherwise incur.

Coping with change using the Canonical Data Model

According to the Greek philosopher Heraclitus of Ephesus, “Nothing endures but change”. It’s not just a common theme, but also a fundamental principle of architecture. Any design that does not allow for change is doomed from the start. By the same token, when exposing services to other applications, particularly applications that cross organizational boundaries, the signature of those services (methods, messages and data) become a contract that should not be broken. The Canonical Data Model provides a way to resolve this paradox.

When using a message bus, a common pattern is to use a message translator at the endpoints to transform messages to and from the canonical format. This allows the messaging system to maintain versioned services as receive locations and communicate with connected applications in their native format, while avoiding the n-squared problem. Instead of requiring up to 12 translations for a four endpoint integration, the maximum number needed would be 8. As the numbers grow, the savings quickly become more significant (for 6 endpoints, 12 instead of 30; for 8 endpoints, 16 instead of 56). Internal operations (orchestration, etc.) are simplified because only one format is dealt with.

The same pattern can be applied to service-enabled applications. As I noted in a previous post, messages and data entities will change from release to release according to the needs of the system (new features as well as changes and fixes to existing functionality). As long as all consumers of these classes are built and deployed simultaneously, all is good. Once that no longer applies, such as when another application is calling your service, then a versioning scheme becomes necessary.

While a full treatment of versioning is beyond the scope of this post, my preference is for a strict versioning scheme:

Strategy #1: The Strict Strategy (New Change, New Contract)

The simplest approach to Web service contract versioning is to require that a new version of a contract be issued whenever any kind of change is made to any part of the contract.

This is commonly implemented by changing the target namespace value of a WSDL definition (and possibly the XML Schema definition) every time a compatible or incompatible change is made to the WSDL, XML Schema, or WS-Policy content related to the contract. Namespaces are used for version identification instead of a version attribute because changing the namespace value automatically forces a change in all consumer programs that need to access the new version of the schema that defines the message types.

This “super-strict” approach is not really that practical, but it is the safest and sometimes warranted when there are legal implications to Web service contract modifications, such as when contracts are published for certain inter-organization data exchanges. Because both compatible and incompatible changes will result in a new contract version, this approach supports neither backwards or forwards compatibility.

Pros and Cons

The benefit of this strategy is that you have full control over the evolution of the service contract, and because backwards and forwards compatibility are intentionally disregarded, you do not need to concern yourself with the impact of any change in particular (because all changes effectively break the contract).

On the downside, by forcing a new namespace upon the contract with each change, you are guaranteeing that all existing service consumers will no longer be compatible with any new version of the contract. Consumers will only be able to continue communicating with the Web service while the old contract remains available alongside the new version or until the consumers themselves are updated to conform to the new contract.

Therefore, this approach will increase the governance burden of individual services and will require careful transitioning strategies. Having two or more versions of the same service co-exist at the same time can become a common requirement for which the supporting service inventory infrastructure needs to be prepared.

In short, my reasons for preferring the strict model is summed up by the words “safety” and “control” above. Once you have exposed a service, your ability to control its evolution becomes severely limited. Changes to the signature, if meaningful, either break older clients or introduce risk of semantic confusion. The only way around this is to have synchronized releases of both service and consumers. This is a painful process when the external consumer is developed in-house. It is doubly so when that consumer is developed by an entirely different organization. Using a strict approach decouples the service from the client. New functionality is added via new endpoints and clients, shielded from the change until ready, upgrade on their own schedule (within reason).

Using the Canonical Data Model, strictly versioned services can be set up as facades over the business layer for use by external applications (whether in-house or third party). Incoming requests are translated to the canonical (internal-only) format and responses are translated from the canonical format to that required by the endpoint. Internal-only services (such as those to support a Smart Client) can use the canonical format directly.

This architecture allows for preprocessing or post processing for external calls if needed. Individual versions of services, messages and data can be exposed, tracked, and ultimately retired in a controlled manner. The best part is that it provides these benefits while still supporting a unified business layer. This combination of both flexibility and uniformity makes for a robust design.

Using extension methods for message transformation

There are many reasons to service-enable an application, from providing an integration to supporting new client types (such as mobile apps). If it’s a layered application with a message-based architecture, then the battle is half won. However, there are pitfalls to avoid, such as having the internal message format exposed to external consumers.

Message and data contracts will likely change with each release as new features are added and old ones tweaked to respond to evolving business needs. This works when all the components using those contracts are built and deployed simultaneously. Once that condition no longer applies, then the schema defining messages and payloads for the external consumers must become invariant. The alternative is attempting to coordinate synchronized releases between two or more applications. This is difficult enough with two internal teams, crossing organizational boundaries can make it truly painful.

Having parallel sets of message and data contracts for internal and external consumers (internal and external being relative to the application, not the organization) allows for the internal schema to evolve while the external schema remains static. Another advantage is that the external schema can be tailored to just the functionality to be exposed. The rub is that now you have to take messages and payloads that come into the service in an external format and transform them to the internal format used by the business layer.

An extremely flexible way to handle message transformation is via serialization and XSLT. Unfortunately, to obtain the flexibility, a measure of performance must be traded. If the service operates asynchronously, then that trade-off will most likely be the optimal one. However, synchronous services, particularly those with high traffic, may find the overhead of this method to be too much. For those situations, a code-based translation approach may give the best performance (albeit sacrificing flexibility).

Having chosen a code-based approach to message transformation, the next question is how best to implement it. Common concerns will be avoiding duplication of code and dependency management. At first glance, it would appear to work if you take the internal messages and add constructors that take the external message as a parameter as well as methods that output the external message. This centralizes the translation function to one or two methods that are co-located with the object to be translated to and from. It also introduces a new dependency to each and every assembly using the internal messages (assuming the external messages are reside in their own assembly). Clearly, this is not the manner in which to implement code-based transformation.

One method to avoid both duplicate code and dependency proliferation is to create static methods on classes that reside in an assembly separate from both the internal and external schema. The translation assembly then need only be referenced by the service layer which is the only one needing its services. While this can be done via a utility class (a la System.Convert), a more intuitive route is to set up extension methods. Extension methods are a bit of syntactic sugar added in version 3.0 of the .Net Framework that allow you to create static methods that appear to be instance methods of the type they extend. This provides centralized code without propagating dependencies unnecessarily and has the advantage of a cleaner syntax.