Form Follows Function on SPaMCast 335


It’s time for another appearance on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast. This time I’m taking on Knuth’s quote: “Premature optimization is the root of all evil (or at least most of it) in programming.”

SPaMCast 335 features Tom on the meaning of effectiveness, efficiency, frameworks and methodologies; a discussion of my “Wait, did I just say Knuth was wrong?” post and an installment of Jo Ann Sweeny’s column, “Explaining Communication”, talking about content and a framework to guide the development of content.

Wait, did I just say Knuth was wrong?

Surprised Women

In “Microservice Mistakes – Complexity as a Service”, I argued that the fine-grained nature of microservices opened up the risk of pushing complexity out to the consumers of those services. Rather than encapsulating details, microservice architectures expose them, forcing clients to know more about the internals than is common in both object-oriented and SOA traditions. In the comments, it was suggested that granularity was irrelevant as multiple granular microservices could be composed to form a coarser-grained microservice that would provide a more appropriate level of abstraction. My response was that while this is theoretically true, aggregating service calls in that manner risks issues due to network latency. This drew a response quoting Donald Knuth: “Premature optimization is the root of all evil (or at least most of it) in programming.”

Okay, in my rebuttal I did say that Knuth was wrong about this when it came to distributed systems. A better response would have been to point out that Knuth’s quote did not apply. Far from being an optimization, taking latency (as well as other network characteristics) into consideration is just basic design. Meeting a certain time to complete for in-process calls affects quality of service, making efforts to reduce that time optimizations. Meeting a certain time to complete for remote calls affects function. Achieving a functional state is not an optimization.

Location agnostic components, code that “just works” whether in-process, out of process, or over the wire, has been a Holy Grail since the days of DCOM and CORBA. The laws of physics, however, just won’t be denied. Services are not JARs and DLLs. Changing components that were designed to run in-process into ones capable of running remotely will almost certainly involve major re-work, not a little optimization.

Fears for Tiers – Do You Need a Service Layer?

Tiered Wedding Cake

One of the perils of popularity (at least in technology) is that some people equate popularity with efficacy. Everyone’s using this everywhere, so it must be great for everything!

Actually, not so much.

No one, to my knowledge, has created the secret sauce that makes everything better with no trade-offs or side effects. Context still rules. Martin Fowler recently published “Microservices and the First Law of Distributed Objects”, which touches on this principle. In it, Fowler refers to what he calls the First Law of Distributed Object Design: “don’t distribute your objects”. To make a long story short, distributed objects are a very bad idea because object interfaces tend to be fine-grained and fine-grained interfaces perform poorly in a distributed environment. “Chunkier” interfaces are more appropriate to distributed scenarios due to the difference in context between in-process and out of process communication (particularly when out of process also crosses machine, even network boundaries). Additionally, distributed architectures introduce complexity, adding another contextual element to be accounted for when evaluating whether to use them.

In his post, under “Further Readings”, Fowler references an older post of his on Dr. Dobb’s titled “Errant Architectures” (a re-packaging of Chapter 7 of his book Patterns of Enterprise Application Architecture). In that, he discusses encountering what was a common architectural anti-pattern of the time (early 2000s):

There’s a recurring presentation I used to see two or three times a year during design reviews. Proudly, the system architect of a new OO system lays out his plan for a new distributed object system—let’s pretend it’s some kind of ordering system. He shows me a design that looks rather like “Architect’s Dream, Developer’s Nightmare” with separate remote objects for customers, orders, products and deliveries. Each one is a separate component that can be placed in a separate processing node.
I ask, “Why do you do this?”

“Performance, of course,” the architect replies, looking at me a little oddly. “We can run each component on a separate box. If one component gets too busy, we add extra boxes for it so we can load-balance our application.” The look is now curious, as if he wonders if I really know anything about real distributed object stuff at all.

The quintessential example of this style was the web site presentation layer with its business and data layer behind a set of web services. Particularly in the .Net world, web services were the cutting edge and everyone needed them. Of course the system architect in the story was confusing scalability for performance (as were most who jumped onto the web service layer bandwagon). In order to increase the former, a hit is taken to the latter. In most cases, this hit is entirely unnecessary. A web application with an in-process back-end can be load-balanced much more simply than separate UI and service sites, without suffering the performance penalty of remote communication.

There are valid use cases for a service layer, such as when an application provides a both an interactive user interface and is service-enabled. When there are multiple user interfaces, some front ends will benefit from physical separation from the back-end (native mobile apps, desktop clients, SharePoint web parts, etc.). Applications can be structured to accommodate the same back-end either in-process or out of process. This structure can be particularly important for applications that have both internal (where internal is defined as built and deployed simultaneously with the back-end) and external service clients as these have different needs in terms of versioning. Wrapping the back-end with a service layer rather than allowing it to invade a particular service implementation can allow one back-end to serve internal and external customers either directly or via services (RESTful and/or SOAP) without violating the DRY principle.

Many of the same caveats that apply to composite applications formed from microservices (network latency, serialization costs, etc.) apply to these types of applications. It makes little sense to take on these downsides in the absence of a clear demonstrable need. Doing so looks more like falling prey to a fad than creating a well thought out design.

[Wedding Cake Image by shine oa via Wikimedia Commons.]

Eyeballing Performance

I think I see the performance

What does slow code look like?

Tony DaSilva recently tweeted:

“Jeez, this code sure looks slow” is hardly helpful and just not quantitative enough for effective decision-making.

Tony’s tweet reminded me of a time where I had to explain to a coder why the data access classes of a particular performance-sensitive application used a DataReader to fill POCO data transfer objects (DTOs). After all, we could have just used one line of code to fill a DataSet; that would be much faster. Patient soul that I am (or pedantic, it depends on who you ask), I took the time to demonstrate how one line of code that we write may involve a lot lines of code within the library we’re calling. In fact, filling a DataSet involves using a DataReader, thus filling DTOs from a DataSet involves iterating the results of a query twice. The size difference between the DTOs and the DataSet when serialized was a bonus lesson.

Some performance issues, notably those involving redundant work, might be detected by inspection, assuming that the redundant work is visible. In the example above, it wasn’t. Many performance issues will only become visible via profiling. More importantly, without profiling data, the relative significance of the issue can’t be determined. Saving a few microseconds in a particular section of code isn’t going to be much help if several seconds are being lost to network or database issues. This type of ad hoc response is symptomatic of more than one performance analysis anti-pattern. Performance profiling and tuning requires a holistic approach to be effective.

It’s not just about better performance, it’s about better performance in the areas that make the most difference.

Why does software development have to be so hard?

Untangling this could be tricky

A series of 8 tweets by Dan Creswell paints a familiar, if depressing, picture of the state of software development:

(1) Developers growing up with modern machinery have no sense of constrained resource.

(2) Thus these developers have not developed the mental tools for coping with problems that require a level of computational efficiency.

(3) In fact they have no sensitivity to the need for efficiency in various situations. E.g. network services, mobile, variable rates of change.

(4) Which in turn means they are prone to delivering systems inadequate for those situations.

(5) In a world that is increasingly networked & demanding of efficiency at scale, we would expect to see substantial polarisation.

(6) The small number of successful products and services built by a few and many poor attempts by the masses.

(7) Expect commodity dev teams to repeatedly fail to meet these challenges and many wasted dollars.

(8) Expect smart startups to limit themselves to hiring a few good techies that will out-deliver the big orgs and define the future.

The Fallacies of Distributed Computing are more than twenty years old, but Arnon Rotem-Gal-Oz’s observations (five years after he first made them) still apply:

With almost 15 years since the fallacies were drafted and more than 40 years since we started building distributed systems – the characteristics and underlying problems of distributed systems remain pretty much the same. What is more alarming is that architects, designers and developers are still tempted to wave some of these problems off thinking technology solves everything.


Is it really this hard to get it right?

More importantly, how do we change this?

In order to determine a solution, we first have to understand the nature of the problem. Dan’s tweets point to the machines developers are used to, although in fairness, those of us who lived through the bad old days of personal computing can attest that developers were getting it wrong back then. In “Most software developers are not architects”, Simon Brown points out that too many teams are ignorant of or downright hostile to the need for architectural design. Uncle Bob Martin in “Where is the Foreman?”, suggests the lack of a gatekeeper to enforce standards and quality is why “our floors squeak”. Are we over-emphasizing education and underestimating training? Has the increasing complexity and the amount of abstraction used to manage it left us with too generalized a knowledge base relative to our needs?

Like any wicked problem, I suspect that the answer to “why?” lies not in any one aspect but in the combination. Likewise, no one aspect is likely, in my opinion, to hold the answer in any given case, much less all cases.

People can be spoiled by the latest and greatest equipment as well as the optimal performance that comes for working and testing on the local network. However, reproducing real-world conditions is a bit more complicated than giving someone an older machine. You can simulate load and traffic on your site, but understand and accounting for competing traffic on the local network and the internet is a bit more difficult. We cannot say “application x will handle y number of users”, only that it will handle that number of users under the exact same conditions and environment as we have simulated – a subtle, but critical difference.

Obviously, I’m partial to Simon Brown’s viewpoint. The idea of a coherent, performant design just “emerging” from doing the simplest thing that could possibly work is ludicrous. The analogy would be walking into an auto parts store, buying components individually, and expecting them to “just work” – you have to have some sort of idea of the end product in mind. On the other hand, attempting to specify too much up front is as bad as too little – the knowledge needed is not there and even if it were, a single designer doesn’t scale when dealing with any system that has a team of any real size.

Uncle Bob’s idea of a “foreman” could work under some circumstances. Like Big Design Up Front, however, it doesn’t scale. Collaboration is as important to the team leader as it is to the architect. The consequences of an all-knowing, all-powerful personality can be just as dire in this role as for an architect.

In “Hordes of Novices”, Bob Martin observed “When it’s possible to get a degree in computer science without writing any code, the quality of the graduates is questionable at best”. The problem here is that universities are geared to educate, not train. Just because training is more useful to an employer (at least in the short term), does not make education unimportant. Training deals with this tool at this time while how to determine which tool is right for a given situation is more in the province of education. It’s the difference between how to do versus how to figure out how to do. Both are necessary.

As I’ve already noted, it’s a thorny issue. Rather than offering an answer, I’d rather offer the opportunity for others to add to the conversation in the comments below. How did we get here and how do we go forward?

Think asynch for performance

It’s fairly natural to think in terms of synchronous steps when defining processes to be automated. You do this, then that, followed by a third thing and then you’re done. When applied to applications, this paradigm provides a relatively simple, easy to understand flow. The problem is that each of the steps take time and as the number of steps accumulate (due to enhancements added over time), the duration of the process increases. As the duration of the process increases, the probability that a user takes exception to the wait approaches 1.

There are several approaches to dealing with this dilemma. Hoping it goes away is a non-starter. Removing functionality, unless it’s blatant gold-plating, is probably not going to be an easy sell either. One workable option is scaling up and/or out to reduce the impact of user load on the time to completion. Another is to adjust the user’s perception of performance by executing selected steps, if not entire processes, asynchronously. Combining scale up/scale out and asynchronous operation can provide further performance gains, both actual and perceived.

When scaling up, the trade-off for increased performance is normally the cost of the hardware resources (processor/cores, memory, and/or storage). Either price or physical constraints will prove the limiting factor. For scaling out, the main trade-off will again be price, with a secondary consideration of some added complexity (e.g. session state considerations when load balancing a web application). Price would generally provide more of a limit to this method than any physical considerations. Greater complexity will be the main trade-off for asynchronous operations. Costs can also be incurred if coupled with one or more of the scaling methods above and/or if using commercial tools to host the asynchronous steps.

While a full treatment on the tools available to support asynchronous execution is beyond the scope of this post, they include everything from full-blown messaging systems to queuing to home-grown solutions. High-end tools can be expensive, complex and have considerable hardware requirements. However, when evaluating methods, it is important to include maintenance considerations. A complex system developed in-house may soon equal the cost of a third party product when you factor in both initial development and on-going maintenance.

Steps from a synchronous process can be carved out for asynchronous execution provided they meet certain criteria. These steps must be those that do not change the semantics of the process in the event of a failure or affect the response to the caller. Additionally, such steps must be amenable to retry logic (either automated or manual) or compensating transactions in order to deal with any failures. Feedback to the user must either be handled by periodically checking for messages or by the caller providing a callback mechanism. These considerations are a significant source of the additional complexity incurred.

As an example, consider a process where:

  • the status of an order is updated
  • a document is created and emailed to the participant assigned to fulfill the order
  • an accounting entry is made reflecting the expected cost of fulfilling the order
  • an audit entry is made reflecting the status change
  • the current state of the order is returned to the caller

Over a number of releases, the process is enhanced so that now:

  • the status of an order is updated
  • a document is created and emailed to the participant assigned to fulfill the order
  • an accounting entry is made reflecting the expected cost of fulfilling the order
  • an audit entry is made reflecting the status change
  • a snapshot of the document data is saved
  • a copy of the document is forwarded to the document management system
  • the current state of the order is returned to the caller

The creation and emailing of the document, as well as sending it to the document management system, are prime candidates for asynchronous execution. Both tasks are reproducible in the event of a failure and neither leaves the system in an unstable state should they fail to complete. Because the interaction with the document management system crosses machine boundaries and involves a large message, the improvement should be noticeable. As noted above, the trade-offs for this will include the additional complexity in dealing with failures (retry logic: manual, automatic, or both) as well as the costs and complexity associated with whatever method is used to accomplish the step out of band.

Implementing a process that is entirely asynchronous can have a broader range of complexity. Simple actions that require little feedback to the user, such as re-sending the email from the example above, should be no more difficult than the carved out step. Processes that require greater user interaction will require more design and development effort to accomplish. Processes that were originally implemented in a synchronous manner will require the most effort.

Converting a process from a synchronous communications model to an asynchronous one will require substantial refactoring across all layers. When this refactoring crosses application boundaries (as in the case of a service application supporting other client applications), then the complexity increases. Crossing organizational boundaries (e.g. company to company) will entail the greatest complexity. In these cases, providing a new asynchronous method while maintaining the old synchronous one will simplify the migration. Trying to coordinate multiple releases, particularly across enterprises, is asking for an ulcer.

In spite of the number of times I’ve used the word “complexity” in this post, my intent is not to discourage the use of the technique. Where appropriate, it is an extremely powerful tool that can allow you to meet the needs of the customer without requiring them to pick between function and performance. They tend to like the “have your cake and eat it too” scenario.