Wait, did I just say Knuth was wrong?

Surprised Women

In “Microservice Mistakes – Complexity as a Service”, I argued that the fine-grained nature of microservices opened up the risk of pushing complexity out to the consumers of those services. Rather than encapsulating details, microservice architectures expose them, forcing clients to know more about the internals than is common in both object-oriented and SOA traditions. In the comments, it was suggested that granularity was irrelevant as multiple granular microservices could be composed to form a coarser-grained microservice that would provide a more appropriate level of abstraction. My response was that while this is theoretically true, aggregating service calls in that manner risks issues due to network latency. This drew a response quoting Donald Knuth: “Premature optimization is the root of all evil (or at least most of it) in programming.”

Okay, in my rebuttal I did say that Knuth was wrong about this when it came to distributed systems. A better response would have been to point out that Knuth’s quote did not apply. Far from being an optimization, taking latency (as well as other network characteristics) into consideration is just basic design. Meeting a certain time to complete for in-process calls affects quality of service, making efforts to reduce that time optimizations. Meeting a certain time to complete for remote calls affects function. Achieving a functional state is not an optimization.

Location agnostic components, code that “just works” whether in-process, out of process, or over the wire, has been a Holy Grail since the days of DCOM and CORBA. The laws of physics, however, just won’t be denied. Services are not JARs and DLLs. Changing components that were designed to run in-process into ones capable of running remotely will almost certainly involve major re-work, not a little optimization.

11 thoughts on “Wait, did I just say Knuth was wrong?

  1. IBM’s CICS transaction processing environment has had “location agnostic” interprogram call, “code that ‘just works’ whether in-process, out of process, or over the wire” for over 25 years. It started in the version of CICS for OS/2, and from it to CICS/MVS, then later, from mainframe CICS to other mainframe instances and other platforms.

    It is known as Distributed Program Link. The calling program may specify in which server instance the target program resides, but the better practice is to let CICS make that determination, based on information provided by the SYSADMIN in a “Processing Program Table” entry, or selected by another program, on the fly.

    Either way, the same application calling syntax and compiled executable reaches the called program whether it is in the same server instance (in process), in another server instance on the same OS instance (interprocess), or on another machine. That other machine could use a different character set code page and endian characteristics.

    Twenty-something years ago, I led the design and implementation of a framework that extended this interoperability to (a) calls from outside CICS, (b) calls to programs running under the IMS Transaction Manager, and (c) calls from programs running under the IMS Transaction Manager.

    Like

    • David,

      The question is not whether you can call components identically whether it’s in-process, out of process or over the wire. It’s whether you can do so under load without having the whole thing time out on you. Simple physics dictates that time to make x number of calls is going to increase the farther out you go – if your method that’s aggregating these calls has any time constraint, then location is going to be significant.

      Liked by 1 person

  2. You are quite right. When somebody said, “This makes the location of the called program transparent to the user,” my response was, “As long as he doesn’t have a watch.” BTW, I hated that “transparent to the user” phrase. Here it was merely syntactically invisible to the calling program. Of course if the transparency disappeared if the called program tried to access the calling transaction’s scratch pad by memory address, or returned a pointer to memory that the caller would then use. IBM eventually provided a program, the CICS Interdependency Analyzer ( http://www.slideshare.net/IBM_CICS/cics-ia-v52datasheetgi13333900 ) to flag these and other dependencies that forced resources to be owned by the same process or thread as something else.

    Liked by 1 person

  3. Glad to hear that Prof. Knuth is still right!

    For all time, I work in computing there was a limit from Mother nature.
    We put some instructions into the 8 cells of faster memory.
    We re-wrote some subroutines in an assembler.
    We re-arranged program linking order to reduce the number of page faults.
    We moved some functionality from a mainframe to end-points.

    Prof. Knuth quote is about the base of optimisation – static view of a program (see fig 3 from http://improving-bpm-systems.blogspot.ch/2014/10/improvement-software-program.html ) or dynamic view of a program (see fig 4 http://improving-bpm-systems.blogspot.ch/2014/10/improvement-software-program.html ).

    Knowing performance-critical paths it is possible to re-arrange the “typology” of a distribute application. For example, to temporary move some micro-services from a far-unlimited-cheap cloud to near-limited-expensive-in-house- computing environment.

    Thanks,
    AS

    Like

    • “For example, to temporary move some micro-services from a far-unlimited-cheap cloud to near-limited-expensive-in-house- computing environment.”

      Now that’s a far better example of an optimization.

      Like

    • Cross-process on one box would have performance issues (less than over the wire, but still far worse than in-process) with almost none of the benefits (elastic load balancing, etc.). You would only have process isolation as a consolation prize.

      Like

  4. Pingback: Form Follows Function on SPaMCast 335 | Form Follows Function

  5. I think the root of Knuth’s statement, and the reason it still applies to network calls, is this: CONTEXT.

    Premature optimisation is premature by definition because it lacks context. For an algorithm, perhaps we don’t know the speed of the CPU in production, or the amount of memory, or what other software will be using the same box, or what the data set is that will be run through it, or how often it will be called in the context of a larger process.

    Most typically missing from the context is an expectation against which to benchmark, and here we come to network calls. The argument here is that we should avoid them, but without any context of the expected completion time. If the response is expected in 5 seconds and a network call costs 50 millis, it’s probably irrelevant! If the response is expected in 500 millis, well then we might have an issue. The context changes whether optimisation is premature or not.

    Now, one might say that if you can make it faster, you SHOULD make it faster, and removing network calls does that. But this is where Knuth says “No”. We should optimise only based on the constraints and expectations of your context, not towards the global minimum. Anything else is ‘waste’ in the true Lean sense: optimising something which is not a constraint on the system.

    Like

    • Agreed re: context. It’s important to consider also, that context includes not only actual performance and desired performance, but also what other trade-offs are being satisfied. Sometimes latency is traded for other benefits. Just as it’s short-sighted to jump straight to a distributed model without being able to articulate why the choice is being made, it’s equally short-sighted to criticize the choice without bothering to understand the rationale behind it.

      Like

  6. Pingback: Laziness as a Virtue in Software Architecture | Form Follows Function

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.