Technical Debt & Quality – Binary Thinking in an Analog World

How many shades of gray?

I admit it, I’m a pragmatist.

Less than two weeks after starting this blog, I posted “There is no right way (though there are plenty of wrong ones)”, proclaiming my adherence to the belief that absolutes rarely stand the test of time. In design as well as development process, context is king.

Some, however, tend to take a more black and white approach to things. I recently saw an assertion that “Quality is not negotiable” and that “Only Technical Debt enthusiasts believe that”. By that logic, all but a tiny portion of working software professionals must be “Technical Debt enthusiasts”, because if you’re not the one paying for the work, then the decision about what’s negotiable is out of your hands. Likewise, there’s a difference between being an “enthusiast” and recognizing that trade-offs are sometimes required.

Seventeen years ago, Fast Company published “They Write the Right Stuff”, showcasing the quality efforts of the team working on the code that controlled the space shuttle. There results were impressive:

Consider these stats : the last three versions of the program — each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors. Commercial programs of equivalent complexity would have 5,000 errors.

Impressive results are certainly in order given the criticality of the software:

The group writes software this good because that’s how good it has to be. Every time it fires up the shuttle, their software is controlling a $4 billion piece of equipment, the lives of a half-dozen astronauts, and the dreams of the nation. Even the smallest error in space can have enormous consequences: the orbiting space shuttle travels at 17,500 miles per hour; a bug that causes a timing problem of just two-thirds of a second puts the space shuttle three miles off course.

It should be noted, however, that while the bug rate is infinitesimally small, it’s still greater than zero. With a defined hardware environment, highly trained users, and a process that consumed a budget of $35 million annually, perfection was still out of reach. Reality often tramples over ideals, particularly considering that technical debt can arise from changing business needs and changing technical environments as much as sloppy practice. Recognizing that circumstances may make it the better choice and managing it is more realistic than taking a dogmatic approach.

For most products, it’s common to find multiple varieties with different features and different levels of quality with the choice left to the consumer as to which best suits his/her needs. It’s rare, and rightly so, for that value judgment to be taken out of the consumer’s hands. Taking the position that “quality is not negotiable” (with the implicit assertion that you are the authority on what constitutes quality) places you in just that position of dictating to your customer what is in their best interests. Under the same circumstance, what would be your reaction?

Risky Assumptions and the Assumption of Risk

Long ago, in a land far away, a newly minted developer received a lesson in the danger of untested assumptions. There was an export job that extracted and transmitted to the state data about county jail inmates each month for reimbursement. Having developed this job, our hero was called upon to diagnose and correct an error that the state’s IT staff reported: the county had claimed reimbursement for an inmate held in a nearby city. This was proven by the fact that the Social Security Number submitted for the county’s inmate was identical to that submitted for the city’s inmate.

After much investigation, the plucky newbie determined that the identity of the county inmate was correct (fingerprints, and all that). The person submitted by the city was actually a former roommate of the person in question who had been admitted to the city jail under the borrowed identity of his old pal. It was truly shocking to realize that someone who had a lengthy criminal record would stoop to using another’s identity (although he did deserve kudos for being a pioneer – this was the mid 90’s when identity theft wasn’t yet in vogue).

What was even more shocking was the fix to be used: the county should re-submit the record with a “999-99-9999” value and the state would generate a fake SSN to be used for the remainder of the inmate’s incarceration. Since the city’s submission was first in, it would have to be considered “correct”.

Really???

The truly wonderful thing about that story is that it illustrates so many potential issues that can result from an assumption being allowed to roam unchallenged. Needless error conditions were created that delayed the business process of getting reimbursed. The validity of data in the system was compromised: across multiple incarcerations the same inmate could have multiple identities and multiple inmates could share the same identity over the life of the system.

Just as you can have technical debt, you can have cognitive debt by failing to adequately think through your design. One bad assumption can snowball into others(e.g. the “first in equals correct” rule could only be a poor reaction to the belated realization that the identity logic was unable to guarantee uniqueness). Just as collaboration can help avoid design issues, so too can adequate analysis of the problem space. Making a quick and dirty assumption and running with it leaves you at risk for wasting a lot of time and money.

[Photograph by Alex E. Proimos via Wikimedia Commons.]

Your Code is not Enough

That tells me what, but why?

Although the Agile Manifesto proclaims a preference for “Working software over comprehensive documentation”, that’s a far cry from suggesting no documentation. In fact, the manifesto specifically notes “…there is value in the items on the right”. Some can be tempted to equate the code with the design, but as Ruth Malan has pointed out, this is erroneous:

… but to indicate that the code is the design language and it is the full (and most accurate) expression of the design misses key points. For example, it misses the “negative space” (things we don’t do) directed by the design. It misses the notion that design is an abstraction or conception, and not just any conception — the design is conceived just-so*, and there is a premise (or a conjoint set) that links intent (or aspiration or purpose) with the particular form the design takes, the organization, the elements, their relationships, their articulation or interaction points, their collaborative interactions, and more. The code contains neither the abstraction** nor the premise. Sure, we want the code to speak to the design, to realize the design and to imply and signify and convey the design as best the code can. And if we create the design in the process of writing the code, simultaneously thinking about design issues and code detail, the point still holds — we want the code to be as expressive of the design as we can reasonably make it. But both the code and informal tribal memory are going to be missing bits, so it is good to write and draw the design as design out. Or at least the architecture — the strategically and structurally significant bits. To draw out the relationship of the system to its various contexts, the organization of architectural abstractions (or elements) and their interrelationships and the key mechanisms — with diagrams and descriptions — and explicating the reasoning that drove those design choices (and eliminated others).

* When I say “just so” I don’t by any means mean all at once, but rather that the conception is particular. It is a set of choices (sometimes explicitly reasoned, sometimes more intuitively and implicitly or subconsciously arrived at) that we either can defend or need to try out. We apply insights from experience and knowledge that has been distilled over many experiences, and reason our way to the design approach. And then we test it. Of course.

** Abstractions, yes. Abstractions indicated by the design. But not the design abstraction that contains little of what is in code statements but much of what is in code relationships and form, and in what is not in the code, what is specifically, designedly, not done.

Simon Brown, in his July 2012 Skills Matter presentation, “The code doesn’t tell the whole story”, listed several attributes of a system that code is generally insufficient to document”

  • The “big picture” context of the system
  • Quality of service requirements significant to the architecture
  • Architecturally significant constraints on the design
  • Guiding principles of the design
  • The physical environment (infrastructure) the system will operate in
  • Deployment locations of the code components
  • Operational aspects (monitoring, management, security, disaster recovery)
  • Security

In short, the code the conveys the ‘what’ but not the ‘why’ (arguably the more important aspect). It defines ‘what is’ but is silent regarding ‘what should be’, ‘how fast’, etc. We cannot readily infer from the code the physical layout of the system nor its operating environment.

Although there is a need for documentation beyond the code, the traditional dangers (divergence between the documentation and reality, accessibility of the artifact, audiences ignoring artifacts, etc.) still apply. Documentation needs to be accurate and tailored to the task at hand in order to be effective. Quantity is far from the same thing as quality in this case.

Documentation that can be automatically generated from code, database schema, and other artifacts rather than maintained manually will help prevent divergence. Manual maintenance should be reserved for those things that cannot be generated and only when there is a need for continuity. Documenting architecturally significant design decisions and the rationale behind them makes sense in that the information will apply across a large portion, if not all, of the product’s lifecycle. Lower-level, tactical design documents (where used) will only be useful during the release for which it was created; time spent creating and maintaining these artifacts should be minimized.

Clear communication should be a top priority for all documentation. Diagrams can prove the old saying that a picture is worth a thousand words when the focus is on clarity. Semantics will tend to be more important than style or syntactic correctness. The audience and intended use of an artifact will also affect its usefulness as documentation. Automated tests may be useful in conveying some types of information to technically savvy audiences, but may be less useful to others. As with code, you should determine whether the artifact can convey the information you want understood.

A beneficial side-effect of creating lightweight design documentation is that it assists in the collaboration process. Describing the design helps get it out of your head (as well as those of your collaborators) and into a format more conducive to inspection and evaluation. Sometimes the mere organization of your thoughts in order to communicate them allows you to see potential problems.

What do we mean by “architecture”?

Some things last longer than others

Every systematic development of any subject ought to begin with a definition, so that everyone may understand what the discussion is about.

Marcus Tullius Cicero (196BC ‒ 16BC), De Officiis, Book 1, Moral Goodness (h/t to Glen Alleman, October 8, 2013 Quote of the Day)

One of the joys of the English language is the overloading of words with multiple meanings. While it’s not hard to start an argument over technical issues under normal circumstances, having multiple definitions to conflate makes it that much easier. Take, for example, the word “architecture”. While different dictionaries give slightly different definitions (e.g. Merriam-Webster Dictionary.com and Wiktionary), the common set of definitions boil down to:

While there is a relationship between these concepts, a person practicing architecture might create an architectural design using one or more architectural styles, there is enough differentiation to cause trouble if we don’t clarify our terms. Far from being an academic exercise, paying attention to semantics can actually save time and frustration when there is ambiguity present.