Stopping Accidental Technical Debt

Buster Keaton looking at a poorly constructed house

In one of my earlier posts about technical debt, I differentiated between intentional debt (that taken on deliberately and purposefully) and accidental debt (that which just accrues over time without rhyme or reason or record). Dealing with (in the sense of evaluating, tracking, and resolving it) technical debt is obviously a consideration for someone in an application architect role. While someone in that role absolutely should be aware of the intentional debt, is there a way to be more attuned to the accidental debt as well?

Last summer, I published a post titled “Distance…is the one true enemy…”. The post started with a group of tweets from Gregory Brown talking about the corrosive effects of distance on software development (distance between compile and run, between failure and correction, between development and feedback, etc.). I then extended the concept to management, talking about how distance between sense-maker and decision-maker could negatively affect the quality of the decisions being made.

There’s also a distance that neither Greg nor I covered at the time, design distance. Design distance is the distance between the design and the outcome. Reducing design distance makes it easier to keep a handle on the accidental debt as well as the intentional.

Distance between the architectural decisions and the implementation can introduce technical debt. This distance can come from remote decision-makers, architecture pigeons who swoop in, deposit their “wisdom”, and then fly away home. It can come from failing to communicate the design considerations effectively across the entire team. It can also come from failing to monitor the system as it evolves. The design and the implementation need to be in alignment. Even more so, the design and the implementation need to align with particular problems to be solved/jobs to be done. Otherwise, the result may look like this:

Distance between development of the system and keeping the system running can introduce technical debt as well. The platform a system runs on is a vital part of the system, as critical as the code it supports. As with the code, the design, implementation, and context all need to be kept in alignment.

Alignment of design, implementation, and context can only be maintained by on-going architectural assessment. Stefan Dreverman’s “Using Philosophy in IT architecture” identified four questions to be asked as part of an assessment:

  1. “What is my purpose?”
  2. “What am I composed of?”
  3. “What’s in my environment?”
  4. “What do I communicate?”

These questions are applicable not only to the beginning of a system, but throughout its life-cycle. Failing to re-evaluate the architecture as a whole as the system evolves can lead to inconsistencies as design distance grows. We can get so busy dealing with the present that we create a future of pain:

At first glance, this approach might seem to be expensive, but rewriting legacy systems is expensive as well (assuming the rewrite would be successful, which is a tenuous assumption). Building applications with a one-and-done mindset is effectively building a legacy system.

Advertisement

Pride, Prejudice, and Professionalism in the Business of IT

interior of a 1958 Plymouth Savoy

Twenty-plus years in IT have led me to believe that there are very few absolutes when it comes to software systems. Two that do seem to hold true are these:

  1. Creating systems is esteemed far more highly than maintaining systems.
  2. Systems that are not maintained, will decay.

There are a variety of reasons for this situation, many of which are baked into the architecture of the enterprise. Regardless of the why, however, the two facts remain. Without a response to those issues, entropy is inevitable.

Over the past few days, I’ve seen several blog posts by two different authors dealing with this situation in two different ways:

Jason complains about the non-technical “leadership class” in his first post:

And hence we get someone making the big decisions about healthcare who knows nothing about medicine or about running hospitals or ambulance services. And we get someone in charge of all the schools who knows nothing about teaching or running a school. And we get someone in charge of a major software company whose last job was being in charge of a soft drinks company. And so on.

Again, this is fine, if they leave the technical decisions to the technical experts. And that’s where it all falls down, of course. They don’t.

The guy in charge of the NHS insists on telling doctors and nurses how they should do their jobs. The woman in charge of UK schools insists on overriding the expertise of teachers. The guy in charge of a major software company refuses to listen to the engineers about the need for automated testing. And so on.

This is the Dunning-Kruger effect writ large. CEOs and government ministers are brimming with the overconfidence of someone who doesn’t know that they don’t know.

In his second post he follows up with how to respond:

My pithy – but entirely serious – advice in that situation is Do It AnywayTM.

There are, of course, obligations implied when we Do It AnywayTM. What we’re doing must be in the best interests of stakeholders. Do It AnywayTM is not a Get Out Jail Free card for any decision you might want to justify. We are making informed decisions on their behalf. And then doing what needs to be done. Y’know. Like professionals.

I disagree. Strenuously.

If you go to the doctor and they tells you that you will need surgery at some point for some condition, would you expect to be forcibly admitted and operated on immediately?

If you were charged with a crime, would you expect your attorney to accept a plea bargain on your behalf without consultation or prior permission?

If neither of those professionals would usurp the right of their client/patient to make their own informed decision, why should we? Both of those examples would be considered malpractice and the first would be criminal assault in addition. Therefore, I disagree that acting on someone’s behalf without their knowledge or consent is a viable option.

John’s approach, rejecting helplessness and confronting the issues by communicating the costs (with justifications/evidence) is, in my opinion, the truly professional approach. We have a responsibility to make the problem visible and continue making it visible. We also have a responsibility to operate within the limits we’re given. We know far more about our area than someone higher up the management chain, but, that does not equate to knowing more in general than those higher up the management chain. Ignorance is relative. Micro-managing, getting deeper in the weeds than you need to is ineffective. If, however, you’re in the weeds, do you have the information necessary to say that the issue being “interfered with” is one without higher-level consequences? Dunning-Kruger can cut a wide swathe. Trust needs to cut both ways.

Imagine riding as a passenger in a car. You see the car drifting closer and closer to the shoulder. Do you point it out to the driver or do you just grab the wheel? You might prevent an accident or you just might cause one by steering into a vehicle coming up from behind that you didn’t see from your vantage point.

[Plymouth Savoy photo by Christopher Ziemnowicz via Wikimedia Commons]

Form Follows Function on SPaMCast 426

SPaMCAST logo

One of the benefits of being a regular on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast is getting to take part in the year-end round table (episode 426). Jeremy Berriault, Steve Tendon, Jon M. Quigley and I joined Tom for a discussion of:

  1. Whether software quality would be a focus of IT in 2017
  2. Whether Agile is over, at least as far as Agile as a principle-driven movement
  3. Whether security will be more important than quality and productivity in the year ahead

It was a great discussion and, as Tom noted, a great way to finish off the tenth year of the SPAMCast and kickoff year eleven.

You can find all my SPaMCast episodes using under the SPAMCast Appearances category on this blog. Enjoy!

Storming on Design

From Wikimedia: VORTEX2 field command vehicle with tornado in sight. Wyoming, LaGrange.

My youngest son has recently fallen in love with the idea of being a storm chaser when he gets older. Tweet storms are more my speed. There was an interesting one last week from Sarah Mei regarding the contextual nature of assessing design quality:

Context is a recurring theme for me. While the oldest post with that tag is just under three years old, a search on the term finds hits going all the way back to my second post in October, 2011. Sarah’s tweets resonated with me because in my opinion, ignoring context is a fool’s game.

Both encapsulation and silos are forms of separation of concerns. What differentiates the two is the context that makes the one a good idea and the other a bad idea. Without the context, you can come up with two mutually exclusive “universal” principles.

A key component of architectural design, is navigating the fractals that make up the contexts in which a system exists. Ruth Malan has had this to say regarding the importance of designing “outside the box”:

Russell Ackoff urged that to design a system, it must be seen in the context of the larger system of which it is part. Any system functions in a larger system (various larger systems, for that matter), and the boundaries of the system — its interaction surfaces and the capabilities it offers — are design negotiations. That is, they entail making decisions with trade-off spaces, with implications and consequences for the system and its containing system of systems. We must architect across the boundaries, not just up to the boundaries. If we don’t, we naively accept some conception of the system boundary and the consequent constraints both on the system and on its containing systems (of systems) will become clear later. But by then much of the cast will have set. Relationships and expectations, dependencies and interdependencies will create inertia. Costs of change will be higher, perhaps too high.

This interrelationship can be seen from the diagram taken from the same post:

System Context illustration, Ruth Malan

It’s important to bear in mind that contexts are multi-dimensional. All but the very simplest of systems will likely have multiple types of stakeholders, leading to multiple, potentially conflicting contexts. Accounting for these contexts while defining the problem and while designing a solution appropriate to the problem space is critical to avoiding the high costs Ruth referred to above.

Another takeaway is that context can (and likely will) change over time. Whether it’s changes in terms of staffing (as Sarah noted) or changes in the needs of users or changes in technology, a design that was fit for yesterday’s context can become unfit for today’s and a disaster for tomorrow’s.

Dealing with Technical Debt Like We Mean it

What’s the biggest problem with technical debt?

In my opinion, the biggest problem is that it works. Just like the electrical outlet pictured above, systems with technical debt get the job done, even when there’s a hidden surprise or two waiting to make life interesting for us at some later date. If it flat-out failed, getting it fixed would be far easier. Making the argument to spend time (money) changing something that “works” can be difficult.

Failing to make the argument, however, is not the answer:

Brenda Michelson‘s observation is half the battle. The argument for paying down technical debt needs to be made in business-relevant terms (cost, risk, customer impact, etc.). We need more focus on the “debt” part and remember “technical” is just a qualifier:

The other half of the battle is communicating, in the same business-relevant manner, the costs and/or risks involved when taking on technical debt is considered:

Tracking what technical debt exists and managing the payoff (or write off, removing failed experiments is a reduction technique) is important. Likewise, managing the assumption of technical debt is critical to avoid being swamped by it.

Of course, one could take the approach that the only acceptable level of technical debt is zero. This is equivalent to saying “if we can’t have a perfect product, we won’t have a product”. That might be a difficult position to sell to those writing the checks.

Even if you could get an agreement for that position, reality will conspire to frustrate you. Entropy emerges. Even if the code is perfected and then left unchanged, the system can rot as its platform ages and the needs of the business change. When a system is actively maintained over time without an eye to maintaining a coherent, intentional architecture, then the situation becomes worse. In his post “Enterprise Modernization – The Next Big Thing!”, David Sprott noted:

The problem with modernization is that it is widely perceived as slow, very expensive and high risk because the core business legacy systems are hugely complex as a result of decades of tactical change projects that inevitably compromise any original architecture. But modernization activity must not be limited to the old, core systems; I observe all enterprises old and new, traditional and internet based delivering what I call “instant legacy” [Note 1] generally as outcomes of Agile projects that prioritize speed of delivery over compliance with a well-defined reference architecture that enables ongoing agility and continuous modernization.

Kellan Elliot-McCrea, in “Towards an understanding of technical debt”, captured the problem:

All code is technical debt. All code is, to varying degrees, an incorrect bet on what the future will look like.

This means that assessing and managing technical debt should be an ongoing activity with a responsible owner rather than a one-off event that “somebody” will take care of. The alternative is a bit like using a credit card at every opportunity and ignoring the statements until the repo-man is at the door.

What We Have Here is a Failure to Communicate

“What we’ve got here is failure to communicate” and it appears to be epidemic. My own personal grand unified theory of everything is that most problems stem from or are aggravated by a lack of communication. Whether the topic is process, governance, planning, estimates, or design, chances are it’s easier to find opinions (and worse, policy and practices) based on one-sided viewpoints than a balanced understanding of the contexts involved. This is dangerous due to the simple fact that organizations are social systems (frequently fractal systems of systems) and as Ruth Malan has noted:

Russell Ackoff urged that to design a system, it must be seen in the context of the larger system of which it is part. Any system functions in a larger system (various larger systems, for that matter), and the boundaries of the system — its interaction surfaces and the capabilities it offers — are design negotiations. That is, they entail making decisions with trade-off spaces, with implications and consequences for the system and its containing system of systems. We must architect across the boundaries, not just up to the boundaries. If we don’t, we naively accept some conception of the system boundary and the consequent constraints both on the system and on its containing systems (of systems) will become clear later. But by then much of the cast will have set. Relationships and expectations, dependencies and interdependencies will create inertia. Costs of change will be higher, perhaps too high.

In other words, systems exist within an ecosystem, not a vacuum. Failure to take context into account harms systems (whether software or social) by baking in harmful structures and behaviors and we cannot take into account contexts that are not communicated and appreciated. This, by the way, is why you find posts about management and process on a site with the tagline “All Things Architectural”.

This is why I believe that successfully managing technical debt can’t happen without successfully communicating to the customer when it’s being taken on, what the costs involved are (or may be) and how it’s affecting the evolution of the product.

This is why I believe the answer to problems related to estimation lies in communication and collaboration, rather than #NoEstimates on the one hand or rigid authoritarianism on the other. This, in my opinion, holds true for all the social system issues (management, process, governance, planning, quality, and architectural design) that affect software development. Without understanding (which does not happen without communication) the goals behind the practices and what results are being achieved, it’s unlikely that the system will work to the satisfaction of anyone.

Local “optimizations” won’t fix systemic problems. We need to bridge the gaps. Can we talk?

Updated 4/8/2016 to fix a broken link.

“Dealing with “Technical Debt”” on Iasa Global

The term “technical debt” has a tendency to evoke an emotional response. Some people react puritanically – “technical debt” means sloppy code; sloppy code is sin; failing to call sin “sin” is condoning sin; to the stake with the heretic! Others will contend that technical debt solely refers solely to conscious trade-offs and by definition excludes code that is poorly written or designed.

Read “Dealing with “Technical Debt”” on Iasa Global for my take on how broadly “technical debt” should be defined.

Technical Debt – Why not just do the work better?

Soap or oil press ruins in Panayouda, Lesvos

A comment from Tom Cagley was the catalyst for my post “Design Communicates the Solution to a Problem”, and he’s done it again. I recently re-posted “Technical Debt – What it is and what to do about it” to the Iasa Global blog, and Tom was kind enough to tweet a link to it. In doing so, he added a little (friendly) barb: “Why not just do the work better?” My reply was “It’s certainly the best strategy, assuming you can“.

Obviously, we have a great deal of control over intentional coding shortcuts. It’s not absolute control; business considerations can intrude, despite our best advice to the contrary. There are a host of other factors that we have less control over: aging platforms, issues in dependencies, changing runtime circumstances (changes in load, usage patterns, etc.), and increased complexity due to organic growth are all factors that can change a “good” design into a “poor” one. As Tom has noted, some have extended the metaphor to refer to these as “Quality Debt, Design Debt, Configuration Management Debt, User Experience Debt, Architectural Debt”, etc., but in essence, they’re all technical deficits affecting user satisfaction.

Unlike coherent architectural design, these other forms of technical debt emerge quite easily. They can emerge singly or in concert. In many cases, they can emerge without your doing anything “wrong”.

Compromise is often a source of technical debt, both in terms of the traditional variety (code shortcuts) and in terms of those I listed above. While it’s easy to say “no compromises”, reality is far different. While I’m no fan of YAGNI, at least the knee-jerk kind, over-engineering is not the answer either. Not every application needs to be built for hundreds of thousands of concurrent users. This is compromise. Having insufficient documentation is a form of technical debt that can come back to haunt you if personnel changes result in loss of knowledge. How many have made a compromise in this respect? While it is a fundamental form of documentation, code is insufficient by itself.

A compromise that I’ve had to make twice in my career is the use of the shared database EAI pattern when an integration had to be in place and the target application did not have a better mechanism in place. While I am absolutely in agreement that this method is sub-standard, the business need was too great (i.e. revenue was at stake). The risk was identified and the product owner was able to make an informed decision with the understanding that the integration would be revised as soon as possible. Under the circumstances, both compromises were the right decision.

In my opinion, taking the widest possible view of technical debt is critical. Identification and management of all forms of technical debt is a key component of sustainable evolution of a product over its lifetime. Maintaining a focus on the product as a whole, rather than projects, provides a long-term focus should help make better compromises – those that are tailored to getting the optimal value out of the product. Having tunnel vision on just the code leaves too much room for surprises.

[Soap or oil press ruins in Panayouda, Lesvos Image by Fallacia83 via Wikimedia Commons.]

“Technical Debt – What it is and what to do about it” on Iasa Global Blog

Durer's Usury

In spite of all that’s been written on the subject of technical debt, it’s still a common occurrence to see it defined as simply “bad code”. Likewise, it’s still common to see the solution offered being “stop writing bad code”. Technical debt encompasses much more than that simplistic definition, so while “stop writing bad code” is good advice, it’s wholly insufficient to deal with the subject.

See the full post on the Iasa Global Blog (a re-post, originally published here).

Lord of the Repository

The man on horseback

In Robert “Uncle Bob” Martin’s “Where is the Foreman”, he advocated for a “foreman” with exclusive commit rights who would review each and every potential commit before it made its way into the repository in the interest of ensuring quality. While I am in sympathy with some of his points, ultimately the idea breaks down for a number of reasons, most particularly in terms of introducing a bottleneck. A single person will only be able to keep up with so many team members and if a sudden bout of the flu can bring your operation to a standstill, there’s a huge problem.

Unlike Jason Gorman, I believe that egalitarian development teams are not the answer. When everyone is responsible for something, it is cliche that nobody takes responsibility for it (they’ve even given the phenomena its own name). However, being responsible for something does not mean dictating. Dictators eventually tend to fall prey to tunnel vision.

Jason Gorman pointed out in a follow-up post, “Why Code Inspections Need To Be Egalitarian”, “You can’t force people, con people, bribe people or blackmail them into caring.” You can, however, help people to understand the reasons behind decisions and participate in the making of those decisions. Understanding and participation are more conducive to ownership and adoption than coercion. Promoting ownership and adoption of values vital to the mission is the essence of leadership.

A recent Tweet from Thomas Cagley illustrates the need for reflective, purposeful leadership:

In my experience, the best leaders exercise their power lightly. It’s less a question of what they can decide and more a question of should they decide out of hand. When your philosophy is “I make the decisions”, you make yourself a hostage to presence. Anywhere you’re not, no decision will be made, regardless of how disastrous that lack of action may be. I learned from an old mentor that the mark of a true leader is that they can sleep when they go on vacation. They’re still responsible for what happens, but they’ve equipped their team to respond reasonably to issues rather than to mill about helplessly.

In his follow-up post, “Oh Foreman, Where art Thou?”, Uncle Bob moderated his position a bit, introducing the idea of assistants to help in the reviews and extension of commit rights to those team members who had proved trustworthy. It’s a better position than the first post, but still a bit too controlling and self-certain. The goal should not be to grow a pack of followers who mimic the alpha wolf, but to grow the predators who snap at your heals. This keeps them and just as important, you, on the path of learning and growth.