Form Follows Function on SPaMCast 454

SPaMCAST logo

I’m back for another appearance on Tom Cagley’s Software Process and Measurement (SPaMCast) podcast.

This week’s episode, number 454, begins with Tom talking about iteration planning. Jeremy Berriault comes next with a segment on QA team leads and I bat cleanup with a Form Follows Function installment based on my post “Trash or Treasure – What’s Your Legacy?”.

Tom and I discuss the concept of legacy systems: what they are, whether they’re trash or treasure (sometimes both), and how they impact an organization.

You can find all the SPaMCast episodes I’m in under the SPaMCast Appearances category on this blog. Enjoy!

Architecture Corner: No Money – Lights Out! – Seven Deadly Sins of IT

Episode 5 of this season of Architecture Corner is out (I made a guest appearance in episode 1, “Good at Innovation”). In this installment, Chris the CEO falls victim to yet another temptation.

The CEO has decided that no more time and money will be “wasted” on old systems. Can Joakim Lindbom convince him that sloth breeds technical debt, leading to expensive outages?

Trash or Treasure – What’s Your Legacy?

Pirate's burying treasure

The topic of legacy systems is something of a contentious one. In most cases, a legacy is understood to be a good thing. What makes a system “legacy”? Is it a technical or business decision?

A little over a year ago, Greger Wikstrand took a stab at clarifying the term with his post “Legacy systems, a definition”. In the post, he looked at different definitions of what constituted a legacy system, ranging from “any code that is in use” to “outdated technology” to “high technical debt”. The definition he went with, in my opinion, is the most useful:

It should be clear that legacy systems are not about technical considerations. It is about how well the existing system meets and is able to adapt to business needs.

A pair of tweets from Joanna Young that I saw yesterday brought this to mind:

Whether or not a system has crossed the line into legacy territory is not a technical decision but a business one. As Greger and Joanna both noted, it’s about fitness for purpose. Technical considerations absolutely have immense bearing on whether the system is able to meet needs. However, they are not the sole determinant.

The standard narrative is for a system to start out “clean” and then rot via neglect and/or ad hoc enhancement. This is certainly a common scenario, but it overlooks the obvious. While failure to maintain a system and its platform will certainly degrade it, keeping the technology components up to date does not ensure that the system will continue to match the needs of those who depend on it. For that matter, it’s easy enough to build a brand new system using the latest and greatest technology that is a legacy right out the gate due to its failure to meet the needs of its stakeholders.

Age of the platform is not a problem; an inability to get support or find people knowledgeable about the platform is a problem. Technical debt in and of itself is not a problem; being impeded or prevented from maintaining/enhancing the system due to technical debt is a problem. This works for any given technical issue – substitute the tangible, stakeholder-oriented result of that technical issue and the point becomes clearer to those with the ability to address them.

The key is not to focus solely on functional aspects nor quality of service and/or technical aspects, but the system as a whole. This requires the participation of the entire set of social systems involved in the creation, maintenance, and usage of the software system. Communication and collaboration across all elements of those social systems is critical to effectively maintaining the software system and the social systems that it enables.

A critically important part of promoting that communication and collaboration is maintaining the cohesion of the social systems involved in creating and maintaining the software system. Where those social systems are ad hoc and episodic, the potential for forming the relationships necessary for effective situational awareness is minimal. IT won’t know about functional gaps until too late and the stakeholders won’t know what their options are for addressing them nor will they have advance notice of impending technical issues.

Social systems create, maintain, and use software systems. Systems that are designed to work together have a better chance of doing so than those that are just thrown together and wished well.

Stopping Accidental Technical Debt

Buster Keaton looking at a poorly constructed house

In one of my earlier posts about technical debt, I differentiated between intentional debt (that taken on deliberately and purposefully) and accidental debt (that which just accrues over time without rhyme or reason or record). Dealing with (in the sense of evaluating, tracking, and resolving it) technical debt is obviously a consideration for someone in an application architect role. While someone in that role absolutely should be aware of the intentional debt, is there a way to be more attuned to the accidental debt as well?

Last summer, I published a post titled “Distance…is the one true enemy…”. The post started with a group of tweets from Gregory Brown talking about the corrosive effects of distance on software development (distance between compile and run, between failure and correction, between development and feedback, etc.). I then extended the concept to management, talking about how distance between sense-maker and decision-maker could negatively affect the quality of the decisions being made.

There’s also a distance that neither Greg nor I covered at the time, design distance. Design distance is the distance between the design and the outcome. Reducing design distance makes it easier to keep a handle on the accidental debt as well as the intentional.

Distance between the architectural decisions and the implementation can introduce technical debt. This distance can come from remote decision-makers, architecture pigeons who swoop in, deposit their “wisdom”, and then fly away home. It can come from failing to communicate the design considerations effectively across the entire team. It can also come from failing to monitor the system as it evolves. The design and the implementation need to be in alignment. Even more so, the design and the implementation need to align with particular problems to be solved/jobs to be done. Otherwise, the result may look like this:

Distance between development of the system and keeping the system running can introduce technical debt as well. The platform a system runs on is a vital part of the system, as critical as the code it supports. As with the code, the design, implementation, and context all need to be kept in alignment.

Alignment of design, implementation, and context can only be maintained by on-going architectural assessment. Stefan Dreverman’s “Using Philosophy in IT architecture” identified four questions to be asked as part of an assessment:

  1. “What is my purpose?”
  2. “What am I composed of?”
  3. “What’s in my environment?”
  4. “What do I communicate?”

These questions are applicable not only to the beginning of a system, but throughout its life-cycle. Failing to re-evaluate the architecture as a whole as the system evolves can lead to inconsistencies as design distance grows. We can get so busy dealing with the present that we create a future of pain:

At first glance, this approach might seem to be expensive, but rewriting legacy systems is expensive as well (assuming the rewrite would be successful, which is a tenuous assumption). Building applications with a one-and-done mindset is effectively building a legacy system.

Pride, Prejudice, and Professionalism in the Business of IT

interior of a 1958 Plymouth Savoy

Twenty-plus years in IT have led me to believe that there are very few absolutes when it comes to software systems. Two that do seem to hold true are these:

  1. Creating systems is esteemed far more highly than maintaining systems.
  2. Systems that are not maintained, will decay.

There are a variety of reasons for this situation, many of which are baked into the architecture of the enterprise. Regardless of the why, however, the two facts remain. Without a response to those issues, entropy is inevitable.

Over the past few days, I’ve seen several blog posts by two different authors dealing with this situation in two different ways:

Jason complains about the non-technical “leadership class” in his first post:

And hence we get someone making the big decisions about healthcare who knows nothing about medicine or about running hospitals or ambulance services. And we get someone in charge of all the schools who knows nothing about teaching or running a school. And we get someone in charge of a major software company whose last job was being in charge of a soft drinks company. And so on.

Again, this is fine, if they leave the technical decisions to the technical experts. And that’s where it all falls down, of course. They don’t.

The guy in charge of the NHS insists on telling doctors and nurses how they should do their jobs. The woman in charge of UK schools insists on overriding the expertise of teachers. The guy in charge of a major software company refuses to listen to the engineers about the need for automated testing. And so on.

This is the Dunning-Kruger effect writ large. CEOs and government ministers are brimming with the overconfidence of someone who doesn’t know that they don’t know.

In his second post he follows up with how to respond:

My pithy – but entirely serious – advice in that situation is Do It AnywayTM.

There are, of course, obligations implied when we Do It AnywayTM. What we’re doing must be in the best interests of stakeholders. Do It AnywayTM is not a Get Out Jail Free card for any decision you might want to justify. We are making informed decisions on their behalf. And then doing what needs to be done. Y’know. Like professionals.

I disagree. Strenuously.

If you go to the doctor and they tells you that you will need surgery at some point for some condition, would you expect to be forcibly admitted and operated on immediately?

If you were charged with a crime, would you expect your attorney to accept a plea bargain on your behalf without consultation or prior permission?

If neither of those professionals would usurp the right of their client/patient to make their own informed decision, why should we? Both of those examples would be considered malpractice and the first would be criminal assault in addition. Therefore, I disagree that acting on someone’s behalf without their knowledge or consent is a viable option.

John’s approach, rejecting helplessness and confronting the issues by communicating the costs (with justifications/evidence) is, in my opinion, the truly professional approach. We have a responsibility to make the problem visible and continue making it visible. We also have a responsibility to operate within the limits we’re given. We know far more about our area than someone higher up the management chain, but, that does not equate to knowing more in general than those higher up the management chain. Ignorance is relative. Micro-managing, getting deeper in the weeds than you need to is ineffective. If, however, you’re in the weeds, do you have the information necessary to say that the issue being “interfered with” is one without higher-level consequences? Dunning-Kruger can cut a wide swathe. Trust needs to cut both ways.

Imagine riding as a passenger in a car. You see the car drifting closer and closer to the shoulder. Do you point it out to the driver or do you just grab the wheel? You might prevent an accident or you just might cause one by steering into a vehicle coming up from behind that you didn’t see from your vantage point.

[Plymouth Savoy photo by Christopher Ziemnowicz via Wikimedia Commons]

Building a Legacy

Greek Trireme image from Deutsches Museum, Munich, Germany

 

Over the last few weeks, I’ve run across a flurry of articles dealing with the issue of legacy systems used by the U.S. government.

An Associated Press story on the findings from the Government Accountability Office (GAO) issued in May reported that roughly three-fourths of the $80 billion IT budget was used to maintain legacy systems, some more than fifty years old and without an end of life date in sight. An article on CIO.com about the same GAO report detailed seven of the oldest systems. Two were over 56 years old, two 53, one 51, one 35, and one 31. Four of the seven have plans to be replaced, but the two oldest have no replacement yet planned.

Cost was not the only issue, reliability is a problem as well. An article on Timeline.com noted:

Then there’s the fact that, up until 2010, the Secret Service’s computer systems were only operational about 60% of the time, thanks to a highly outdated 1980s mainframe. When Senator Joe Lieberman spoke out on the issue back in 2010, he claimed that, in comparison, “industry and government standards are around 98 percent generally.” It’s alright though, protecting the president and vice president is a job that’s really only important about 60 percent of the time, right?

It would be easy to write this off as just another example of public-sector inefficiency, but you can find these same issues in the private sector as well. Inertia can, and does, affect systems belonging to government agencies and business alike. Even a perfectly designed implemented system (we’ve all got those, right?) is subject to platform rot if ignored. Ironically, our organizations seem designed to do just that by being project-centric.

In philosophy, there’s a paradox called the Ship of Theseus, that explores the question of identity. The question arises, if we maintain something by replacing its constituent parts, does it remain the same thing? While many hours could be spent debating this, to those whose opinion should matter most, those who use the system, the answer is yes. To them, the identity of the system is bound up in what they do with it, such that it ceases to be the same thing, not when we maintain it but when its function is degraded through neglect.

Common practice, however, separates ownership and interest. Those with the greatest interest in the system typically will not own the budget for work on it. Those owning the budget, will typically be biased towards projects which add value, not maintenance work that represents cost.

Speaking of cost, is 75% of the budget an unreasonable amount for maintenance? How well are the systems meeting the needs of their users? Is quality increasing, decreasing, or holding steady? Was more money spent because of deferred maintenance than would have been spent with earlier intervention? How much business risk is involved? Without this context, it’s extremely difficult to say. It’s understandable that someone outside an organization might lack this information, but even within it, would a centralized IT group have access to it all? Is the context as meaningful at a higher, central level as it is “at the pointy end of the spear”?

Maintaining systems bit by bit, replacing them gradually over time, is likely to be more successful and less expensive, than letting them rot and then having a big-bang re-write. In my opinion, having an effective architecture for the enterprise’s IT systems is dependent on having an effective architecture for the enterprise itself. If the various systems (social and software) are not operating in conjunction, drift and inertia will take care of building your legacy (system).

[Greek Trireme image from Deutsches Museum, Munich, Germany via Wikimedia Commons]

What’s Innovation Worth?

Animated GIF of Sherman Tank Variants

What does an old World War II tank have to do with innovation?

I’ve mentioned it before, but it bears repeating – one of benefits of having a blog is the ability to interact with and learn from people all over the world. For example, Greger Wikstrand and I have been trading blog posts on innovation for six months now. His latest post, “Switcher’s curse and legacy decisions”,is the 18th installment in the series. In this post, Greger discusses switcher’s curse, “a trap in which a decision maker systematically switches too often”.

Just as the sunk cost fallacy can keep you holding on to a legacy system long past its expiration date, switcher’s curse can cause you to waste money on too-frequent changes. As Greger points out in his post, the net benefit of the new system must outweigh both the net benefit of the old, plus the cost of switching (with a significant safety margin to account for estimation errors in assessing the costs and benefits). Newer isn’t automatically better.

“Disruption” is a two-edged sword when it comes to innovation. As Greger notes regarding legacy systems:

Existing software is much more than a series of decisions to keep it. It embodies a huge number of decisions on how the business of the company should work. The software is full of decisions about business objects and what should be done with them. These decisions, embodied in the software, forms the operating system of the company. The decision to switch is bigger than replacing some immaterial asset with another. It is a decision about replacing a proven way of working with a new way of working.

Disruption involves risk. Change involves cost; disruptive change involves higher costs. In “Innovate or Execute?”, Earl Beede asked:

So, do our employers really want us taking the processes they have paid dearly to implement and products they have scheduled out for the next 15 quarters and, individually, do something disruptive? Every team member taking a risk to see what they can learn and then build on?

Wouldn’t that be chaos?

Beede’s answer to the dilemma:

Now, please don’t think I am completely cynical. I do think that the board of directors and maybe even the C-level officers want to have innovative companies. I really believe that there needs to be parts of a company whose primary mission is to make the rest of the company obsolete. But those disruptive parts need to be small, isolated groups, kept out of the day-to-day delivery of the existing products or services.

What employers should be asking for is for most of the company to be focused on executing the existing plans and for some of the company to be trying to put the executing majority into a whole new space.

This meshes well with Greger’s recommendations:

Conservatism is often the best approach. But it needs to be a prudent conservatism. Making changes smaller and more easily reversible decreases the need for caution. We should consider a prudent application of fail fast mentality in our decision-making process. (But I prefer to call it learn fast.)

Informed decision-making (i.e. making decisions that make sense in light of your context) is critical. The alternative is to rely on blind luck. Being informed requires learning, and as Greger noted, fast turn-around on that learning is to be preferred. Likewise, limiting risk during learning is to be preferred as well. Casimir Artmann, in his post “Fail is not an option”, discussed this concept in relation to hiking in the wilderness. Assessing and controlling risk in that environment can be a matter of life in death. In a business context, it’s the same (even if the “death” is figurative, it’s not much comfort considering the lives impacted). Learning is only useful if you survive to put it to use.

Lastly, it must be understood that decision-making is not a one-time activity. Context is not static, neither should your decision-making process be. An iterative cycle of sense-making and decision-making is required to maintain the balance between innovation churn and stagnation.

So, why the tank?

The M-4 Sherman, in addition to being the workhorse of the U.S. Army’s armor forces in World War II, is also an excellent illustration of avoiding the switcher’s curse. When it was introduced, it was a match for existing German armored vehicles. Shortly afterwards, however, it was outclassed as newer, heavier, better armed German models came online. The U.S. stuck with their existing design, and were able to produce almost three times the number of tanks as Germany (not counting German tanks inferior to the Sherman). As the saying goes, “quantity has a quality all its own”, particularly when paired with other weapon systems in a way that did not disrupt production. The German strategy of producing multiple models hampered their ability to produce in quantity, negating their qualitative advantage. In this instance, progressive enhancement and innovating on the edges was a winning strategy for the U.S.