This is not a project

Gantt Chart

My apologies to René Magritte, as I appropriate his point, if not his iconic painting.

After I posted “Storming on Design”, it sparked a discussion with theslowdiyer around context and change. In that discussion, theslowdiyer commented:

‘you don’t adhere to a plan for any longer than it makes sense to.’
Heh, agree. I wonder if the “plan as a tool” vs. “plan as a goal in itself” discussion isn’t deserving of a post of its own 🙂

Indeed it is, even if it did take me nearly four months to get to it.

The key concept to understand, is that the plan is not the goal, merely a stated intention of how to achieve the goal (if this causes you to suspect that the words “plan” and “design” could be substituted for each other without changing the point, move to the head of the class). Magritte’s painting stated that the picture is not the thing. The map is not the territory (and if that concept seems a bit self-evident, consider the fact that Wikipedia considers it significant enough to devote over 1700 words, not counting footnotes and links, to the topic).

Conflating plan and goal is a common problem. To illustrate the difference, consider undergoing an operation. Is it your desire that the surgeon perform the procedure as planned or that your problem gets fixed? In the former scenario, your survival is optional.

This is not, however, to say that planning (or design) is useless. The output of an effective planning/design process is critical. As Joanna Young noted in her “Four Signs of Readiness – Or Not”:

I’m all for consigning the traditional 50+ pages of adminis-trivia on scope, schedule, budget, risks that requires signing in blood to the dustbin. However no organization should forego the thoughtful and hard work on determining what needs to be done, why, how, by whom, for how much – and how this will all be governed and measured as it is proceeds through sprints and/or waterfalls to delivery.

The information derived from the process (not the form, not the presentation, but the information) is critical tool for moving forward intelligently. If you have no idea of what to do, how to do it, who can do it when and for how much, you are adrift. You’re starting a trip with no idea of whether the gas tank has anything in it. Conversely, attempting to achieve 100% certainty from the outset is a fool’s errand. For any endeavor, more will be known nearer the destination. Plans without “wiggle room” are of limited usefulness as you will drift outside the cone of uncertainty from the start and never get back inside.

Having a reasonable idea of what’s acceptable variance helps determine when it’s time to abandon the current plan and go with a revised one. Planning and design are processes, not events or even phases. It’s a matter of continually monitoring context and whether our intentions are still in accordance with reality. Where the differ, reality wins. Always.

Execution isn’t blindly marching forward according to plan. It’s surfing the wave of context.

Designing Communication, Communicating Design

The Simplest metamodel in the world ever!

We work in a communications industry.

We create and maintain systems to move information around in order to get things done. That information moves between people and systems in combinations and configurations too numerous to count. In spite of that, we don’t do that great a job of communicating what should be, for us, extremely important information. We tend to be really bad at communicating the architecture of our systems – structure, behavior, and most importantly, the reasons for the decisions made. It’s bad enough when we fail to adequately communicate that information to others, it’s really bad when we fail to communicate it to ourselves. I know I’ve let myself down more than once (“What was I thinking here?!”).

Over the past few days, I’ve been privileged to follow (and even contribute a bit to) a set of conversations on Twitter. Grady Booch, Ivar Jacobson, Ruth Malan, Simon Brown, and others have been discussing the need for architectural awareness and the state of communicating architecture.

This exchange between Simon, Chris Carroll and Eoin Woods sums it up well:

First and foremost, an understanding of what the role of a software architect is and why it’s important is needed. Any organization where the role is seen as either just a senior developer or (heaven help us!) some sort of Taylorist “thinker” who designs everything for the “worker bee” coders to implement, is almost guaranteed to be challenged in terms of application architecture. Resting on that foundation of shifting sand, the organization’s enterprise IT architecture (EITA) is likewise almost guaranteed to be challenged barring a remarkable series of “happy accidents”. The role (not necessarily position) of software architect is required, because software architecture is a distinct set of concerns that can either be addressed intentionally or left to emerge haphazardly out of the construction of the system.

Before we can communicate the architecture of a system, it’s necessary to understand what that is. In “Software Architecture: Central Concerns, Key Decisions”, Ruth Malan and Dana Bredemeyer defined it as high impact, systemic decisions involving (at a minimum):

  • system priority setting
  • system decomposition and composition
  • system properties, especially cross-cutting concerns
  • system fit to context
  • system integrity

I don’t think it’s possible to over-emphasize the use of “system” and “systemic” in the preceding paragraphs. That being said, it’s important to understand that architectural concerns do not exist in a void. There is a cyclic relationship between the architectural concerns of a system and the system’s code. The architectural concerns guide the implementation, while the implementation defines the current state of the architecture and constrains the evolution of future state of the architecture. Code is a necessary, but insufficient source of architectural knowledge – it’s not enough. As Ruth Malan noted in the Visual Design portion (part II) of her presentation at the Software Architect Conference in London a year ago:

Slide from Ruth Malan's presentation on Visual Design

While the code serves as a foundation of the system, it’s also important to realize that the system exists within a larger context. There is a fractal set of systems within systems within ecosystems. Ruth illustrated this in the Intention and Reflection portion (part III) of the presentation reference above:

Slide from Ruth Malan's presentation on Intention and Reflection

[Note: Take the time to view the entirety of the Intention and Reflection presentation. It’s an excellent overview of how to design the architecture of a system.]

The fractal nature of systems within systems within ecosystems is illustrated by the image at the top of the post (h/t to Ric Phillips for the reblog of it). Richard Sage‘s humorous (though only partly, I’m sure) suggestion of it as a meta-model goes a long way towards portraying the problem of a language to communicate architecture.

Not only are we dealing with a nested set of “things”, but the understanding of those things differ according to the stakeholder. For example, while the business owner might see a “web site” as one monolithic thing, the architect might see an application made up of code components depending on other applications and services running on a collection of servers. Maintaining a coherent, normalized object model of the system yet being able to present it in multiple ways (some of which might be difficult to relate) is not a trivial exercise.

Lower-level aspects of design lend themselves to automated solutions, which can increase reliability of the model by avoiding “documentation rot”. An interesting (in my opinion) aspect that can also be automated is the evolution of code over time. What can’t be parsed from the code, however, is intention and reasoning.

Another barrier to communication is the need to be both expressive and flexible (also well illustrated by Richard’s meta-model) while also being simple enough to use. UML works well on the former, but (rightly or wrongly) is perceived to fail on the latter. Simon Brown’s C4 model aims to achieve a better balance in that aspect.

At present, I don’t think we have one tool that does it all. I suspect that even with a suite of tools, that narrative documents will still be way some aspects are captured and communicated. Having a centralized store for the non-code bits (with a way to relate them back to the code) would be a great thing.

All in all, it is encouraging to see people talking about the need for architectural design and the need to communicate the aspects of that design.

Technical Debt and Rolling Re-writes (Who Needs Architects?)

If you think building a system is challenging, try maintaining one.

Tom Cagley‘s recent post “Plan to Throw One Away Re-Read Saturday: The Mythical Man-Month, Part 11”, was a good reminder that while “technical debt” may be something currently on the radar for many, it’s far from a new phenomenon. The concept of instant legacy applications was in place when forty years ago when Frederick Brooks wrote his masterpiece, even if they weren’t called that. As Tom observed in the post:

Rarely is the first attempt useful to the end consumer, and the usefulness of that first attempt is less in the code than in the feedback it generates. Software development is no different. The initial conceptual design and anticipated technical architecture of a large project rarely stands up to the rigors of the discovery process, and those designs should be learned from and then thrown away.

The faulty assumptions and design flaws accumulate not only from sprint to sprint leading up to the initial release, but also from release to release. In spite of the fact that a product can be so seriously flawed, throwing it away and starting over is easier said than done. While sunk costs cannot be recovered, too sanguine an attitude towards them may not enhance your credibility with the customer. Having to pay for the same thing over and over can make them grumpy.

This sets up a dilemma, one that frequently leads to living with technical debt and attempting to incrementally patch it up. There are limits, however, to the number of band-aids that can be applied. This might make it tempting to propose a rewrite, but as Erik Dietrich stated in “The Myth of the Software Rewrite”:

Sure, they know things now that they didn’t know when they started on this code 3 years ago. But won’t the same thing be true in 3 years? Won’t the developers then be looking at the code and saying, “this is a mess — if only we knew in 2015 what we now know in 2018!” And, beyond that, what makes you think that giving the same group of people the same marching orders won’t result in the same kind of code?

The “big rewrite from scratch because this is a mess” is a losing strategy.

Fortunately, there is an alternative. Quoting Tom Cagley again from the same post as above:

If change is both inevitable and good (within limits), then both systems and organizations (a type of system) need to be engineered to support and facilitate change. Architecturally, techniques such as modularization, object-oriented design and other processes that foster simplification and incremental change create an environment in which change isn’t avoided, but rather encouraged.

While we may laugh at the image of changing a tire while the vehicle is in motion, it is an accurate metaphor. Customers expect flexibility and change on the go; waiting equals lost business. The keys to evolving in place are having an intentionally designed, modular architecture and an understanding of where the weaknesses lie. Both of these are concerns that reside squarely on the architect’s plate.

Modularity not only makes an application more easily maintainable via separation of concerns, but it also embraces change by making components replaceable. This is one of the qualities that has made microservices such a hot topic, although it would be a mistake to think that microservices are the only way (or best way in all cases) to achieve modularity.

Modularity brings benefits beyond the purely technical as well. Rewrites of a fraction of an application are more easily sold than big-bang efforts. Demonstrating forethought (while you can’t predict what the change will be, predicting the need for change is more of a sure thing) demonstrates concern for the customer’s welfare, which should make for a better relationship.

Being able to throw a system away a little at a time allows us to keep the car on the road while it changes and adapts to changing conditions.

“Laziness as a Virtue in Software Architecture” on Iasa Global

Laziness may be one of the Seven Deadly Sins, but it can be a virtue in software development. As Matt Osbun observed:

Robert Heinlein noted the benefits of laziness:

See the full post on the Iasa Global Site (a re-post, originally published here).

Microservices – Sharpening the Focus

Motion Blurred London Bus

While it was not the genesis of the architectural style known as microservices, the March 2014 post by James Lewis and Martin Fowler certainly put it on the software development community’s radar. Although the level of interest generated has been considerable, the article was far from an unqualified endorsement:

Despite these positive experiences, however, we aren’t arguing that we are certain that microservices are the future direction for software architectures. While our experiences so far are positive compared to monolithic applications, we’re conscious of the fact that not enough time has passed for us to make a full judgement.

One reasonable argument we’ve heard is that you shouldn’t start with a microservices architecture. Instead begin with a monolith, keep it modular, and split it into microservices once the monolith becomes a problem. (Although this advice isn’t ideal, since a good in-process interface is usually not a good service interface.)

So we write this with cautious optimism. So far, we’ve seen enough about the microservice style to feel that it can be a worthwhile road to tread. We can’t say for sure where we’ll end up, but one of the challenges of software development is that you can only make decisions based on the imperfect information that you currently have to hand.

In the course of roughly fourteen months, Fowler’s opinion has gelled around the “reasonable argument”:

So my primary guideline would be don’t even consider microservices unless you have a system that’s too complex to manage as a monolith. The majority of software systems should be built as a single monolithic application. Do pay attention to good modularity within that monolith, but don’t try to separate it into separate services.

This mirrors what Sam Newman stated in “Microservices For Greenfield?”:

I remain convinced that it is much easier to partition an existing, “brownfield” system than to do so up front with a new, greenfield system. You have more to work with. You have code you can examine, you can speak to people who use and maintain the system. You also know what ‘good’ looks like – you have a working system to change, making it easier for you to know when you may have got something wrong or been too aggressive in your decision making process.

You also have a system that is actually running. You understand how it operates, how it behaves in production. Decomposition into microservices can cause some nasty performance issues for example, but with a brownfield system you have a chance to establish a healthy baseline before making potentially performance-impacting changes.

I’m certainly not saying ‘never do microservices for greenfield’, but I am saying that the factors above lead me to conclude that you should be cautious. Only split around those boundaries that are very clear at the beginning, and keep the rest on the more monolithic side. This will also give you time to assess how how mature you are from an operational point of view – if you struggle to manage two services, managing 10 is going to be difficult.

In short, the application architectural style known as microservice architecture (MSA), is unlikely to be an appropriate choice for the early stages of an application. Rather it is a style that is most likely migrated to from a more monolithic beginning. Some subset of applications may benefit from that form of distributed componentization at some point, but distribution, at any degree of granularity, should be based on need. Separation of concerns and modularity does not imply a need for distribution. In fact, poorly planned distribution may actually increase complexity and coupling while destroying encapsulation. Dependencies must be managed whether local or remote.

This is probably a good point to note that there is a great deal of room between a purely monolithic approach and a full-blown MSA. Rather than a binary choice, there is a wide range of options between the two. The fractal nature of the environment we inhabit means that responsibilities can be described as singular and separate without their being required to share the same granularity. Monoliths can be carved up and the resulting component parts still be considered monolithic compared to an extremely fine-grained sub-application microservice and that’s okay. The granularity of the partitioning (and the associated complexity) can be tailored to the desired outcome (such as making components reusable across multiple applications or more easily replaceable).

The moral of the story, at least in my opinion, is that intentional design concentrating on separation of concerns, loose coupling, and high cohesion is beneficial from the very start. Vertical (functional) slices, perhaps combined with layers (what I call “dicing”), can be used to achieve these ends. Regardless of whether the components are to be distributed at first, designing them with that in mind from the start will ease any transition that comes in the future without ill effects for the present. Neglecting these issues, risks hampering, if not outright preventing, breaking them out at a later date without resorting to a re-write.

These same concerns apply higher levels of abstraction as well. Rather than blindly growing a monolith that is all things to all people, adding new features should be treated as an opportunity to evaluate whether that functionality coheres with the existing application or is better suited to being a service from an external provider. Just as the application architecture should aim for modularity, so too should the solution architecture.

A modular design is a flexible design. While we cannot know up front the extent of change an application will undergo over its lifetime, we can be sure that there will be change. Designing with flexibility in mind means that change, when it comes, is less likely to be an existential crisis. As Hayim Makabee noted in his write-up of Rotem Hermon’s talk, “Change Driven Design”: “Change should entail extending the system rather than refactoring.”

A full-blown MSA architecture is one possible outcome for an application. It is, however, not the most likely outcome for most applications. What is important is to avoid unnecessary constraints and retain sufficient flexibility to deal with the needs that arise.

[London Bus Image by E01 via Wikimedia Commons.]

Laziness as a Virtue in Software Architecture

Laziness may be one of the Seven Deadly Sins, but it can be a virtue in software development. As Matt Osbun observed:

Robert Heinlein noted the benefits of laziness:

Even in the military, laziness carries potential greatness (emphasis mine):

I divide my officers into four groups. There are clever, diligent, stupid, and lazy officers. Usually two characteristics are combined. Some are clever and diligent — their place is the General Staff. The next lot are stupid and lazy — they make up 90 percent of every army and are suited to routine duties. Anyone who is both clever and lazy is qualified for the highest leadership duties, because he possesses the intellectual clarity and the composure necessary for difficult decisions. One must beware of anyone who is stupid and diligent — he must not be entrusted with any responsibility because he will always cause only mischief.

Generaloberst Kurt von Hammerstein-Equord

The lazy architect will ignore slogans like YAGNI and the Rule of Three when experience and/or information tells them that it’s far more likely that the need will arise than not. As Matt stated in “Foreseeable and Imaginary Design”, architects must ask “What changes are possible and which of those changes are foreseeable”. The slogans point out that engineering costs but the reality is that so does re-work resulting from decisions deferred. Avoiding that extra work (laziness) avoids the cost associated with it (frugality).

Likewise, lazy architects are less likely cave when confronted with the sayings of some notable person. Rather than resign themselves to the extra work, they’re more likely to examine the statement as Kevlin Henney did:

It’s far cheaper to design and build a system according to its context than re-build a system to fix issues that were foreseeable. The re-work born of being too focused on the tactical to the detriment of the strategic is as much a form of technical debt as cutting corners. The lazy architect knows that time spent identifying and reconciling contexts can allow them to avoid the extra work caused by blind incremental design.

No Structure Services

Amoeba sketch

Some people seem to think that flexibility is universally a virtue. Flexibility, in their opinion, is key to interoperability. Postel’s Principle, “…be conservative in what you do, be liberal in what you accept from others”, is often used to justify this belief. While this sounds wonderful in theory, in practice it’s problematic. As Tom Stuart pointed out in “Postel’s Principle is a Bad Idea”:

Postel’s Principle is wrong, or perhaps wrongly applied. The problem is that although implementations will handle well formed messages consistently, they all handle errors differently. If some data means two different things to different parts of your program or network, it can be exploited—Interoperability is achieved at the expense of security.

These problems exist in TCP, the poster child for Postel’s principle. It is possible to make different machines see different input, by building packets that one machine accepts and the other rejects. In Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection, the authors use features like IP fragmentation, corrupt packets, and other ambiguous bits of the standard, to smuggle attacks through firewalls and early warning systems.

In his defense, the environment in which Postel proposed this principle is far different from what we have now. Eric Allman, writing for the ACM Queue, noted in “The Robustness Principle Reconsidered”:

The Robustness Principle was formulated in an Internet of cooperators. The world has changed a lot since then. Everything, even services that you may think you control, is suspect.

Flexibility, often sold as extensibility, too often introduces ambiguity and uncertainty. Ambiguity and uncertainty are antithetical to APIs. This is why 2 of John Sonmez’s “3 Simple Techniques to Make APIs Easier to Use and Understand” are “Using enumerations to limit choices” and “Using default values to reduce required parameters”. Constraints provide structure and structure simplifies.

Taken to the extreme, I’ve seen flexibility used to justify “string in, string out” service method signatures. “Send us a string containing XML and we’ll send you one back”. There’s no need to worry about versioning, etc. because all the versions for all the clients are handled by a single endpoint. Of course, behind the scenes there’s a lot of conditional logic and “hope for the best” parsing. For the client, there’s no automated generation of messages nor even guarantee of structure. Validation of the structure can only occur at runtime.

Does this really sound robust?

I often suspect the reluctance to tie endpoints to defined contracts is due to excessive coupling between the code exposing the service and the code performing the function of the service. If domain logic is intermingled with presentation logic (which a service is), then a strict versioning scheme, an application of the Open/Closed Principle to services, now violates Don’t Repeat Yourself (DRY). If, however, the two concerns are kept separate within the application, multiple endpoints can be handled without duplicating business logic. This provides flexibility for both divergent client needs and client migrations from one message format to another with less complexity and ambiguity.

Stable interfaces don’t buy you much when they’re achieved by unsustainable complexity on the back end. The effect of ambiguity on ease of use doesn’t help either.

Professional Software Development – Can We Mandate What We Can’t Define?

The law is a what?!?

The only true wisdom is in knowing you know nothing.
Socrates

What types of software products have you worked on: desktop applications, traditional web, single-page applications, embedded, mobile, mainframe?

How about organizations: private for-profit, government, non-profit?

How about domains: finance, retail, defense, health care, entertainment, banking, law enforcement, intelligence, real estate, etc. etc. etc.?

Given that the realm of “software development” is currently huge (and probably expanding as you read this), how logical is it that someone (or even a group) could regulate what is acceptable process and practice? I won’t say that it would be impossible to come up with one unified set of regulations that would fit all circumstances, but I’m very comfortable estimating the likelihood as a minute fraction of a percent. If the entire realm were broken down into smaller groupings, the chance might increase, but the resulting glut of regulations would become an administrative nightmare and still wouldn’t address those circumstances that aren’t in the list above but are on the horizon.

Nonetheless, people continue to float the idea of regulation.

Last fall, Bob Martin floated the idea of government regulation as a reaction to the healthcare.gov fiasco. That would be the same government whose contracting regulations contributed to the fiasco in the first place, correct? That would be the same government that has legally mandated Agile for Department of Defense contracts? Legally mandated agility just seems to sound a bit suspicious. As Jeff Sutherland noted “Many in Washington are still trying to figure out what exactly that means but it is a start”. A start, for sure, but the start of what?

Ken Schwaber’s blog post “Can Software Developers Meet the Need?” takes a different approach. Schwaber proposes that:

A software profession governing body is needed. We need to formalize and regulate the skills, techniques, and practices needed to build different types of software capabilities. On one side, there is the danger of squeezing the creativity out of software development by unknowledgeable bureaucrats. On the other side is the danger of the increasingly vital software our society relies on failing critically.

We can either create such a governance capability, or the governments will legislate it after a particularly disastrous failure.

Call me a cynic, but I’m betting that the amount of bureaucratic squeezing that would result from this would far outweigh any gain in quality.

Most of the organization types listed above are already on the hook for harm caused by their IT operations; just ask Target and Knight Capital (don’t ask the Centers for Medicare & Medicaid Services). Is it more likely that a committee, whether private or public, can better manage the quality of software across all the various categories listed above? Could they be more likely to keep up with change in the industry? Color me doubtful.

Surfing the Plan

Hang loose

In a previous post, I used the Eisenhower quote “…plans are useless but planning is indispensable”. The Agile Manifesto expresses a preference for “Responding to change over following a plan”. A tweet I saw recently illustrates both of those points and touches on why so many seem to have problems with estimates:

Programming IRL:
“ETA for an apple pie?”
“2h”
8h later:
“Where is it?”
“You didn’t tell me the dishes were dirty and you lacked an oven.”

At first glance, it’s the age-old story of being given inadequate requirements and then being held to an estimate long after it’s proven unreasonable. However, it should also be clear that the estimate was given without adequate initial planning, no “plan B” and when the issues were discovered, there was no communication of the need to revise the estimate by an additional 300%.

Before the torches and pitchforks come out, I’m not assigning blame. There are no villains in the scenario, just two victims. While I’ve seen my share of dysfunctional situations where the mutual distrust between IT and the business was the result of bad actors, I’ve also seen plenty that were the result of good people trapped inside bad processes. If the situation can be salvaged, communication and collaboration are going to be critical to doing so.

People deal with uncertainty every day. Construction projects face delays due to weather. Watch any home improvement show and chances are you’ll see a renovation project that has to change scope or cost due to an unforeseen situation. Even surgeons find themselves changing course due to circumstances they weren’t aware of until the patient was on the table. What the parties need to be aware of is that the critical matter is not whether or not an issue appears, but how it’s handled.

The first aspect of handling issues is not to stick to a plan that is past its “sell by” date. A plan is only valid within its context and when the context changes, sticking to the plan is delusional. If your GPS tells you to go straight and your eyes tell you the bridge is out, which should you believe?

Sometimes the expiration of a plan is strategic; the goal is not feasible and continuing will only waste time, money, and effort. Other times, the goal remains, but the original tactical approach is no longer valid. There are multiple methods appropriate to tactical decision-making. Two prominent ones are Deming’s Plan-Do-Check-Act and Boyd’s Observe-Orient-Decide-Act. Each has its place, but have a looping nature in common. Static plans work for neither business leaders nor fighter pilots.

The second aspect of handling issues is communication. It can be easy for IT to lose sight of the fact that the plan they’re executing is a facet of the overarching plan that their customer is executing. Whether in-house IT or contractor, the relationship with the business is a symbiotic one. In my experience, success follows those who recognize that and breakdowns occur when it is ignored. Constant communication and involvement with that customer avoids the trust-killing green-green-green-RED!!! project management theater.

In his post “Setting Expectations”, George Dinwiddie nailed the whole issue with plans and estimates:

What if we were able to set expectations beyond a simple number? What if we could say what we know and what we don’t know? What if we could give our best estimate now, and give a better one next week when we know more? Would that help?

The thing is, these questions are not about the estimates. These questions are about the relationship between the person estimating and the person using the estimate. How can we improve that relationship?

One Size Fits Somebody, But Probably Not You

Why One Process Can’t Work Everywhere

Gene Hughson and Charlie Alfred

In the software business, there’s been a strong tendency to treat standardized processes like Better Homes and Garden’s recipes. People pine after a standard way to make a team effective in dealing with complex problems. Agile methods, unit test strategies, continuous integration are just a few of the examples. The hope is that we can just copy a process that was successful somewhere else, or maybe make a few small alterations (like the tailor at Joseph A. Banks does), and presto, we have a quantum leap in effectiveness. This post explores why the authors believe that this model is closer to fantasy than reality.

The “Essence of Software”

“There is no single development, in either technology or management technique, which by itself, promises even one order of magnitude of improvement within a decade, in productivity, in reliability, in simplicity”

– Fred Brooks

The above quote appeared in 1986 in an article titled “No Silver Bullet – Essence and Accident in Software Engineering” [1]. The main premise of this article is that “the essence of software engineering is a construct of interlocking concepts among data items, algorithms, and invocations of functions. This essence is abstract, in that the conceptual construct is the same under many different representations.” He goes on to say that four things make software inherently different from other engineering disciplines:

  • Complexity: the vast number of different parts and the differences between them makes software profoundly different from other large-scale engineering, like computers, buildings or automobiles.
  • Conformity: Physics and chemistry have core unifying principles (General Relativity, Ohm’s Law) which drive behavior and organization. Software is more like law – designed by humans. Loose conformity exists, but with much variation and contradiction.
  • Changeability: Software is soft; firmware is firm hardware is hard. The names accurately represent which is likely to be changed first. High rates of change combine with interdependencies between parts to increase complexity.
  • Invisibility: The reality of software is not inherently embedded in space. As much as we try, UML class, state and sequence diagrams only capture the very highest levels of structure and behavior. Even the source code fails to tell the full picture, as is evident any complex multi-threaded program.

These same four aspects apply to the processes by which software is created. For any non-trivial system, the number of humans involved and their interactions will serve to increase complexity and decrease conformity. For good or ill, requirements will change. The more detailed the rules around how the various players interact, the less those rules will bear any resemblance to reality.

Software Development Processes

According to Wikipedia [2], the roots of agile methods can be traced back to 1957, to work done at IBM’s Service Bureau Corporation. The movement gathered momentum in the early 1970’s, and became a force in 2001, with the publication of the Agile Manifesto.

Agile is a collection of several methods, including Kanban, XP, Crystal Clear, Function-Driven Development, and Scrum. In 2012, Scrum seems to have emerged as the most widely adopted agile method [3]. Agile methods strive to address the complexity and changeability obstacles cited above by Brooks.

Traditional waterfall methods spend significant up-front time trying to define a system’s requirements and architecture. For large complex systems, this process can take years of effort and involve many people.

Agile proponents argue:

  • Textual and/or diagrammatic representations of a system are too vague and incomplete, and
  • During the process of representing the system, requirements change too rapidly
  • The ability to assess changes is hampered by the absence of early iterations of partial systems
  • Refactoring, the ability to restructure a design to better handle change, is an essential capability

By contrast, Waterfall proponents argue:

  • Refactoring doesn’t scale well. It works best in smaller, more localized areas
  • Early design decisions constrain the solution space of downstream decisions. Errors here multiply
  • Regardless of how well you encapsulate things, large systems have many cross-dependencies. Many systemic issues are not evident until a “critical mass” of the system can be viewed as parts and whole.

Further complicating the debate are the inherent trade-offs between three strong drivers:

  1. Velocity – At what rate is the system being developed? How long will it take to be done?
  2. Quality – How good is the result? Features? Cost? Usability? Performance? Security? Reliability? etc.
  3. Adaptability – How fast can the system adapt? To new Requirements? Geographies? Technologies?

These three drivers expand the scope of the debate. The development projects/process, the system being developed, and the deployment environment(s) are very tightly coupled.

One Size Does Not Fit All

If you accept the premises above, then one inescapable solution is that context matters, a lot. Brooks argued that there was “no silver bullet” in technology or management technique (and development process is, after all, a subclass of management technique). In the same way that the architecture of a system evolves, intentionally or not, to adapt to its context, so too must a process.

Evidence that this statement is true is that all methods get adapted. There are many Scrum projects, but none practice the same way. In the same vein, there are many waterfall projects (especially in safety-critical regulated development, like aircraft and medical devices), and few, if any, of them practice the same way.

Take a slight deviation, and you see a new picture. Product line (also called platform) adds a different wrinkle. A traditional development project has a specific target in mind. An automobile is designed for transporting a few passengers along roads. A boat is designed for transporting a few passengers over water. If either products ends up trying to do the work of the other, it is considered a failure condition.

With product lines, the intent is to create a shared asset base from which many related products can be built. Google’s Android framework is the foundation for smart phone and tablet computers from Samsung, Motorola, HTC and others [4]. Product lines create a new tension, which is similar to but quite different from the “change over time” tension which motivates agile. Product line development deals with change over time challenges and change over space (context) challenges at the same time. When a football players knee is forced to deal with concurrent changes in time and space, it often results in an ACL or MCL tear.

In December 2011, Mark Kennaley did an excellent podcast with Mike Gualtieri of Forrester Research [5]. The subject was that just because a development process works one place does not mean it will succeed in another. This is analogous to a plant that thrives on sunshine, heat and water, will fare poorly if planted in a cooler, drier climate in the shade. Kennaley lists ten factors which must be considered, such as size of the development team, complexity of the domain, technical complexity, whether the team is co-located or distributed, the division of labor within the organization, compliance, criticality, time to market pressures and culture.

We share this view, but believe that other factors need to be added. In particular, important variations must be considered in the nature of:

  • The system being developed – Early life-cycle stages of innovative systems are different from next-gen systems with well understood markets/solutions
  • The context of use – SUV’s are used by off-road enthusiasts and suburban families. A vehicle suitable for the former killed 100,000’s of the latter
  • System deployment – in-house hosting is different from Software as a Service, and mobile is quite different from LAN connectivity

Some factors are also in need of expansion. In addition to its complexity, the variability and volatility of the domain will affect the fit of the process to its context. The team’s level of experience with the technology platform will likewise impact the suitability of the process as much as its size and dispersion.

Examples

In this section, we’d like to present two short case studies which illustrate how development, system, and deployment factor strongly influence process selection.

Example 1 – Connected Medical Device

In regulated development of embedded systems (medical and aerospace), human safety concerns dominate the process. However, formal process only begins when development starts. At this point in time:

  • FMEA processes assess safety risks, impact and mitigations and ensure that unmitigated risk is acceptable
  • Product (feature) and system-level (architectural) requirements are clearly specified.
  • A formal life cycle process dictates reviews, approvals and traceability for designs, implementation and testing

As a result, requirements changes have a burden (analyze safety impact, redo traceability, redo tests). Again, given the overriding safety concern, this rigor is justifiable

The concept phase is critical to the process. The focus of this phase is hypothesis formulation, proof of concept and risk reduction. Design controls are off in this phase. As a result, it behooves a product development shop to stay in concept phase until the problem is well-defined, architecture is solid, and requirements are well-defined. A common mistake is to proclaim end of concept too soon, and carry ambiguity and risk into development. Design controls make these much more expensive to address than they would have been during concept.

This leads directly to one of the fundamental tenets of agile – requirements volatility. As mentioned earlier, once in the development phase, safety mechanisms (including risk analysis, change management, reviews and traceability) put a big tax on requirements change. With embedded systems, the good news is that physics, chemistry and biology are stable, predictable sciences, and user interfaces are relatively task focused.

The interesting situation occurs when moving to medical application and enterprise software that is also safety-critical. Regulatory design controls still apply because of the safety risks. However, physics, chemistry and biology are less important drivers and human users become a bigger factor. Now, you have the embedded software problem with significantly more requirements volatility. The best approach here isn’t always obvious. One strategy is to use architecture to separate the system into central parts whose requirements are more stable, and use agile methods on peripheral parts (e.g. reporting) whose requirements are more volatile.

Example 2 – Corporate Line of Business Application

Our second example involves an application used to provide title search services to the agency channel of a title insurance provider operating in multiple states. This application serves both external customers (title insurance agents) as well as internal users and integrates with a variety of systems. The system has multiple compliance requirements due to the fact that in addition to Federal law, it must be compliant with the laws of each state served. The system is maintained by a small, technically experienced team that has become familiar with the business domain over a ten-year period. During this period, the same business owner has been in place, yielding a high degree familiarity between the owner and the team. This gives us the following dominant process drivers:

  • Variability – The diversity of legal and regulatory requirements as well as operational workflow from one jurisdiction to another requires a high degree of flexibility from the system.
  • Complexity – Responding to the domain complexity noted above as well as the integrations with other systems yields a significant amount of technical complexity. This is further added to by the need to maintain acceptable performance of the system while the user base is growing.
  • Stability as a Priority – Users of the system value stability over new features.

These drivers have led to the development of a process that is agile without being Agile. “Just enough” is the watchword, and practices are constantly evaluated for relevance. Those that provide value are retained and those that do not are dropped.

Frequent internal releases are used to verify code and validate the product, but releases to production take place at three to six months intervals, according to the preference of the business owner. The release management practices of the group, described in the post “Do you have releases or escapes?” [6] is used to ensure the integrity of the release.

Rather than a time-boxed process, a negotiated method is used where the effort to deliver the desired bundle of functionality drives the projected due date for the release. Estimates are given as ranges, with variances that diminish as more is known about the individual features to be delivered. Any changes are triaged, and if needed for the release in progress, the schedule is adjusted accordingly. Collaborative requirements elicitation, constant feedback, and transparency are used to maintain the relationship between the business owner and the development team.

Conclusion

The contextual factors that determine the appropriateness of a development process will vary from industry to industry and from enterprise to enterprise. It is also important that these drivers vary from application to application within the same enterprise as well. A process that has achieved success with some groups in an organization may actually degrade the performance of other groups if it does not fit their context [7]. Standardization that ignores the appropriateness of a set of practices to the target environment may well do more harm than good.

[1] http://faculty.salisbury.edu/~xswang/Research/Papers/SERelated/no-silver-bullet.pdf
[2] http://en.wikipedia.org/wiki/Agile_software_development
[3] http://www.versionone.com/state_of_agile_development_survey/10/page3.asp
[4] http://www.botskool.com/geeks/list-andriod-based-smart-pnhones
[5] http://blogs.forrester.com/mike_gualtieri/12-12-11-technopolitics_podcast_agile_software_is_not_the_cats_meow
[6] https://genehughson.wordpress.com/2011/12/16/releases-or-escapes/
[7] http://thecodist.com/article/i_fear_our_mobile_group_being_forced_to_follow_scrum