From Good to Bad

Like any tool, a plan is deemed “good” or “bad” primarily based on how well it enables the user to accomplish some objective. Evaluated in this way, the quality of project planning “in the wild” ranges from outstanding to awful. A single plan can move from good to bad, if the underlying circumstances change.

A plan is fundamentally deemed “good” or “bad” based on its ability to help successfully guide the project to completion. “Outstanding” plans satisfy this test very efficiently and also have characteristics that improve organizational learning and re-use potential.

Irrelevancy

Some plans are wonderful creations when viewed in isolation, full of detail and nuance, theoretically capable of efficiently guiding a project from start to finish.  What happens when such plans are ignored by participants? The plans become at best irrelevant, since a different control structure is being used to guide actual work conduct.

Other plans are irrelevant because they don’t match the project circumstances. A common way for that to occur is when a past plan is imposed on a new project without the necessary adjustments.

A project plan is only relevant to the extent that project effort is enabled by it. A plan that is not accepted and embraced by the project team is at best irrelevant. A plan that does not match the actual project circumstances is a recipe for disaster.

Crisis

A plan that does not match the actual project circumstances is a recipe for disaster.

Failure Analysis

When designing anything, it is helpful to understand various ways that failure can occur. The logic of Failure Modes and Effects Analysis (FMEA) is universal and applies to processes as well as devices and the systems they evolve. Any individual part can fail, often in multiple ways, but a large percentage of critical failures occur because an interaction between parts was either overlooked or not adequately managed.

A very similar situation exists in the world of project planning. Errors can certainly exist within a single set of task assumptions, but the errors that cause the greatest harm often result from complex ways that multiple tasks interact with each other, available resources, or external stresses. The root cause may not be obvious until the full set of interactions is seen and carefully examined.

Everything you understand about quality, root cause analysis, and Failure Modes and Effects Analysis (FMEA) is equally valid in the realm of Project Planning.

Failure Types

There are many ways a project plan can fail, we’ll cover some of the most common ones here. In addition to understanding these failure modes, consider them examples of how you can extrapolate your engineering experience into the world of project planning.

Consistency

A project plan must be consistent with reality. That doesn’t mean it must be equally detailed, just that whatever is in the plan must not conflict with the real world it attempts to model. Conflicts between the plan and reality are fundamental defects that can have broad implications. Consistency can also be a purely internal issue within the plan itself.

Obvious examples of consistency errors include:
  • Expecting timely approval when stakeholder consensus has not yet been reached
  • Planning that depends upon resources becoming available at the exact moment of need, especially when resource availability is externally driven
  • Any violation of physical laws or principles
Other consistency errors are much more subtle:
  • Assuming that stakeholders will remain constant throughout the project
  • Ignoring the “switching overhead” that takes place when people multi-task
  • Assuming everyone knows what they should be doing
  • Assuming everyone wants the project to succeed

Risk is introduced whenever the project plan and reality are inconsistent. Small amounts of manageable risk are a good trade-off to simplify the planning process, but confidence in the planning and associated decision making can be quickly reduced by apparently trivial inconsistencies that interact in complex ways. Consistency errors can be subtle and very hard to spot during reviews.

Completeness

A project plan must be sufficiently complete that the information it provides is adequate to confidently make decisions and monitor progress.  How to accomplish this varies widely in practice. Reaching that threshold in a critical project (such as a nuclear power plant) will be orders of magnitude more difficult than on small routine tasks where risk exposure is much lower.

Obvious examples of “completeness” errors that would impact decision making include:
  • Omitting required tasks and activities
  • Inadequately representing the hand-off and flow between tasks
  • Failure to account for approval and review cycles
Other “completeness” errors are much more subtle:
  • Inadequate visibility of actual task status at critical assessment points
  • “Sneak circuit” interactions between tasks, such as competition for resources
  • Preparation, set-up, tear-down and other necessary supporting activity
  • Supporting roles such as service, training, security, and others not highlighted in the mainstream task flow

Risk is introduced whenever the project plan is incomplete. Small amounts of manageable risk are a good trade-off to simplify the planning process, but confidence in the planning and associated decision making can be compromised if the holes are still present at the time when the decision must be made. Completeness errors are relatively easy to spot during reviews.

Your engineering experience is valuable in a great many ways. Here are a few more examples where familiar engineering concepts can be applied to project planning.

Latent Defects

A latent defect is a flaw that is present from the beginning but difficult to notice until triggered by an external event or condition. Many systems have latent defects that are never triggered. Consider that if the Titanic had not struck an iceberg, the design weakness of watertight compartments with open tops that allowed cascade failure might never have been revealed. Likewise, software can execute perfectly during structured testing but quickly fail in the field when complex interactions with host and user environments are first encountered.

“If the Titanic had not struck an iceberg, the design weakness of watertight compartments with open tops that allowed cascade failure might never have been revealed.”

Experience is wonderful, but limited to the combination of factors that were present during your past effort. Plans that are based solely on past experience, rather than derived from the nature of the work itself and then validated against past experience, are very likely to contain latent defects.

A plan that “didn’t fail last time” may still contain serious flaws that are simply waiting for the right set of circumstances to be revealed. The absence of failure often leads to a false sense of confidence – the Space Shuttle Challenger disaster is a classic example of this pattern.

“Drift” Failures

Some failures occur so slowly that they escape notice until they result in more visible problems. Geologic fault lines create stresses due to differential motion.  Those stresses result in increasing strain and ultimately unplanned movement, if that strain is not released. Aging components in sensitive circuits eventually lead to out-of-tolerance behavior. Likewise mechanical wear accumulates until eventual failure.

The gradual drift of a fixed plan relative to an evolving project reality creates stresses that ultimately impact the project. Without some equivalent of strain gauges, the warning signs are frequently missed.

 “Brittle” Failures

While some failures are gradual, others rapidly advance from an initiation event to complete failure. As an example, instead of simply deforming as it was expected to do, the hull of the Titanic is believed to have fractured on impact. In part the result of high-sulfur steel, but certainly exacerbated by the cold temperatures and speed, the fractures greatly increased the severity of the accident.

Some projects are effectively more ductile than others. If each task and each interaction must take place in an exact manner, expect your project to behave like a ceramic!

Project plans typically experience “brittle” failures when a widespread latent defect is exposed to one or more major stresses.  The combination of lack of warning and rapid progression makes these type of failures very difficult to manage.

“Tolerance Stack” Failures

It is common for parts to have allowable tolerances and this provides a valuable trade-off between overall objectives and the cost of increasingly precise execution. Wherever multiple tolerances interact, the combination of those interactions must also be managed. The term “tolerance stack” is often used to describe the process of examining multiple tolerances as a set rather than independently at the part level.

In project planning the tolerance stack may transform minor inaccuracies in estimation of time or resource needs into critical failure points. Consider what happens on a single task when it takes five percent longer than anticipated to complete. The resources that were supposed to have been freed up at the end of the task remain locked and downstream tasks dependent upon those resources will be made late. Tolerance stack issues often arise when work performed by suppliers or the customer is inadequately controlled.

Project plans can appear fine when examined at the individual task level, yet fail when the tasks are combined due to cumulative tolerance errors. Careful consideration of interactions is necessary to predict and avoid such problems.

“Physics of Failure”

A particularly useful engineering analysis method is to examine failure risk in relationship to the force needed to cause that failure. In engineering those forces are things like voltage, kinetic energy, chemical reactions, and so forth.

In project planning you’ll be dealing with forces such as:
  • Money
  • Position authority
  • Influence
  • Politics
  • Reward and punishment mechanisms
Blackbeard,_Buccaneer_-_Cover

You can evaluate tasks to see which are vulnerable to politics, funding, or many other external forces.

The logic is the same. You can ignore the force below a certain activation energy level and beyond that threshold the impact becomes increasingly significant until finally breakdown and failure occur.

You can also evaluate susceptibility to a particular force. Just as a magnetic field has very little impact on some parts yet is critical to others, you can evaluate tasks to see which are vulnerable to politics, funding, or many other external forces. The stronger the force, and the more tightly it is coupled to your project, the greater the potential impact.

Project plans fail for specific reasons, just like devices do, and awareness of the forces acting upon the project is a key source of insight needed to offset planning risk.

Entropy

Last on our quick reading list, but perhaps the most universal threat to planning, is simply entropy. Everything moves toward a less ordered state, including project plans – unless energy is spent to offset that movement.  As the poet Robert Burns famously wrote, “The best laid schemes o’ Mice an’ Men gang aft agley.”

burns mouse

“The best laid schemes o’ Mice an’ Men gang aft agley”

All project plans, regardless of how perfect they may be at the moment of creation, drift toward chaos unless constantly monitored and renewed.