Recently I ran across some extremely interesting data analysis about software project estimates versus actuals, courtesy Erik Bernhardsson (hereafter, “EB”). It provides context about how much risk is in software projects, and by implication, how to identify and manage that risk.
Since PL Programs project consultants structure and manage supply-chain projects like fulfillment centers, distribution centers, and reverse logistics sites, and a major workstream in those projects is the Warehouse Management and related software (ERPs, TMS’s, LMS’s, CMMS’s, HRIS’s and on and on…), I was immediately interested in what this means for program risk management.
Risk management is a key source of value in project management, so being able to identify high risk areas, and, even better, quantify them can help project planning organizations avoid major, costly missteps.
In summary, this is empirical verification of what is ‘common knowledge’: software projects are much more likely than not to exceed budget estimates, and there is some outlier chance of truly ruinous outcomes. Fortunately, there are ways to proactively address the risk of budget (and schedule) disaster.
Let’s take a look.
Software Project Statistical Models Background
What Erik found from examining data with thousands of software task estimates and actual effort was illuminating. Summarized in his words, but bold is mine:
The median blowup factor turns out to be exactly 1x for this dataset, whereas the mean blowup factor is 1.81x. Again, this confirms the hunch that developers estimate the median well, but the mean ends up being much higher.
- People estimate the median completion time well, but not the mean.
- The mean turns out to be substantially worse than the median, due to the distribution being skewed (log-normally).
- When you add up the estimates for n tasks, things get even worse.
- Tasks with the most uncertainty (rather the biggest size) can often dominate the mean time it takes to complete all tasks.
- The mean time to complete a task we know nothing about is actually infinite.
The whole piece is really worth reading, as is the followup on judging when to cut bait on a project. But what does it mean for us?
Review of Data
This got me interested so I pulled the dataset and did some decidedly-not-so-statistically-sophisticated analysis.
I pivoted the total project estimates and actuals with some mild cleaning of removing the small tasks and anomalous “7” estimate/actual rows. I added a calculated field for the “blowup factor”, as Erik put it, which is the ratio of actual-to-estimate effort. A high blowup factor means the actual effort was more than the estimate.
A blowup factor of 1 means that the actual effort matched the estimated effort (Remember this!).
We took the project data, created a table of the by-project summaries*, and put it in a histogram by blowup factor at .1 increments:
Remember that Blowup Factor of 1x being where “estimate” = “actual effort”? Look at the chart. The cumulative probability of a project being at or under budget is 40%.
This means that the probability of a project exceeding budget is 60%.
The median blowup in that dataset is 1.09. The average (consistent with EB’s post, though my numbers/methodology are slightly different) is 1.25. This rings true with all sorts of literature, from The Mythical Man-Month to various other journals, software-project books, and scheduling Monte-Carlo simulations.
But, and again drawing out EB’s analysis, we see that the top 20% of the cases have a blowup average of 2.6. And in the dataset shown, 5% of the time (95th percentile), the blowup was 5.32. Fitting a distribution curve to all this would let you make forecasts at any level of probability (again, see EB – a 99th percentile case yields a 32x blowup ) but this is enough to catch our attention.
It means that software projects are dangerous. Some percentage of the time, you end up with extremely high overruns, in this dataset a 5x cost. And while the most common outcome may be meeting budget, your average expected effort is 125% of your original estimate.
This has obvious budgeting and contingency implications.
Identifying Risk Factors
What makes software projects so dangerous? Well, the level of uncertainty of the work itself.
EB points out in his second post that there are some characteristics of projects with different levels of uncertainty (here represented by sigma σ):
Where do Warehouse systems come into play here?
Well, many Warehouse Management Systems have a defined set of core functionality. From that core functionality, they offer some level of configuration ability. They must be integrated with a variety of ERPs, automation systems, business partner systems, reporting, financial systems, timekeeping and HR systems, transportation management, and others. On top of that, customers may require customizations or enhancements for specific business uses.
So a low-σ case for a WMS implementation would have the characteristics of few or simple integrations, business processes that are configurable out-of-the-box, and no customizations. Critically, this is also the case where all requirements are identified and understood. This might be the case where the project meets budget and schedule.
The High-σ case would include implementing a new, never-before-operated system or set of functionality modules, many customizations, and many or complex integrations. Requirements here are not all identified or clear. If you are facing a high-σ implementation, then expect more complexity and effort and overall risk.
And of course there is a gradient of complexity between the two situations.
What The Project Manager Can Do
The project manager, as keeper of the risk management process, can do a few things to help out.
The first is simply being aware of the risk factor and identifying it with the team. A WMS is not always a WMS, and the project team needs to know what they’re working with.
Second, the mitigation plan may include a risk-avoidance plan of moving the high-σ factors to lower-σ factors.
The key theme is to reduce uncertainty. Perhaps all the customizations aren’t really needed. Maybe there is an out-of-the-box option for that labor or yard management module. The cost of buying additional features from another vendor should be weighed against the uncertainty of development complexity and the ongoing maintenance requirements.
Last, the risk can be mitigated. Here are two strategies for mitigating a high-uncertainty situation:
A key driver of uncertainty is the level of completeness of the requirements-gathering. If all requirements are thoroughly captured and understood, then the uncertainty (by definition) goes down. So for any system implementation that even sniffs of uncertainty, pay extra attention and take time during the requirements-gathering part of design.
Bringing σ from high to medium, or from medium to low, dramatically increases the odds of a successful project.
In addition, project budget should include schedule and cost buffer (contingency) based on the assessed complexity. After scoping the project, the budget must be revisited from a risk standpoint and contingency evaluated on the factors above.
Likewise, the integrated testing plan and ramp schedule should have room to wiggle if a complex system is in the works. Add a few weeks of “defect fix” or a slower ramp plan if the systems are high-uncertainty.
The bottom line is to be aware of the possible range of outcomes of this critical part of the project and to take appropriate risk management precautions.
If you found this interesting, please subscribe! Or if you have projects coming up and could use some expert project and risk management, set a project consultation.
“*” Project data table: