Software development effort estimation


In software development, effort estimation is the process of predicting the most realistic amount of effort required to develop or maintain software based on incomplete, uncertain and noisy input. Effort estimates may be used as input to project plans, iteration plans, budgets, investment analyses, pricing processes and bidding rounds.

State-of-practice

Published surveys on estimation practice suggest that expert estimation is the dominant strategy when estimating software development effort.
Typically, effort estimates are over-optimistic and there is a strong over-confidence in their accuracy. The mean effort overrun seems to be about 30% and not decreasing over time. For a review of effort estimation error surveys, see. However, the measurement of estimation error is problematic, see [|Assessing the accuracy of estimates].
The strong overconfidence in the accuracy of the effort estimates is illustrated by the finding that, on average, if a software professional is 90% confident or “almost sure” to include the actual effort in a minimum-maximum interval, the observed frequency of including the actual effort is only 60-70%.
Currently the term “effort estimate” is used to denote as different concepts such as most likely use of effort, the effort that corresponds to a probability of 50% of not exceeding, the planned effort, the budgeted effort or the effort used to propose a bid or price to the client. This is believed to be unfortunate, because communication problems may occur and because the concepts serve different goals.

History

Software researchers and practitioners have been addressing the problems of effort estimation for software development projects since at least the 1960s; see, e.g., work by Farr and Nelson.
Most of the research has focused on the construction of formal software effort estimation models. The early models were typically based on regression analysis or mathematically derived from theories from other domains. Since then a high number of model building approaches have been evaluated, such as approaches founded on case-based reasoning, classification and regression trees, simulation, neural networks, Bayesian statistics, lexical analysis of requirement specifications, genetic programming, linear programming, economic production models, soft computing, fuzzy logic modeling, statistical bootstrapping, and combinations of two or more of these models. The perhaps most common estimation methods today are the parametric estimation models COCOMO, SEER-SEM and SLIM. They have their basis in estimation research conducted in the 1970s and 1980s and are since then updated with new calibration data, with the last major release being COCOMO II in the year 2000. The estimation approaches based on functionality-based size measures, e.g., function points, is also based on research conducted in the 1970s and 1980s, but are re-calibrated with modified size measures and different counting approaches, such as the use case points or object points in the 1990s.

Estimation approaches

There are many ways of categorizing estimation approaches, see for example. The top level categories are the following:
Below are examples of estimation approaches within each category.
Estimation approachCategoryExamples of support of implementation of estimation approach
Analogy-based estimationFormal estimation modelANGEL, Weighted Micro Function Points
WBS-based estimationExpert estimationProject management software, company specific activity templates
Parametric modelsFormal estimation modelCOCOMO, SLIM, SEER-SEM, TruePlanning for Software
Size-based estimation modelsFormal estimation modelFunction Point Analysis, Use Case Analysis, Use Case Points, SSU, Story points-based estimation in Agile software development, Object Points
Group estimationExpert estimationPlanning poker, Wideband delphi
Mechanical combinationCombination-based estimationAverage of an analogy-based and a Work breakdown structure-based effort estimate
Judgmental combinationCombination-based estimationExpert judgment based on estimates from a parametric model and group estimation

Selection of estimation approaches

The evidence on differences in estimation accuracy of different estimation approaches and models suggest that there is no “best approach” and that the relative accuracy of one approach or model in comparison to another depends strongly on the context
. This implies that different organizations benefit from different estimation approaches. Findings that may support the selection of estimation approach based on the expected accuracy of an approach include:
The most robust finding, in many forecasting domains, is that combination of estimates from independent sources, preferable applying different approaches, will on average improve the estimation accuracy.
It is important to be aware of the limitations of each traditional approach to measuring software development productivity.
In addition, other factors such as ease of understanding and communicating the results of an approach, ease of use of an approach, and cost of introduction of an approach should be considered in a selection process.

Assessing the accuracy of estimates

The most common measure of the average estimation accuracy is the MMRE, where the MRE of each estimate is defined as:
MRE =
This measure has been criticized
and there are several alternative measures, such as more symmetric measures, Weighted Mean of Quartiles of relative errors
and Mean Variation from Estimate.
MRE is not reliable if the individual items are skewed. PRED is preferred as a measure of estimation accuracy. PRED measures the percentage of predicted values that are within 25 percent of the actual value.
A high estimation error cannot automatically be interpreted as an indicator of low estimation ability. Alternative, competing or complementing, reasons include low cost control of project, high complexity of development work, and more delivered functionality than originally estimated. A framework for improved use and interpretation of estimation error measurement is included in.

Psychological issues

There are many psychological factors potentially explaining the strong tendency towards over-optimistic effort estimates that need to be dealt with to increase accuracy of effort estimates. These factors are essential even when using formal estimation models, because much of the input to these models is judgment-based. Factors that have been demonstrated to be important are: Wishful thinking, anchoring, planning fallacy and cognitive dissonance. A discussion on these and other factors can be found in work by Jørgensen and Grimstad.
The chronic underestimation of development effort has led to the coinage and popularity of numerous humorous adages, such as ironically referring to a task as a "small matter of programming", and citing laws about underestimation:
Adding to the fact that estimating development efforts is hard, it's worth stating that assigning more resources doesn't always help.

Comparison of development estimation software

SoftwareSchedule estimateCost estimateCost ModelsInputReport Output FormatSupported Programming LanguagesPlatformsCostLicense
AFCAA REVICREVICKLOC, Scale Factors, Cost Driversproprietary, TextanyDOSFreeProprietary Free for public distribution
Seer for SoftwareSEER-SEMSLOC, Function points, use cases, bottoms-up, object, featuresproprietary, Excel, Microsoft Project, IBM Rational, Oracle Crystal BallanyWindows, Any CommercialProprietary
SLIMSLIMSize, constraints, scale factors, historical projects, historical trendsproprietary, Excel, Microsoft Project, Microsoft PowerPoint, IBM Rational, text, HTMLanyWindows, Any CommercialProprietary
TruePlanningPRICEComponents, Structures, Activities, Cost drivers, Processes, Functional Software Size, Function Points, Use Case Conversion Points, Predictive Object Points Excel, CADanyWindowsCommercialProprietary