1/253
Chapters 7, 8, 9, 13, and 14
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Ninety-Ninety Rule (Tim Cargill)
The first 90% of the code accounts for the first 90% of the development time. The remaining 10% of the code accounts for the other 90% of the development time.
Highlights the frequent underestimation of the final stages of software projects.
Judgement from Experience (Fred Brooks)
Good judgment comes from experience, and experience comes from bad judgment
Software Estimation Problems
Software time and cost estimates are often wrong because projects are usually new and unique; unlike estimation for a new home.
Hidden Schedule Slippage
Falling behind schedule in software projects is hard to notice until it’s too late.
Biggest Portion of Cost
Effort (person-months) - time and salaries
Cost Equation
Effort * Loaded Salary Per Man Month
Effort
Development time, everything from requirements engineering to delivery; excludes maintenance phase.
Cost Models Creation Options
Create controlled experiments to determine influence of various cost factors
Derive from real project data
Cost Model - Controlled Experiments (Description + Cons)
Keeping all factors constant, except for the one factor that is being studied, randomly assign subjects to tasks to observe impact
Very expensive; done in small scales on low-cost subjects (students)
Validity often questionable
Cost Model - Real Project Data
Use time logs from large companies for multiple SW projects; base SW efforts on sound hypotheses of how factors are related (basis for COCOMO)
Results depend on sound hypothesis
Different projects (web app vs embedded software; same results?)
Differing Results Between Cost Models (Why?)
Different:
Definitions (Man-months, LOC)
Tuning/Calibrating
Tools used
Product/Process Maturity
Expertise
Cost Model Calibration
Existing cost models must be tuned or calibrated to your software engineering work environment (adding your own data to bias the tool). Typically with past, ‘similar to’ project data.
Cost Models Commonality
They all encode some basic, common relations between software development activities and cost
Useful Relationships Found in Cost Models
More code → Higher Code (reuse existing code)
Get the best from people (individual/team capabilities, better work environment/incentives cost a little more, but saves a lot of money)
Avoid rework (use info hiding, fix problems early in cycle)
Tools help (duh why is this even in here i stg ts pmo)
Algorithmic Models
Cost estimation models based on mathematical formulas that relate project attributes to costs.
Basic Cost Model
The Basic Cost Model
Effort in person-months = a+b (kLOC)^c
kLOC = 1,000 lines of code; a, b, c, are constants
Basic Cost Model (Instances)
Halstead → Effort = 0.7 kLOC^1.50
Boehm → Effort = 2.4 kLOC^1.05
Walston-Felix → Effort 5.2 kLOC^0.91
Each may use different LOC definitions (makes it harder to compare numbers directly)
Slowest vs Fastest Effort Growth (Halstead, Boehm, Walston-Felix)
Halstead predicts the fastest effort growth (c = 1.50), while Walston-Felix predictsthe slowest effort growth (c = 0.91).
Early Estimation of kLOC
The earlier in the project, the harder it is to estimate, which reduces accuracy and usefulness of the three basic cost models. A better estimate can be made once a detailed design is in place.
Walston-Felix
A cost estimation model published in a 1976 IBM Systems Journal article where Effort = 5.2 kLOC^0.91 and Duration = 4.1 kLOC^0.36. It was derived from about 60 IBM projects of different sizes and of different programming languages. The model yields unsatisfactory results even when applied to the same subset of 60 projects.
Walston-Felix Productivity Intervals
29 variables that influence productivity, (along with levels/values) used to explain/address the unsatisfactory results the model produces. Each variable has a value (small, medium, or large) that affect project outcomes.
Interval = | Large - Small | → large intervals = big impact on cost.
COCOMO 81
A cost estimation model developed by Barry Boehm in 1981 (Book: Software Engineering Economics) that distinguishes three project classes: Organic, Embedded, Semi-detached projects, each with varying cost and schedule parameters.
Simplest form = basic COCOMO = (Effort = b(kLOC)^c
COCOMO 81 - Organic Project Class
A COCOMO 81 project type that has a smaller team size, a known environment, lots of experience with similar projects. People can contribute early on with little learning overhead.
b = 2.4
c = 1.05
COCOMO 81 - Embedded Project Class
A COCOMO 81 project type where the product is embedded in an inflexible environment that poses severe constraints.
b = 3.6
c = 1.20
COCOMO 81 - Semi-detached Project Class
A COCOMO 81 project type that combines/balances aspects from the other two classes.
b = 3.0
c = 1.12
Basic vs Intermediate/Detailed COCOMO
Basic COCOMO is simple and provides crude estimates (based on nominal values for cost drivers)
Intermediate/Detailed COCOMO is more complex in that it was 15 cost drivers that influence productivity and generally give better estimates.
Putnam
A cost estimation model that uses Manpower Required (MR) over time of software projects and is well-suited for estimating cost of very large software projects. It lead to the rule of thumb: 40% total effort for development and 60% total effort for maintenance, The manpower is approximated by Rayleigh distribution:
MR(t) = 2Kate-^(at*t)
a = speed up factor
K = total manpower required
t = time
Function Point Analysis
A method to measure the size of software projects that requires the number of data structures (instead of kLOC) as input. Assumes data structures predict project size and is well-suited for applications in which data structures dominate project size, less suited for algorithm-heavy applications
Function Point Analysis - Number of input types (I)
User inputs that cause changes to data structures; each different format or treatment counts separately.
Function Point Analysis - Number of output types (O)
User outputs that reflect changes made to data structures; each unique output format is counted separately.
Function Point Analysis - Number of inquiry types (E)
Inputs that control execution without altering internal data structures, like menu selections.
Function Point Analysis - Number of logical internal files (L)
Internal data generated, used, and maintained by the system, such as index files.
Function Point Analysis - Number of interfaces (F)
Data output to or shared with another application
Unadjusted Function Points (UFP)
Weighted sum of five concepts: UFP = 4I + 5O + 4E + 10L + 7F. Reflects average complexity of concepts.
COCOMO 2 (COCOMO 95)
Developed by Barry Boehm et al., introduced in the 1995 article and 2000 book Software Cost Estimation with COCOMO II. It adapts COCOMO to modern software lifecycle practices.
ICASE
Integrated Computer Aided Software Environment; similar to an IDE like Eclipse.
Object-Points
In COCOMO 2, objects like screens, reports, and 3GL modules used to estimate effort in the Application Composition model.
Function-Points
Metric for software size based on user inputs, outputs, inquiries, internal files, and external interfaces; used in early project stages.
3GL
Third-Generation Languages like C or Fortran; integration into higher-level systems (e.g., Java) can be costly.
Three Cost Models (COCOMO 2)
Application Composition (early prototyping), Early Design (architecture stage), and Post-Architecture (development stage).
Application Composition Model
Estimates effort based on object points; involves determining screen/report complexity, adjusting for reuse, and calculating effort with productivity rates.
Early Design Model
Uses Unadjusted Function Points (UFP), converts them to SLOC, applies seven cost drivers, and estimates effort early in the design phase.
Post-Architecture Model
Most detailed model; uses constants, 5 scale factors, and 17 cost drivers to refine cost estimation after system architecture is decided.
Product Factors
Cost drivers related to reliability, database size, product complexity, required reusability, and documentation needs.
Platform Factors
Cost drivers that account for execution-time constraints, main storage requirements, and platform volatility.
Personnel Factors
Cost drivers evaluating analyst and programmer capabilities, experience, language/tool familiarity, and team continuity.
Project Factors
Cost drivers related to software tool maturity, multi-site development challenges, and schedule flexibility.
Precedentedness
A scale factor assessing how novel the project is to the organization, influencing risk and complexity.
Development Flexibility
A scale factor measuring how tightly the software must conform to preset requirements and external specifications.
Architecture/Risk Resolution
A scale factor evaluating the degree of early risk identification, architecture definition, and critical milestone planning.
Team Cohesion
A scale factor assessing team alignment, stakeholder cooperation, and ability to work toward shared goals.
Process Maturity
A scale factor rating the organization’s process maturity, often based on CMM (Capability Maturity Model).
Counting Code with Reuse
Adjusts effort estimates based on design, code, and integration modifications needed to reuse existing code, using formulas involving DM, CM, IM, SU, and AA.
Reuse Model
Factors like Software Understanding (SU) and Assessment and Assimilation (AA) adjust for the quality and integration effort of reused code.
Use-case points (what, who, where, sum up, adjust uucp, 13 TCF, 8 development team EFC, UCP = UUCP * TCF * ECF, actor complexity)
13 TCFs
8 Development Team Factors
State of Software cost models
Limited Quality (Mange Jorgensen)
Expert Estimates: Avoid Traps (compensate, work fills, pricing to win, budget method)
Guidelines for Expert-Based Estimates (all)
Winner’s Curse Bidding
Good way to improve estimates
Estimate not a point
Cone of Uncertainty
Inexperience → Overestimate abilities
Estimation technique based on UML-style use cases; adjust Unadjusted Use-Case Points (UUCP) with Technical Complexity Factors (TCF) and Environmental Complexity Factors (ECF) to get UCP.
Summing up UUCP
Count unadjusted use case points by adding complexities of use-cases and actors.
Adjusting UUCP
Multiply UUCP by Technical Complexity Factors (TCF) and Environmental Complexity Factors (ECF) to adjust.
Includes factors like distributed system, performance, reusability, concurrency, portability, and security among others.
Includes factors like UML familiarity, analyst capability, application experience, object-oriented experience, motivation, stable requirements.
Formula to calculate Use-Case Points
UCP = UUCP × TCF × ECF.
Actor weight: Simple (1), Average (2), Complex (3), based on interaction type.
Many rely on past project data but lack systematic collection and sharing across projects.
A review of studies on expert estimation of software development effort that showed that many cost models and expert estimates show mixed accuracy; 5 studies favored models, 5 expert judgment, 5 no difference.
Avoid traps like Compensating for Lack of Data, Work Filling Time (Given X months, so we’ll take X months), Pricing to Win (Comp’s bid is X, so we bid for 9/10 of X), or budget method which mislead true effort estimation
Don’t mix estimation, planning, and bidding
Combine estimation methods (average estimates of multiple experts)
Ask for justification/estimate rationale (no gut feelings)
Select experts with experience in similar projects
Accept and access uncertainty
Provide opportunities to learn to estimate
Consider postponing or avoiding effort estimation
Over-optimistic bidders win fixed-price contracts, leading to high project risks (highest bid wins, but item could be worth a lot less)
For Software:
Over-optimistic bides lower
Client selects a low or lowest bid
Lower bid means higher risks
Most effective approach to increase accuracy of cost estimation: Use estimates in performance evaluations. Make estimators accountable by tying estimates to performance evaluation
Estimate is Not a Point
An estimate is a range of values, not a single point; use three-point estimation: (optimistic + 4 × realistic + pessimistic) / 6
At the start of a project, uncertainty is high and decreases over time; estimates should be frequently updated. People underestimate amount of uncertainty
Inexperienced people tend to overestimate their skills, leading to unrealistic project estimates; More mature orgs tend to have fewer cost overruns because they’re more aware of their abilities.
Effort in person-months does not directly translate to calendar months; 20 person-months could be 1 person for 20 months or 20 people for 1 month, etc.
Cost models map estimated effort (E) in person-months to project duration (T) in calendar months using formulas like T = 2.5 * E^0.35 (Walston-Felix) or similar.
Reducing project duration requires adding more people, but leads to lower individual productivity.
Adding people increases communication overhead, causes productivity loss as new members ramp up, and diverts existing team members’ time.
Adding manpower to a late software project makes it later, due to communication overhead and onboarding delays.
Average individual productivity L(c) = L - s(P-1)c, where L = initial productivity, s = communication productivity loss, P = number of people, c = communication density.
Total team productivity Ltotal = P × (L - s(P-1)c); optimal productivity occurs at a specific team size before communication loss outweighs gains.
As team size increases, individual productivity decreases; beyond an optimal point, total team productivity also drops.
Empirical limit on compression is around 75% of nominal project schedule; trying to compress more enters “impossible region”.
Compressing the schedule by X% increases cost by approximately X%.
In agile, upfront full cost estimation is avoided; instead, cost is estimated iteratively using story points and team velocity.
Pick a baseline feature (e.g., 5 story points), estimate other features relative to it, and maintain consistent point calibration.
Team members independently estimate story points, reveal estimates simultaneously, discuss discrepancies, and re-estimate to reach consensus.
Story points are mapped to effort via estimated velocity (story points per iteration); story points are not recalibrated mid-project.
Projects often lack good data collection, leading to political or guess-based estimates; all cost models require calibration and ongoing reassessment.