Cost Behavior and Estimation: Fixed vs Variable; High-Low and Regression Methods

Fixed costs vs. variable costs

  • Management accounts convert data from the accounting system into a format that is more useful for cost analysis
  • Fixed costs: same total amount regardless of activity level, within the period
    • Examples: rent paid each month; salary for a salaried employee
  • Variable costs: change with activity level (costs vary with output or hours worked)
    • Example: wages for hourly employees (more hours → higher pay, fewer hours → lower pay)
  • Other examples:
    • Subscriptions (e.g., Netflix) are typically fixed costs because the charge is the same each period
  • In practice, many costs are mixed (contain both fixed and variable components)
    • Example: renting a truck with a fixed fee plus a variable cost per mile (or per hour): C = F + v imes x
    • Intercept (F) is the fixed portion; slope (v) is the variable portion
  • Why categorize costs this way?
    • Helps with forecasting and budgeting
    • Lets us predict how total cost will change with activity level based on fixed and variable portions
  • Note from lecture: most costs in reality are mixed, not purely fixed or purely variable
    • This is why we estimate both fixed and variable components rather than labeling every cost categorically

Mixed costs and a practical example

  • Truck rental example mentioned in class:
    • Fixed portion: $400 (intercept)
    • Variable portion: $0.20 per unit (e.g., per mile or per hour)
    • Graphically: the intercept represents the fixed cost, and the slope represents the variable cost per unit
  • General form: C = F + v imes x where
    • C = total cost (y)
    • F = fixed cost (intercept)
    • v = variable cost per unit (slope)
    • x = activity level (e.g., miles, hours)

Why we estimate costs

  • Primary purpose: look at total cost and determine what portion is fixed vs. variable
  • Use past data to predict future costs under different activity levels
  • The two key implications:
    • If we know the fixed and variable portions, we can forecast costs for different levels of activity
    • Helps with budgeting, pricing, and profitability analysis

Three main methods to analyze/estimate costs

  • Scattergram (scatter plot)
    • Plot observed data points (cost vs. activity)
    • A visual check: does the data roughly fall on a straight line?
    • If the data does not look linear, a simple linear method may not be appropriate
    • Outliers can distort the line, so visually inspect points that lie far from the line
  • Regression (line of best fit)
    • A statistical method that uses all data points to fit a straight line
    • The slope gives the variable cost per unit; the intercept gives the fixed cost
    • Provides goodness-of-fit measures to assess prediction quality
    • Output typically includes: intercept (fixed cost F), slope (variable cost v), correlation metrics, t-stats, p-values
  • High-Low method (a simpler, quick estimation approach)
    • Uses only the highest and lowest activity levels
    • Steps: 1) Identify the highest and lowest activity levels:

      • X_ ext{high}, X_ ext{low}

      • Y_ ext{high}, Y_ ext{low}
        2) Compute the variable cost per unit (slope):
        v = rac{Y_ ext{high} - Y_ ext{low}}{X_ ext{high} - X_ ext{low}}
        3) Compute the fixed cost (intercept) using one of the points (either high or low):
        F = Yi - v imes Xi \text{for } i ext{ = high or low}
        4) Construct the cost function: C = F + v imes x
    • Pros: simple and quick; Cons: uses only two data points and is sensitive to outliers
    • Regression is typically more accurate because it uses all data points

Example data from the lecture (computer repair shop)

  • Setup:
    • Y = total overhead cost (in dollars), measured monthly
    • X = number of repair hours (nonfinancial unit; used as the activity level)
  • Data description (as presented):
    • Two noted points used for high-low estimation: highest ~568 hours, lowest ~200 hours
    • Costs: at 568 hours, Y ≈ 12,083; at 200 hours, Y ≈ 9,054
  • High-Low calculation (as described in the lecture):
    • Variable cost per hour (slope) calculation:
      v = rac{Y_ ext{high} - Y_ ext{low}}{X_ ext{high} - X_ ext{low}} = rac{12{,}083 - 9{,}054}{568 - 200}
    • Using those two points, the speaker stated the variable cost as v ext{ = } 10.40 ext{ per hour}
    • Intercept (fixed cost) calculation (using one point):
    • If using the high point: F = Y_ ext{high} - v imes X_ ext{high} = 12{,}083 - 10.40 imes 568 \approx 6{,}176
    • If using the low point: F = Y_ ext{low} - v imes X_ ext{low} = 9{,}054 - 10.40 imes 200 \approx 7{,}414
    • The two calculations give approximately similar fixed-cost estimates (about $6k–$7.4k), illustrating how the high-low method yields a fixed cost estimate from the chosen points
  • The lecture notes also illustrate a policy of using the high-low approach with the two points to derive the cost function

Regression approach (detailed interpretation from the lecture)

  • Regression output (illustrative example from the lecture):
    • Cost function (predicted overhead): C = 64.72 + 12.52 imes x where
    • F = 64.72 (intercept, fixed cost)
    • v = 12.52 (slope, variable cost per repair hour)
  • Interpretation of regression outputs:
    • Intercept (F) represents the fixed cost component per period
    • Slope (v) represents the variable cost per additional unit of activity (here, per repair hour)
    • Multiple R (correlation) ≈ 0.91, indicating a strong linear relationship between hours and cost
    • R^2 ≈ 0.83, meaning about 83% of the variation in cost is explained by the number of repair hours
    • t-statistics and p-values assess the statistical significance of the estimated coefficients; the lecture notes mention a very small p-value (on the order of 10^{-4} or smaller), indicating high significance of the estimates
  • Predictions using the regression model:
    • Example: for x = 300 repair hours,
      C(300) = 64.72 + 12.52 imes 300 = 64.72 + 3756 = 3820.72
    • The lecturer also stated a predicted value of $10{,}228 for 300 hours, which is inconsistent with the regression equation shown above; this appears to be an error or a mismatch in the transcript
  • Important caveat about extrapolation:
    • The data used for the regression covered up to 568 hours
    • Predicting costs for 600 hours would be outside the observed range, so the estimate may be unreliable
    • Always check the range of the data before extrapolating with the model

Linear vs. non-linear data and model choice

  • Scatter plot helps visually assess whether a linear model is appropriate
  • If the scatter plot does not resemble a straight line, a non-linear model or a different approach may be necessary
  • The lecturer notes that, in some cases, non-linear forms can be used (and Excel can handle such models) if the data support it

Outliers and data quality

  • Outliers: observations that lie far from the line of best fit
    • Example described: a data point at 400 repair hours with a very different cost than other 400-hour observations
    • Outliers may indicate errors in data entry or unusual events in that month
    • Decision point: determine whether the data point is representative; if not, decide whether to exclude it because it can distort the estimation method
  • When focusing on high-low estimation, the method is particularly sensitive to extremes/outliers

Practical implications and takeaways

  • Use cost behavior to forecast costs under different activity levels
  • Regression provides a more reliable estimate than high-low because it uses all data points and provides measures of fit
  • High-low offers a quick check or starting point for estimation but should be used with caution
  • Always examine data for linearity and outliers before choosing a method
  • Be mindful of extrapolation risks; predictions beyond the observed data range are less reliable
  • Understand that estimates are just that—best guesses based on historical data and the chosen model; real-world factors can cause deviations

Key takeaways summarized

  • Costs can be categorized as fixed, variable, or mixed; mixed costs combine a fixed base with a variable component
  • The general cost equation is C = F + v imes x where
    • F is the fixed cost
    • v is the variable cost per unit of activity
    • x is the level of activity
  • High-Low method (two-point estimate) steps: identify high/low activity, compute v = rac{Y_ ext{high} - Y_ ext{low}}{X_ ext{high} - X_ ext{low}}, compute F = Yi - v imes Xi for either point, and form C = F + v imes x
  • Regression (best-fit line) uses all data points to estimate the cost function; interpret intercept and slope, assess fit with R, R^2, t-stats, and p-values; beware extrapolation beyond the data range
  • Real-world examples include rent (fixed), hourly wages (variable), and subscriptions (fixed)
  • Outliers require careful handling, as they can distort both high-low and regression results