Chapter 10, Part B: Simple Linear Regression

Simple Linear Regression: Part B

Using the Estimated Regression Equation

  • The estimated regression equation can be used for estimation and prediction.
  • Key calculations involve:
    • Confidence Interval Estimate of E(yp)E(y_p)
    • Prediction Interval Estimate of ypy_p
    • Where the confidence coefficient is 1α1 - \alpha and tα/2t_{\alpha/2} is based on a t distribution with n2n - 2 degrees of freedom.
Point Estimation
  • Example: If 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be:
    y^=10+5(3)=25 cars\hat{y} = 10 + 5(3) = 25 \text{ cars}
Confidence Interval for E(yp)E(y_p)
  • Estimate of the Standard Deviation of Confidence Interval for E(yp)E(y_p)
  • Example: The 95% confidence interval estimate of the mean number of cars sold when 3 TV ads are run is:
    25±3.1824(1.4491)25 \pm 3.1824(1.4491)
    25±4.6125 \pm 4.61
    20.39 to 29.61 cars20.39 \text{ to } 29.61 \text{ cars}
Prediction Interval for ypy_p
  • Estimate of the Standard Deviation of an Individual Value of ypy_p
  • Example: The 95% prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is:
    25±3.1824(2.6013)25 \pm 3.1824(2.6013)
    25±8.2825 \pm 8.28
    16.72 to 33.28 cars16.72 \text{ to } 33.28 \text{ cars}

Computer Solution

  • Statistical software (e.g., Minitab) can be used to perform regression analysis.
  • The independent variable was named "Ads" and the dependent variable was named "Cars" in the example.
  • Performing the regression analysis computations without the help of a computer can be quite time-consuming.
Minitab Output
  • Minitab prints the standard error of the estimate, s, as well as information about the goodness of fit.

  • For each of the coefficients b<em>0b<em>0 and b</em>1b</em>1, the output shows its value, standard deviation, t value, and p-value.

  • Minitab prints the estimated regression equation (e.g., Cars = 10.0 + 5.00 Ads).

  • The standard ANOVA table is printed.

  • Also provided are the 95% confidence interval estimate of the expected number of cars sold and the 95% prediction interval estimate of the number of cars sold for an individual weekend with 3 ads.

  • Regression equation example from Minitab:

    The regression equation is
    Cars = 10 + 5.00 Ads
    
    Predictor      Coef   SE Coef      T      p
    Constant     10.000     2.366   4.23   0.024
    Ads          5.0000    1.0801   4.63   0.019
    
    S = 2.2      R-sq = 87.7%     R-sq(adj) = 83.6%
    
    Analysis of Variance
    
    SOURCE           DF      SS      MS      F      p
    Regression        1     100     100   21.43   0.019
    Residual Error    3      14   4.667
    Total             4     114
    
    Predicted Values for New Observations
    
    New Obs      Fit   SE Fit       95% C.I.           95% P.I.
          1   25.00     2.60  (20.39, 29.61)  (16.72, 33.28)
    

Residual Analysis

  • Much of the residual analysis is based on an examination of graphical plots.
  • Residual for Observation i.
  • The residuals provide the best information about ϵ\epsilon.
  • If the assumptions about the error term ϵ\epsilon appear questionable, the hypothesis tests about the significance of the regression relationship and the interval estimation results may not be valid.
Residual Plot Against x
  • If the assumption that the variance of ϵ\epsilon is the same for all values of x is valid, and the assumed regression model is an adequate representation of the relationship between the variables, then:

    • The residual plot should give an overall impression of a horizontal band of points.
  • Good Pattern Example

  • Nonconstant Variance Example

  • Model Form Not Adequate Example