Study Notes on the Least-Squares Regression Line

Section 11.2: The Least-Squares Regression Line

OBJECTIVES

  • Objective 1: Compute the least-squares regression line.

  • Objective 2: Compute the correlation coefficient.

  • Objective 3: Use the least-squares regression line to make predictions.

  • Objective 4: Interpret predicted values, the slope, and the y-intercept of the least-squares regression line.

OBJECTIVE 1: COMPUTE THE LEAST-SQUARES REGRESSION LINE

  • A sample of house data is provided: size in square feet and selling price in thousands of dollars.

  • Previous analysis revealed a strong positive linear association between the size and sales price of houses.

  • Data Table:

    • Size (Square Feet) and Selling Price ($1000s):

    • 2521 sq ft: 400

    • 2555 sq ft: 426

    • 2735 sq ft: 428

    • 2846 sq ft: 435

    • 3028 sq ft: 469

    • 3049 sq ft: 475

    • 3198 sq ft: 488

    • 3198 sq ft: 455

LEAST-SQUARES REGRESSION LINE

  • The least-squares regression line minimizes the sum of the squared vertical distances between the data points and the line itself.

  • Best-Fitting Line:

    • The optimal fitting line is the one for which these squared vertical distances are smallest.

  • Equation of the Least-Squares Regression Line:

    • The general form for predicting y from x is given as:

    • y=a+bxy = a + bx

      • where:

      • aa = y-intercept

      • bb = slope.

  • Variable Definitions:

    • The variable we want to predict (selling price) is known as the outcome variable or dependent variable.

    • The variable we are given (size) is known as the explanatory variable or independent variable.

CALCULATOR USAGE TO COMPUTE REGRESSION LINE

  • A one-time setting on the calculator is required to display the correlation coefficient.

  • TI-84 Plus Calculator Steps:

    • To configure: Press 2nd, 0, and select DiagnosticOn.

    • After setting, compute the least-squares regression line and the correlation coefficient for the house data.

DATA SUMMARY

  • House Size:

    • 2521 sq ft: 400

    • 2555 sq ft: 426

    • 2735 sq ft: 428

    • 2846 sq ft: 435

    • 3028 sq ft: 469

    • 3049 sq ft: 475

    • 3198 sq ft: 488

    • 3198 sq ft: 455

OBJECTIVE 2: USE THE LEAST-SQUARES REGRESSION LINE TO MAKE PREDICTIONS

  • Predicted Value: Predictions can be made by substituting the explanatory variable's value into the regression equation.

  • Example Calculation:

    • Given the equation y=160.1939+0.0992xy = 160.1939 + 0.0992x for selling price based on size:

    • Predicting selling price for house size of 2800 sq ft:

    • Calculation: y=160.1939+0.0992imes2800y = 160.1939 + 0.0992 imes 2800.

  • Point of Averages:

    • Average size of houses: x=2891.25x = 2891.25 sq ft.

    • Average selling price: y=447.0y = 447.0 thousand dollars.

    • Substituting average size back into the regression gives expected average selling price, confirming the linear relationship.

OBJECTIVE 3: INTERPRET PREDICTED VALUES, THE SLOPE, AND THE y-INTERCEPT

  • Predicted Values: Estimates of the average outcome for given values of the explanatory variable.

    • Example: With the equation y=160.1939+0.0992xy = 160.1939 + 0.0992x, to estimate average price for 3000 sq ft:

    • Substitute: y=160.1939+0.0992imes3000y = 160.1939 + 0.0992 imes 3000.

  • Interpreting the y-Intercept (a):

    • The y-intercept is where the line crosses the y-axis:

    • Interpreted only if data includes both positive and negative x-values.

      • If the range of x-values contains only positive or negative numbers, the intercept value does not hold significant practical interpretation.

INTERPRETING THE SLOPE (b)

  • The slope represents how much the predicted value (outcome variable) changes with a unit change in x (explanatory variable):

    • If the values of the explanatory variable differ by 1, the predicted values change by the amount of the slope, bb.

    • If the change in the explanatory variable is by a factor dd, the predicted values change by the amount bimesdb imes d.

  • Example Comparison:

    • Considering two houses: One at 1900 sq ft and another at 1750 sq ft, predict their price difference.

CHECK YOUR UNDERSTANDING - STUDENT PERFORMANCE EXAMPLE

  • At final exams, students reported hours studied, leading to the regression line for predicting scores:

    • y=50+5xy = 50 + 5x.

    • Predict Antoine's score studying for 6 hours:

    • Substitute: y=50+5imes6y = 50 + 5 imes 6.

    • Effect of studying more hours: If Emma studied 3 hours longer than Jeremy, predict the score difference based on the slope of 5.

KEY CONCEPTS TO REMEMBER

  • Definitions of outcome (response) and explanatory (predictor) variables.

  • Calculation of least-squares regression line.

  • Application of regression line for predictions.

  • Interpretation of predicted values, y-intercepts, and slopes in regression analysis.