Chapter 14: Least Squares Method

Chapter 14: Least Squares Method

Introduction to the Least Squares Method

  • The Least Squares Method is a statistical technique used for estimating the parameters of a regression line that best fits a set of sample data points.

  • It specifically aims to determine the Y-intercept (b₀) and slope (b₁) of the regression line.

Objective of the Method

  • The main goal of the least squares method is to minimize the sum of squared vertical distances between the observed data points (response values) and the predicted response values on the regression line.

  • This can be expressed mathematically as minimizing:
      S=extSumofsquareddistances=extsum((extobservedresponseextpredictedresponse)2)S = ext{Sum of squared distances} = ext{sum}(( ext{observed response} - ext{predicted response})^2)

Key Components for Calculating the Regression Line

  • To find the sample regression line, several key equations and summations are required:
      - Sample Mean of x:
        xˉ=rac1nextsum(xi)\bar{x} = rac{1}{n} ext{sum}(x_i)
      - Sample Mean of y:
        yˉ=rac1nextsum(yi)\bar{y} = rac{1}{n} ext{sum}(y_i)

  • The following quantities need to be calculated:
      - Total Sum of Squares for x (SSxx):
        SSxx=extsum(xi2)rac(extsum(xi))2nSS_{xx} = ext{sum}(x_i^2) - rac{( ext{sum}(x_i))^2}{n}
      - Total Sum of Squares for y (SSyy):
        SSyy=extsum(yi2)rac(extsum(yi))2nSS_{yy} = ext{sum}(y_i^2) - rac{( ext{sum}(y_i))^2}{n}
      - Sum of Cross Products (SSxy):
        SSxy=extsum(xiyi)rac(extsum(xi))(extsum(yi))nSS_{xy} = ext{sum}(x_i y_i) - rac{( ext{sum}(x_i))( ext{sum}(y_i))}{n}

Formulas for the Regression Line Parameters

  • Y-Intercept (b₀):
      The intercept of the regression line is calculated using the equation:
      b0=yˉb1xˉb_0 = \bar{y} - b_1 \bar{x}

  • Slope (b₁):
      The slope of the regression line can be determined as follows:
      b1=racSSxySSxxb_1 = rac{SS_{xy}}{SS_{xx}}

Covariance Calculation

  • Covariance is a measure of how two variables change together. It can be calculated as follows:
      Cov=racextsum((yiyˉ)(xixˉ))nCov = rac{ ext{sum}((y_i - \bar{y})(x_i - \bar{x}))}{n}

Summary Charts and Visuals

  • When plotting the regression line, the data points are represented as scattered points on a graph and the regression line is shown through these points, indicating the best fit according to the least squares criterion.

Applications

  • The least squares method is widely used in various fields including economics, engineering, biology, and social sciences, wherever one seeks to establish a relationship between two quantitative variables.