Notes on Least Squares and Systems of Linear Equations

The Method of Least Squares

  • Purpose: To determine a straight line that best fits a set of data points (x, y) when the points are scattered about a line.
  • Application context: Used to describe and predict the relationship between two variables; for example, predicting store sales from time or other predictors.
  • Given n data points: Pᵢ(xᵢ, yᵢ), i = 1, 2, …, n.
  • The regression line (least-squares line) is the linear function
    y=f(x)=mx+by = f(x) = mx + b
    where the constants m (slope) and b (intercept) minimize the sum of squared residuals (the vertical distances from the data points to the line).

The normal (normal equations) for the least-squares line

  • Define the sums:
    • S<em>x=</em>i=1nxiS<em>x = \sum</em>{i=1}^n x_i
    • S<em>y=</em>i=1nyiS<em>y = \sum</em>{i=1}^n y_i
    • S<em>xx=</em>i=1nxi2S<em>{xx} = \sum</em>{i=1}^n x_i^2
    • S<em>xy=</em>i=1nx<em>iy</em>iS<em>{xy} = \sum</em>{i=1}^n x<em>i y</em>i
    • nn = number of data points.
  • The least-squares line satisfies the two normal equations:
    egin{cases}
    Sy = m Sx + n b, \
    S{xy} = m S{xx} + b S_x.
    \end{cases}
  • In matrix form, this is the system
    \begin{pmatrix}
    S{xx} & Sx \
    Sx & n \end{pmatrix} \begin{pmatrix} m \ b \end{pmatrix} = \begin{pmatrix} S{xy} \
    S_y
    \end{pmatrix}.
  • Note: The sums are computed from the given data; solving for m and b yields the regression line y=mx+b.y = mx + b.
  • Interpretation: The normal equations arise from minimizing the sum of squared residuals
    min<em>m,b</em>i=1n(y<em>i(mx</em>i+b))2\min<em>{m,b} \sum</em>{i=1}^n \, (y<em>i - (m x</em>i + b))^2
    with respect to m and b.
  • Quick behavior: As data points tighten around a single straight line, the least-squares line aligns with the best linear trend through the scatter diagram.

Example 1

  • Data points (illustrative from transcript): five data points lead to a regression line
    y=0.95x+10.35.y = -0.95x + 10.35.
  • Reported slope and intercept:
    • m=0.95,<br/>b=10.35.m = -0.95,<br /> \quad b = 10.35.
  • This yields the least-squares line:
    y=0.95x+10.35.y = -0.95x + 10.35.

Example 2

  • Data points (as given):
    (1,8),(2,6),(5,6),(7,4),(10,1).(1,8), (2,6), (5,6), (7,4), (10,1).
  • Regression line (from transcript):
    y=0.685x+8.426.y = -0.685x + 8.426.
  • Parameters:
    • m=0.685,<br/>b=8.426.m = -0.685,<br /> \quad b = 8.426.
  • Regression line:
    y=0.685x+8.426.y = -0.685x + 8.426.

Summary of the method

  • Steps to compute the least-squares line for a data set:
    1. Compute the sums: S<em>x,S</em>y,S<em>xx,S</em>xy,n.S<em>x, S</em>y, S<em>{xx}, S</em>{xy}, n.
    2. Solve the normal equations for (m, b):
      \begin{cases}
      Sy = m Sx + n b, \
      S{xy} = m S{xx} + b S_x.
      \end{cases}
    3. Form the regression line: y=mx+b.y = mx + b.
  • Visualization: Plot the scatter diagram of the data and graph the least-squares line to assess fit.
  • Important note: The least-squares line minimizes vertical distances; it may not be optimal for other error metrics or non-linear relationships.

2.1 Systems of linear equations and introduction

  • L₁ and L₂ intersecting at exactly one point imply the system has one unique solution.
    • Graphically: two non-parallel lines intersecting at a single point.
  • L₁ and L₂ parallel and coincident (the same line) imply infinitely many solutions.
    • Graphically: the lines are on top of each other; every point on the line is a solution.
  • L₁ and L₂ parallel and distinct imply no solution (inconsistent system).
    • Graphically: parallel lines never meet.

Classification of a 2x2 system

  • Determine the number of solutions:
    • a) One and only one solution → unique solution.
    • b) Infinitely many solutions → the two equations represent the same line (parallel and coincident).
    • c) No solution → the two equations represent distinct parallel lines.

Substitution example (illustrative from transcript)

  • System 1:
    \begin{cases}
    2x - 4y = -10, \
    3x + 2y = 1
    \end{cases}
  • Solve by substitution:
    • From the first equation: 2x4y=10x2y=5x=2y5.2x - 4y = -10 \Rightarrow x - 2y = -5 \Rightarrow x = 2y - 5.
    • Substitute into the second equation:
      3(2y5)+2y=1 6y15+2y=1 8y=16 y=2.3(2y - 5) + 2y = 1 \ 6y - 15 + 2y = 1 \ 8y = 16 \ y = 2.
    • Then x=2(2)5=1.x = 2(2) - 5 = -1.
  • Solution:
    (x,y)=(1,2).(x, y) = (-1, \, 2).
  • Check: plug back into both equations to verify.

Additional system illustrating infinite solutions

  • System 2 (demonstrates inconsistency with multiple of a line):
    \begin{cases}
    5x - 6y = 8, \
    10x - 12y = 16
    \end{cases}
  • Observation: The second equation is exactly two times the first (multiplicative consistency):
    10x12y=16=2(5x6y=8).10x - 12y = 16 = 2(5x - 6y = 8).
  • Conclusion: Infinitely many solutions (the two equations represent the same line).
  • Note on method: When two equations are proportional (with consistent constants), the system is dependent and has infinitely many solutions.

Quick recap of key terms

  • Scatter diagram: visual plot of data points to inspect possible linear relationship.
  • Regression line: the line that best fits the data in the sense of least squares.
  • Normal equations: the two equations derived from minimizing the sum of squared residuals, used to find m and b.
  • Sums for calculations: S<em>x,S</em>y,S<em>xx,S</em>xyS<em>x, S</em>y, S<em>{xx}, S</em>{xy} as defined above.
  • Solutions of linear systems: unique, infinite (dependent), or none (inconsistent), depending on whether the lines intersect, coincide, or are parallel but distinct.