Notes on Least Squares and Systems of Linear Equations

Goal: find a straight line y = mx + b that best fits a set of data points (xi, yi) by minimizing the total squared residuals.
Data setup: n data points Pi with coordinates (xi, y_i).
Practical use: the resulting trend line can be used to forecast or predict future values (e.g., sales trends).
Visual aid: plot a scatter diagram of the data and overlay the least-squares regression line.
General form of the regression line:
$y = f(x) = mx + b$
What makes the line “best” in the least-squares sense: the residuals ei = yi - (mxi + b) are as small as possible in the sum of squares, i.e., minimize S = \sum{i=1}^n \bigl(yi - (m xi + b)\bigr)^2.

The constants m and b satisfy the normal equations obtained by setting the partial derivatives of S with respect to m and b to zero:

\begin{cases}
\displaystyle \sum{i=1}^n yi = m\sum{i=1}^n xi + b\,n, \
\displaystyle \sum{i=1}^n xi yi = m\sum{i=1}^n xi^2 + b\sum{i=1}^n x_i.
\end{cases}
These two equations can be solved simultaneously for m and b.
A compact way to write them using sums:
$\sum yi = m\sum xi + b\,n, \quad \sum xi yi = m\sum xi^2 + b\sum xi.$
Brief derivation sketch:
- Define residials ei = yi - (m x_i + b).
- Minimize S = \sum e_i^2 with respect to m and b.
- Set the partial derivatives ∂S/∂m = 0 and ∂S/∂b = 0 to obtain the normal equations above.
Practical steps to compute:
- Compute the sums: Σxi, Σyi, Σxi^2, Σxi y_i, and n.
- Solve the 2×2 linear system given by the normal equations for m and b.
Example results from transcript:
- Example 1 (data: P1(1,8), P2(2,6), P3(5,6), P4(7,4), P5(10,1)) yields the regression line
  $Y = -0.685\,x + 8.426.$
- Here, $m \approx -0.685$ and $b \approx 8.426$ .
- Example 2 (another dataset) yields the line
  $Y = -0.95\,x + 10.35.$
- Here, $m = -0.95$ and $b = 10.35$ .

Step 1: collect data points (xi, yi).
Step 2: compute the required sums: $\sum xi, \sum yi, \sum xi^2, \sum xi y_i, n$ .
Step 3: plug sums into the normal equations and solve for m and b.
Step 4: plot the data and draw the regression line y = mx + b.
Step 5: use the model to predict future values and assess fit (e.g., via residuals, R^2, or other metrics).

The least-squares solution is the orthogonal projection of the data vector y onto the column space of the design matrix X = [ [1, x1], [1, x2], …, [1, x_n] ].
In matrix form, the normal equations can be written as (X^T X) [m, b]^T = X^T y.
The method assumes:
- A linear relationship between x and y (linear in parameters m and b).
- Homoscedastic, roughly normally distributed errors with constant variance (for inference).
- No or manageable influence from outliers (outliers can distort the fit).

A system consists of two or more linear equations in two variables (x, y): L1: a1 x + b1 y = c1, L2: a2 x + b2 y = c2, etc.
Key question: how many solutions does the system have? (one, infinite, or none)
Terminology:
- Unique solution: the two lines L1 and L2 intersect at exactly one point.
- Infinitely many solutions (dependent system): the equations represent the same line (parallel and coincident).
- No solution (inconsistent system): the equations represent distinct parallel lines.
Terminology about consistency:
- A system is consistent if it has at least one solution (one or infinitely many).
- A system is inconsistent if it has no solution.
Quick classification criterion (two-equation case):
- If the determinant Δ = a1 b2 − a2 b1 ≠ 0, the system has a unique solution.
- If Δ = 0, check for dependence or inconsistency by comparing ratios (a1:a2, b1:b2, c1:c2).
- If (a2, b2, c2) is a scalar multiple of (a1, b1, c1), there are infinitely many solutions.
- If (a2, b2, c2) is not a scalar multiple of (a1, b1, c1), there is no solution.

System:

\begin{cases}
2x - 4y = -10, \
3x + 2y = 1.
\end{cases}
Solve by substitution:
- From the first equation: x = 2y - 5.
- Substitute into the second: 3(2y - 5) + 2y = 1.
- Compute: 6y - 15 + 2y = 1 \Rightarrow 8y = 16 \Rightarrow y = 2.
- Then x = 2(2) - 5 = -1.
Solution:
$(x, y) = (-1, \, 2).$
Verification:
- 2(-1) - 4(2) = -2 - 8 = -10 ✔
- 3(-1) + 2(2) = 3 + 4 = 7 ≠ 1?
  Note: The original second equation is 3x + 2y = 1; substituting x = -1, y = 2 gives 3(-1) + 2(2) = -3 + 4 = 1 ✔.

System:

\begin{cases}
5x - 6y = 8, \
10x - 12y = 16.
\end{cases}
Observation: The second equation is exactly 2 times the first equation (2 × (5x - 6y) = 10x - 12y and 2 × 8 = 16).
Conclusion: The two equations are dependent and represent the same line; there are infinitely many solutions.
Description of the solution set: all points (x, y) that satisfy 5x - 6y = 8; e.g., one can parametrize quickly: x = (8 + 6y)/5 for any y.

Given system:
\begin{cases}
a1 x + b1 y = c1, \ a2 x + b2 y = c2.
\end{cases}
Compute determinant: $\Delta = a1 b2 - a2 b1.$
If $\Delta \neq 0$ : unique solution.
If $\Delta = 0$ : check for dependence vs inconsistency:
- If there exists a scalar k with (a2, b2, c2) = k(a1, b1, c1): infinitely many solutions (coincident lines).
- If no such k exists (the triples are not proportional): no solution (parallel distinct lines).

Least-squares regression provides the best linear fit in the sense of minimizing squared errors under the model assumptions.
Practical considerations:
- Data quality, outliers, and extrapolation beyond the data range can affect accuracy and predictive power.
- The normal equations give a stable method when data are well-conditioned; ill-conditioned data (e.g., x_i very close in value) can lead to numerical instability.
Real-world relevance:
- Used for forecasting sales trends, demand planning, and many other predictive analytics tasks.
- In linear systems, understanding the number of solutions helps in modeling feasibility and consistency with observed data.

Least-squares line:
$y = mx + b$
Normal equations:
$\sum{i=1}^n yi = m\sum{i=1}^n xi + b\,n, \qquad \sum{i=1}^n xi yi = m\sum{i=1}^n xi^2 + b\sum{i=1}^n x_i.$
Determinant criterion (2×2 system):
$\Delta = a1 b2 - a2 b1.$
Solution existence:
- If $\Delta \neq 0$ , unique solution.
- If $\Delta = 0$ and ratios match, infinite solutions; if not, no solution.

The transcript provides concrete example lines (e.g., Y = -0.685x + 8.426 and Y = -0.95x + 10.35) illustrating the least-squares fits.
The substitution method shown in the examples demonstrates a straightforward approach to solving linear systems by expressing one variable in terms of the other and back-substituting.
Ethical/practical caveats: regression results should be interpreted in light of data quality, context, and the risk of overreliance on extrapolation."