Math Foundations: Matrices and Linear Equations

Course Overview

Modules:
- Matrices and Solving System of Linear Equations
- Vectors, Distance and Angle between vectors, Linear Independence and Dependence
- Matrix Decompositions
- Vector Calculus
- Gradient Descent Algorithms, Challenges in optimization in ML
- Optimization in Machine Learning (PCA, SVM)

Why Study Vectors and Matrices?

Vectors are fundamental in machine learning (ML).
Used throughout the entire workflow, from data representation to model training and prediction.

Uses of Vectors in ML:

Data Representation:
- Feature representation: A house can be described by features like (House number, No of Bed rooms, Area in square feet) = $(133, 4, 2200)$
- Text as Vectors: Texts are represented as vectors in ML. Techniques like Word2Vec and BERT embeddings convert words/text into vectors.
Distance and Similarity:
- Distance between vectors is defined using inner product (dot product).
- Used to study the similarity of data.
Linear dependency and Data Redundancy:
- Study of linear independent and linear dependent vectors.
- If two vectors are linearly dependent, one data is redundant of the other.

Linear Algebra

Linear algebra is the study of vectors and rules to manipulate vectors.
Vectors are not only geometric vectors (points in 2D/3D space) but any objects that can be added together and multiplied by scalar values to produce another object of the same kind.
- Example: Polynomials can also be treated as vectors.
Deals with vectors in the space $R^n$

Systems of Linear Equations

Form a central part of linear algebra.
Many problems can be formulated as systems of linear equations.
Tools of linear algebra can be used to solve such problems.
Uses analytical techniques and numerical methods to solve the systems of linear equations.

Motivating Problem for Studying Linear System of Equations

A company produces products $N1, N2, … Nn$ for which resources $R1, R2, … Rm$ are required.
To produce a unit of product $Ni$ , $a{ij}$ units of resource $R_j$ are needed, where $1 ≤ i ≤ n, 1 ≤ j ≤ m$ .
Find an optimal production plan where $xi$ units of product $Ni$ are produced if a total $bj$ units of resource $Rj$ are available, and no resources are left over.

Formulation

If we produce $x1, x2, … xn$ units of the products $N1, N2 … Nn$ , we need a total of $a{1i}x1 + a{2i}x2 + … + a{ni}xn$ units of resource $R_j$ .
Equation: $a{1i}x1 + a{2i}x2 + … + a{ni}xn = b_i$
Set up the following set of linear equations in n unknowns, $x1, x2 … x_n$ .
- $a{11}x1 + a{21}x2 + · · · + a{n1}xn = b_1$
- $\vdots$
- $a{1m}x1 + a{2m}x2 + · · · + a{nm}xn = b_m$
A linear system has zero, one, or infinitely many solutions.

System of Linear Equations

A linear system of m equations in n unknowns $x1, …, xn$ is a set of equations of the form:
- $a{11}x1 + a{12}x2 + … + a{1n}xn = b_1$
- $a{21}x1 + a{22}x2 + … + a{2n}xn = b_2$
- $\vdots$
- $a{m1}x1 + a{m2}x2 + … + a{mn}xn = b_m$
The system is called linear because each variable $x_i$ appears in the first power only.
$a{11}, …, a{mn}$ are given numbers, called the coefficients of the system.
$b1, …, bm$ on the right are also given numbers.
If all the $b_j$ are zero, then the system is called a homogeneous system.
If at least one $b_j$ is not zero, then the system is called a non-homogeneous system.

Augmented Matrix

Matrix Form of the Linear System:
- Assume that coefficients $a_{ij}$ are not all zero, so that A is not a zero matrix.
- $x$ has n components, whereas $b$ has m components.
The augmented matrix of the system is:
- $\tilde{A} = \begin{bmatrix} a{11} & a{12} & … & a{1n} & b1 \ a{21} & a{22} & … & a{2n} & b2 \ \vdots & \vdots & \ddots & \vdots & \vdots \ a{m1} & a{m2} & … & a{mn} & bm \end{bmatrix}$
- The dashed vertical line is a reminder that the last column of $\tilde{A}$ came from vector $b$ .
- Matrix $A$ is augmented.

Solving System of Equations

Eliminate variables and solve.
Add a multiple of one equation to the other.
Consider the matrix form of the system of transformed equations for which solutions could be obtained essentially by inspection.
Preserve solutions of the original system of equations.
Elementary transformations performed on the coefficient matrix should also be performed on the constant vector.
Perform all elementary row transformations on the augmented matrix of the linear system.

Solving System of Linear Equations with Two Variables and Their Geometrical Interpretation

Unique Solution:
- System:
  - $x + y = 5$
  - $2x - y = 4$
- Rewrite:
  - $y = -x + 5$
  - $y = 2x - 4$
- Lines have slopes -1 and 2, and y-intercepts 5 and -4.
- They intersect at a point, the solution: $x = 3, y = 2$
No Solution:
- System:
  - $-2x + y = 3$
  - $-4x + 2y = 2$
- Rewrite:
  - $y = 2x + 3$
  - $y = 2x + 1$
- Lines have slope 2, and y-intercepts 3 and 1. They are parallel.
- No point of intersection. No solution.
Many Solutions:
- System:
  - $4x - 2y = 6$
  - $6x - 3y = 9$
- Each equation can be written as $y = 2x - 3$ .
- The graph of each equation is a line with slope 2 and y-intercept -3.
- Any point on the line is a solution. Many solutions.

Three Possibilities for System of Linear Equations

A system of linear equations has

no solution, or
exactly one solution, or
infinitely many solutions.
A system of linear equations is said to be consistent if it has either one solution or infinitely many solutions;
A system is inconsistent if it has no solution

Matrices

Definition: A matrix is a rectangular array of numbers. The numbers in the array are called the elements of the matrix.
Matrices are usually denoted by capital letters.
- Examples:
  - $A = \begin{bmatrix} 2 & 3 \ 7 & 5 \end{bmatrix}$
  - $B = \begin{bmatrix} 1 & 3 & 5 & 6 \ 0 & 5 & 8 & 3 \ 2 & 9 & 12 & 1 \end{bmatrix}$
  - $C = \begin{bmatrix} 7 \ -1 \ 5 \end{bmatrix}$
Rows and Columns: Matrices consist of rows and columns. Rows are labeled from the top of the matrix, columns from the left.
- Example:
  - $\begin{bmatrix} 2 & 3 & -4 \ 7 & 5 & -1 \end{bmatrix}$
  - Rows:
    - row 1: $\begin{bmatrix} 2 & 3 & -4 \end{bmatrix}$
    - row 2: $\begin{bmatrix} 7 & 5 & -1 \end{bmatrix}$
  - Columns:
    - column 1: $\begin{bmatrix} 2 \ 7 \end{bmatrix}$
    - column 2: $\begin{bmatrix} 3 \ 5 \end{bmatrix}$
    - column 3: $\begin{bmatrix} -4 \ -1 \end{bmatrix}$

How to Represent a System of Equations in Matrix Form

System of equations:
- $x1 + x2 + x_3 = 2$
- $2x1 + 3x2 + x_3 = 3$
- $x1 - x2 - 2x_3 = -6$
Matrix of coefficients: $\begin{bmatrix} 1 & 1 & 1 \ 2 & 3 & 1 \ 1 & -1 & -2 \end{bmatrix}$
Augmented matrix: $\begin{bmatrix} 1 & 1 & 1 & 2 \ 2 & 3 & 1 & 3 \ 1 & -1 & -2 & -6 \end{bmatrix}$
If $A = \begin{bmatrix} 1 & 1 & 1 \ 2 & 3 & 1 \ 1 & -1 & -2 \end{bmatrix}$ and $B = \begin{bmatrix} 2 \ 3 \ -6 \end{bmatrix}$ , then the above system of equations can be written in matrix form as $AX = B$ , where $X = \begin{bmatrix} x1 \ x2 \ x_3 \end{bmatrix}$

Example - Solving a System of Equations

Solve the system of equations:
- $x1 - 2x2 + x_3 = 0$
- $5x1 - 2x2 - 8x_3 = 8$
- $-5x_3 = 10$
Augmented Matrix:
$\begin{bmatrix} 1 & -2 & 1 & 0 \ 5 & -2 & -8 & 8 \ 1 & 2 & -5 & 10 \end{bmatrix}$
Elementary row operations:
- $-5 \cdot [equation 1] + [equation 2] = [new equation 2]$
- $\begin{bmatrix} 1 & -2 & 1 & 0 \ 0 & 8 & -13 & 8 \ 1 & 2 & -5 & 10 \end{bmatrix}$
- $-1 \cdot [equation 1] + [equation 3] = [new equation 3]$
- $\begin{bmatrix} 1 & -2 & 1 & 0 \ 0 & 8 & -13 & 8 \ 0 & 4 & -6 & 10 \end{bmatrix}$
- $\vdots$
- $-4 \cdot [equation 3] +[equation 2] = [new equation 2]$
- $\begin{bmatrix} 1 & -2 & 1 & 0 \ 0 & 0 & 11 & -32 \ 0 & 4 & -6 & 10 \end{bmatrix}$
- $\vdots$
- $-1 \cdot [equation 3] +[equation 1] = [new equation 1]$
  - $\begin{bmatrix} 1 & -2 & 0 & 1 \ 0 & 1 & 0 & -1 \ 0 & 0 & 1 & -2 \end{bmatrix}$
- $\begin{bmatrix} 0 & 1 & 0 & -1 \ 0 & 0 & 1 & -2 \ 0 & 0 & 0 & 0 \end{bmatrix}$

Echelon Form and Reduced Row Echelon Form

A rectangular matrix is in echelon form (or row echelon form) if it has the following three properties:

All nonzero rows are above any rows of all zeros.
Each leading entry of a row is in a column to the right of the leading entry of the row above it.
All entries in a column below a leading entry are zeros.

If a matrix in echelon form satisfies the following additional conditions, then it is in reduced echelon form (or reduced row echelon form):

The leading entry in each nonzero row is 1.
Each leading 1 is the only nonzero entry in its column.

Elementary Row Operations

Interchange two rows of a matrix.
Multiply the elements of a row by a nonzero constant.
Add a multiple of the elements of one row to the corresponding elements of another row.

Systems of equations that are related through elementary transformations are called equivalent systems. Matrices that are related through elementary row operations are called row equivalent matrices. The symbol $\sim$ is used to indicate equivalence in both cases.

Gauss-Jordan Elimination Method - Example

Solve the system of equations:

$x1 - 2x2 + 4x_3 = 12$
$-2x1 - x2 + 5x_3 = 18$
$-x1 + 3x2 - 3x_3 = -8$

Solution:

Start with the augmented matrix and use the first row to create zeros in the first column.

This corresponds to using the first equation to eliminate $x_1$ from the second and third equations.

$\begin{bmatrix} 1 & -2 & 4 & 12 \ -2 & -1 & 5 & 18 \ -1 & 3 & -3 & -8 \end{bmatrix} \sim \begin{bmatrix} 1 & -2 & 4 & 12 \ 0 & -5 & 13 & 42 \ 0 & 1 & 1 & 4 \end{bmatrix}$

Next multiply row 2 by $-\frac{1}{5}$ to make the $(2, 2)$ element 1. (This corresponds to making the coefficient of $x_2$ in the second equation 1.)

$\sim \begin{bmatrix} 1 & -2 & 4 & 12 \ 0 & 1 & -\frac{13}{5} & -\frac{42}{5} \ 0 & 1 & 1 & 4 \end{bmatrix}$

Create zeros in the second column as follows. (This corresponds to using the second equation to eliminate $x_2$ from the first and third equations.)

$\sim \begin{bmatrix} 1 & 0 & -\frac{6}{5} & -\frac{24}{5} \ 0 & 1 & -\frac{13}{5} & -\frac{42}{5} \ 0 & 0 & \frac{18}{5} & \frac{62}{5} \end{bmatrix}$

Multiply row 3 by $\frac{5}{18}$ . (This corresponds to making the coefficient of $x_3$ in the third equation 1.)

$\sim \begin{bmatrix} 1 & 0 & -\frac{6}{5} & -\frac{24}{5} \ 0 & 1 & -\frac{13}{5} & -\frac{42}{5} \ 0 & 0 & 1 & \frac{31}{9} \end{bmatrix}$

Finally, create zeros in the third column. (This corresponds to using the third equation to eliminate $x_3$ from the first and second equations.)

$\sim \begin{bmatrix} 1 & 0 & 0 & -\frac{2}{3} \ 0 & 1 & 0 & -\frac{1}{9} \ 0 & 0 & 1 & \frac{31}{9} \end{bmatrix}$

This matrix corresponds to the system

$x_1 = -\frac{2}{3}$
$x_2 = -\frac{1}{9}$
$x_3 = \frac{31}{9}$

The solution is $x1 = -\frac{2}{3}, x2 = -\frac{1}{9}, x_3 = \frac{31}{9}$ .

System of Equations with No Solution

Solve the system of equations if possible:

$x1 + x2 + 5x_3 = 3$
$x2 + 3x3 = -1$
$x1 + 2x2 + 8x_3 = 3$

Solution:

Starting with the augmented matrix we get

$\begin{bmatrix} 1 & 1 & 5 & 3 \ 0 & 1 & 3 & -1 \ 1 & 2 & 8 & 3 \end{bmatrix} \sim \begin{bmatrix} 1 & 1 & 5 & 3 \ 0 & 1 & 3 & -1 \ 0 & 1 & 3 & 0 \end{bmatrix} \sim \begin{bmatrix} 1 & 0 & 2 & 4 \ 0 & 1 & 3 & -1 \ 0 & 0 & 0 & 1 \end{bmatrix}$

The last row of this reduced echelon form gives the equation

$0x1 + 0x2 + 0x_3 = 1$

This equation cannot be satisfied for any values of $x1, x2$ , and $x_3$ . Thus the system has no solution. (This information was in fact available from the next-to-last matrix.)

System of Equations with Infinite Number of Solutions

Note that in the final matrix which is reduced echelon form 2nd and 3rd column do not have pivot elements. Those columns correspond to the independent variables. That is $x2$ and $x3$ are independent variables. Hence the solutions are $x4 = 2, x2 = r, x3 = s$ and $x1 = -2 -2x2 + x3$ where r and s are any real numbers.

Homogenous System

Solve

$x1 - 2x2 + 4x_3 = 0$
$2x1 - x2 + 5x_3 = 0$
$-x1 + 3x2 - 3x_3 = 0$

$\begin{bmatrix} 1 & -2 & 4 \ 2 & -1 & 5 \ -1 & 3 & -3 \end{bmatrix}$

$x1 = 0, x2 = 0$ and $x_3 = 0$ is the only solution.

Practice Problems

The augmented matrix of a linear system has been reduced by row operations to the form shown. In each case, continue the appropriate row operations and describe the solution set of the original system.

Practice Problems

Find an equation involving g, h, and k that makes this augmented matrix correspond to a consistent system:

$\begin{bmatrix} 0 & 2 & -4 & 7 & g \ 3 & 1 & -5 & 0 & h \ -2 & 4 & -9 & 5 & k \end{bmatrix}$

Do the three planes $x1 + 2x2 + x3 = 4, x2 - x3 = 1$ , and $x1 + 3x_2 = 0$ have at least one common point of intersection? Explain.

Gaussian Elimination Method

You don’t have to compute Reduced Echelon form to solve the system of linear equations. Instead it is enough to find the Echelon form. This method is called Gaussian Elimination method. HW: Follow this method for the previous examples to solve the system of linear equations.

One More Example:

Consider the following system where we seek all solutions for some $a \in R$ :

$-2x1 + 4x2 - 2x3 - x4 + 4x_5 = -3$
$4x1 - 8x2 + 3x3 - 3x4 + x_5 = 2$
$x1 - 2x2 + x3 - x4 + x_5 = 0$
$-x1 + 2x2 - 3x4 + 4x5 = a$

Let us consider the augmented matrix

$\begin{bmatrix} -2 & 4 & -2 & -1 & 4 & -3 \ 4 & -8 & 3 & -3 & 1 & 2 \ 1 & -2 & 1 & -1 & 1 & 0 \ -1 & 2 & 0 & -3 & 4 & a \end{bmatrix}$

$\sim \begin{bmatrix} 1 & -2 & 0 & 0 & 3 & 1 \ 0 & 0 & -1 & 1 & -3 & 2 \ 0 & 0 & 0 & -2 & 2 & -1-3 \ 0 & 0 & 0 & 0 & 0 & a+1 \end{bmatrix}$

Now multiply the second row by -1 and the third row by $\frac{-1}{3}$ to get the augmented matrix in its final form, known as the row-echelon form. These equations are equivalent to the original set of equations:

$x1 - 2x2 + x3 - x4 + x_5 = 0$
$x3 - x4 + 3x_5 = -2$
$x4 - 2x5 = 1$
$0 = a + 1$

The preceding set of equations cannot be solved when
$a \neq -1$ .
The last equation is consistent only for $a = -1$ i.e when rank of the coefficient matrix is same the rank of augmented matrix then the system has solution.

In REF of a matrix, the leading entry is called as the pivot element and columns containing pivot elements are called as pivot columns. The corresponding variables are called pivot variables.
The variables corresponding to non pivot columns are called as free variables.

Consider

$x1 - 2x2 + x3 - x4 + x_5 = 0$
$x3 - x4 + 3x_5 = -2$
$x4 - 2x5 = 1$

Let $x5 = t$ and $x2 = s$
Then $x4 = 1 + 2t$ Then, we get $x3 = -1 - t$ and $x_1 = 2 + 2s + 2t$
Thus, solution is $(2 + 2s + 2t, s, -1 - t, 1 + 2t, t), \forall t, s \in R$

The system is inconsistent if the rank of the coefficient matrix is less than the rank of augmented matrix.
The system is consistent if the rank of the coefficient matrix is same as the rank of augmented matrix.
For the consistent system, if there are no free variables, the solution is unique.
For the consistent system, if there are free variables, there are infinitely many solutions.

Gaussian Elimination and Gauss Jordan

Repeated use of elementary row operations done to convert a matrix to upper triangular form is called Gaussian elimination.
Consider the system of equations $Ax = b$ where A is a n × n matrix.
If A is invertible, it means that $A^{-1}$ exists such that $AA^{-1} = A^{-1}A = I_n$
In such a case the row-reduced echelon form of A is $I_n$ , i.e every column is a pivot column where the pivot is 1.
The process of converting A to $I_n$ that we have discussed above is called the Gauss-Jordan method.

Gaussian Elimination and Gauss Jordan

In Gaussian Elimination we use multiples of the first row to eliminate the entries in the first column below the first row.
Then we use multiples of the second row to eliminate entries in the second column below the second row and so on until we get an upper-triangular matrix.
This Gauss Jordan process is shown diagramatically in the next slide.
Then we take multiples of the last row to eliminate non-zero entries in the last column above the last entry, followed by multiples of the last but one row to eliminate non-zero entries in the last but one column and so on. This gives us a diagonal matrix.

Calculating Inverse

Can the Gauss Jordan procedure calculate the inverse of a matrix?
For example, let A be a n × n matrix whose inverse $A^{-1}$ exists. We would like to compute its inverse using Gauss Jordan procedure. Is this possible?
Yes we can compute the inverse in the following way: we simply set up n linear systems of the form $Ax = ei, 1 \leq i \leq n$ where $ei$ - is the i th canonical basis vector and find their solutions x. Each solution vector constitutes a column in $A^{-1}$ . Why is this true?

Find the Inverse of A if it Exists

$A = \begin{bmatrix} 1 & -1 & -2 \ 2 & -3 & -5 \ -1 & 3 & 5 \end{bmatrix}$

Solution

Applying the method of Gauss-Jordan elimination we get

$[A:I] = \begin{bmatrix} 1 & -1 & -2 & 1 & 0 & 0 \ 2 & -3 & -5 & 0 & 1 & 0 \ -1 & 3 & 5 & 0 & 0 & 1 \end{bmatrix}$

$\sim \begin{bmatrix} 1 & -1 & -2 & 1 & 0 & 0 \ 0 & -1 & -1 & -2 & 1 & 0 \ 0 & 2 & 3 & 1 & 0 & 1 \end{bmatrix}$

$\sim \begin{bmatrix} 1 & -1 & -2 & 1 & 0 & 0 \ 0 & 1 & 1 & 2 & -1 & 0 \ 0 & 0 & 1 & -3 & 2 & 1 \end{bmatrix}$

$\sim \begin{bmatrix} 1 & 0 & 0 & 0 & 5 & -3 & -1 \ 0 & 1 & 0 & 0 & -1 & -3 & 2 & 1 \ 0 & 0 & 1 & 1 & -3 & 2 & 1 \end{bmatrix}$

Hence inverse of A = \begin{bmatrix} 0 & 5 & -3 & -1\ 0 & -1 & -3 & 2 & 1 \1 & -3 & 2 & 1 \end{bmatrix}

Find the Inverse of A if it Exists

Since the LHS 3x3 matrix can not be brought to $I_3$ Inverse of A does not exist.