Math Foundations: Matrices and Linear Equations
Course Overview
Modules:
Matrices and Solving System of Linear Equations
Vectors, Distance and Angle between vectors, Linear Independence and Dependence
Matrix Decompositions
Vector Calculus
Gradient Descent Algorithms, Challenges in optimization in ML
Optimization in Machine Learning (PCA, SVM)
Why Study Vectors and Matrices?
Vectors are fundamental in machine learning (ML).
Used throughout the entire workflow, from data representation to model training and prediction.
Uses of Vectors in ML:
Data Representation:
Feature representation: A house can be described by features like (House number, No of Bed rooms, Area in square feet) = (133, 4, 2200)
Text as Vectors: Texts are represented as vectors in ML. Techniques like Word2Vec and BERT embeddings convert words/text into vectors.
Distance and Similarity:
Distance between vectors is defined using inner product (dot product).
Used to study the similarity of data.
Linear dependency and Data Redundancy:
Study of linear independent and linear dependent vectors.
If two vectors are linearly dependent, one data is redundant of the other.
Linear Algebra
Linear algebra is the study of vectors and rules to manipulate vectors.
Vectors are not only geometric vectors (points in 2D/3D space) but any objects that can be added together and multiplied by scalar values to produce another object of the same kind.
Example: Polynomials can also be treated as vectors.
Deals with vectors in the space R^n
Systems of Linear Equations
Form a central part of linear algebra.
Many problems can be formulated as systems of linear equations.
Tools of linear algebra can be used to solve such problems.
Uses analytical techniques and numerical methods to solve the systems of linear equations.
Motivating Problem for Studying Linear System of Equations
A company produces products N1, N2, … Nn for which resources R1, R2, … Rm are required.
To produce a unit of product Ni, a{ij} units of resource R_j are needed, where 1 ≤ i ≤ n, 1 ≤ j ≤ m.
Find an optimal production plan where xi units of product Ni are produced if a total bj units of resource Rj are available, and no resources are left over.
Formulation
If we produce x1, x2, … xn units of the products N1, N2 … Nn, we need a total of a{1i}x1 + a{2i}x2 + … + a{ni}xn units of resource R_j.
Equation: a{1i}x1 + a{2i}x2 + … + a{ni}xn = b_i
Set up the following set of linear equations in n unknowns, x1, x2 … x_n.
a{11}x1 + a{21}x2 + · · · + a{n1}xn = b_1
\vdots
a{1m}x1 + a{2m}x2 + · · · + a{nm}xn = b_m
A linear system has zero, one, or infinitely many solutions.
System of Linear Equations
A linear system of m equations in n unknowns x1, …, xn is a set of equations of the form:
a{11}x1 + a{12}x2 + … + a{1n}xn = b_1
a{21}x1 + a{22}x2 + … + a{2n}xn = b_2
\vdots
a{m1}x1 + a{m2}x2 + … + a{mn}xn = b_m
The system is called linear because each variable x_i appears in the first power only.
a{11}, …, a{mn} are given numbers, called the coefficients of the system.
b1, …, bm on the right are also given numbers.
If all the b_j are zero, then the system is called a homogeneous system.
If at least one b_j is not zero, then the system is called a non-homogeneous system.
Augmented Matrix
Matrix Form of the Linear System:
Assume that coefficients a_{ij} are not all zero, so that A is not a zero matrix.
x has n components, whereas b has m components.
The augmented matrix of the system is:
\tilde{A} = \begin{bmatrix} a{11} & a{12} & … & a{1n} & b1 \ a{21} & a{22} & … & a{2n} & b2 \ \vdots & \vdots & \ddots & \vdots & \vdots \ a{m1} & a{m2} & … & a{mn} & bm \end{bmatrix}
The dashed vertical line is a reminder that the last column of \tilde{A} came from vector b.
Matrix A is augmented.
Solving System of Equations
Eliminate variables and solve.
Add a multiple of one equation to the other.
Consider the matrix form of the system of transformed equations for which solutions could be obtained essentially by inspection.
Preserve solutions of the original system of equations.
Elementary transformations performed on the coefficient matrix should also be performed on the constant vector.
Perform all elementary row transformations on the augmented matrix of the linear system.
Solving System of Linear Equations with Two Variables and Their Geometrical Interpretation
Unique Solution:
System:
x + y = 5
2x - y = 4
Rewrite:
y = -x + 5
y = 2x - 4
Lines have slopes -1 and 2, and y-intercepts 5 and -4.
They intersect at a point, the solution: x = 3, y = 2
No Solution:
System:
-2x + y = 3
-4x + 2y = 2
Rewrite:
y = 2x + 3
y = 2x + 1
Lines have slope 2, and y-intercepts 3 and 1. They are parallel.
No point of intersection. No solution.
Many Solutions:
System:
4x - 2y = 6
6x - 3y = 9
Each equation can be written as y = 2x - 3.
The graph of each equation is a line with slope 2 and y-intercept -3.
Any point on the line is a solution. Many solutions.
Three Possibilities for System of Linear Equations
A system of linear equations has
no solution, or
exactly one solution, or
infinitely many solutions.
A system of linear equations is said to be consistent if it has either one solution or infinitely many solutions;
A system is inconsistent if it has no solution
Matrices
Definition: A matrix is a rectangular array of numbers. The numbers in the array are called the elements of the matrix.
Matrices are usually denoted by capital letters.
Examples:
A = \begin{bmatrix} 2 & 3 \ 7 & 5 \end{bmatrix}
B = \begin{bmatrix} 1 & 3 & 5 & 6 \ 0 & 5 & 8 & 3 \ 2 & 9 & 12 & 1 \end{bmatrix}
C = \begin{bmatrix} 7 \ -1 \ 5 \end{bmatrix}
Rows and Columns: Matrices consist of rows and columns. Rows are labeled from the top of the matrix, columns from the left.
Example:
\begin{bmatrix} 2 & 3 & -4 \ 7 & 5 & -1 \end{bmatrix}
Rows:
row 1: \begin{bmatrix} 2 & 3 & -4 \end{bmatrix}
row 2: \begin{bmatrix} 7 & 5 & -1 \end{bmatrix}
Columns:
column 1: \begin{bmatrix} 2 \ 7 \end{bmatrix}
column 2: \begin{bmatrix} 3 \ 5 \end{bmatrix}
column 3: \begin{bmatrix} -4 \ -1 \end{bmatrix}
How to Represent a System of Equations in Matrix Form
System of equations:
x1 + x2 + x_3 = 2
2x1 + 3x2 + x_3 = 3
x1 - x2 - 2x_3 = -6
Matrix of coefficients: \begin{bmatrix} 1 & 1 & 1 \ 2 & 3 & 1 \ 1 & -1 & -2 \end{bmatrix}
Augmented matrix: \begin{bmatrix} 1 & 1 & 1 & 2 \ 2 & 3 & 1 & 3 \ 1 & -1 & -2 & -6 \end{bmatrix}
If A = \begin{bmatrix} 1 & 1 & 1 \ 2 & 3 & 1 \ 1 & -1 & -2 \end{bmatrix} and B = \begin{bmatrix} 2 \ 3 \ -6 \end{bmatrix}, then the above system of equations can be written in matrix form as AX = B, where X = \begin{bmatrix} x1 \ x2 \ x_3 \end{bmatrix}
Example - Solving a System of Equations
Solve the system of equations:
x1 - 2x2 + x_3 = 0
5x1 - 2x2 - 8x_3 = 8
-5x_3 = 10
Augmented Matrix:
\begin{bmatrix} 1 & -2 & 1 & 0 \ 5 & -2 & -8 & 8 \ 1 & 2 & -5 & 10 \end{bmatrix}Elementary row operations:
-5 \cdot [equation 1] + [equation 2] = [new equation 2]
\begin{bmatrix} 1 & -2 & 1 & 0 \ 0 & 8 & -13 & 8 \ 1 & 2 & -5 & 10 \end{bmatrix}
-1 \cdot [equation 1] + [equation 3] = [new equation 3]
\begin{bmatrix} 1 & -2 & 1 & 0 \ 0 & 8 & -13 & 8 \ 0 & 4 & -6 & 10 \end{bmatrix}
\vdots
-4 \cdot [equation 3] +[equation 2] = [new equation 2]
\begin{bmatrix} 1 & -2 & 1 & 0 \ 0 & 0 & 11 & -32 \ 0 & 4 & -6 & 10 \end{bmatrix}
\vdots
-1 \cdot [equation 3] +[equation 1] = [new equation 1]
\begin{bmatrix} 1 & -2 & 0 & 1 \ 0 & 1 & 0 & -1 \ 0 & 0 & 1 & -2 \end{bmatrix}
\begin{bmatrix} 0 & 1 & 0 & -1 \ 0 & 0 & 1 & -2 \ 0 & 0 & 0 & 0 \end{bmatrix}
Echelon Form and Reduced Row Echelon Form
A rectangular matrix is in echelon form (or row echelon form) if it has the following three properties:
All nonzero rows are above any rows of all zeros.
Each leading entry of a row is in a column to the right of the leading entry of the row above it.
All entries in a column below a leading entry are zeros.
If a matrix in echelon form satisfies the following additional conditions, then it is in reduced echelon form (or reduced row echelon form):
The leading entry in each nonzero row is 1.
Each leading 1 is the only nonzero entry in its column.
Elementary Row Operations
Elementary Row Operations
Interchange two rows of a matrix.
Multiply the elements of a row by a nonzero constant.
Add a multiple of the elements of one row to the corresponding elements of another row.
Systems of equations that are related through elementary transformations are called equivalent systems. Matrices that are related through elementary row operations are called row equivalent matrices. The symbol \sim is used to indicate equivalence in both cases.
Gauss-Jordan Elimination Method - Example
Solve the system of equations:
x1 - 2x2 + 4x_3 = 12
-2x1 - x2 + 5x_3 = 18
-x1 + 3x2 - 3x_3 = -8
Solution:
Start with the augmented matrix and use the first row to create zeros in the first column.
This corresponds to using the first equation to eliminate x_1 from the second and third equations.
\begin{bmatrix} 1 & -2 & 4 & 12 \ -2 & -1 & 5 & 18 \ -1 & 3 & -3 & -8 \end{bmatrix} \sim \begin{bmatrix} 1 & -2 & 4 & 12 \ 0 & -5 & 13 & 42 \ 0 & 1 & 1 & 4 \end{bmatrix}
Next multiply row 2 by -\frac{1}{5} to make the (2, 2) element 1. (This corresponds to making the coefficient of x_2 in the second equation 1.)
\sim \begin{bmatrix} 1 & -2 & 4 & 12 \ 0 & 1 & -\frac{13}{5} & -\frac{42}{5} \ 0 & 1 & 1 & 4 \end{bmatrix}
Create zeros in the second column as follows. (This corresponds to using the second equation to eliminate x_2 from the first and third equations.)
\sim \begin{bmatrix} 1 & 0 & -\frac{6}{5} & -\frac{24}{5} \ 0 & 1 & -\frac{13}{5} & -\frac{42}{5} \ 0 & 0 & \frac{18}{5} & \frac{62}{5} \end{bmatrix}
Multiply row 3 by \frac{5}{18}. (This corresponds to making the coefficient of x_3 in the third equation 1.)
\sim \begin{bmatrix} 1 & 0 & -\frac{6}{5} & -\frac{24}{5} \ 0 & 1 & -\frac{13}{5} & -\frac{42}{5} \ 0 & 0 & 1 & \frac{31}{9} \end{bmatrix}
Finally, create zeros in the third column. (This corresponds to using the third equation to eliminate x_3 from the first and second equations.)
\sim \begin{bmatrix} 1 & 0 & 0 & -\frac{2}{3} \ 0 & 1 & 0 & -\frac{1}{9} \ 0 & 0 & 1 & \frac{31}{9} \end{bmatrix}
This matrix corresponds to the system
x_1 = -\frac{2}{3}
x_2 = -\frac{1}{9}
x_3 = \frac{31}{9}
The solution is x1 = -\frac{2}{3}, x2 = -\frac{1}{9}, x_3 = \frac{31}{9}.
System of Equations with No Solution
Solve the system of equations if possible:
x1 + x2 + 5x_3 = 3
x2 + 3x3 = -1
x1 + 2x2 + 8x_3 = 3
Solution:
Starting with the augmented matrix we get
\begin{bmatrix} 1 & 1 & 5 & 3 \ 0 & 1 & 3 & -1 \ 1 & 2 & 8 & 3 \end{bmatrix} \sim \begin{bmatrix} 1 & 1 & 5 & 3 \ 0 & 1 & 3 & -1 \ 0 & 1 & 3 & 0 \end{bmatrix} \sim \begin{bmatrix} 1 & 0 & 2 & 4 \ 0 & 1 & 3 & -1 \ 0 & 0 & 0 & 1 \end{bmatrix}
The last row of this reduced echelon form gives the equation
0x1 + 0x2 + 0x_3 = 1
This equation cannot be satisfied for any values of x1, x2, and x_3. Thus the system has no solution. (This information was in fact available from the next-to-last matrix.)
System of Equations with Infinite Number of Solutions
Note that in the final matrix which is reduced echelon form 2nd and 3rd column do not have pivot elements. Those columns correspond to the independent variables. That is x2 and x3 are independent variables. Hence the solutions are x4 = 2, x2 = r, x3 = s and x1 = -2 -2x2 + x3 where r and s are any real numbers.
Homogenous System
Solve
x1 - 2x2 + 4x_3 = 0
2x1 - x2 + 5x_3 = 0
-x1 + 3x2 - 3x_3 = 0
\begin{bmatrix} 1 & -2 & 4 \ 2 & -1 & 5 \ -1 & 3 & -3 \end{bmatrix}
x1 = 0, x2 = 0 and x_3 = 0 is the only solution.
Practice Problems
The augmented matrix of a linear system has been reduced by row operations to the form shown. In each case, continue the appropriate row operations and describe the solution set of the original system.
Practice Problems
Find an equation involving g, h, and k that makes this augmented matrix correspond to a consistent system:
\begin{bmatrix} 0 & 2 & -4 & 7 & g \ 3 & 1 & -5 & 0 & h \ -2 & 4 & -9 & 5 & k \end{bmatrix}
Do the three planes x1 + 2x2 + x3 = 4, x2 - x3 = 1, and x1 + 3x_2 = 0 have at least one common point of intersection? Explain.
Gaussian Elimination Method
You don’t have to compute Reduced Echelon form to solve the system of linear equations. Instead it is enough to find the Echelon form. This method is called Gaussian Elimination method. HW: Follow this method for the previous examples to solve the system of linear equations.
One More Example:
Consider the following system where we seek all solutions for some a \in R:
-2x1 + 4x2 - 2x3 - x4 + 4x_5 = -3
4x1 - 8x2 + 3x3 - 3x4 + x_5 = 2
x1 - 2x2 + x3 - x4 + x_5 = 0
-x1 + 2x2 - 3x4 + 4x5 = a
Let us consider the augmented matrix
\begin{bmatrix} -2 & 4 & -2 & -1 & 4 & -3 \ 4 & -8 & 3 & -3 & 1 & 2 \ 1 & -2 & 1 & -1 & 1 & 0 \ -1 & 2 & 0 & -3 & 4 & a \end{bmatrix}
\sim \begin{bmatrix} 1 & -2 & 0 & 0 & 3 & 1 \ 0 & 0 & -1 & 1 & -3 & 2 \ 0 & 0 & 0 & -2 & 2 & -1-3 \ 0 & 0 & 0 & 0 & 0 & a+1 \end{bmatrix}
Now multiply the second row by -1 and the third row by \frac{-1}{3} to get the augmented matrix in its final form, known as the row-echelon form. These equations are equivalent to the original set of equations:
x1 - 2x2 + x3 - x4 + x_5 = 0
x3 - x4 + 3x_5 = -2
x4 - 2x5 = 1
0 = a + 1
The preceding set of equations cannot be solved when
a \neq -1.
The last equation is consistent only for a = -1 i.e when rank of the coefficient matrix is same the rank of augmented matrix then the system has solution.
In REF of a matrix, the leading entry is called as the pivot element and columns containing pivot elements are called as pivot columns. The corresponding variables are called pivot variables.
The variables corresponding to non pivot columns are called as free variables.
Consider
x1 - 2x2 + x3 - x4 + x_5 = 0
x3 - x4 + 3x_5 = -2
x4 - 2x5 = 1
Let x5 = t and x2 = s
Then x4 = 1 + 2t Then, we get x3 = -1 - t and x_1 = 2 + 2s + 2t
Thus, solution is (2 + 2s + 2t, s, -1 - t, 1 + 2t, t), \forall t, s \in R
The system is inconsistent if the rank of the coefficient matrix is less than the rank of augmented matrix.
The system is consistent if the rank of the coefficient matrix is same as the rank of augmented matrix.
For the consistent system, if there are no free variables, the solution is unique.
For the consistent system, if there are free variables, there are infinitely many solutions.
Gaussian Elimination and Gauss Jordan
Repeated use of elementary row operations done to convert a matrix to upper triangular form is called Gaussian elimination.
Consider the system of equations Ax = b where A is a n × n matrix.
If A is invertible, it means that A^{-1} exists such that AA^{-1} = A^{-1}A = I_n
In such a case the row-reduced echelon form of A is I_n, i.e every column is a pivot column where the pivot is 1.
The process of converting A to I_n that we have discussed above is called the Gauss-Jordan method.
Gaussian Elimination and Gauss Jordan
In Gaussian Elimination we use multiples of the first row to eliminate the entries in the first column below the first row.
Then we use multiples of the second row to eliminate entries in the second column below the second row and so on until we get an upper-triangular matrix.
This Gauss Jordan process is shown diagramatically in the next slide.
Then we take multiples of the last row to eliminate non-zero entries in the last column above the last entry, followed by multiples of the last but one row to eliminate non-zero entries in the last but one column and so on. This gives us a diagonal matrix.
Calculating Inverse
Can the Gauss Jordan procedure calculate the inverse of a matrix?
For example, let A be a n × n matrix whose inverse A^{-1} exists. We would like to compute its inverse using Gauss Jordan procedure. Is this possible?
Yes we can compute the inverse in the following way: we simply set up n linear systems of the form Ax = ei, 1 \leq i \leq n where ei - is the i th canonical basis vector and find their solutions x. Each solution vector constitutes a column in A^{-1}. Why is this true?
Find the Inverse of A if it Exists
A = \begin{bmatrix} 1 & -1 & -2 \ 2 & -3 & -5 \ -1 & 3 & 5 \end{bmatrix}
Solution
Applying the method of Gauss-Jordan elimination we get
[A:I] = \begin{bmatrix} 1 & -1 & -2 & 1 & 0 & 0 \ 2 & -3 & -5 & 0 & 1 & 0 \ -1 & 3 & 5 & 0 & 0 & 1 \end{bmatrix}
\sim \begin{bmatrix} 1 & -1 & -2 & 1 & 0 & 0 \ 0 & -1 & -1 & -2 & 1 & 0 \ 0 & 2 & 3 & 1 & 0 & 1 \end{bmatrix}
\sim \begin{bmatrix} 1 & -1 & -2 & 1 & 0 & 0 \ 0 & 1 & 1 & 2 & -1 & 0 \ 0 & 0 & 1 & -3 & 2 & 1 \end{bmatrix}
\sim \begin{bmatrix} 1 & 0 & 0 & 0 & 5 & -3 & -1 \ 0 & 1 & 0 & 0 & -1 & -3 & 2 & 1 \ 0 & 0 & 1 & 1 & -3 & 2 & 1 \end{bmatrix}
Hence inverse of A = \begin{bmatrix} 0 & 5 & -3 & -1\ 0 & -1 & -3 & 2 & 1 \1 & -3 & 2 & 1 \end{bmatrix}
Find the Inverse of A if it Exists
Since the LHS 3x3 matrix can not be brought to I_3 Inverse of A does not exist.