Linear Algebra and Matrix Theory Foundations

Linear Equations and Systems

Reduction and Elementary Row Operations: The fundamental process for solving systems involves three types of row operations to reach a solution set: * Scaling: Multiplying a row by a non-zero scalar (e.g., $E \times t$ outside to 6). * Interchanging: Swapping the positions of two rows in a matrix or system. * Replacement/Addition: Adding a multiple of one row to another row ( $R_2 \rightarrow cR_1 + B_2$ ).
Variables and Unknowns: In a system such as $2x_1 + 3x_2 = b$ , the terms $x_1, x_2$ are unknowns, while the values multiplied by them are coefficients.
System Dimensions: * 2 Dimensions: Represents lines in a plane (e.g., $2x_1 + 3x_2 = b$ ). * 3 Dimensions: Represents planes in space (e.g., $3x_1 + 4x_2 + x_3 = z$ ).
The Solution Set: The set containing all possible values that satisfy every equation in the system. * Guessing Solutions: While one might guess a solution like $(1, 2, 0)$ , it is not systematic. * Possible Outcomes: A linear system has either no solution, exactly one solution, or infinitely many solutions.
Visualizing Solutions: * One Solution: The intersection of lines or planes at a single point. * Infinitely Many Solutions: Occurs when equations represent the same line/plane or intersect along a line. For example, a linear equation in 3 variables defines a plane. * No Solution (Inconsistent): Occurs when lines or planes are parallel and do not intersect.

Row Reduction and Echelon Forms

Echelon Form (REF): A matrix is in echelon form if: * All nonzero rows are above any rows of all zeros. * Each leading entry (the leftmost non-zero entry) of a row is in a column to the right of the leading entry of the row above it. * All entries in a column below a leading entry are zeros.
Reduced Row Echelon Form (RREF): A matrix is in RREF if it satisfies REF conditions plus: * The leading entry in each nonzero row is 1. * Each leading 1 is the only nonzero entry in its column.
Pivot Position and Column: * Pivot Position: A location in matrix $A$ that corresponds to a leading 1 in the RREF of $A$ . * Pivot Column: A column of $A$ that contains a pivot position.
Consistency Test: A system is consistent if and only if the rightmost column of the augmented matrix is not a pivot column (i.e., no row of the form $[0, 0, \text{...}, 0 | b]$ where $b \neq 0$ ).

Vector Equations and Span

Vectors in $\mathbb{R}^n$ : The set of all ordered n-tuples of real numbers ( $x_1, x_2, \text{...}, x_n$ ).
Operations: * Scalar Multiple: Multiplying a vector by a real number $c$ such that $c\mathbf{u} = \begin{pmatrix} cu_1 \\ cu_2 \end{pmatrix}$ . * Vector Addition: Adding components such that $\mathbf{u} + \mathbf{v} = \begin{pmatrix} u_1 + v_1 \\ u_2 + v_2 \end{pmatrix}$ . Addition follows the Parallelogram Rule.
Linear Combinations: Given vectors $\mathbf{v_1}, \mathbf{v_2}, \text{...}, \mathbf{v_p}$ and weights $c_1, c_2, \text{...}, c_p$ , the vector $\mathbf{y} = c_1\mathbf{v_1} + \text{...} + c_p\mathbf{v_p}$ is a linear combination.
Span: $Span\{\mathbf{v_1}, \text{...}, \mathbf{v_p}\}$ is the set of all linear combinations of the vectors. Geometrically, in $\mathbb{R}^3$ , the span of two non-parallel vectors is a plane through the origin.

Matrix Equations

Matrix-Vector Product: If $A$ is an $m \times n$ matrix with columns $\mathbf{a_1}, \text{...}, \mathbf{a_n}$ and $\mathbf{x} \in \mathbb{R}^n$ , then $A\mathbf{x}$ is the linear combination of the columns of $A$ using the entries in $\mathbf{x}$ as weights: $A\mathbf{x} = x_1\mathbf{a_1} + x_2\mathbf{a_2} + \text{...} + x_n\mathbf{a_n}$ .
Equivalent Descriptions: The following have the same solution set: * The matrix equation $A\mathbf{x} = \mathbf{b}$ . * The vector equation $x_1\mathbf{a_1} + \text{...} + x_n\mathbf{a_n} = \mathbf{b}$ . * The linear system with augmented matrix $[A | \mathbf{b}]$ .

Solution Sets of Linear Systems

Homogeneous Systems: A system $A\mathbf{x} = \mathbf{0}$ is homogeneous. It always has the trivial solution $\mathbf{x} = \mathbf{0}$ . * It has a non-trivial solution if and only if there is at least one free variable (a column without a pivot).
Inhomogeneous Systems: Systems of the form $A\mathbf{x} = \mathbf{b}$ where $\mathbf{b} \neq \mathbf{0}$ .
Parametric Vector Form: Solutions can be expressed as $\mathbf{x} = \mathbf{p} + t\mathbf{v}$ , where $\mathbf{p}$ is a particular solution and $t\mathbf{v}$ represents the solution to the corresponding homogeneous system.

Linear Independence

Linearly Independent: A set of vectors $\mathbf{v_1}, \text{...}, \mathbf{v_p}$ is independent if the equation $c_1\mathbf{v_1} + \text{...} + c_p\mathbf{v_p} = \mathbf{0}$ has only the trivial solution ( $c_i = 0$ for all $i$ ). * This occurs if the matrix $A = [\mathbf{v_1} \dots \mathbf{v_p}]$ has a pivot in every column.
Linearly Dependent: At least one vector can be written as a linear combination of the others. * Any set containing the zero vector is linearly dependent. * If a set contains more vectors than there are entries in each vector (p > n in $\mathbb{R}^n$ ), it must be linearly dependent.

Linear Transformations

Definition: A transformation $T: \mathbb{R}^n \rightarrow \mathbb{R}^m$ is linear if: * $T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})$ * $T(c\mathbf{u}) = cT(\mathbf{u})$
Terminology: * Domain: $\mathbb{R}^n$ (input space). * Codomain: $\mathbb{R}^m$ (target space). * Range: The set of all images $T(\mathbf{x})$ .
Standard Matrix: Every linear transformation $T: \mathbb{R}^n \rightarrow \mathbb{R}^m$ can be represented by a matrix $A = [T(\mathbf{e_1}) \dots T(\mathbf{e_n})]$ , where $\mathbf{e_1}, \text{...}, \mathbf{e_n}$ are columns of the identity matrix $I_n$ .
Geometric Transformations in $\mathbb{R}^2$ : * Rotation: Rotates vectors counterclockwise by angle $\theta$ . Standard matrix: $A = \begin{pmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{pmatrix}$ . * Reflection: Across $x_2 = 0$ ( $x_1$ -axis): $\begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}$ ; across $x_1 = 0$ : $\begin{pmatrix} -1 & 0 \\ 0 & 1 \end{pmatrix}$ ; across $x_2 = x_1$ : $\begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}$ . * Contraction/Expansion: Horizontal ( $\begin{pmatrix} k & 0 \\ 0 & 1 \end{pmatrix}$ ) or Vertical ( $\begin{pmatrix} 1 & 0 \\ 0 & k \end{pmatrix}$ ). * Shear: Horizontal ( $\begin{pmatrix} 1 & k \\ 0 & 1 \end{pmatrix}$ ) or Vertical ( $\begin{pmatrix} 1 & 0 \\ k & 1 \end{pmatrix}$ ). * Projection: Onto $x_1$ -axis: $\begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}$ .
Properties: * Onto: $T$ is onto if the range equals the codomain (A has a pivot in every row). * One-to-one: $T$ is one-to-one if each image is the result of at most one input (A has a pivot in every column).

Matrix Operations and Inverses

Matrix Algebra: * Addition: Defined for matrices of the same dimension. * Multiplication: For $A (m \times n)$ and $B (n \times p)$ , the product $AB$ is $(m \times p)$ . Note: Generally, $AB \neq BA$ . * Transpose ( $A^T$ ): Columns become rows.
Inverses: A square matrix $A$ is invertible if there exists $C$ such that $AC = CA = I$ . * $2 \times 2$ Matrix Inverse: If $A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$ , then $A^{-1} = \frac{1}{ad-bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}$ , provided the determinant $ad-bc \neq 0$ . * Algorithm for $n \times n$ Inverse: Row reduce the augmented matrix $[A | I]$ . If it reduces to $[I | B]$ , then $B = A^{-1}$ .
Invertible Matrix Theorem (IMT): For a square $n \times n$ matrix $A$ , the following are equivalent: * $A$ is invertible. * $A$ is row equivalent to $I_n$ . * $A\mathbf{x} = \mathbf{0}$ has only the trivial solution. * The columns of $A$ are linearly independent and span $\mathbb{R}^n$ . * $det(A) \neq 0$ .

LU Factorization

Definition: Writing a matrix $A$ as the product of a lower triangular matrix $L$ (with 1s on diagonal) and an upper triangular matrix $U$ (echelon form): $A = LU$ .
Solving Systems with LU: To solve $A\mathbf{x} = \mathbf{b}$ , let $U\mathbf{x} = \mathbf{y}$ . 1. Solve $L\mathbf{y} = \mathbf{b}$ (Forward substitution). 2. Solve $U\mathbf{x} = \mathbf{y}$ (Backward substitution).

Subspaces, Dimension, and Rank

Subspace: A subset of $\mathbb{R}^n$ that contains the zero vector and is closed under addition and scalar multiplication.
Fundamental Subspaces: * Column Space ( $Col A$ ): Span of the columns of $A$ . * Null Space ( $Nul A$ ): Set of all solutions to $A\mathbf{x} = \mathbf{0}$ .
Basis: A linearly independent set in a subspace that spans it. * Basis for $Col A$ : Use the pivot columns of the original matrix. * Basis for $Nul A$ : Use the vectors from the parametric solution of $A\mathbf{x} = \mathbf{0}$ .
Dimension ( $dim$ ): The number of vectors in a basis.
Rank: $rank(A) = dim(Col A)$ .
Rank Theorem: $rank(A) + dim(Nul A) = n$ (number of columns).

Determinants

Determinant ( $det A$ ): Computed via cofactor expansion along any row or column. * $C_{ij} = (-1)^{i+j} det(A_{ij})$ . * Triangular matrices: Determinant is the product of diagonal entries.
Properties and Operations: * Row swap: Changes sign ( $-det A$ ). * Row scaling by $k$ : Scales determinant by $k$ . * Row addition: Does not change the determinant. * $det(AB) = det(A) det(B)$ . * $det(A^T) = det(A)$ .
Geometric interpretation: In $\mathbb{R}^2$ , $|det A|$ is the area of a parallelogram. In $\mathbb{R}^3$ , $|det A|$ is the volume of a parallelepiped.

Markov Chains

Stochastic Matrix: A square matrix whose columns are probability vectors (entries are $\ge 0$ and sum to 1).
Steady-State Vector: A probability vector $\mathbf{q}$ such that $P\mathbf{q} = \mathbf{q}$ .
Regular Stochastic Matrix: There exists some power $P^k$ where all entries are strictly positive. * Convergence Theorem: If $P$ is regular, there is a unique steady-state vector $\mathbf{q}$ , and for any initial $\mathbf{x_0}$ , the sequence $P^n\mathbf{x_0}$ converges to $\mathbf{q}$ .
Google PageRank: Uses a Google Matrix $G = pP^* + (1-p)K$ , where $p$ is a damping factor (typically $0.85$ ) and $K$ has every entry equal to $1/n$ .

Eigenvalues and Eigenvectors

Definitions: If $A\mathbf{v} = \lambda \mathbf{v}$ for $\mathbf{v} \neq \mathbf{0}$ , $\mathbf{v}$ is an eigenvector and $\lambda$ is an eigenvalue.
Characteristic Equation: $det(A - \lambda I) = 0$ . The roots are the eigenvalues.
Multiplicity: * Algebraic Multiplicity: The count of an eigenvalue as a root of the characteristic polynomial. * Geometric Multiplicity: The dimension of the eigenspace ( $dim(Nul(A - \lambda I))$ ).
Diagonalization: $A = PDP^{-1}$ where $D$ is a diagonal matrix of eigenvalues and $P$ is a matrix of corresponding eigenvectors. A matrix is diagonalizable if and only if it has $n$ linearly independent eigenvectors.
Complex Eigenvalues: Occur in conjugate pairs. For a $2 \times 2$ rotation-dilation matrix $C = \begin{pmatrix} a & -b \\ b & a \end{pmatrix}$ , eigenvalues are $a \pm bi$ , corresponding to a scaling of factor $r = \sqrt{a^2 + b^2}$ and rotation by $\phi = \tan^{-1}(b/a)$ .

Orthogonality and Least Squares

Dot Product: $\mathbf{u} \cdot \mathbf{v} = u_1v_1 + \dots + u_nv_n$ . If $\mathbf{u} \cdot \mathbf{v} = 0$ , they are orthogonal.
Length (Magnitude): $\lVert \mathbf{u} \rVert = \sqrt{\mathbf{u} \cdot \mathbf{u}}$ .
Orthogonal Projection: The projection of $\mathbf{y}$ onto a subspace $W$ is $\hat{\mathbf{y}} = proj_W \mathbf{y}$ . * Best Approximation Theorem: $\hat{\mathbf{y}}$ is the unique closest vector in $W$ to $\mathbf{y}$ .
Gram-Schmidt Process: An algorithm to produce an orthogonal basis $\mathbf{v_1}, \dots, \mathbf{v_p}$ from any basis $\mathbf{x_1}, \dots, \mathbf{x_p}$ .
QR Decomposition: Factoring $A = QR$ where $Q$ has orthonormal columns and $R$ is upper triangular.
Least-Squares Solution: Finding $\hat{\mathbf{x}}$ that minimizes $\lVert \mathbf{b} - A\mathbf{x} \rVert$ . Solutions must satisfy the Normal Equations: $A^T A \mathbf{x} = A^T \mathbf{b}$ .

Singular Value Decomposition (SVD)

Definition: Writing any $m \times n$ matrix as $A = U\Sigma V^T$ , where $U$ and $V$ are orthogonal matrices and $\Sigma$ contains the singular values $\sigma_i = \sqrt{\lambda_i}$ (square roots of eigenvalues of $A^T A$ ).
Fundamental Subspaces and SVD: * Columns $\mathbf{u_1} \dots \mathbf{u_r}$ of $U$ span $Col A$ . * Columns $\mathbf{v_{r+1}} \dots \mathbf{v_n}$ of $V$ span $Nul A$ .
Spectral Decomposition: $A = \sum_{i=1}^r \sigma_i \mathbf{u_i}\mathbf{v_i}^T$ . This allows for rank-j approximations, used extensively in image compression and data analysis.

Questions & Discussion

Course Organization: Worksheet solutions are not provided intentionally to prepare students for upper-level courses that lack studios and provided answers. Students should use Piazza, office hours, and software like MATLAB or Octave to check work.
True/False Strategies: TAs recommend looking at simple examples or counter-examples. For instance, if a system has more unknowns than equations, it cannot have a unique solution (True, as there will be free variables).
Optimization: How to find the max value of a quadratic form $Q(\mathbf{x}) = \mathbf{x^T}A\mathbf{x}$ subject to $\lVert \mathbf{x} \rVert = 1$ ? The maximum value is the largest eigenvalue $\lambda_1$ of the symmetric matrix $A$ .

A symmetric matrix has the property that it can be diagonalized, meaning that it can be expressed in the form $A = PDP^{-1}$ , where:

$D$ is a diagonal matrix containing the eigenvalues of the symmetric matrix.
$P$ is a matrix whose columns are the corresponding normalized (orthonormal) eigenvectors of the symmetric matrix.

Properties of Symmetric Matrices and Diagonalization:

Real Eigenvalues: All eigenvalues of a symmetric matrix are real numbers.
Orthogonal Eigenvectors: Eigenvectors corresponding to distinct eigenvalues are orthogonal to each other.
Diagonalization: A symmetric matrix is diagonalizable if and only if it has enough linearly independent eigenvectors (which it always does).
Implication: For a symmetric matrix, the process of diagonalization can simplify many matrix operations, particularly in applications involving quadratic forms or solving differential equations.