Linear Algebra and Matrix Theory Foundations

Linear Equations and Systems

  • Reduction and Elementary Row Operations: The fundamental process for solving systems involves three types of row operations to reach a solution set:     * Scaling: Multiplying a row by a non-zero scalar (e.g., E×tE \times t outside to 6).     * Interchanging: Swapping the positions of two rows in a matrix or system.     * Replacement/Addition: Adding a multiple of one row to another row (R2cR1+B2R_2 \rightarrow cR_1 + B_2).

  • Variables and Unknowns: In a system such as 2x1+3x2=b2x_1 + 3x_2 = b, the terms x1,x2x_1, x_2 are unknowns, while the values multiplied by them are coefficients.

  • System Dimensions:     * 2 Dimensions: Represents lines in a plane (e.g., 2x1+3x2=b2x_1 + 3x_2 = b).     * 3 Dimensions: Represents planes in space (e.g., 3x1+4x2+x3=z3x_1 + 4x_2 + x_3 = z).

  • The Solution Set: The set containing all possible values that satisfy every equation in the system.     * Guessing Solutions: While one might guess a solution like (1,2,0)(1, 2, 0), it is not systematic.     * Possible Outcomes: A linear system has either no solution, exactly one solution, or infinitely many solutions.

  • Visualizing Solutions:     * One Solution: The intersection of lines or planes at a single point.     * Infinitely Many Solutions: Occurs when equations represent the same line/plane or intersect along a line. For example, a linear equation in 3 variables defines a plane.     * No Solution (Inconsistent): Occurs when lines or planes are parallel and do not intersect.

Row Reduction and Echelon Forms

  • Echelon Form (REF): A matrix is in echelon form if:     * All nonzero rows are above any rows of all zeros.     * Each leading entry (the leftmost non-zero entry) of a row is in a column to the right of the leading entry of the row above it.     * All entries in a column below a leading entry are zeros.

  • Reduced Row Echelon Form (RREF): A matrix is in RREF if it satisfies REF conditions plus:     * The leading entry in each nonzero row is 1.     * Each leading 1 is the only nonzero entry in its column.

  • Pivot Position and Column:     * Pivot Position: A location in matrix AA that corresponds to a leading 1 in the RREF of AA.     * Pivot Column: A column of AA that contains a pivot position.

  • Consistency Test: A system is consistent if and only if the rightmost column of the augmented matrix is not a pivot column (i.e., no row of the form [0,0,...,0b][0, 0, \text{...}, 0 | b] where b0b \neq 0).

Vector Equations and Span

  • Vectors in Rn\mathbb{R}^n: The set of all ordered n-tuples of real numbers (x1,x2,...,xnx_1, x_2, \text{...}, x_n).

  • Operations:     * Scalar Multiple: Multiplying a vector by a real number cc such that cu=(cu1cu2)c\mathbf{u} = \begin{pmatrix} cu_1 \\ cu_2 \end{pmatrix}.     * Vector Addition: Adding components such that u+v=(u1+v1u2+v2)\mathbf{u} + \mathbf{v} = \begin{pmatrix} u_1 + v_1 \\ u_2 + v_2 \end{pmatrix}. Addition follows the Parallelogram Rule.

  • Linear Combinations: Given vectors v1,v2,...,vp\mathbf{v_1}, \mathbf{v_2}, \text{...}, \mathbf{v_p} and weights c1,c2,...,cpc_1, c_2, \text{...}, c_p, the vector y=c1v1+...+cpvp\mathbf{y} = c_1\mathbf{v_1} + \text{...} + c_p\mathbf{v_p} is a linear combination.

  • Span: Span{v1,...,vp}Span\{\mathbf{v_1}, \text{...}, \mathbf{v_p}\} is the set of all linear combinations of the vectors. Geometrically, in R3\mathbb{R}^3, the span of two non-parallel vectors is a plane through the origin.

Matrix Equations

  • Matrix-Vector Product: If AA is an m×nm \times n matrix with columns a1,...,an\mathbf{a_1}, \text{...}, \mathbf{a_n} and xRn\mathbf{x} \in \mathbb{R}^n, then AxA\mathbf{x} is the linear combination of the columns of AA using the entries in x\mathbf{x} as weights: Ax=x1a1+x2a2+...+xnanA\mathbf{x} = x_1\mathbf{a_1} + x_2\mathbf{a_2} + \text{...} + x_n\mathbf{a_n}.

  • Equivalent Descriptions: The following have the same solution set:     * The matrix equation Ax=bA\mathbf{x} = \mathbf{b}.     * The vector equation x1a1+...+xnan=bx_1\mathbf{a_1} + \text{...} + x_n\mathbf{a_n} = \mathbf{b}.     * The linear system with augmented matrix [Ab][A | \mathbf{b}].

Solution Sets of Linear Systems

  • Homogeneous Systems: A system Ax=0A\mathbf{x} = \mathbf{0} is homogeneous. It always has the trivial solution x=0\mathbf{x} = \mathbf{0}.     * It has a non-trivial solution if and only if there is at least one free variable (a column without a pivot).

  • Inhomogeneous Systems: Systems of the form Ax=bA\mathbf{x} = \mathbf{b} where b0\mathbf{b} \neq \mathbf{0}.

  • Parametric Vector Form: Solutions can be expressed as x=p+tv\mathbf{x} = \mathbf{p} + t\mathbf{v}, where p\mathbf{p} is a particular solution and tvt\mathbf{v} represents the solution to the corresponding homogeneous system.

Linear Independence

  • Linearly Independent: A set of vectors v1,...,vp\mathbf{v_1}, \text{...}, \mathbf{v_p} is independent if the equation c1v1+...+cpvp=0c_1\mathbf{v_1} + \text{...} + c_p\mathbf{v_p} = \mathbf{0} has only the trivial solution (ci=0c_i = 0 for all ii).     * This occurs if the matrix A=[v1vp]A = [\mathbf{v_1} \dots \mathbf{v_p}] has a pivot in every column.

  • Linearly Dependent: At least one vector can be written as a linear combination of the others.     * Any set containing the zero vector is linearly dependent.     * If a set contains more vectors than there are entries in each vector (p > n in Rn\mathbb{R}^n), it must be linearly dependent.

Linear Transformations

  • Definition: A transformation T:RnRmT: \mathbb{R}^n \rightarrow \mathbb{R}^m is linear if:     * T(u+v)=T(u)+T(v)T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})     * T(cu)=cT(u)T(c\mathbf{u}) = cT(\mathbf{u})

  • Terminology:     * Domain: Rn\mathbb{R}^n (input space).     * Codomain: Rm\mathbb{R}^m (target space).     * Range: The set of all images T(x)T(\mathbf{x}).

  • Standard Matrix: Every linear transformation T:RnRmT: \mathbb{R}^n \rightarrow \mathbb{R}^m can be represented by a matrix A=[T(e1)T(en)]A = [T(\mathbf{e_1}) \dots T(\mathbf{e_n})], where e1,...,en\mathbf{e_1}, \text{...}, \mathbf{e_n} are columns of the identity matrix InI_n.

  • Geometric Transformations in R2\mathbb{R}^2:     * Rotation: Rotates vectors counterclockwise by angle θ\theta. Standard matrix: A=(cos(θ)amp;sin(θ)sin(θ)amp;cos(θ))A = \begin{pmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{pmatrix}.     * Reflection: Across x2=0x_2 = 0 (x1x_1-axis): (1amp;00amp;1)\begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}; across x1=0x_1 = 0: (1amp;00amp;1)\begin{pmatrix} -1 & 0 \\ 0 & 1 \end{pmatrix}; across x2=x1x_2 = x_1: (0amp;11amp;0)\begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}.     * Contraction/Expansion: Horizontal ((kamp;00amp;1)\begin{pmatrix} k & 0 \\ 0 & 1 \end{pmatrix}) or Vertical ((1amp;00amp;k)\begin{pmatrix} 1 & 0 \\ 0 & k \end{pmatrix}).     * Shear: Horizontal ((1amp;k0amp;1)\begin{pmatrix} 1 & k \\ 0 & 1 \end{pmatrix}) or Vertical ((1amp;0kamp;1)\begin{pmatrix} 1 & 0 \\ k & 1 \end{pmatrix}).     * Projection: Onto x1x_1-axis: (1amp;00amp;0)\begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}.

  • Properties:     * Onto: TT is onto if the range equals the codomain (A has a pivot in every row).     * One-to-one: TT is one-to-one if each image is the result of at most one input (A has a pivot in every column).

Matrix Operations and Inverses

  • Matrix Algebra:     * Addition: Defined for matrices of the same dimension.     * Multiplication: For A(m×n)A (m \times n) and B(n×p)B (n \times p), the product ABAB is (m×p)(m \times p). Note: Generally, ABBAAB \neq BA.     * Transpose (ATA^T): Columns become rows.

  • Inverses: A square matrix AA is invertible if there exists CC such that AC=CA=IAC = CA = I.     * 2×22 \times 2 Matrix Inverse: If A=(aamp;bcamp;d)A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}, then A1=1adbc(damp;bcamp;a)A^{-1} = \frac{1}{ad-bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}, provided the determinant adbc0ad-bc \neq 0.     * Algorithm for n×nn \times n Inverse: Row reduce the augmented matrix [AI][A | I]. If it reduces to [IB][I | B], then B=A1B = A^{-1}.

  • Invertible Matrix Theorem (IMT): For a square n×nn \times n matrix AA, the following are equivalent:     * AA is invertible.     * AA is row equivalent to InI_n.     * Ax=0A\mathbf{x} = \mathbf{0} has only the trivial solution.     * The columns of AA are linearly independent and span Rn\mathbb{R}^n.     * det(A)0det(A) \neq 0.

LU Factorization

  • Definition: Writing a matrix AA as the product of a lower triangular matrix LL (with 1s on diagonal) and an upper triangular matrix UU (echelon form): A=LUA = LU.

  • Solving Systems with LU: To solve Ax=bA\mathbf{x} = \mathbf{b}, let Ux=yU\mathbf{x} = \mathbf{y}.     1. Solve Ly=bL\mathbf{y} = \mathbf{b} (Forward substitution).     2. Solve Ux=yU\mathbf{x} = \mathbf{y} (Backward substitution).

Subspaces, Dimension, and Rank

  • Subspace: A subset of Rn\mathbb{R}^n that contains the zero vector and is closed under addition and scalar multiplication.

  • Fundamental Subspaces:     * Column Space (ColACol A): Span of the columns of AA.     * Null Space (NulANul A): Set of all solutions to Ax=0A\mathbf{x} = \mathbf{0}.

  • Basis: A linearly independent set in a subspace that spans it.     * Basis for ColACol A: Use the pivot columns of the original matrix.     * Basis for NulANul A: Use the vectors from the parametric solution of Ax=0A\mathbf{x} = \mathbf{0}.

  • Dimension (dimdim): The number of vectors in a basis.

  • Rank: rank(A)=dim(ColA)rank(A) = dim(Col A).

  • Rank Theorem: rank(A)+dim(NulA)=nrank(A) + dim(Nul A) = n (number of columns).

Determinants

  • Determinant (detAdet A): Computed via cofactor expansion along any row or column.     * Cij=(1)i+jdet(Aij)C_{ij} = (-1)^{i+j} det(A_{ij}).     * Triangular matrices: Determinant is the product of diagonal entries.

  • Properties and Operations:     * Row swap: Changes sign (detA-det A).     * Row scaling by kk: Scales determinant by kk.     * Row addition: Does not change the determinant.     * det(AB)=det(A)det(B)det(AB) = det(A) det(B).     * det(AT)=det(A)det(A^T) = det(A).

  • Geometric interpretation: In R2\mathbb{R}^2, detA|det A| is the area of a parallelogram. In R3\mathbb{R}^3, detA|det A| is the volume of a parallelepiped.

Markov Chains

  • Stochastic Matrix: A square matrix whose columns are probability vectors (entries are 0\ge 0 and sum to 1).

  • Steady-State Vector: A probability vector q\mathbf{q} such that Pq=qP\mathbf{q} = \mathbf{q}.

  • Regular Stochastic Matrix: There exists some power PkP^k where all entries are strictly positive.     * Convergence Theorem: If PP is regular, there is a unique steady-state vector q\mathbf{q}, and for any initial x0\mathbf{x_0}, the sequence Pnx0P^n\mathbf{x_0} converges to q\mathbf{q}.

  • Google PageRank: Uses a Google Matrix G=pP+(1p)KG = pP^* + (1-p)K, where pp is a damping factor (typically 0.850.85) and KK has every entry equal to 1/n1/n.

Eigenvalues and Eigenvectors

  • Definitions: If Av=λvA\mathbf{v} = \lambda \mathbf{v} for v0\mathbf{v} \neq \mathbf{0}, v\mathbf{v} is an eigenvector and λ\lambda is an eigenvalue.

  • Characteristic Equation: det(AλI)=0det(A - \lambda I) = 0. The roots are the eigenvalues.

  • Multiplicity:     * Algebraic Multiplicity: The count of an eigenvalue as a root of the characteristic polynomial.     * Geometric Multiplicity: The dimension of the eigenspace (dim(Nul(AλI))dim(Nul(A - \lambda I))).

  • Diagonalization: A=PDP1A = PDP^{-1} where DD is a diagonal matrix of eigenvalues and PP is a matrix of corresponding eigenvectors. A matrix is diagonalizable if and only if it has nn linearly independent eigenvectors.

  • Complex Eigenvalues: Occur in conjugate pairs. For a 2×22 \times 2 rotation-dilation matrix C=(aamp;bbamp;a)C = \begin{pmatrix} a & -b \\ b & a \end{pmatrix}, eigenvalues are a±bia \pm bi, corresponding to a scaling of factor r=a2+b2r = \sqrt{a^2 + b^2} and rotation by ϕ=tan1(b/a)\phi = \tan^{-1}(b/a).

Orthogonality and Least Squares

  • Dot Product: uv=u1v1++unvn\mathbf{u} \cdot \mathbf{v} = u_1v_1 + \dots + u_nv_n. If uv=0\mathbf{u} \cdot \mathbf{v} = 0, they are orthogonal.

  • Length (Magnitude): u=uu\lVert \mathbf{u} \rVert = \sqrt{\mathbf{u} \cdot \mathbf{u}}.

  • Orthogonal Projection: The projection of y\mathbf{y} onto a subspace WW is y^=projWy\hat{\mathbf{y}} = proj_W \mathbf{y}.     * Best Approximation Theorem: y^\hat{\mathbf{y}} is the unique closest vector in WW to y\mathbf{y}.

  • Gram-Schmidt Process: An algorithm to produce an orthogonal basis v1,,vp\mathbf{v_1}, \dots, \mathbf{v_p} from any basis x1,,xp\mathbf{x_1}, \dots, \mathbf{x_p}.

  • QR Decomposition: Factoring A=QRA = QR where QQ has orthonormal columns and RR is upper triangular.

  • Least-Squares Solution: Finding x^\hat{\mathbf{x}} that minimizes bAx\lVert \mathbf{b} - A\mathbf{x} \rVert. Solutions must satisfy the Normal Equations: ATAx=ATbA^T A \mathbf{x} = A^T \mathbf{b}.

Singular Value Decomposition (SVD)

  • Definition: Writing any m×nm \times n matrix as A=UΣVTA = U\Sigma V^T, where UU and VV are orthogonal matrices and Σ\Sigma contains the singular values σi=λi\sigma_i = \sqrt{\lambda_i} (square roots of eigenvalues of ATAA^T A).

  • Fundamental Subspaces and SVD:     * Columns u1ur\mathbf{u_1} \dots \mathbf{u_r} of UU span ColACol A.     * Columns vr+1vn\mathbf{v_{r+1}} \dots \mathbf{v_n} of VV span NulANul A.

  • Spectral Decomposition: A=i=1rσiuiviTA = \sum_{i=1}^r \sigma_i \mathbf{u_i}\mathbf{v_i}^T. This allows for rank-j approximations, used extensively in image compression and data analysis.

Questions & Discussion

  • Course Organization: Worksheet solutions are not provided intentionally to prepare students for upper-level courses that lack studios and provided answers. Students should use Piazza, office hours, and software like MATLAB or Octave to check work.

  • True/False Strategies: TAs recommend looking at simple examples or counter-examples. For instance, if a system has more unknowns than equations, it cannot have a unique solution (True, as there will be free variables).

  • Optimization: How to find the max value of a quadratic form Q(x)=xTAxQ(\mathbf{x}) = \mathbf{x^T}A\mathbf{x} subject to x=1\lVert \mathbf{x} \rVert = 1? The maximum value is the largest eigenvalue λ1\lambda_1 of the symmetric matrix AA.

A symmetric matrix has the property that it can be diagonalized, meaning that it can be expressed in the form A=PDP1A = PDP^{-1}, where:

  • DD is a diagonal matrix containing the eigenvalues of the symmetric matrix.

  • PP is a matrix whose columns are the corresponding normalized (orthonormal) eigenvectors of the symmetric matrix.

Properties of Symmetric Matrices and Diagonalization:
  • Real Eigenvalues: All eigenvalues of a symmetric matrix are real numbers.

  • Orthogonal Eigenvectors: Eigenvectors corresponding to distinct eigenvalues are orthogonal to each other.

  • Diagonalization: A symmetric matrix is diagonalizable if and only if it has enough linearly independent eigenvectors (which it always does).

  • Implication: For a symmetric matrix, the process of diagonalization can simplify many matrix operations, particularly in applications involving quadratic forms or solving differential equations.