Ch 2.2 - The Inverse of a Matrix

Understanding Matrix Operations and Efficiency

  • Equality of Transposed Products:

    • The quantities (Ax)^T and x^T A^T are equal, as indicated by Theorem 3(d).

  • Scalar Product (Dot Product) vs. Outer Product:

    • For a column vector x = [[x1], [x2]]:

      • The scalar product x^T x = [x1 \ x2] [[x1], [x2]] = [x1^2 + x2^2]. This results in a 1 \times 1 matrix, usually written without brackets (a scalar).

      • Example from transcript (assuming x=[5; 3] for calculation of 34): If x = [[5], [3]], then x^T x = [5 \ 3] [[5], [3]] = [25+9] = 34.

      • The outer product x x^T = [[x1], [x2]] [x1 \ x2] = [[x1^2, x1 x2], [x2 x1, x2^2]].

      • Example from transcript (assuming x=[3; 5] for calculation of matrix): If x = [[3], [5]], then x x^T = [[3], [5]] [3 \ 5] = [[9, 15], [15, 25]].

  • Matrix Product Definition and Undefined Operations:

    • The product A^T x^T is not defined if the number of columns in A^T does not match the number of rows in x^T. For instance, if A^T is m \times 2 and x is a column vector, x^T would be a row vector. For the product to be defined, x^T would need to have 2 rows, implying x is a 1x2 row vector not a column vector which is contradictory to typical conventions for x^T x. The exact dimensions of A^T and x that lead to this undefined product were not explicitly stated beyond x does not have two rows to match the two columns of A^T.

  • Computational Efficiency of Matrix Multiplication:

    • When computing A^2 x, it is generally more efficient to compute it as A(Ax).

    • Comparison for a 4 \times 4 matrix A and 4 \times 1 vector x:

      • Method 1: A(Ax).

        • Computing Ax requires 4 \times 4 = 16 multiplications (4 for each of the 4 entries in the resulting vector).

        • Computing A(Ax) then requires another 16 multiplications.

        • Total: 16 + 16 = 32 multiplications.

      • Method 2: A^2 x.

        • Computing A^2 = A \cdot A requires 4^3 = 64 multiplications (4 for each of the 16 entries in the 4 \times 4 matrix A^2).

        • Computing A^2 x then requires another 16 multiplications.

        • Total: 64 + 16 = 80 multiplications.

    • Conclusion: Computing A(Ax) is significantly faster than computing A^2 x.

  • Properties of Matrix Products (AB) with Identical Rows/Columns in A:

    • If all columns of matrix A are identical, then all columns of the product AB are also identical.

    • If all rows of matrix A are identical (i.e., ext{row}i(A) = ext{row}j(A) for all i,j), then all rows of the product AB are also identical ( ext{row}i(AB) = ext{row}i(A) \cdot B).

    • If all entries in A are the same (implying both identical rows and identical columns), then all entries in AB will also be the same.

The Inverse of a Matrix

  • Matrix Analogue of a Reciprocal:

    • The concept of a matrix inverse (A^{-1}) is analogous to the multiplicative inverse (reciprocal) of a non-zero real number (e.g., 5^{-1} = 1/5).

    • For real numbers, the inverse satisfies 5^{-1}(5) = 1 and 5(5^{-1}) = 1.

  • Key Distinctions for Matrices:

    • Two-Sided Definition: Because matrix multiplication is generally not commutative (AB \neq BA), the matrix generalization requires both equations to be satisfied: CA = I and AC = I.

    • No Division Notation: Slanted-line notation (e.g., A/B) is avoided for matrices.

    • Square Matrices Only: A full generalization of the inverse is possible only for square matrices (n \times n).

  • Definition of an Invertible Matrix:

    • An n \times n matrix A is invertible (or nonsingular) if there exists an n \times n matrix C such that CA = I and AC = I, where I = I_n is the n \times n identity matrix.

    • C is called an inverse of A.

    • Uniqueness of the Inverse: The inverse C is uniquely determined by A.

      • Proof: If B were another inverse of A, then B = BI = B(AC) = (BA)C = IC = C. Thus, B=C.

    • The unique inverse is denoted as A^{-1}, so the defining equations are A^{-1}A = I and AA^{-1} = I.

    • A matrix that is not invertible is called a singular matrix.

  • Example 1: Verifying an Inverse for a 2 \times 2 Matrix:

    • Given A = [[3, 2], [7, 5]] and C = [[5, -2], [-7, 3]].

    • CA = [[5, -2], [-7, 3]] [[3, 2], [7, 5]] = [[(5)(3) + (-2)(7), (5)(2) + (-2)(5)], [(-7)(3) + (3)(7), (-7)(2) + (3)(5)]] = [[15 - 14, 10 - 10], [-21 + 21, -14 + 15]] = [[1, 0], [0, 1]] = I.

    • AC = [[3, 2], [7, 5]] [[5, -2], [-7, 3]] = [[(3)(5) + (2)(-7), (3)(-2) + (2)(3)], [(7)(5) + (5)(-7), (7)(-2) + (5)(3)]] = [[15 - 14, -6 + 6], [35 - 35, -14 + 15]] = [[1, 0], [0, 1]] = I.

    • Since both conditions are met, C = A^{-1}.

Inverse of a 2 \times 2 Matrix

  • Theorem 4: Formula for the Inverse of a 2 \times 2 Matrix:

    • Let A = [[a, b], [c, d]].

    • If ad - bc \neq 0, then A is invertible and its inverse is given by:
      A^{-1} = \frac{1}{ad - bc} [[d, -b], [-c, a]]

    • If ad - bc = 0, then A is not invertible.

  • Determinant of a 2 \times 2 Matrix:

    • The quantity ad - bc is called the determinant of A, denoted as ext{det} A = ad - bc.

    • Theorem 4 implies that a 2 \times 2 matrix A is invertible if and only if ext{det} A \neq 0.

  • Example 2: Finding the Inverse of a 2 \times 2 Matrix:

    • Find the inverse of A = [[3, 4], [5, 6]].

    • First, calculate the determinant: ext{det} A = (3)(6) - (4)(5) = 18 - 20 = -2.

    • Since ext{det} A = -2 \neq 0, A is invertible.

    • Using Theorem 4:
      A^{-1} = \frac{1}{-2} [[6, -4], [-5, 3]] = [[-3, 2], [5/2, -3/2]]

Solving Linear Systems with Invertible Matrices

  • Importance of Invertible Matrices:

    • They are essential for algebraic calculations and formula derivations in linear algebra.

    • They can provide insights into mathematical models of real-life situations.

  • Theorem 5: Unique Solution for Ax = b:

    • If A is an invertible n \times n matrix, then for any vector \mathbf{b} in R^n, the matrix equation A\mathbf{x} = \mathbf{b} has a unique solution given by \mathbf{x} = A^{-1}\mathbf{b}.

    • Proof of Existence: Substituting A^{-1}\mathbf{b} for \mathbf{x} in the equation: A(A^{-1}\mathbf{b}) = (AA^{-1})\mathbf{b} = I\mathbf{b} = \mathbf{b}. Thus, A^{-1}\mathbf{b} is a solution.

    • Proof of Uniqueness: Assume \mathbf{u} is any solution such that A\mathbf{u} = \mathbf{b}. Multiplying both sides by A^{-1} on the left:
      A^{-1}(A\mathbf{u}) = A^{-1}\mathbf{b}
      (A^{-1}A)\mathbf{u} = A^{-1}\mathbf{b}
      I\mathbf{u} = A^{-1}\mathbf{b}
      \mathbf{u} = A^{-1}\mathbf{b} Consequently, any solution must be equal to A^{-1}\mathbf{b}, proving uniqueness.

Practical Application: Elastic Beam Deflection (Example 3)

  • Scenario: A horizontal elastic beam is supported at its ends and subjected to forces at three points (1, 2, 3), causing deflections.

  • Variables:

    • \mathbf{f} \in R^3: A vector listing the forces at the three points.

    • \mathbf{y} \in R^3: A vector listing the amounts of deflection at the three points.

  • Hooke's Law Relationship: \mathbf{y} = D\mathbf{f}

    • D: The flexibility matrix.

    • D^{-1}: The stiffness matrix.

  • Physical Significance of Columns of D (Flexibility Matrix):

    • Express D as D = D I3 = [D\mathbf{e1} \ D\mathbf{e2} \ D\mathbf{e3}], where \mathbf{e_j} are the standard basis vectors (columns of the identity matrix).

    • Interpreting \mathbf{e_1}: The vector (1, 0, 0) represents a unit force applied downward at point 1, with zero force at the other two points.

    • First column of D (D\mathbf{e_1}): Contains the beam deflections that result from applying a unit force at point 1 (and zero force at points 2 and 3).

    • Similarly, the second and third columns of D list the deflections caused by a unit force at points 2 and 3, respectively.

  • Physical Significance of Columns of D^{-1} (Stiffness Matrix):

    • The inverse equation is \mathbf{f} = D^{-1}\mathbf{y}, which computes the force vector \mathbf{f} required to produce a given deflection vector \mathbf{y} (i.e., this matrix describes the beam's stiffness).

    • Express D^{-1} as D^{-1} = [D^{-1}\mathbf{e1} \ D^{-1}\mathbf{e2} \ D^{-1}\mathbf{e_3}].

    • Interpreting \mathbf{e_1} as a deflection vector: The vector (1, 0, 0) now represents a unit deflection at point 1, with zero deflections at the other two points.

    • First column of D^{-1} (D^{-1}\mathbf{e_1}): Lists the forces that must be applied at the three points to produce a unit deflection at point 1 and zero deflections at points 2 and 3.

    • Similarly, columns 2 and 3 of D^{-1} list the forces required to produce unit deflections at points 2 and 3, respectively.

    • Note on Forces: To achieve specific deflections, some forces in these columns might be negative (indicating an upward force).

    • Units: If flexibility is measured in inches of deflection per pound of load, then stiffness matrix entries are given in pounds of load per inch of deflection.

Practicalities of Using A^{-1} to Solve Ax=b

  • Computational Efficiency: The formula \mathbf{x} = A^{-1}\mathbf{b} is rarely used for numerical computations of A\mathbf{x}=\mathbf{b} for large matrices.

    • Row reduction of the augmented matrix [A \ \mathbf{b}] is almost always faster and generally more accurate (as it can minimize rounding errors).

  • Exception: The 2 \times 2 case can be an exception, where mental calculation of A^{-1} might make using the formula quicker.

  • Example 4: Solving a 2 \times 2 System Using A^{-1}

    • System:
      3x1 + 4x2 = 3 \
      5x1 + 6x2 = 7

    • This is equivalent to A\mathbf{x} = \mathbf{b} where A = [[3, 4], [5, 6]] and \mathbf{b} = [[3], [7]].

    • From Example 2, we found A^{-1} = [[-3, 2], [5/2, -3/2]].

    • Solution:
      \mathbf{x} = A^{-1}\mathbf{b} = [[-3, 2], [5/2, -3/2]] [[3], [7]] \
      = [[(-3)(3) + (2)(7)], [(5/2)(3) + (-3/2)(7)]] \
      = [[-9 + 14], [15/2 - 21/2]] \
      = [[5], [-6/2]] = [[5], [-3]]

    • So, x1 = 5 and x2 = -3.

Properties of Invertible Matrices

  • Theorem 6: Important Facts about Invertible Matrices:

    • a. Inverse of an Inverse: If A is an invertible matrix, then its inverse, A^{-1}, is also invertible, and (A^{-1})^{-1} = A.

      • Proof: By definition, A^{-1}A = I and AA^{-1} = I. These equations satisfy the conditions for A to be the inverse of A^{-1}.

    • b. Inverse of a Product: If A and B are n \times n invertible matrices, then their product AB is also invertible. The inverse of AB is the product of their inverses in reverse order: (AB)^{-1} = B^{-1}A^{-1}

      • Proof: We need to show that (AB)(B^{-1}A^{-1}) = I and (B^{-1}A^{-1})(AB) = I:
        (AB)(B^{-1}A^{-1}) = A(BB^{-1})A^{-1} = AIA^{-1} = AA^{-1} = I
        A similar calculation shows (B^{-1}A^{-1})(AB) = I.

    • c. Inverse of a Transpose: If A is an invertible matrix, then its transpose, A^T, is also invertible. The inverse of A^T is the transpose of A^{-1}: (A^T)^{-1} = (A^{-1})^T

      • Proof: Using Theorem 3(d) (which states (XY)^T = Y^T X^T):
        (A^{-1})^T A^T = (AA^{-1})^T = I^T = I
        And also:
        A^T (A^{-1})^T = (A^{-1}A)^T = I^T = I
        Thus, A^T is invertible, and its inverse is (A^{-1})^T.

  • Generalization of Theorem 6(b):

    • The product of any number of n \times n invertible matrices is invertible, and its inverse is the product of their inverses in reverse order. For example, (ABC)^{-1} = C^{-1}B^{-1}A^{-1}.

  • Role of Definitions in Proofs: Proofs rigorously demonstrate that a proposed inverse (or other property) satisfies the formal definition. For example, showing (B^{-1}A^{-1}) is the inverse of AB means showing it satisfies the definition of an inverse with AB.

Elementary Matrices and Computing A^{-1}

  • Connection to Row Operations:

    • A significant connection exists between invertible matrices and elementary row operations.

    • An invertible matrix A is row equivalent to the identity matrix I_n.

    • This relationship provides a systematic method for finding A^{-1}.

  • Definition of an Elementary Matrix:

    • An elementary matrix is a matrix obtained by performing a single elementary row operation on an identity matrix (I_m).

    • There are three types of elementary matrices, corresponding to the three elementary row operations:

      1. Row Replacement: Adding a multiple of one row to another.

      2. Row Interchange: Swapping two rows.

      3. Row Scaling: Multiplying a row by a nonzero scalar.

  • Example 5: Elementary Matrices and Row Operations:

    • Let A = [[a, b, c], [d, e, f], [g, h, i]].

    • E1 = [[1, 0, 0], [0, 1, 0], [-4, 0, 1]] (obtained by R3 \leftarrow R3 - 4R1 on I_3).

      • E1A performs the operation R3 \leftarrow R3 - 4R1 on A.

    • E2 = [[0, 1, 0], [1, 0, 0], [0, 0, 1]] (obtained by R1 \leftrightarrow R2 on I3).

      • E2A performs the operation R1 \leftrightarrow R_2 on A.

    • E3 = [[1, 0, 0], [0, 1, 0], [0, 0, 5]] (obtained by R3 \leftarrow 5R3 on I3).

      • E3A performs the operation R3 \leftarrow 5R_3 on A.

  • General Fact: If an elementary row operation is performed on an m \times n matrix A, the resulting matrix can be written as EA, where E is the m \times m elementary matrix created by performing the same row operation on I_m.

  • Invertibility of Elementary Matrices:

    • Every elementary matrix E is invertible.

    • This is because elementary row operations are reversible (Section 1.1).

    • The inverse of an elementary matrix E is simply the elementary matrix of the same type that performs the reverse operation, transforming E back into the identity matrix I.

  • Example 6: Finding the Inverse of an Elementary Matrix:

    • Given E1 = [[1, 0, 0], [0, 1, 0], [-4, 0, 1]] (which adds -4 times row 1 to row 3 of I3).

    • To reverse this operation and transform E1 back into I3, one must add +4 times row 1 to row 3.

    • Therefore, the inverse is E_1^{-1} = [[1, 0, 0], [0, 1, 0], [4, 0, 1]].

The Algorithm for Finding A^{-1}

  • Theorem 7: Invertibility and Row Equivalence to Identity:

    • An n \times n matrix A is invertible if and only if A is row equivalent to the n \times n identity matrix (I_n).

    • Furthermore, if A is invertible, any sequence of elementary row operations that reduces A to In will also transform In into A^{-1}.

  • Proof of Theorem 7:

    • Part 1: If A is invertible, then A \sim In (A is row equivalent to In).

      • By Theorem 5, if A is invertible, the equation A\mathbf{x} = \mathbf{b} has a solution for every \mathbf{b}.

      • This implies that A has a pivot position in every row (Theorem 4 in Section 1.4).

      • Since A is a square n \times n matrix, having n pivot positions in n rows means all n pivot positions must lie on the main diagonal.

      • Therefore, the reduced echelon form of A must be In, meaning A \sim In.

    • Part 2: If A \sim I_n, then A is invertible.

      • If A \sim In, it means A can be transformed into In by a sequence of elementary row operations.

      • Each elementary row operation corresponds to left-multiplication by an elementary matrix. So, there exist elementary matrices E1, E2, \ldots, Ep such that: Ep \cdots E2 E1 A = I_n \quad (1)

      • Since each elementary matrix is invertible, their product (Ep \cdots E1) is also invertible (by the generalization of Theorem 6b).

      • Let K = Ep \cdots E1. Then KA = I_n. This directly implies that A is invertible, and its inverse is K (since multiplying by K on the left yields the identity, K^{-1} must be A).

      • More specifically, from KA = In, we multiply both sides on the left by K^{-1} to get A = K^{-1}. Then, (A^{-1})^{-1} = A implies A^{-1} = (K^{-1})^{-1} = K = Ep \cdots E_1.

      • This means that A^{-1} is precisely the matrix obtained by applying the same sequence of elementary operations (E1, E2, \ldots, Ep) to In (because Ep \cdots E1 In = Ep \cdots E_1 = K = A^{-1}).

  • Algorithm for Finding A^{-1}:

    • To find the inverse of an n \times n matrix A, form the augmented matrix [A \ I] by placing A and the identity matrix I_n side-by-side.

    • Perform row operations to reduce this augmented matrix.

    • If A is row equivalent to I_n: The augmented matrix will transform from [A \ I] to [I \ A^{-1}]. The matrix on the right side will be A^{-1}.

    • If A is not row equivalent to I_n: If, during row reduction, a row of zeros appears on the left side (where A initially was), then A is not invertible, and no inverse exists.

  • Example 7: Finding the Inverse of a 3 \times 3 Matrix:

    • Find the inverse of A = [[1, 2, 1], [0, 1, 0], [3, 0, 1]], if it exists.

    • Form the augmented matrix [A \ I]:
      [[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [3, 0, 1, |, 0, 0, 1]]

    • Perform row operations:

      1. R3 \leftarrow R3 - 3R_1:
        [[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [0, -6, -2, |, -3, 0, 1]]

      2. R3 \leftarrow R3 + 6R_2:
        [[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [0, 0, -2, |, -3, 6, 1]]

      3. R3 \leftarrow (-1/2)R3:
        [[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [0, 0, 1, |, 3/2, -3, -1/2]]

      4. R1 \leftarrow R1 - R_3:
        [[1, 2, 0, |, -1/2, 3, 1/2], [0, 1, 0, |, 0, 1, 0], [0, 0, 1, |, 3/2, -3, -1/2]]

      5. R1 \leftarrow R1 - 2R_2:
        [[1, 0, 0, |, -1/2, 1, 1/2], [0, 1, 0, |, 0, 1, 0], [0, 0, 1, |, 3/2, -3, -1/2]]

    • Since A is row equivalent to I, A is invertible. The inverse is:
      A^{-1} = [[-1/2, 1, 1/2], [0, 1, 0], [3/2, -3, -1/2]]

  • Checking the Answer:

    • It's good practice to verify the calculated inverse by checking if AA^{-1} = I (or A^{-1}A = I).

    • Note: If A is known to be invertible and you derive a matrix C such that AC = I, then C must be A^{-1}. It is not strictly necessary to also check CA=I in this context of computation, as the algorithm guarantees that if a matrix on the right emerges, it is the unique inverse.

  • Another View of Matrix Inversion (Solving Multiple Systems Simultaneously):

    • Finding A^{-1} by row reducing [A \ I] can be viewed as simultaneously solving n separate matrix equations:
      A\mathbf{x}1 = \mathbf{e1}, \ A\mathbf{x}2 = \mathbf{e2}, \ \ldots, \ A\mathbf{x}n = \mathbf{en}
      where \mathbf{ej} are the columns of the identity matrix In. The augmented columns for these systems are simply the columns of In, forming [A \ \mathbf{e1} \ \mathbf{e2} \ \cdots \ \mathbf{en}] = [A \ I].

    • The property AA^{-1} = I and the definition of matrix multiplication demonstrate that the columns of A^{-1} are precisely the solutions \mathbf{x}1, \mathbf{x}2, \ldots, \mathbf{x}_n to these systems.

    • Practical Use: This perspective is valuable if an applied problem only requires finding one or two specific columns of A^{-1}. In such cases, only the corresponding systems A\mathbf{x}j = \mathbf{ej} need to be solved, rather than computing the full inverse.