Ch 2.2 - The Inverse of a Matrix

Understanding Matrix Operations and Efficiency

Equality of Transposed Products:
- The quantities (Ax)^T and x^T A^T are equal, as indicated by Theorem 3(d).
Scalar Product (Dot Product) vs. Outer Product:
- For a column vector x = [[x1], [x2]]:
  - The scalar product x^T x = [x1 \ x2] [[x1], [x2]] = [x1^2 + x2^2]. This results in a 1 \times 1 matrix, usually written without brackets (a scalar).
  - Example from transcript (assuming x=[5; 3] for calculation of 34): If x = [[5], [3]], then x^T x = [5 \ 3] [[5], [3]] = [25+9] = 34.
  - The outer product x x^T = [[x1], [x2]] [x1 \ x2] = [[x1^2, x1 x2], [x2 x1, x2^2]].
  - Example from transcript (assuming x=[3; 5] for calculation of matrix): If x = [[3], [5]], then x x^T = [[3], [5]] [3 \ 5] = [[9, 15], [15, 25]].
Matrix Product Definition and Undefined Operations:
- The product A^T x^T is not defined if the number of columns in A^T does not match the number of rows in x^T. For instance, if A^T is m \times 2 and x is a column vector, x^T would be a row vector. For the product to be defined, x^T would need to have 2 rows, implying x is a 1x2 row vector not a column vector which is contradictory to typical conventions for x^T x. The exact dimensions of A^T and x that lead to this undefined product were not explicitly stated beyond x does not have two rows to match the two columns of A^T.
Computational Efficiency of Matrix Multiplication:
- When computing A^2 x, it is generally more efficient to compute it as A(Ax).
- Comparison for a 4 \times 4 matrix A and 4 \times 1 vector x:
  - Method 1: A(Ax).
    - Computing Ax requires 4 \times 4 = 16 multiplications (4 for each of the 4 entries in the resulting vector).
    - Computing A(Ax) then requires another 16 multiplications.
    - Total: 16 + 16 = 32 multiplications.
  - Method 2: A^2 x.
    - Computing A^2 = A \cdot A requires 4^3 = 64 multiplications (4 for each of the 16 entries in the 4 \times 4 matrix A^2).
    - Computing A^2 x then requires another 16 multiplications.
    - Total: 64 + 16 = 80 multiplications.
- Conclusion: Computing A(Ax) is significantly faster than computing A^2 x.
Properties of Matrix Products (AB) with Identical Rows/Columns in A:
- If all columns of matrix A are identical, then all columns of the product AB are also identical.
- If all rows of matrix A are identical (i.e., ext{row}i(A) = ext{row}j(A) for all i,j), then all rows of the product AB are also identical ( ext{row}i(AB) = ext{row}i(A) \cdot B).
- If all entries in A are the same (implying both identical rows and identical columns), then all entries in AB will also be the same.

The Inverse of a Matrix

Matrix Analogue of a Reciprocal:
- The concept of a matrix inverse (A^{-1}) is analogous to the multiplicative inverse (reciprocal) of a non-zero real number (e.g., 5^{-1} = 1/5).
- For real numbers, the inverse satisfies 5^{-1}(5) = 1 and 5(5^{-1}) = 1.
Key Distinctions for Matrices:
- Two-Sided Definition: Because matrix multiplication is generally not commutative (AB \neq BA), the matrix generalization requires both equations to be satisfied: CA = I and AC = I.
- No Division Notation: Slanted-line notation (e.g., A/B) is avoided for matrices.
- Square Matrices Only: A full generalization of the inverse is possible only for square matrices (n \times n).
Definition of an Invertible Matrix:
- An n \times n matrix A is invertible (or nonsingular) if there exists an n \times n matrix C such that CA = I and AC = I, where I = I_n is the n \times n identity matrix.
- C is called an inverse of A.
- Uniqueness of the Inverse: The inverse C is uniquely determined by A.
  - Proof: If B were another inverse of A, then B = BI = B(AC) = (BA)C = IC = C. Thus, B=C.
- The unique inverse is denoted as A^{-1}, so the defining equations are A^{-1}A = I and AA^{-1} = I.
- A matrix that is not invertible is called a singular matrix.
Example 1: Verifying an Inverse for a 2 \times 2 Matrix:
- Given A = [[3, 2], [7, 5]] and C = [[5, -2], [-7, 3]].
- CA = [[5, -2], [-7, 3]] [[3, 2], [7, 5]] = [[(5)(3) + (-2)(7), (5)(2) + (-2)(5)], [(-7)(3) + (3)(7), (-7)(2) + (3)(5)]] = [[15 - 14, 10 - 10], [-21 + 21, -14 + 15]] = [[1, 0], [0, 1]] = I.
- AC = [[3, 2], [7, 5]] [[5, -2], [-7, 3]] = [[(3)(5) + (2)(-7), (3)(-2) + (2)(3)], [(7)(5) + (5)(-7), (7)(-2) + (5)(3)]] = [[15 - 14, -6 + 6], [35 - 35, -14 + 15]] = [[1, 0], [0, 1]] = I.
- Since both conditions are met, C = A^{-1}.

Inverse of a 2 \times 2 Matrix

Theorem 4: Formula for the Inverse of a 2 \times 2 Matrix:
- Let A = [[a, b], [c, d]].
- If ad - bc \neq 0, then A is invertible and its inverse is given by:
  A^{-1} = \frac{1}{ad - bc} [[d, -b], [-c, a]]
- If ad - bc = 0, then A is not invertible.
Determinant of a 2 \times 2 Matrix:
- The quantity ad - bc is called the determinant of A, denoted as ext{det} A = ad - bc.
- Theorem 4 implies that a 2 \times 2 matrix A is invertible if and only if ext{det} A \neq 0.
Example 2: Finding the Inverse of a 2 \times 2 Matrix:
- Find the inverse of A = [[3, 4], [5, 6]].
- First, calculate the determinant: ext{det} A = (3)(6) - (4)(5) = 18 - 20 = -2.
- Since ext{det} A = -2 \neq 0, A is invertible.
- Using Theorem 4:
  A^{-1} = \frac{1}{-2} [[6, -4], [-5, 3]] = [[-3, 2], [5/2, -3/2]]

Solving Linear Systems with Invertible Matrices

Importance of Invertible Matrices:
- They are essential for algebraic calculations and formula derivations in linear algebra.
- They can provide insights into mathematical models of real-life situations.
Theorem 5: Unique Solution for Ax = b:
- If A is an invertible n \times n matrix, then for any vector \mathbf{b} in R^n, the matrix equation A\mathbf{x} = \mathbf{b} has a unique solution given by \mathbf{x} = A^{-1}\mathbf{b}.
- Proof of Existence: Substituting A^{-1}\mathbf{b} for \mathbf{x} in the equation: A(A^{-1}\mathbf{b}) = (AA^{-1})\mathbf{b} = I\mathbf{b} = \mathbf{b}. Thus, A^{-1}\mathbf{b} is a solution.
- Proof of Uniqueness: Assume \mathbf{u} is any solution such that A\mathbf{u} = \mathbf{b}. Multiplying both sides by A^{-1} on the left:
  A^{-1}(A\mathbf{u}) = A^{-1}\mathbf{b}
  (A^{-1}A)\mathbf{u} = A^{-1}\mathbf{b}
  I\mathbf{u} = A^{-1}\mathbf{b}
  \mathbf{u} = A^{-1}\mathbf{b} Consequently, any solution must be equal to A^{-1}\mathbf{b}, proving uniqueness.

Practical Application: Elastic Beam Deflection (Example 3)

Scenario: A horizontal elastic beam is supported at its ends and subjected to forces at three points (1, 2, 3), causing deflections.
Variables:
- \mathbf{f} \in R^3: A vector listing the forces at the three points.
- \mathbf{y} \in R^3: A vector listing the amounts of deflection at the three points.
Hooke's Law Relationship: \mathbf{y} = D\mathbf{f}
- D: The flexibility matrix.
- D^{-1}: The stiffness matrix.
Physical Significance of Columns of D (Flexibility Matrix):
- Express D as D = D I3 = [D\mathbf{e1} \ D\mathbf{e2} \ D\mathbf{e3}], where \mathbf{e_j} are the standard basis vectors (columns of the identity matrix).
- Interpreting \mathbf{e_1}: The vector (1, 0, 0) represents a unit force applied downward at point 1, with zero force at the other two points.
- First column of D (D\mathbf{e_1}): Contains the beam deflections that result from applying a unit force at point 1 (and zero force at points 2 and 3).
- Similarly, the second and third columns of D list the deflections caused by a unit force at points 2 and 3, respectively.
Physical Significance of Columns of D^{-1} (Stiffness Matrix):
- The inverse equation is \mathbf{f} = D^{-1}\mathbf{y}, which computes the force vector \mathbf{f} required to produce a given deflection vector \mathbf{y} (i.e., this matrix describes the beam's stiffness).
- Express D^{-1} as D^{-1} = [D^{-1}\mathbf{e1} \ D^{-1}\mathbf{e2} \ D^{-1}\mathbf{e_3}].
- Interpreting \mathbf{e_1} as a deflection vector: The vector (1, 0, 0) now represents a unit deflection at point 1, with zero deflections at the other two points.
- First column of D^{-1} (D^{-1}\mathbf{e_1}): Lists the forces that must be applied at the three points to produce a unit deflection at point 1 and zero deflections at points 2 and 3.
- Similarly, columns 2 and 3 of D^{-1} list the forces required to produce unit deflections at points 2 and 3, respectively.
- Note on Forces: To achieve specific deflections, some forces in these columns might be negative (indicating an upward force).
- Units: If flexibility is measured in inches of deflection per pound of load, then stiffness matrix entries are given in pounds of load per inch of deflection.

Practicalities of Using A^{-1} to Solve Ax=b

Computational Efficiency: The formula \mathbf{x} = A^{-1}\mathbf{b} is rarely used for numerical computations of A\mathbf{x}=\mathbf{b} for large matrices.
- Row reduction of the augmented matrix [A \ \mathbf{b}] is almost always faster and generally more accurate (as it can minimize rounding errors).
Exception: The 2 \times 2 case can be an exception, where mental calculation of A^{-1} might make using the formula quicker.
Example 4: Solving a 2 \times 2 System Using A^{-1}
- System:
  3x1 + 4x2 = 3 \
  5x1 + 6x2 = 7
- This is equivalent to A\mathbf{x} = \mathbf{b} where A = [[3, 4], [5, 6]] and \mathbf{b} = [[3], [7]].
- From Example 2, we found A^{-1} = [[-3, 2], [5/2, -3/2]].
- Solution:
  \mathbf{x} = A^{-1}\mathbf{b} = [[-3, 2], [5/2, -3/2]] [[3], [7]] \
  = [[(-3)(3) + (2)(7)], [(5/2)(3) + (-3/2)(7)]] \
  = [[-9 + 14], [15/2 - 21/2]] \
  = [[5], [-6/2]] = [[5], [-3]]
- So, x1 = 5 and x2 = -3.

Properties of Invertible Matrices

Theorem 6: Important Facts about Invertible Matrices:
- a. Inverse of an Inverse: If A is an invertible matrix, then its inverse, A^{-1}, is also invertible, and (A^{-1})^{-1} = A.
  - Proof: By definition, A^{-1}A = I and AA^{-1} = I. These equations satisfy the conditions for A to be the inverse of A^{-1}.
- b. Inverse of a Product: If A and B are n \times n invertible matrices, then their product AB is also invertible. The inverse of AB is the product of their inverses in reverse order: (AB)^{-1} = B^{-1}A^{-1}
  - Proof: We need to show that (AB)(B^{-1}A^{-1}) = I and (B^{-1}A^{-1})(AB) = I:
    (AB)(B^{-1}A^{-1}) = A(BB^{-1})A^{-1} = AIA^{-1} = AA^{-1} = I
    A similar calculation shows (B^{-1}A^{-1})(AB) = I.
- c. Inverse of a Transpose: If A is an invertible matrix, then its transpose, A^T, is also invertible. The inverse of A^T is the transpose of A^{-1}: (A^T)^{-1} = (A^{-1})^T
  - Proof: Using Theorem 3(d) (which states (XY)^T = Y^T X^T):
    (A^{-1})^T A^T = (AA^{-1})^T = I^T = I
    And also:
    A^T (A^{-1})^T = (A^{-1}A)^T = I^T = I
    Thus, A^T is invertible, and its inverse is (A^{-1})^T.
Generalization of Theorem 6(b):
- The product of any number of n \times n invertible matrices is invertible, and its inverse is the product of their inverses in reverse order. For example, (ABC)^{-1} = C^{-1}B^{-1}A^{-1}.
Role of Definitions in Proofs: Proofs rigorously demonstrate that a proposed inverse (or other property) satisfies the formal definition. For example, showing (B^{-1}A^{-1}) is the inverse of AB means showing it satisfies the definition of an inverse with AB.

Elementary Matrices and Computing A^{-1}

Connection to Row Operations:
- A significant connection exists between invertible matrices and elementary row operations.
- An invertible matrix A is row equivalent to the identity matrix I_n.
- This relationship provides a systematic method for finding A^{-1}.
Definition of an Elementary Matrix:
- An elementary matrix is a matrix obtained by performing a single elementary row operation on an identity matrix (I_m).
- There are three types of elementary matrices, corresponding to the three elementary row operations:
  1. Row Replacement: Adding a multiple of one row to another.
  2. Row Interchange: Swapping two rows.
  3. Row Scaling: Multiplying a row by a nonzero scalar.
Example 5: Elementary Matrices and Row Operations:
- Let A = [[a, b, c], [d, e, f], [g, h, i]].
- E1 = [[1, 0, 0], [0, 1, 0], [-4, 0, 1]] (obtained by R3 \leftarrow R3 - 4R1 on I_3).
  - E1A performs the operation R3 \leftarrow R3 - 4R1 on A.
- E2 = [[0, 1, 0], [1, 0, 0], [0, 0, 1]] (obtained by R1 \leftrightarrow R2 on I3).
  - E2A performs the operation R1 \leftrightarrow R_2 on A.
- E3 = [[1, 0, 0], [0, 1, 0], [0, 0, 5]] (obtained by R3 \leftarrow 5R3 on I3).
  - E3A performs the operation R3 \leftarrow 5R_3 on A.
General Fact: If an elementary row operation is performed on an m \times n matrix A, the resulting matrix can be written as EA, where E is the m \times m elementary matrix created by performing the same row operation on I_m.
Invertibility of Elementary Matrices:
- Every elementary matrix E is invertible.
- This is because elementary row operations are reversible (Section 1.1).
- The inverse of an elementary matrix E is simply the elementary matrix of the same type that performs the reverse operation, transforming E back into the identity matrix I.
Example 6: Finding the Inverse of an Elementary Matrix:
- Given E1 = [[1, 0, 0], [0, 1, 0], [-4, 0, 1]] (which adds -4 times row 1 to row 3 of I3).
- To reverse this operation and transform E1 back into I3, one must add +4 times row 1 to row 3.
- Therefore, the inverse is E_1^{-1} = [[1, 0, 0], [0, 1, 0], [4, 0, 1]].

The Algorithm for Finding A^{-1}

Theorem 7: Invertibility and Row Equivalence to Identity:
- An n \times n matrix A is invertible if and only if A is row equivalent to the n \times n identity matrix (I_n).
- Furthermore, if A is invertible, any sequence of elementary row operations that reduces A to In will also transform In into A^{-1}.
Proof of Theorem 7:
- Part 1: If A is invertible, then A \sim In (A is row equivalent to In).
  - By Theorem 5, if A is invertible, the equation A\mathbf{x} = \mathbf{b} has a solution for every \mathbf{b}.
  - This implies that A has a pivot position in every row (Theorem 4 in Section 1.4).
  - Since A is a square n \times n matrix, having n pivot positions in n rows means all n pivot positions must lie on the main diagonal.
  - Therefore, the reduced echelon form of A must be In, meaning A \sim In.
- Part 2: If A \sim I_n, then A is invertible.
  - If A \sim In, it means A can be transformed into In by a sequence of elementary row operations.
  - Each elementary row operation corresponds to left-multiplication by an elementary matrix. So, there exist elementary matrices E1, E2, \ldots, Ep such that: Ep \cdots E2 E1 A = I_n \quad (1)
  - Since each elementary matrix is invertible, their product (Ep \cdots E1) is also invertible (by the generalization of Theorem 6b).
  - Let K = Ep \cdots E1. Then KA = I_n. This directly implies that A is invertible, and its inverse is K (since multiplying by K on the left yields the identity, K^{-1} must be A).
  - More specifically, from KA = In, we multiply both sides on the left by K^{-1} to get A = K^{-1}. Then, (A^{-1})^{-1} = A implies A^{-1} = (K^{-1})^{-1} = K = Ep \cdots E_1.
  - This means that A^{-1} is precisely the matrix obtained by applying the same sequence of elementary operations (E1, E2, \ldots, Ep) to In (because Ep \cdots E1 In = Ep \cdots E_1 = K = A^{-1}).
Algorithm for Finding A^{-1}:
- To find the inverse of an n \times n matrix A, form the augmented matrix [A \ I] by placing A and the identity matrix I_n side-by-side.
- Perform row operations to reduce this augmented matrix.
- If A is row equivalent to I_n: The augmented matrix will transform from [A \ I] to [I \ A^{-1}]. The matrix on the right side will be A^{-1}.
- If A is not row equivalent to I_n: If, during row reduction, a row of zeros appears on the left side (where A initially was), then A is not invertible, and no inverse exists.
Example 7: Finding the Inverse of a 3 \times 3 Matrix:
- Find the inverse of A = [[1, 2, 1], [0, 1, 0], [3, 0, 1]], if it exists.
- Form the augmented matrix [A \ I]:
  [[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [3, 0, 1, |, 0, 0, 1]]
- Perform row operations:
  1. R3 \leftarrow R3 - 3R_1:
    [[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [0, -6, -2, |, -3, 0, 1]]
  2. R3 \leftarrow R3 + 6R_2:
    [[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [0, 0, -2, |, -3, 6, 1]]
  3. R3 \leftarrow (-1/2)R3:
    [[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [0, 0, 1, |, 3/2, -3, -1/2]]
  4. R1 \leftarrow R1 - R_3:
    [[1, 2, 0, |, -1/2, 3, 1/2], [0, 1, 0, |, 0, 1, 0], [0, 0, 1, |, 3/2, -3, -1/2]]
  5. R1 \leftarrow R1 - 2R_2:
    [[1, 0, 0, |, -1/2, 1, 1/2], [0, 1, 0, |, 0, 1, 0], [0, 0, 1, |, 3/2, -3, -1/2]]
- Since A is row equivalent to I, A is invertible. The inverse is:
  A^{-1} = [[-1/2, 1, 1/2], [0, 1, 0], [3/2, -3, -1/2]]
Checking the Answer:
- It's good practice to verify the calculated inverse by checking if AA^{-1} = I (or A^{-1}A = I).
- Note: If A is known to be invertible and you derive a matrix C such that AC = I, then C must be A^{-1}. It is not strictly necessary to also check CA=I in this context of computation, as the algorithm guarantees that if a matrix on the right emerges, it is the unique inverse.
Another View of Matrix Inversion (Solving Multiple Systems Simultaneously):
- Finding A^{-1} by row reducing [A \ I] can be viewed as simultaneously solving n separate matrix equations:
  A\mathbf{x}1 = \mathbf{e1}, \ A\mathbf{x}2 = \mathbf{e2}, \ \ldots, \ A\mathbf{x}n = \mathbf{en}
  where \mathbf{ej} are the columns of the identity matrix In. The augmented columns for these systems are simply the columns of In, forming [A \ \mathbf{e1} \ \mathbf{e2} \ \cdots \ \mathbf{en}] = [A \ I].
- The property AA^{-1} = I and the definition of matrix multiplication demonstrate that the columns of A^{-1} are precisely the solutions \mathbf{x}1, \mathbf{x}2, \ldots, \mathbf{x}_n to these systems.
- Practical Use: This perspective is valuable if an applied problem only requires finding one or two specific columns of A^{-1}. In such cases, only the corresponding systems A\mathbf{x}j = \mathbf{ej} need to be solved, rather than computing the full inverse.