Ch 2.2 - The Inverse of a Matrix
Understanding Matrix Operations and Efficiency
Equality of Transposed Products:
The quantities (Ax)^T and x^T A^T are equal, as indicated by Theorem 3(d).
Scalar Product (Dot Product) vs. Outer Product:
For a column vector x = [[x1], [x2]]:
The scalar product x^T x = [x1 \ x2] [[x1], [x2]] = [x1^2 + x2^2]. This results in a 1 \times 1 matrix, usually written without brackets (a scalar).
Example from transcript (assuming x=[5; 3] for calculation of 34): If x = [[5], [3]], then x^T x = [5 \ 3] [[5], [3]] = [25+9] = 34.
The outer product x x^T = [[x1], [x2]] [x1 \ x2] = [[x1^2, x1 x2], [x2 x1, x2^2]].
Example from transcript (assuming x=[3; 5] for calculation of matrix): If x = [[3], [5]], then x x^T = [[3], [5]] [3 \ 5] = [[9, 15], [15, 25]].
Matrix Product Definition and Undefined Operations:
The product A^T x^T is not defined if the number of columns in A^T does not match the number of rows in x^T. For instance, if A^T is m \times 2 and x is a column vector, x^T would be a row vector. For the product to be defined, x^T would need to have 2 rows, implying x is a 1x2 row vector not a column vector which is contradictory to typical conventions for x^T x. The exact dimensions of A^T and x that lead to this undefined product were not explicitly stated beyond
x does not have two rows to match the two columns of A^T
.
Computational Efficiency of Matrix Multiplication:
When computing A^2 x, it is generally more efficient to compute it as A(Ax).
Comparison for a 4 \times 4 matrix A and 4 \times 1 vector x:
Method 1: A(Ax).
Computing Ax requires 4 \times 4 = 16 multiplications (4 for each of the 4 entries in the resulting vector).
Computing A(Ax) then requires another 16 multiplications.
Total: 16 + 16 = 32 multiplications.
Method 2: A^2 x.
Computing A^2 = A \cdot A requires 4^3 = 64 multiplications (4 for each of the 16 entries in the 4 \times 4 matrix A^2).
Computing A^2 x then requires another 16 multiplications.
Total: 64 + 16 = 80 multiplications.
Conclusion: Computing A(Ax) is significantly faster than computing A^2 x.
Properties of Matrix Products (AB) with Identical Rows/Columns in A:
If all columns of matrix A are identical, then all columns of the product AB are also identical.
If all rows of matrix A are identical (i.e., ext{row}i(A) = ext{row}j(A) for all i,j), then all rows of the product AB are also identical ( ext{row}i(AB) = ext{row}i(A) \cdot B).
If all entries in A are the same (implying both identical rows and identical columns), then all entries in AB will also be the same.
The Inverse of a Matrix
Matrix Analogue of a Reciprocal:
The concept of a matrix inverse (A^{-1}) is analogous to the multiplicative inverse (reciprocal) of a non-zero real number (e.g., 5^{-1} = 1/5).
For real numbers, the inverse satisfies 5^{-1}(5) = 1 and 5(5^{-1}) = 1.
Key Distinctions for Matrices:
Two-Sided Definition: Because matrix multiplication is generally not commutative (AB \neq BA), the matrix generalization requires both equations to be satisfied: CA = I and AC = I.
No Division Notation: Slanted-line notation (e.g., A/B) is avoided for matrices.
Square Matrices Only: A full generalization of the inverse is possible only for square matrices (n \times n).
Definition of an Invertible Matrix:
An n \times n matrix A is invertible (or nonsingular) if there exists an n \times n matrix C such that CA = I and AC = I, where I = I_n is the n \times n identity matrix.
C is called an inverse of A.
Uniqueness of the Inverse: The inverse C is uniquely determined by A.
Proof: If B were another inverse of A, then B = BI = B(AC) = (BA)C = IC = C. Thus, B=C.
The unique inverse is denoted as A^{-1}, so the defining equations are A^{-1}A = I and AA^{-1} = I.
A matrix that is not invertible is called a singular matrix.
Example 1: Verifying an Inverse for a 2 \times 2 Matrix:
Given A = [[3, 2], [7, 5]] and C = [[5, -2], [-7, 3]].
CA = [[5, -2], [-7, 3]] [[3, 2], [7, 5]] = [[(5)(3) + (-2)(7), (5)(2) + (-2)(5)], [(-7)(3) + (3)(7), (-7)(2) + (3)(5)]] = [[15 - 14, 10 - 10], [-21 + 21, -14 + 15]] = [[1, 0], [0, 1]] = I.
AC = [[3, 2], [7, 5]] [[5, -2], [-7, 3]] = [[(3)(5) + (2)(-7), (3)(-2) + (2)(3)], [(7)(5) + (5)(-7), (7)(-2) + (5)(3)]] = [[15 - 14, -6 + 6], [35 - 35, -14 + 15]] = [[1, 0], [0, 1]] = I.
Since both conditions are met, C = A^{-1}.
Inverse of a 2 \times 2 Matrix
Theorem 4: Formula for the Inverse of a 2 \times 2 Matrix:
Let A = [[a, b], [c, d]].
If ad - bc \neq 0, then A is invertible and its inverse is given by:
A^{-1} = \frac{1}{ad - bc} [[d, -b], [-c, a]]If ad - bc = 0, then A is not invertible.
Determinant of a 2 \times 2 Matrix:
The quantity ad - bc is called the determinant of A, denoted as ext{det} A = ad - bc.
Theorem 4 implies that a 2 \times 2 matrix A is invertible if and only if ext{det} A \neq 0.
Example 2: Finding the Inverse of a 2 \times 2 Matrix:
Find the inverse of A = [[3, 4], [5, 6]].
First, calculate the determinant: ext{det} A = (3)(6) - (4)(5) = 18 - 20 = -2.
Since ext{det} A = -2 \neq 0, A is invertible.
Using Theorem 4:
A^{-1} = \frac{1}{-2} [[6, -4], [-5, 3]] = [[-3, 2], [5/2, -3/2]]
Solving Linear Systems with Invertible Matrices
Importance of Invertible Matrices:
They are essential for algebraic calculations and formula derivations in linear algebra.
They can provide insights into mathematical models of real-life situations.
Theorem 5: Unique Solution for Ax = b:
If A is an invertible n \times n matrix, then for any vector \mathbf{b} in R^n, the matrix equation A\mathbf{x} = \mathbf{b} has a unique solution given by \mathbf{x} = A^{-1}\mathbf{b}.
Proof of Existence: Substituting A^{-1}\mathbf{b} for \mathbf{x} in the equation: A(A^{-1}\mathbf{b}) = (AA^{-1})\mathbf{b} = I\mathbf{b} = \mathbf{b}. Thus, A^{-1}\mathbf{b} is a solution.
Proof of Uniqueness: Assume \mathbf{u} is any solution such that A\mathbf{u} = \mathbf{b}. Multiplying both sides by A^{-1} on the left:
A^{-1}(A\mathbf{u}) = A^{-1}\mathbf{b}
(A^{-1}A)\mathbf{u} = A^{-1}\mathbf{b}
I\mathbf{u} = A^{-1}\mathbf{b}
\mathbf{u} = A^{-1}\mathbf{b} Consequently, any solution must be equal to A^{-1}\mathbf{b}, proving uniqueness.
Practical Application: Elastic Beam Deflection (Example 3)
Scenario: A horizontal elastic beam is supported at its ends and subjected to forces at three points (1, 2, 3), causing deflections.
Variables:
\mathbf{f} \in R^3: A vector listing the forces at the three points.
\mathbf{y} \in R^3: A vector listing the amounts of deflection at the three points.
Hooke's Law Relationship: \mathbf{y} = D\mathbf{f}
D: The flexibility matrix.
D^{-1}: The stiffness matrix.
Physical Significance of Columns of D (Flexibility Matrix):
Express D as D = D I3 = [D\mathbf{e1} \ D\mathbf{e2} \ D\mathbf{e3}], where \mathbf{e_j} are the standard basis vectors (columns of the identity matrix).
Interpreting \mathbf{e_1}: The vector (1, 0, 0) represents a unit force applied downward at point 1, with zero force at the other two points.
First column of D (D\mathbf{e_1}): Contains the beam deflections that result from applying a unit force at point 1 (and zero force at points 2 and 3).
Similarly, the second and third columns of D list the deflections caused by a unit force at points 2 and 3, respectively.
Physical Significance of Columns of D^{-1} (Stiffness Matrix):
The inverse equation is \mathbf{f} = D^{-1}\mathbf{y}, which computes the force vector \mathbf{f} required to produce a given deflection vector \mathbf{y} (i.e., this matrix describes the beam's stiffness).
Express D^{-1} as D^{-1} = [D^{-1}\mathbf{e1} \ D^{-1}\mathbf{e2} \ D^{-1}\mathbf{e_3}].
Interpreting \mathbf{e_1} as a deflection vector: The vector (1, 0, 0) now represents a unit deflection at point 1, with zero deflections at the other two points.
First column of D^{-1} (D^{-1}\mathbf{e_1}): Lists the forces that must be applied at the three points to produce a unit deflection at point 1 and zero deflections at points 2 and 3.
Similarly, columns 2 and 3 of D^{-1} list the forces required to produce unit deflections at points 2 and 3, respectively.
Note on Forces: To achieve specific deflections, some forces in these columns might be negative (indicating an upward force).
Units: If flexibility is measured in
inches of deflection per pound of load
, then stiffness matrix entries are given inpounds of load per inch of deflection
.
Practicalities of Using A^{-1} to Solve Ax=b
Computational Efficiency: The formula \mathbf{x} = A^{-1}\mathbf{b} is rarely used for numerical computations of A\mathbf{x}=\mathbf{b} for large matrices.
Row reduction of the augmented matrix [A \ \mathbf{b}] is almost always faster and generally more accurate (as it can minimize rounding errors).
Exception: The 2 \times 2 case can be an exception, where mental calculation of A^{-1} might make using the formula quicker.
Example 4: Solving a 2 \times 2 System Using A^{-1}
System:
3x1 + 4x2 = 3 \
5x1 + 6x2 = 7This is equivalent to A\mathbf{x} = \mathbf{b} where A = [[3, 4], [5, 6]] and \mathbf{b} = [[3], [7]].
From Example 2, we found A^{-1} = [[-3, 2], [5/2, -3/2]].
Solution:
\mathbf{x} = A^{-1}\mathbf{b} = [[-3, 2], [5/2, -3/2]] [[3], [7]] \
= [[(-3)(3) + (2)(7)], [(5/2)(3) + (-3/2)(7)]] \
= [[-9 + 14], [15/2 - 21/2]] \
= [[5], [-6/2]] = [[5], [-3]]So, x1 = 5 and x2 = -3.
Properties of Invertible Matrices
Theorem 6: Important Facts about Invertible Matrices:
a. Inverse of an Inverse: If A is an invertible matrix, then its inverse, A^{-1}, is also invertible, and (A^{-1})^{-1} = A.
Proof: By definition, A^{-1}A = I and AA^{-1} = I. These equations satisfy the conditions for A to be the inverse of A^{-1}.
b. Inverse of a Product: If A and B are n \times n invertible matrices, then their product AB is also invertible. The inverse of AB is the product of their inverses in reverse order: (AB)^{-1} = B^{-1}A^{-1}
Proof: We need to show that (AB)(B^{-1}A^{-1}) = I and (B^{-1}A^{-1})(AB) = I:
(AB)(B^{-1}A^{-1}) = A(BB^{-1})A^{-1} = AIA^{-1} = AA^{-1} = I
A similar calculation shows (B^{-1}A^{-1})(AB) = I.
c. Inverse of a Transpose: If A is an invertible matrix, then its transpose, A^T, is also invertible. The inverse of A^T is the transpose of A^{-1}: (A^T)^{-1} = (A^{-1})^T
Proof: Using Theorem 3(d) (which states (XY)^T = Y^T X^T):
(A^{-1})^T A^T = (AA^{-1})^T = I^T = I
And also:
A^T (A^{-1})^T = (A^{-1}A)^T = I^T = I
Thus, A^T is invertible, and its inverse is (A^{-1})^T.
Generalization of Theorem 6(b):
The product of any number of n \times n invertible matrices is invertible, and its inverse is the product of their inverses in reverse order. For example, (ABC)^{-1} = C^{-1}B^{-1}A^{-1}.
Role of Definitions in Proofs: Proofs rigorously demonstrate that a proposed inverse (or other property) satisfies the formal definition. For example, showing (B^{-1}A^{-1}) is the inverse of AB means showing it satisfies the definition of an inverse with AB.
Elementary Matrices and Computing A^{-1}
Connection to Row Operations:
A significant connection exists between invertible matrices and elementary row operations.
An invertible matrix A is row equivalent to the identity matrix I_n.
This relationship provides a systematic method for finding A^{-1}.
Definition of an Elementary Matrix:
An elementary matrix is a matrix obtained by performing a single elementary row operation on an identity matrix (I_m).
There are three types of elementary matrices, corresponding to the three elementary row operations:
Row Replacement: Adding a multiple of one row to another.
Row Interchange: Swapping two rows.
Row Scaling: Multiplying a row by a nonzero scalar.
Example 5: Elementary Matrices and Row Operations:
Let A = [[a, b, c], [d, e, f], [g, h, i]].
E1 = [[1, 0, 0], [0, 1, 0], [-4, 0, 1]] (obtained by R3 \leftarrow R3 - 4R1 on I_3).
E1A performs the operation R3 \leftarrow R3 - 4R1 on A.
E2 = [[0, 1, 0], [1, 0, 0], [0, 0, 1]] (obtained by R1 \leftrightarrow R2 on I3).
E2A performs the operation R1 \leftrightarrow R_2 on A.
E3 = [[1, 0, 0], [0, 1, 0], [0, 0, 5]] (obtained by R3 \leftarrow 5R3 on I3).
E3A performs the operation R3 \leftarrow 5R_3 on A.
General Fact: If an elementary row operation is performed on an m \times n matrix A, the resulting matrix can be written as EA, where E is the m \times m elementary matrix created by performing the same row operation on I_m.
Invertibility of Elementary Matrices:
Every elementary matrix E is invertible.
This is because elementary row operations are reversible (Section 1.1).
The inverse of an elementary matrix E is simply the elementary matrix of the same type that performs the reverse operation, transforming E back into the identity matrix I.
Example 6: Finding the Inverse of an Elementary Matrix:
Given E1 = [[1, 0, 0], [0, 1, 0], [-4, 0, 1]] (which adds -4 times row 1 to row 3 of I3).
To reverse this operation and transform E1 back into I3, one must add +4 times row 1 to row 3.
Therefore, the inverse is E_1^{-1} = [[1, 0, 0], [0, 1, 0], [4, 0, 1]].
The Algorithm for Finding A^{-1}
Theorem 7: Invertibility and Row Equivalence to Identity:
An n \times n matrix A is invertible if and only if A is row equivalent to the n \times n identity matrix (I_n).
Furthermore, if A is invertible, any sequence of elementary row operations that reduces A to In will also transform In into A^{-1}.
Proof of Theorem 7:
Part 1: If A is invertible, then A \sim In (A is row equivalent to In).
By Theorem 5, if A is invertible, the equation A\mathbf{x} = \mathbf{b} has a solution for every \mathbf{b}.
This implies that A has a pivot position in every row (Theorem 4 in Section 1.4).
Since A is a square n \times n matrix, having n pivot positions in n rows means all n pivot positions must lie on the main diagonal.
Therefore, the reduced echelon form of A must be In, meaning A \sim In.
Part 2: If A \sim I_n, then A is invertible.
If A \sim In, it means A can be transformed into In by a sequence of elementary row operations.
Each elementary row operation corresponds to left-multiplication by an elementary matrix. So, there exist elementary matrices E1, E2, \ldots, Ep such that: Ep \cdots E2 E1 A = I_n \quad (1)
Since each elementary matrix is invertible, their product (Ep \cdots E1) is also invertible (by the generalization of Theorem 6b).
Let K = Ep \cdots E1. Then KA = I_n. This directly implies that A is invertible, and its inverse is K (since multiplying by K on the left yields the identity, K^{-1} must be A).
More specifically, from KA = In, we multiply both sides on the left by K^{-1} to get A = K^{-1}. Then, (A^{-1})^{-1} = A implies A^{-1} = (K^{-1})^{-1} = K = Ep \cdots E_1.
This means that A^{-1} is precisely the matrix obtained by applying the same sequence of elementary operations (E1, E2, \ldots, Ep) to In (because Ep \cdots E1 In = Ep \cdots E_1 = K = A^{-1}).
Algorithm for Finding A^{-1}:
To find the inverse of an n \times n matrix A, form the augmented matrix [A \ I] by placing A and the identity matrix I_n side-by-side.
Perform row operations to reduce this augmented matrix.
If A is row equivalent to I_n: The augmented matrix will transform from [A \ I] to [I \ A^{-1}]. The matrix on the right side will be A^{-1}.
If A is not row equivalent to I_n: If, during row reduction, a row of zeros appears on the left side (where A initially was), then A is not invertible, and no inverse exists.
Example 7: Finding the Inverse of a 3 \times 3 Matrix:
Find the inverse of A = [[1, 2, 1], [0, 1, 0], [3, 0, 1]], if it exists.
Form the augmented matrix [A \ I]:
[[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [3, 0, 1, |, 0, 0, 1]]Perform row operations:
R3 \leftarrow R3 - 3R_1:
[[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [0, -6, -2, |, -3, 0, 1]]R3 \leftarrow R3 + 6R_2:
[[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [0, 0, -2, |, -3, 6, 1]]R3 \leftarrow (-1/2)R3:
[[1, 2, 1, |, 1, 0, 0], [0, 1, 0, |, 0, 1, 0], [0, 0, 1, |, 3/2, -3, -1/2]]R1 \leftarrow R1 - R_3:
[[1, 2, 0, |, -1/2, 3, 1/2], [0, 1, 0, |, 0, 1, 0], [0, 0, 1, |, 3/2, -3, -1/2]]R1 \leftarrow R1 - 2R_2:
[[1, 0, 0, |, -1/2, 1, 1/2], [0, 1, 0, |, 0, 1, 0], [0, 0, 1, |, 3/2, -3, -1/2]]
Since A is row equivalent to I, A is invertible. The inverse is:
A^{-1} = [[-1/2, 1, 1/2], [0, 1, 0], [3/2, -3, -1/2]]
Checking the Answer:
It's good practice to verify the calculated inverse by checking if AA^{-1} = I (or A^{-1}A = I).
Note: If A is known to be invertible and you derive a matrix C such that AC = I, then C must be A^{-1}. It is not strictly necessary to also check CA=I in this context of computation, as the algorithm guarantees that if a matrix on the right emerges, it is the unique inverse.
Another View of Matrix Inversion (Solving Multiple Systems Simultaneously):
Finding A^{-1} by row reducing [A \ I] can be viewed as simultaneously solving n separate matrix equations:
A\mathbf{x}1 = \mathbf{e1}, \ A\mathbf{x}2 = \mathbf{e2}, \ \ldots, \ A\mathbf{x}n = \mathbf{en}
where \mathbf{ej} are the columns of the identity matrix In. The augmented columns for these systems are simply the columns of In, forming [A \ \mathbf{e1} \ \mathbf{e2} \ \cdots \ \mathbf{en}] = [A \ I].The property AA^{-1} = I and the definition of matrix multiplication demonstrate that the columns of A^{-1} are precisely the solutions \mathbf{x}1, \mathbf{x}2, \ldots, \mathbf{x}_n to these systems.
Practical Use: This perspective is valuable if an applied problem only requires finding one or two specific columns of A^{-1}. In such cases, only the corresponding systems A\mathbf{x}j = \mathbf{ej} need to be solved, rather than computing the full inverse.