Ch 1.4 The Matrix Equation Ax = b
1.4 The Matrix Equation Ax = b
Equivalence of System, Vector, and Matrix Equations
Connection to Span (from Section 1.3)
A vector equation is equivalent to a system of linear equations.
Example (Page 1): A system involving a vector y and vectors {V1, V2, V3} can be represented by an augmented matrix.
Row reduction of the augmented matrix:
\begin{bmatrix} 1 & 5 & -3 & -4 \ -2 & -7 & 0 & h \ 3 & -6 & -5 & -8 \ \end{bmatrix} \sim \begin{bmatrix} 1 & 5 & -3 & -4 \ 0 & 1 & -2 & -1 \ 0 & 0 & 1 & h-5 \ \end{bmatrix}The system is consistent if and only if there's no pivot in the last (fourth) column. This means (h-5) must be zero.
Therefore, y is in Span{V1, V2, V3} if and only if h = 5.
Important Note: The presence of a free variable does not guarantee that a system is consistent.
Linear Combinations and Span Properties
If vectors u and v are in Span{W1, W2, W3}, then there exist scalars C1, C2, C3 and d1, d2, d3 such that:
u = C1W1 + C2W2 + C3W3
v = d1W1 + d2W2 + d3W3
Then, their sum u + v is also in Span{W1, W2, W3}:
u + v = (C1W1 + C2W2 + C3W3) + (d1W1 + d2W2 + d3W3)
u + v = (C1 + d1)W1 + (C2 + d2)W2 + (C3 + d3)W3
Since (C1 + d1), (C2 + d2), and (C3 + d3) are also scalars, this shows u + v is a linear combination of {W1, W2, W3}.
Definition of the Matrix-Vector Product Ax
A fundamental idea in linear algebra is to view a linear combination of vectors as the product of a matrix and a vector.
Definition (Page 1): If A is an m \times n matrix with columns a1, \dots, an, and if x is in R^n (a column vector with n entries), then the product of A and x, denoted by Ax, is the linear combination of the columns of A using the corresponding entries in x as weights.
Ax = \begin{bmatrix} a1 & a2 & \dots & an \end{bmatrix} \begin{bmatrix} x1 \ x2 \ \vdots \ xn \end{bmatrix} = x1a1 + x2a2 + \dots + xnan
Condition: Ax is defined only if the number of columns of A (which is n) equals the number of entries in x (also n).
Example 1 (Page 1):
a.
\begin{bmatrix} 1 & 2 & 3 \ 4 & 5 & 6 \ \end{bmatrix} \begin{bmatrix} 4 \ 3 \ 7 \ \end{bmatrix} = 4 \begin{bmatrix} 1 \ 4 \ \end{bmatrix} + 3 \begin{bmatrix} 2 \ 5 \ \end{bmatrix} + 7 \begin{bmatrix} 3 \ 6 \ \end{bmatrix} = \begin{bmatrix} 4 \ 16 \ \end{bmatrix} + \begin{bmatrix} 6 \ 15 \ \end{bmatrix} + \begin{bmatrix} 21 \ 42 \ \end{bmatrix} = \begin{bmatrix} 31 \ 73 \ \end{bmatrix}b.
\begin{bmatrix} 8 & 3 \ 5 & 1 \ \end{bmatrix} \begin{bmatrix} 4 \ 7 \ \end{bmatrix} = 4 \begin{bmatrix} 8 \ 5 \ \end{bmatrix} + 7 \begin{bmatrix} 3 \ 1 \ \end{bmatrix} = \begin{bmatrix} 32 \ 20 \ \end{bmatrix} + \begin{bmatrix} 21 \ 7 \ \end{bmatrix} = \begin{bmatrix} 53 \ 27 \ \end{bmatrix}
Example 2 (Page 2): To write the linear combination 3v1 - 5v2 + 7v3 as a matrix times a vector:
Place the vectors v1, v2, v3 into the columns of a matrix A.
Place the weights 3, -5, 7 into a vector x.
Thus, 3v1 - 5v2 + 7v3 = \begin{bmatrix} v1 & v2 & v3 \end{bmatrix} \begin{bmatrix} 3 \ -5 \ 7 \ \end{bmatrix} = Ax
The Matrix Equation Ax = b
Any system of linear equations can be written as an equivalent matrix equation in the form Ax = b.
Example (System to Matrix Equation, Page 2):
System of linear equations:
x1 - 2x2 - x3 = 4
-5x2 + 3x3 = 1Equivalent vector equation:
x1 \begin{bmatrix} 1 \ 0 \ \end{bmatrix} + x2 \begin{bmatrix} -2 \ -5 \ \end{bmatrix} + x3 \begin{bmatrix} -1 \ 3 \ \end{bmatrix} = \begin{bmatrix} 4 \ 1 \ \end{bmatrix}Equivalent matrix equation:
\begin{bmatrix} 1 & -2 & -1 \ 0 & -5 & 3 \ \end{bmatrix} \begin{bmatrix} x1 \ x2 \ x3 \ \end{bmatrix} = \begin{bmatrix} 4 \ 1 \ \end{bmatrix}The matrix here is the coefficient matrix of the system.
Theorem 3 (Page 2): Equivalence of Three Forms
If A is an m \times n matrix with columns a1, \dots, an, and b is in R^m, then the matrix equation Ax = b has the same solution set as:
The vector equation: x1a1 + x2a2 + \dots + xnan = b
The system of linear equations whose augmented matrix is:
\begin{bmatrix} a1 & a2 & \dots & an & b \ \end{bmatrix}
Significance: This theorem provides a powerful tool, allowing problems to be viewed in three equivalent ways: as a matrix equation, a vector equation, or a system of linear equations. This flexibility aids in constructing mathematical models and solving problems through row reduction of the augmented matrix.
Existence of Solutions
Key Fact (Page 3): The equation Ax = b has a solution if and only if b is a linear combination of the columns of A. Equivalently, Ax = b is consistent if and only if b is in Span{a1, \dots, an}.
Determining Consistency for All Possible b
Example 3 (Page 3): Let A = \begin{bmatrix} 1 & 3 & 4 \ -4 & 2 & -6 \ -3 & -2 & -7 \ \end{bmatrix}. Is the equation Ax = b consistent for all possible b = \begin{bmatrix} b1 \ b2 \ b3 \ \end{bmatrix}?
Row reduce the augmented matrix [A \ b]:
\begin{bmatrix} 1 & 3 & 4 & b1 \ -4 & 2 & -6 & b2 \ -3 & -2 & -7 & b3 \ \end{bmatrix} \sim \begin{bmatrix} 1 & 3 & 4 & b1 \ 0 & 14 & 10 & b2+4b1 \ 0 & 7 & 5 & b3+3b1 \ \end{bmatrix} \sim \begin{bmatrix} 1 & 3 & 4 & b1 \ 0 & 14 & 10 & b2+4b1 \ 0 & 0 & 0 & b3+3b1 - \frac{1}{2}(b2+4b1) \ \end{bmatrix}The third entry in the augmented column is b3+3b1 - \frac{1}{2}b2 - 2b1 = b1 - \frac{1}{2}b2 + b3.
The system Ax=b is not consistent for every b because some choices of b can make (b1 - \frac{1}{2}b2 + b3) nonzero, leading to an inconsistent system.
The reduced matrix shows that Ax = b is consistent if and only if the entries in b satisfy:
b1 - \frac{1}{2}b2 + b3 = 0This equation describes a plane through the origin in R^3, which is the set of all linear combinations of the columns of A. The equation fails to be consistent for all b because the echelon form of A has a row of zeros.
Theorem 4 (Page 3): Equivalence of Four Statements
Let A be an m \times n matrix. The following statements are logically equivalent (all true or all false):
a. For each b in R^m, the equation Ax = b has a solution.
b. Each b in R^m is a linear combination of the columns of A.
c. The columns of A span R^m. (This means Span{v1, \dots, vp} = R^m if {v1, \dots, vp} are the columns of A).
d. The matrix A has a pivot position in every row.
Warning (Page 4): Theorem 4 is about the coefficient matrix A, not the augmented matrix [A \ b]. If an augmented matrix [A \ b] has a pivot position in every row, the equation Ax = b may or may not be consistent (e.g., if the pivot is in the augmented column).
Proof Sketch (Page 6):
Statements (a), (b), and (c) are equivalent by definition.
To show (a) and (d) are equivalent: Let U be an echelon form of A.
If (d) is true (pivot in every row of U), then [U \ d] cannot have a pivot in the augmented column, so Ax = b has a solution for any b (meaning (a) is true).
If (d) is false (last row of U is all zeros), then we can construct a vector d with a 1 in its last entry. The system [U \ d] represents an inconsistent system. Since row operations are reversible, this implies an original system [A \ b] that is also inconsistent, thus (a) is false.
Computation of Ax: The Row-Vector Rule
This rule provides a more efficient method for calculating entries in Ax by hand.
Example 4 (Page 4): Compute Ax, where A = \begin{bmatrix} 2 & 3 & 4 \ -1 & 5 & -3 \ 6 & -2 & 8 \ \end{bmatrix} and x = \begin{bmatrix} x1 \ x2 \ x3 \ \end{bmatrix}.
By definition: Ax = x1 \begin{bmatrix} 2 \ -1 \ 6 \ \end{bmatrix} + x2 \begin{bmatrix} 3 \ 5 \ -2 \ \end{bmatrix} + x3 \begin{bmatrix} 4 \ -3 \ 8 \ \end{bmatrix} = \begin{bmatrix} 2x1 + 3x2 + 4x3 \ -x1 + 5x2 - 3x3 \ 6x1 - 2x2 + 8x3 \ \end{bmatrix}
Notice that the first entry in Ax (2x1 + 3x2 + 4x3) is the sum of products of entries from the first row of A and the entries in x.
Similarly, the second entry (-x1 + 5x2 - 3x3) is from the second row of A and x. And so on.
Row-Vector Rule for Computing Ax (Page 4): If the product Ax is defined, then the i^{th} entry in Ax is the sum of the products of corresponding entries from row i of A and from the vector x.
Example 5 (Page 5):
a.
\begin{bmatrix} 2 & -3 \ 8 & 0 \ -5 & 2 \ \end{bmatrix} \begin{bmatrix} 4 \ 7 \ \end{bmatrix} = \begin{bmatrix} (2)(4) + (-3)(7) \ (8)(4) + (0)(7) \ (-5)(4) + (2)(7) \ \end{bmatrix} = \begin{bmatrix} 8 - 21 \ 32 + 0 \ -20 + 14 \ \end{bmatrix} = \begin{bmatrix} -13 \ 32 \ -6 \ \end{bmatrix}b. \begin{bmatrix} 1 & 0 & 0 \ 0 & 1 & 0 \ 0 & 0 & 1 \ \end{bmatrix} \begin{bmatrix} r \ s \ t \ \end{bmatrix}
This specific matrix is called an identity matrix and is denoted by I.
The calculation shows that I \begin{bmatrix} r \ s \ t \ \end{bmatrix} = \begin{bmatrix} (1)(r) + (0)(s) + (0)(t) \ (0)(r) + (1)(s) + (0)(t) \ (0)(r) + (0)(s) + (1)(t) \ \end{bmatrix} = \begin{bmatrix} r \ s \ t \ \end{bmatrix}, so Ix = x for every x in R^3.
There is an analogous n \times n identity matrix, often written as In, for which Ix = x for every x in R^n.
Properties of the Matrix-Vector Product (Theorem 5)
If A is an m \times n matrix, u and v are vectors in R^n, and c is a scalar, then:
a. Distributive Property: A(u + v) = Au + Av
b. Scalar Multiplication Property: A(cu) = c(Au)
Proof Sketch (Page 5, for n = 3):
Let A = \begin{bmatrix} a1 & a2 & a3 \end{bmatrix}, and let ui and vi be the i^{th} entries of u and v respectively.
For (a):
A(u + v) = \begin{bmatrix} a1 & a2 & a3 \end{bmatrix} \begin{bmatrix} u1 + v1 \ u2 + v2 \ u3 + v3 \ \end{bmatrix}
= (u1 + v1)a1 + (u2 + v2)a2 + (u3 + v3)a3
= (u1a1 + u2a2 + u3a3) + (v1a1 + v2a2 + v3a3)
= Au + AvFor (b):
A(cu) = \begin{bmatrix} a1 & a2 & a3 \end{bmatrix} \begin{bmatrix} cu1 \ cu2 \ cu3 \ \end{bmatrix}
= (cu1)a1 + (cu2)a2 + (cu3)a3
= c(u1a1) + c(u2a2) + c(u3a3)
= c(u1a1 + u2a2 + u3a3)
= c(Au)
Numerical Note on Computation of Ax (Page 6)
To optimize computer algorithms for Ax, the sequence of calculations should involve data stored in contiguous memory locations.
Fortran: Stores matrices as sets of columns. Algorithms in Fortran compute Ax as a linear combination of the columns of A (consistent with the definition).
C: Stores matrices by rows. Algorithms in C should compute Ax using the row-vector rule (multiplication of rows of A by vector x).
Practice Problems (Page 6)
Examples of exercises include:
Exhibiting b as a linear combination of columns of A given that x is a solution to Ax=b.
Verifying Theorem 5(a) for specific matrices and vectors.
Constructing matrices A and vectors b, c such that Ax=b has a solution but Ax=c does not.
Computing matrix-vector products using both the definition and the row-vector rule.
Converting between matrix equations and vector equations.