Ch 1.4 The Matrix Equation Ax = b

1.4 The Matrix Equation Ax=bAx = b

Equivalence of System, Vector, and Matrix Equations

Connection to Span (from Section 1.3)
  • A vector equation is equivalent to a system of linear equations.

  • Example (Page 1): A system involving a vector yy and vectors V1,V2,V3{V1, V2, V3} can be represented by an augmented matrix.

    • Row reduction of the augmented matrix:
      [1amp;5amp;3amp;4 2amp;7amp;0amp;h 3amp;6amp;5amp;8 ][1amp;5amp;3amp;4 0amp;1amp;2amp;1 0amp;0amp;1amp;h5 ]\begin{bmatrix} 1 & 5 & -3 & -4 \ -2 & -7 & 0 & h \ 3 & -6 & -5 & -8 \ \end{bmatrix} \sim \begin{bmatrix} 1 & 5 & -3 & -4 \ 0 & 1 & -2 & -1 \ 0 & 0 & 1 & h-5 \ \end{bmatrix}

    • The system is consistent if and only if there's no pivot in the last (fourth) column. This means (h5)(h-5) must be zero.

    • Therefore, yy is in SpanV1,V2,V3Span{V1, V2, V3} if and only if h=5h = 5.

  • Important Note: The presence of a free variable does not guarantee that a system is consistent.

Linear Combinations and Span Properties
  • If vectors uu and vv are in SpanW1,W2,W3Span{W1, W2, W3}, then there exist scalars C1,C2,C3C1, C2, C3 and d1,d2,d3d1, d2, d3 such that:

    • u=C1W1+C2W2+C3W3u = C1W1 + C2W2 + C3W3

    • v=d1W1+d2W2+d3W3v = d1W1 + d2W2 + d3W3

  • Then, their sum u+vu + v is also in SpanW1,W2,W3Span{W1, W2, W3}:

    • u+v=(C1W1+C2W2+C3W3)+(d1W1+d2W2+d3W3)u + v = (C1W1 + C2W2 + C3W3) + (d1W1 + d2W2 + d3W3)

    • u+v=(C1+d1)W1+(C2+d2)W2+(C3+d3)W3u + v = (C1 + d1)W1 + (C2 + d2)W2 + (C3 + d3)W3

    • Since (C1+d1)(C1 + d1), (C2+d2)(C2 + d2), and (C3+d3)(C3 + d3) are also scalars, this shows u+vu + v is a linear combination of W1,W2,W3{W1, W2, W3}.

Definition of the Matrix-Vector Product AxAx

  • A fundamental idea in linear algebra is to view a linear combination of vectors as the product of a matrix and a vector.

  • Definition (Page 1): If AA is an m×nm \times n matrix with columns a1,,ana1, \dots, an, and if xx is in RnR^n (a column vector with nn entries), then the product of AA and xx, denoted by AxAx, is the linear combination of the columns of AA using the corresponding entries in xx as weights.

    • Ax=[a1amp;a2amp;amp;an][x1 x2  xn]=x1a1+x2a2++xnanAx = \begin{bmatrix} a1 & a2 & \dots & an \end{bmatrix} \begin{bmatrix} x1 \ x2 \ \vdots \ xn \end{bmatrix} = x1a1 + x2a2 + \dots + xnan

  • Condition: AxAx is defined only if the number of columns of AA (which is nn) equals the number of entries in xx (also nn).

  • Example 1 (Page 1):

    • a.
      [1amp;2amp;3 4amp;5amp;6 ][4 3 7 ]=4[1 4 ]+3[2 5 ]+7[3 6 ]=[4 16 ]+[6 15 ]+[21 42 ]=[31 73 ]\begin{bmatrix} 1 & 2 & 3 \ 4 & 5 & 6 \ \end{bmatrix} \begin{bmatrix} 4 \ 3 \ 7 \ \end{bmatrix} = 4 \begin{bmatrix} 1 \ 4 \ \end{bmatrix} + 3 \begin{bmatrix} 2 \ 5 \ \end{bmatrix} + 7 \begin{bmatrix} 3 \ 6 \ \end{bmatrix} = \begin{bmatrix} 4 \ 16 \ \end{bmatrix} + \begin{bmatrix} 6 \ 15 \ \end{bmatrix} + \begin{bmatrix} 21 \ 42 \ \end{bmatrix} = \begin{bmatrix} 31 \ 73 \ \end{bmatrix}

    • b.
      [8amp;3 5amp;1 ][4 7 ]=4[8 5 ]+7[3 1 ]=[32 20 ]+[21 7 ]=[53 27 ]\begin{bmatrix} 8 & 3 \ 5 & 1 \ \end{bmatrix} \begin{bmatrix} 4 \ 7 \ \end{bmatrix} = 4 \begin{bmatrix} 8 \ 5 \ \end{bmatrix} + 7 \begin{bmatrix} 3 \ 1 \ \end{bmatrix} = \begin{bmatrix} 32 \ 20 \ \end{bmatrix} + \begin{bmatrix} 21 \ 7 \ \end{bmatrix} = \begin{bmatrix} 53 \ 27 \ \end{bmatrix}

  • Example 2 (Page 2): To write the linear combination 3v15v2+7v33v1 - 5v2 + 7v3 as a matrix times a vector:

    • Place the vectors v1,v2,v3v1, v2, v3 into the columns of a matrix AA.

    • Place the weights 3,5,73, -5, 7 into a vector xx.

    • Thus, 3v15v2+7v3=[v1amp;v2amp;v3][3 5 7 ]=Ax3v1 - 5v2 + 7v3 = \begin{bmatrix} v1 & v2 & v3 \end{bmatrix} \begin{bmatrix} 3 \ -5 \ 7 \ \end{bmatrix} = Ax

The Matrix Equation Ax=bAx = b

  • Any system of linear equations can be written as an equivalent matrix equation in the form Ax=bAx = b.

  • Example (System to Matrix Equation, Page 2):

    • System of linear equations:
      x12x2x3=4x1 - 2x2 - x3 = 4
      5x2+3x3=1-5x2 + 3x3 = 1

    • Equivalent vector equation:
      x1[1 0 ]+x2[2 5 ]+x3[1 3 ]=[4 1 ]x1 \begin{bmatrix} 1 \ 0 \ \end{bmatrix} + x2 \begin{bmatrix} -2 \ -5 \ \end{bmatrix} + x3 \begin{bmatrix} -1 \ 3 \ \end{bmatrix} = \begin{bmatrix} 4 \ 1 \ \end{bmatrix}

    • Equivalent matrix equation:
      [1amp;2amp;1 0amp;5amp;3 ][x1 x2 x3 ]=[4 1 ]\begin{bmatrix} 1 & -2 & -1 \ 0 & -5 & 3 \ \end{bmatrix} \begin{bmatrix} x1 \ x2 \ x3 \ \end{bmatrix} = \begin{bmatrix} 4 \ 1 \ \end{bmatrix}

    • The matrix here is the coefficient matrix of the system.

Theorem 3 (Page 2): Equivalence of Three Forms
  • If AA is an m×nm \times n matrix with columns a1,,ana1, \dots, an, and bb is in RmR^m, then the matrix equation Ax=bAx = b has the same solution set as:

    • The vector equation: x1a1+x2a2++xnan=bx1a1 + x2a2 + \dots + xnan = b

    • The system of linear equations whose augmented matrix is:
      [a1amp;a2amp;amp;anamp;b ]\begin{bmatrix} a1 & a2 & \dots & an & b \ \end{bmatrix}

  • Significance: This theorem provides a powerful tool, allowing problems to be viewed in three equivalent ways: as a matrix equation, a vector equation, or a system of linear equations. This flexibility aids in constructing mathematical models and solving problems through row reduction of the augmented matrix.

Existence of Solutions

  • Key Fact (Page 3): The equation Ax=bAx = b has a solution if and only if bb is a linear combination of the columns of AA. Equivalently, Ax=bAx = b is consistent if and only if bb is in Spana1,,anSpan{a1, \dots, an}.

Determining Consistency for All Possible bb
  • Example 3 (Page 3): Let A=[1amp;3amp;4 4amp;2amp;6 3amp;2amp;7 ]A = \begin{bmatrix} 1 & 3 & 4 \ -4 & 2 & -6 \ -3 & -2 & -7 \ \end{bmatrix}. Is the equation Ax=bAx = b consistent for all possible b=[b1 b2 b3 ]b = \begin{bmatrix} b1 \ b2 \ b3 \ \end{bmatrix}?

    • Row reduce the augmented matrix [A b][A \ b]:
      [1amp;3amp;4amp;b1 4amp;2amp;6amp;b2 3amp;2amp;7amp;b3 ][1amp;3amp;4amp;b1 0amp;14amp;10amp;b2+4b1 0amp;7amp;5amp;b3+3b1 ][1amp;3amp;4amp;b1 0amp;14amp;10amp;b2+4b1 0amp;0amp;0amp;b3+3b112(b2+4b1) ]\begin{bmatrix} 1 & 3 & 4 & b1 \ -4 & 2 & -6 & b2 \ -3 & -2 & -7 & b3 \ \end{bmatrix} \sim \begin{bmatrix} 1 & 3 & 4 & b1 \ 0 & 14 & 10 & b2+4b1 \ 0 & 7 & 5 & b3+3b1 \ \end{bmatrix} \sim \begin{bmatrix} 1 & 3 & 4 & b1 \ 0 & 14 & 10 & b2+4b1 \ 0 & 0 & 0 & b3+3b1 - \frac{1}{2}(b2+4b1) \ \end{bmatrix}

    • The third entry in the augmented column is b3+3b112b22b1=b112b2+b3b3+3b1 - \frac{1}{2}b2 - 2b1 = b1 - \frac{1}{2}b2 + b3.

    • The system Ax=bAx=b is not consistent for every bb because some choices of bb can make (b112b2+b3)(b1 - \frac{1}{2}b2 + b3) nonzero, leading to an inconsistent system.

    • The reduced matrix shows that Ax=bAx = b is consistent if and only if the entries in bb satisfy:
      b112b2+b3=0b1 - \frac{1}{2}b2 + b3 = 0

    • This equation describes a plane through the origin in R3R^3, which is the set of all linear combinations of the columns of AA. The equation fails to be consistent for all bb because the echelon form of AA has a row of zeros.

Theorem 4 (Page 3): Equivalence of Four Statements
  • Let AA be an m×nm \times n matrix. The following statements are logically equivalent (all true or all false):

    • a. For each bb in RmR^m, the equation Ax=bAx = b has a solution.

    • b. Each bb in RmR^m is a linear combination of the columns of AA.

    • c. The columns of AA span RmR^m. (This means Spanv1,,vp=RmSpan{v1, \dots, vp} = R^m if v1,,vp{v1, \dots, vp} are the columns of AA).

    • d. The matrix AA has a pivot position in every row.

  • Warning (Page 4): Theorem 4 is about the coefficient matrix AA, not the augmented matrix [A b][A \ b]. If an augmented matrix [A b][A \ b] has a pivot position in every row, the equation Ax=bAx = b may or may not be consistent (e.g., if the pivot is in the augmented column).

  • Proof Sketch (Page 6):

    • Statements (a), (b), and (c) are equivalent by definition.

    • To show (a) and (d) are equivalent: Let UU be an echelon form of AA.

      • If (d) is true (pivot in every row of UU), then [U d][U \ d] cannot have a pivot in the augmented column, so Ax=bAx = b has a solution for any bb (meaning (a) is true).

      • If (d) is false (last row of UU is all zeros), then we can construct a vector dd with a 11 in its last entry. The system [U d][U \ d] represents an inconsistent system. Since row operations are reversible, this implies an original system [A b][A \ b] that is also inconsistent, thus (a) is false.

Computation of AxAx: The Row-Vector Rule

  • This rule provides a more efficient method for calculating entries in AxAx by hand.

  • Example 4 (Page 4): Compute AxAx, where A=[2amp;3amp;4 1amp;5amp;3 6amp;2amp;8 ]A = \begin{bmatrix} 2 & 3 & 4 \ -1 & 5 & -3 \ 6 & -2 & 8 \ \end{bmatrix} and x=[x1 x2 x3 ]x = \begin{bmatrix} x1 \ x2 \ x3 \ \end{bmatrix}.

    • By definition: Ax=x1[2 1 6 ]+x2[3 5 2 ]+x3[4 3 8 ]=[2x1+3x2+4x3 x1+5x23x3 6x12x2+8x3 ]Ax = x1 \begin{bmatrix} 2 \ -1 \ 6 \ \end{bmatrix} + x2 \begin{bmatrix} 3 \ 5 \ -2 \ \end{bmatrix} + x3 \begin{bmatrix} 4 \ -3 \ 8 \ \end{bmatrix} = \begin{bmatrix} 2x1 + 3x2 + 4x3 \ -x1 + 5x2 - 3x3 \ 6x1 - 2x2 + 8x3 \ \end{bmatrix}

    • Notice that the first entry in AxAx (2x1+3x2+4x32x1 + 3x2 + 4x3) is the sum of products of entries from the first row of AA and the entries in xx.

    • Similarly, the second entry (x1+5x23x3-x1 + 5x2 - 3x3) is from the second row of AA and xx. And so on.

  • Row-Vector Rule for Computing AxAx (Page 4): If the product AxAx is defined, then the ithi^{th} entry in AxAx is the sum of the products of corresponding entries from row ii of AA and from the vector xx.

  • Example 5 (Page 5):

    • a.
      [2amp;3 8amp;0 5amp;2 ][4 7 ]=[(2)(4)+(3)(7) (8)(4)+(0)(7) (5)(4)+(2)(7) ]=[821 32+0 20+14 ]=[13 32 6 ]\begin{bmatrix} 2 & -3 \ 8 & 0 \ -5 & 2 \ \end{bmatrix} \begin{bmatrix} 4 \ 7 \ \end{bmatrix} = \begin{bmatrix} (2)(4) + (-3)(7) \ (8)(4) + (0)(7) \ (-5)(4) + (2)(7) \ \end{bmatrix} = \begin{bmatrix} 8 - 21 \ 32 + 0 \ -20 + 14 \ \end{bmatrix} = \begin{bmatrix} -13 \ 32 \ -6 \ \end{bmatrix}

    • b. [1amp;0amp;0 0amp;1amp;0 0amp;0amp;1 ][r s t ]\begin{bmatrix} 1 & 0 & 0 \ 0 & 1 & 0 \ 0 & 0 & 1 \ \end{bmatrix} \begin{bmatrix} r \ s \ t \ \end{bmatrix}

      • This specific matrix is called an identity matrix and is denoted by II.

      • The calculation shows that I[r s t ]=[(1)(r)+(0)(s)+(0)(t) (0)(r)+(1)(s)+(0)(t) (0)(r)+(0)(s)+(1)(t) ]=[r s t ]I \begin{bmatrix} r \ s \ t \ \end{bmatrix} = \begin{bmatrix} (1)(r) + (0)(s) + (0)(t) \ (0)(r) + (1)(s) + (0)(t) \ (0)(r) + (0)(s) + (1)(t) \ \end{bmatrix} = \begin{bmatrix} r \ s \ t \ \end{bmatrix}, so Ix=xIx = x for every xx in R3R^3.

      • There is an analogous n×nn \times n identity matrix, often written as InIn, for which Ix=xIx = x for every xx in RnR^n.

Properties of the Matrix-Vector Product (Theorem 5)

  • If AA is an m×nm \times n matrix, uu and vv are vectors in RnR^n, and cc is a scalar, then:

    • a. Distributive Property: A(u+v)=Au+AvA(u + v) = Au + Av

    • b. Scalar Multiplication Property: A(cu)=c(Au)A(cu) = c(Au)

  • Proof Sketch (Page 5, for n=3n = 3):

    • Let A=[a1amp;a2amp;a3]A = \begin{bmatrix} a1 & a2 & a3 \end{bmatrix}, and let uiui and vivi be the ithi^{th} entries of uu and vv respectively.

    • For (a):
      A(u+v)=[a1amp;a2amp;a3][u1+v1 u2+v2 u3+v3 ]A(u + v) = \begin{bmatrix} a1 & a2 & a3 \end{bmatrix} \begin{bmatrix} u1 + v1 \ u2 + v2 \ u3 + v3 \ \end{bmatrix}
      =(u1+v1)a1+(u2+v2)a2+(u3+v3)a3= (u1 + v1)a1 + (u2 + v2)a2 + (u3 + v3)a3
      =(u1a1+u2a2+u3a3)+(v1a1+v2a2+v3a3)= (u1a1 + u2a2 + u3a3) + (v1a1 + v2a2 + v3a3)
      =Au+Av= Au + Av

    • For (b):
      A(cu)=[a1amp;a2amp;a3][cu1 cu2 cu3 ]A(cu) = \begin{bmatrix} a1 & a2 & a3 \end{bmatrix} \begin{bmatrix} cu1 \ cu2 \ cu3 \ \end{bmatrix}
      =(cu1)a1+(cu2)a2+(cu3)a3= (cu1)a1 + (cu2)a2 + (cu3)a3
      =c(u1a1)+c(u2a2)+c(u3a3)= c(u1a1) + c(u2a2) + c(u3a3)
      =c(u1a1+u2a2+u3a3)= c(u1a1 + u2a2 + u3a3)
      =c(Au)= c(Au)

Numerical Note on Computation of AxAx (Page 6)

  • To optimize computer algorithms for AxAx, the sequence of calculations should involve data stored in contiguous memory locations.

  • Fortran: Stores matrices as sets of columns. Algorithms in Fortran compute AxAx as a linear combination of the columns of AA (consistent with the definition).

  • C: Stores matrices by rows. Algorithms in C should compute AxAx using the row-vector rule (multiplication of rows of AA by vector xx).

Practice Problems (Page 6)

  • Examples of exercises include:

    • Exhibiting bb as a linear combination of columns of AA given that xx is a solution to Ax=bAx=b.

    • Verifying Theorem 5(a) for specific matrices and vectors.

    • Constructing matrices AA and vectors b,cb, c such that Ax=bAx=b has a solution but Ax=cAx=c does not.

    • Computing matrix-vector products using both the definition and the row-vector rule.

    • Converting between matrix equations and vector equations.