LU Factorization Notes

LU Factorization

Introduction

These notes cover LU factorization based on section 2.5 of LAE, focusing on the pure LU factorization method. This method does not involve switching rows. The key is to understand under what conditions an LU factorization exists for a given matrix $A$ .

Existence of LU Factorization

For an $m \times n$ matrix $A$ to have an LU factorization, it must be reducible to echelon form using only type one row operations, following the standard Gaussian elimination procedure.

This means:

Pivots must be in their initial state in the top row.
No row swaps are allowed to avoid zeros in pivot positions.
Use each pivot to eliminate all entries below it.
No single row scaling.

If these conditions are met, $A$ can be factored into $A = LU$ , where:

$U$ is the echelon form obtained from the above procedure.
$L$ is a square matrix of size $m \,\times \, m$ , where $m$ is the number of rows of $A$ .

Properties of L

$L$ has the following properties:

Lower triangular: zeros everywhere above its main diagonal.
Main diagonal consists of ones.

Solving Ax = b using LU Factorization

If $A$ has an LU factorization, solving $Ax = b$ becomes computationally efficient:

Solve $Ly = b$ using forward substitution.
Solve $Ux = y$ using backward substitution.

This two-step process yields the solution to $Ax = b$ .

Example

Consider a matrix $A$ that can be reduced to echelon form $U$ using type one row operations. Suppose the row operations:

$R2 \rightarrow R2 - R_1$

$R3 \rightarrow R3 + 2R_1$

result in the echelon form $U$ . In this case, $L$ is given by:

$L = \begin{bmatrix} 1 & 0 & 0 \ -1 & 1 & 0 \ 2 & 0 & 1 \end{bmatrix}$

Notice that the entries in the lower portion of $L$ are the negatives of the coefficients used in reducing $A$ to $U$ . This is because $L$ is composed of inverse elementary matrices.

The key point: $A = LU$ .

Solving Ax = b

To solve $Ax = b$ , we first solve $Ly = b$ .

Given

$L = \begin{bmatrix} 1 & 0 & 0 \ -1 & 1 & 0 \ 2 & 0 & 1 \end{bmatrix}$

and

$b = \begin{bmatrix} 2 \ -4 \ 6 \end{bmatrix}$ ,

then

$Ly = b$

implies

$\begin{aligned} y1 &= 2 \ -y1 + y2 &= -4 \ 2y1 + y_3 &= 6 \end{aligned}$

Solving this system yields:

$\begin{aligned} y1 &= 2 \ y2 &= -4 + y1 = -2 \ y3 &= 6 - 2y_1 = 2 \end{aligned}$

So, $y = \begin{bmatrix} 2 \ -2 \ 2 \end{bmatrix}$ .

Next, solve $Ux = y$ .

Using the previously determined $y = \begin{bmatrix} 2 \ -2 \ 2 \end{bmatrix}$ and given some $U$ (echelon form), we solve $Ux = y$ via backward substitution.

Backward Substitution

Given an upper triangular matrix (echelon form) $U$ , solve $Ux = y$ from the bottom up.

For example, assume $U$ looks like this:

$U = \begin{bmatrix} 4 & 3 & -5 \ 0 & -2 & -2 \ 0 & 0 & 1 \end{bmatrix}$

then $Ux = y$ implies solve:

$\begin{aligned} 4x1 + 3x2 - 5x3 &= 2 \ -2x2 - 2x3 &= -2 \ x3 &= 1 \end{aligned}$

From the bottom equation, $x_3 = 1$ .
Rearrange the second equation: $-2x2 = -2 + 2x3$ , implies $x_2 = 0$ .
Rearrange the first equation: $4x1 = 2 - 3x2 + 5x3$ , implies $x1 = 7/4$ .

Remember to present the solution in the correct order: $x = \begin{bmatrix} 7/4 \ 0 \ 1 \end{bmatrix}$ .

Computational Efficiency

Solving $Ax = b$ as $LUx = b$ involves first solving for $y$ in $Ly = b$ and then solving for $x$ in $Ux = y$ .

This is a computationally efficient approach, especially when solving $Ax = b$ for multiple different $b$ vectors.

Finding the LU Factorization

To find the LU factorization, we need to reduce $A$ to its echelon form $U$ using only type one row operations, maintaining the order of pivots from top to bottom.

If this is not possible, then a pure LU factorization does not exist.

As row reduction is performed, track the row operations applied.

Determining L

Take the negatives of the coefficients used in the row reduction and place them in the corresponding lower triangular entries in $L$ . This makes the process simple.

For example:

Given row operations:

$\begin{aligned} R2 &\rightarrow R2 + R1 \ R3 &\rightarrow R3 - 2R1 \ R3 &\rightarrow R3 - 2R_2 \end{aligned}$

The lower triangular matrix $L$ will have ones on its diagonal. Use the negatives of the given coefficients, 1, -2 and -2 in operation equations, to get:

$L = \begin{bmatrix} 1 & 0 & 0 \ -1 & 1 & 0 \ 2 & 2 & 1 \end{bmatrix}$

Lay Textbook Reference

The Lay textbook presents the algorithm for finding $L$ differently but yields the same outcome. See example 2 on page 128 of the fifth edition, section 2.5.

Why the Algorithm Works

Inverses of Elementary Matrices

Understanding how inverses of elementary matrices work is crucial.

Type One Row Operations: If $E$ replaces row $i$ with itself plus $c$ copies of row $j$ , then $E^{-1}$ replaces row $i$ with itself minus $c$ copies of row $j$ .
Row Swap Operations: Performing the same row swap twice returns the original matrix. Thus, the inverse is the matrix itself.
Row Scaling: If $E$ scales row $i$ by $k$ (where $k \neq 0$ ), then $E^{-1}$ scales row $i$ by $1/k$ .

The inverses of elementary matrices correspond exactly to the inverse row operations.

Sequence of Row Operations

If $A$ can be reduced to echelon form $U$ using type one row operations, then a sequence of elementary matrices can be applied:

$Ek \dots E2 E_1 A = U$

Taking the inverses in reverse order:

$A = E1^{-1} E2^{-1} \dots E_k^{-1} U$

Here, $L = E1^{-1} E2^{-1} \dots E_k^{-1}$ . Thus, $A = LU$ .

Action of Inverse Row Operations

The inverse row operations modify the lower triangular portion of $I$ (identity matrix), leaving the diagonal unchanged.

Filling in the negatives of the original row operation coefficients into the corresponding lower triangular positions gives $L$ .

By applying these operations in reverse order, we maintain the structure of the matrix and correctly compute $L$ .

Example

Suppose the row operations E1 to E6 in sequence convert A to echelon form U:

$E6 E5 E4 E3 E2 E1 A = U$

Then by multiplying the equation by the inverses of these elementary matrices we obtain these expressions:

$A = E1^{-1} E2^{-1} E3^{-1} E4^{-1} E5^{-1} E6^{-1} U$

$L = E1^{-1} E2^{-1} E3^{-1} E4^{-1} E5^{-1} E6^{-1}$

Which yields: A = LU

We want to justify whether this process for the specific example actually gives us a lower triangular matrix with ones down its digonal, as well as what happens to the raw operation.