Comprehensive Guide on LU Factorization
Start your free 7-days trial now!
LU factorization
Let $\boldsymbol{A}$ be a square matrix. The $\boldsymbol{LU}$ factorization of $\boldsymbol{A}$ involves expressing $\boldsymbol{A}$ as a product of a lower triangular matrix $\boldsymbol{L}$ and upper triangular matrix $\boldsymbol{U}$ like so:
As we shall see later, not all square matrices have a $\boldsymbol{LU}$ factorization.
LU factorization of 2x2 and 3x3 matrices
Below are some examples of $\boldsymbol{LU}$ factorization:
Performing LU factorization on a matrix
Let $\boldsymbol{A}$ be a square matrix. If $\boldsymbol{A}$ can be reduced to a row echelon formlink $\boldsymbol{U}$ by Gaussian Elimination without performing any row swaps, then $\boldsymbol{A}$ can be factorized into $\boldsymbol{LU}$ form. Specifically, $\boldsymbol{L}$ is a lower triangular matrix equal to the product of the inverses of elementary matrices required to reduce $\boldsymbol{A}$ into $\boldsymbol{U}$.
Proof. The row echelon form $\boldsymbol{U}$ of square matrix $\boldsymbol{A}$ can be obtained by applying a series of elementary row operations on $\boldsymbol{A}$. Assume that none of the elementary row operations is a row interchange. By theoremlink, performing an elementary row operation on $\boldsymbol{A}$ is equivalent to multiplying the corresponding elementary matrix to $\boldsymbol{A}$ like so:
By theoremlink, elementary matrices are invertible and thus we can make $\boldsymbol{A}$ the subject:
Recall that there are three types of elementary matrices:
elementary matrix corresponding to multiplying a row by some scalar $k$.
elementary matrix corresponding to interchanging two rows.
elementary matrix corresponding to adding a multiple of one row to another row.
We will focus on the first and third types of elementary matrices, and then later explain why $\boldsymbol{LU}$ factorization does not work when we allow the second type of elementary matrix.
Firstly, consider the elementary matrix corresponding to the elementary row operation of multiplying a row by some scalar $k$. This elementary matrix resembles the identity matrix where one of the $1$s is multiplied by $k$. An example of such an elementary matrix might look like so:
This means that this type of elementary matrix is a triangular matrix. Note that this could be treated as both a lower or an upper triangular matrix.
Secondly, consider the elementary matrix corresponding to the elementary row operation of multiplying one row by a scalar and then adding it to another row. This type of elementary matrix is constructed by filling one of the $0$s below or above the diagonal of an identity matrix with $k$. An example of such an elementary matrix might look like so:
Here, the left elementary matrix corresponds to multiplying the third row by $k$ and then adding it to the first row. The right elementary matrix corresponds to multiplying the second row by $k$ and then adding it to the third row. Therefore, depending on the operation we perform, this type of elementary matrix could either be an upper or lower triangular matrix. However, we only require one of these types to obtain the row echelon form. For instance, consider the following matrix:
To obtain the row echelon form, we can either:
multiply the bottom row by $-1/2$ and then add it to the top row. The associated elementary matrix will be an upper triangular.
multiply the top row by $-2$ and then add it to the bottom row. The associated elementary matrix will be a lower triangular.
Both ways will give us the row echelon form. This means that we can assume the elementary matrix of this type to be a lower triangular matrix for argument's sake.
Now, for your reference, here's \eqref{eq:Ty2QBEaT0q9NbNmLu1S} again:
We know that elementary matrices $\boldsymbol{E}_1$, $\boldsymbol{E}_2$, $\cdots$, $\boldsymbol{E}_k$ are all lower triangular matrices. By theoremlink, the inverse of a lower triangular matrix is also a lower triangular matrix. Therefore, $\boldsymbol{E}_1^{-1}$, $\boldsymbol{E}^{-1}_2$, $\cdots$, $\boldsymbol{E}^{-1}_k$ are all lower triangular matrices. By theoremlink, the product of lower triangular matrices also results in a lower triangular matrix. Let's express this product like so:
Where $\boldsymbol{L}$ is a lower triangular matrix. Therefore, \eqref{eq:vNlDGOwhsFrqkw9fegY} becomes:
Finally, let's go over why the elementary row operation of row interchanges is not allowed. As an example, here's the elementary matrix corresponding to interchanging the first and second rows:
Clearly, this is not a triangular matrix. If we include a non-triangular matrix in \eqref{eq:nwLkhk5cMunkwsPtPTh}, then the product of the elementary matrices is no longer guaranteed to be a lower triangular matrix.
This completes the proof.
Performing LU factorization on a 2x2 matrix
Perform $\boldsymbol{LU}$ factorization on the following matrix:
Proof. We first obtain the row echelon form of $\boldsymbol{A}$ without performing any row swaps:
Here, we've performed the elementary row operation of multiplying the first row by $-3$ and then adding it to the bottom row. The corresponding elementary matrix is:
By theoremlink, the inverse matrix of $\boldsymbol{E}$ is:
By theoremlink, we can express $\boldsymbol{A}$ as:
Performing LU factorization on a 3x3 matrix
Perform $\boldsymbol{LU}$ factorization on the following matrix:
Solution. We first obtain the row echelon form of $\boldsymbol{A}$ without performing any row interchanges:
The corresponding elementary row operations are:
By theoremlink, the inverse matrices are:
The product of these inverse matrices is:
By theoremlink, we have that:
LU factorization of a matrix is not unique
In general, the $\boldsymbol{LU}$ factorization of a matrix is not unique.
Counter example. Recall from examplelink in which we performed $\boldsymbol{LU}$ factorization on the following:
The row echelon form of $\boldsymbol{A}$ is:
The corresponding elementary matrix and its inverse are:
Therefore, $\boldsymbol{A}$ has the following $\boldsymbol{LU}$ form:
However, we can further reduce the row echelon form of $\boldsymbol{A}$ in \eqref{eq:GiGJS7dCfh5KEbUDdiw} to get:
Here, we multiplied the second row by $-2$. The elementary matrices are:
By theoremlink, the inverses of $\boldsymbol{E}_1$ and $\boldsymbol{E}_2$ are:
This means that $\boldsymbol{A}$ can also be factorized into $\boldsymbol{LU}$ form like so:
In fact, we could multiply the first or second row by any scalar, which will give us another row echelon form $\boldsymbol{U}$. Therefore, there are infinite number of $\boldsymbol{LU}$ forms of $\boldsymbol{A}$. The completes the proof.
Not all square matrices can be LU factorized
Theoremlink guarantees the existence of a $\boldsymbol{LU}$ form of a square matrix $\boldsymbol{A}$ if $\boldsymbol{A}$ can be reduced to a row echelon form without performing any row swaps. If this condition is not satisfied, then $\boldsymbol{A}$ generally cannot be $\boldsymbol{LU}$ factorized.
Counter example. Consider the following matrix:
Notice how we can not eliminate the $1$ below the $0$ because we can only add multiples of the top row to the bottom row. Therefore, this matrix can only be reduced to a row echelon form by interchanging the rows. This means that the existence of the $\boldsymbol{LU}$ form of $\boldsymbol{A}$ is not guaranteed. Let's show that $\boldsymbol{A}$ in fact does not have a $\boldsymbol{LU}$ form. Let $\boldsymbol{L}$ and $\boldsymbol{U}$ be defined like so:
The product $\boldsymbol{LU}$ is:
Notice how the top-right entry of $\boldsymbol{LU}$ is $0$. This means that $\boldsymbol{LU}$ can never be $\boldsymbol{A}$ and so $\boldsymbol{A}$ does not have a $\boldsymbol{LU}$ form.
As another interesting example, consider the following matrix:
Again, theoremlink cannot guarantee the existence of the $\boldsymbol{LU}$ form of $\boldsymbol{A}$. However, this does not mean that a $\boldsymbol{LU}$ form does not exist - in fact, a $\boldsymbol{LU}$ form does exist in this case:
Solving linear systems using LU factorization
Consider the linear system $\boldsymbol{Ax}=\boldsymbol{b}$. Suppose $\boldsymbol{A}$ can be factorized as $\boldsymbol{A}=\boldsymbol{LU}$. Substitute $\boldsymbol{A}$ into $\boldsymbol{Ax}=\boldsymbol{b}$ to get:
Define a vector $\boldsymbol{y}$ such that:
Substituting $\boldsymbol{y}$ into \eqref{eq:r9NAn9g56oN1waXrjoV} gives:
We solve this system to obtain $\boldsymbol{y}$, which we then substitute into \eqref{eq:jVmN4hvWsxyeJFPABTe} to solve for $\boldsymbol{x}$.
Solving a linear system using LU factorization
Solve the following system using $\boldsymbol{LU}$ factorization:
Proof. The system can be written as $\boldsymbol{Ax}=\boldsymbol{b}$ where:
The first step is to obtain a $\boldsymbol{LU}$ form of $\boldsymbol{A}$ like so:
The corresponding elementary matrix and its inverse are:
By theoremlink, $\boldsymbol{A}$ can be factorized into:
We first solve the system $\boldsymbol{Ly}=\boldsymbol{b}$ for $\boldsymbol{y}$ like so:
We have that $y_1=6$ and $y_2=-4$. We now solve $\boldsymbol{y}=\boldsymbol{Ux}$ for $\boldsymbol{x}$ like so:
We have that $x_2=4$ and $x_1=-5$. Therefore, the solution to $\boldsymbol{Ax}=\boldsymbol{b}$ is:
Benefits of using LU factorization to solve linear systems
You may wonder why we go through the hassle of solving a system of linear equations using $\boldsymbol{LU}$ factorization instead of the more straight forward approach of Gaussian elimination. There are two main reasons:
although $\boldsymbol{LU}$ factorization is cumbersome by hand, the computer implementation of solving a linear system using $\boldsymbol{LU}$ factorization is just as fast as Gaussian elimination.
if we are given multiple linear systems $\boldsymbol{Ax}=\boldsymbol{b}_1$, $\boldsymbol{Ax}=\boldsymbol{b}_2$, $\cdots$, $\boldsymbol{Ax}=\boldsymbol{b}_k$, then solving them will be much easier once we work out the $\boldsymbol{LU}$ factorization of $\boldsymbol{Ax}$ at the start. If we were to use Gaussian elimination, we would have to solve each system separately.
For these reasons, $\boldsymbol{LU}$ factorization is commonly used in computer programs to solve linear systems.