search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
check_circle
Mark as learned
thumb_up
0
thumb_down
0
chat_bubble_outline
0
Comment
auto_stories Bi-column layout
settings

Introduction to Diagonalization

schedule Aug 11, 2023
Last updated
local_offer
Linear Algebra
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Definition.

Diagonalizable matrix

An $n\times{n}$ matrix $\boldsymbol{A}$ is said to be diagonalizable if there exists an invertible matrix $\boldsymbol{P}$ and a diagonal matrixlink $\boldsymbol{D}$ such that:

$$\boldsymbol{A}= \boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}$$

Note that $\boldsymbol{P}$ is said to diagonalize $\boldsymbol{A}$. We say that $\boldsymbol{PDP}^{-1}$ is an eigen-decomposition of $\boldsymbol{A}$.

Example.

2x2 diagonalizable matrix

Here's an example of a diagonalizable $2\times2$ matrix:

$$\begin{pmatrix} 2&3\\4&1\\ \end{pmatrix}=\begin{pmatrix} 1&3\\1&-4\\ \end{pmatrix} \begin{pmatrix} 5&0\\0&-2\\ \end{pmatrix} \begin{pmatrix} 1&3\\1&-4\\ \end{pmatrix}^{-1}$$
Theorem.

Diagonalizable matrices and the corresponding diagonal matrices are similar

If $\boldsymbol{A}$ is a diagonalizable matrix and $\boldsymbol{D}$ is the corresponding diagonal matrix, then $\boldsymbol{A}$ and $\boldsymbol{D}$ are similarlink.

Proof. Let $\boldsymbol{A}$ be a diagonalizable matrix:

$$\boldsymbol{A}= \boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}$$

Where $\boldsymbol{P}$ is an invertible matrix and $\boldsymbol{D}$ is a diagonal matrix. By definitionlink of similar matrices, $\boldsymbol{A}$ and $\boldsymbol{D}$ are similar.

This means that a diagonalizable matrix and its associated diagonal matrix enjoy the properties of similar matrices.

Theorem.

Diagonalization Theorem

Suppose we have an $n\times{n}$ matrix $\boldsymbol{A}$ that is diagonalizable, which means $\boldsymbol{A}$ can be written as:

$$\boldsymbol{A}=\boldsymbol{PDP}^{-1}$$

Where $\boldsymbol{P}$ is some invertible matrix and $\boldsymbol{D}$ is a diagonal matrix.

The diagonalization theorem states that:

  • $\boldsymbol{D}$ is an $n\times{n}$ diagonal matrix with the eigenvalues of $\boldsymbol{A}$ as its diagonal entries.

  • $\boldsymbol{P}$ is an invertible $n\times{n}$ matrix with the eigenvectors of $\boldsymbol{A}$ as its columns.

Proof. Suppose we have an $n\times{n}$ diagonalizable matrix $\boldsymbol{A}$ as follows:

$$\boldsymbol{A}=\boldsymbol{PDP}^{-1}$$

Where $\boldsymbol{P}$ is an invertible $n\times{n}$ matrix and $\boldsymbol{D}$ is an $n\times{n}$ diagonal matrix.

Suppose $\boldsymbol{P}$ has columns $\boldsymbol{x}_1$, $\boldsymbol{x}_2$, $\cdots$, $\boldsymbol{x}_n$ like so:

$$\boldsymbol{P}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix}$$

Suppose the diagonal matrix $\boldsymbol{D}$ is:

$$\boldsymbol{D}= \begin{pmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\smash{\ddots}&\vdots\\ 0&0&\cdots&\lambda_n\\ \end{pmatrix}$$

Where the diagonal entries $\lambda_1$, $\lambda_2$, $\cdots$, $\lambda_n$ are some real numbers. Note that we have defined $\boldsymbol{P}$ and $\boldsymbol{D}$ as above but we haven't imposed any conditions on them - we're simply expressing them as generic matrices.

By theoremlink, the matrix product $\boldsymbol{PD}$ is:

$$\begin{equation}\label{eq:LDiIcKRQr8b5m9eHnuG} \begin{aligned}[b] \boldsymbol{PD} &= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\lambda_n\\ \end{pmatrix}\\ &= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \lambda_1\boldsymbol{x_1}&\lambda_2\boldsymbol{x_2}&\cdots&\lambda_n\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix} \end{aligned} \end{equation}$$

By theoremlink, the matrix product $\boldsymbol{AP}$ is:

$$\begin{equation}\label{eq:JWfoh7K8qb73Xx4MAR1} \begin{aligned}[b] \boldsymbol{AP} &= \boldsymbol{A} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}\\ &= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{Ax_1}&\boldsymbol{Ax_2}&\cdots&\boldsymbol{Ax_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix} \end{aligned} \end{equation}$$

Now recall that $\boldsymbol{A}=\boldsymbol{PDP}^{-1}$. Multiplying both sides by $\boldsymbol{P}$ would give us $\boldsymbol{AP}=\boldsymbol{PD}$. Using equations \eqref{eq:LDiIcKRQr8b5m9eHnuG} and \eqref{eq:JWfoh7K8qb73Xx4MAR1}, we have that:

$$\begin{align*} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{Ax_1}&\boldsymbol{Ax_2}&\cdots&\boldsymbol{Ax_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \lambda_1\boldsymbol{x_1}&\lambda_2\boldsymbol{x_2}&\cdots&\lambda_n\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix} \end{align*}$$

Equating the columns gives us:

$$\begin{align*} \boldsymbol{Ax}_1&=\lambda_1\boldsymbol{x}_1\\ \boldsymbol{Ax}_2&=\lambda_2\boldsymbol{x}_2\\ &\;\;\vdots\\ \boldsymbol{Ax}_n&=\lambda_n\boldsymbol{x}_n\\ \end{align*}$$

By theoremlink, because $\boldsymbol{P}$ is invertible, its column vectors $\boldsymbol{x}_i$ are non-zero vectors. Therefore, by the definitionlink of eigenvalues and eigenvectors:

  • $\lambda_1$, $\lambda_2$, $\cdots$, $\lambda_n$ must represent the eigenvalues.

  • $\boldsymbol{x}_1$, $\boldsymbol{x}_2$, $\cdots$, $\boldsymbol{x}_n$ must represent the corresponding eigenvectors!

This completes the proof.

Example.

Diagonalizing a 2x2 matrix

Diagonalize the following matrix:

$$\boldsymbol{A}=\begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix}$$

That is, find the diagonal matrix $\boldsymbol{D}$ and invertible matrix $\boldsymbol{P}$ such that $\boldsymbol{A}=\boldsymbol{PDP}^{-1}$.

Solution. We will use the diagonalization theorem to diagonalize matrix $\boldsymbol{A}$. We know that the columns of matrix $\boldsymbol{P}$ are the eigenvectors of $\boldsymbol{A}$ whereas the diagonal entries of matrix $\boldsymbol{D}$ are the eigenvalues of $\boldsymbol{A}$.

Therefore, we must compute the eigenvalues and eigenvectors of $\boldsymbol{A}$. We already went through the step-by-step calculations for this specific matrix $\boldsymbol{A}$ in this sectionlink of our guide on eigenvalues and eigenvectors, so we will just skip the calculations here. The eigenvalues of $\boldsymbol{A}$ are as follows:

$$\lambda_1=5,\;\;\; \lambda_2=-2$$

The corresponding eigenvectors are:

$$\boldsymbol{x}_1=\begin{pmatrix} 1\\ 1\\ \end{pmatrix},\;\;\; \boldsymbol{x}_2=\begin{pmatrix} 3\\ -4\\ \end{pmatrix}$$

We can now fully define matrices $\boldsymbol{P}$ and $\boldsymbol{D}$ like so:

$$\boldsymbol{P} =\begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix},\;\;\; \boldsymbol{D} =\begin{pmatrix} 5&0\\ 0&-2\\ \end{pmatrix}$$

Matrix $\boldsymbol{A}$ can therefore be written as:

$$\begin{align*} \boldsymbol{A}&=\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1} \\ &=\begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix} \begin{pmatrix} 5&0\\ 0&-2\\ \end{pmatrix} \begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix}^{-1} \end{align*}$$
Theorem.

Taking the power of a diagonalized matrix

If $\boldsymbol{A}$ is a diagonalizable $n\times{n}$ matrix, then $\boldsymbol{A}$ raised to the power of $k$ is:

$$\boldsymbol{A}^k=\boldsymbol{P}\boldsymbol{D}^k\boldsymbol{P}^{-1}$$

Proof. Since $\boldsymbol{A}$ is assumed to be a diagonalizable matrix, we have that:

$$\boldsymbol{A}=\boldsymbol{PDP}^{-1}$$

Taking the power of $k$ on both sides yields:

$$\begin{align*} \boldsymbol{A}^k&=(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P^{-1}})^k\\ &=\underbrace{(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1})(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1})(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1})\cdots(\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1})}_{k}\\ &=\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}\boldsymbol{P}\boldsymbol{D}\boldsymbol{{}}^{-1}\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}\cdots\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}\\ &=\boldsymbol{P}\boldsymbol{D}\boldsymbol{I}\boldsymbol{D}\boldsymbol{I}\boldsymbol{D}\boldsymbol{I}\cdots\boldsymbol{I}\mathbf{D}\boldsymbol{P}^{-1}\\ &=\boldsymbol{P}\underbrace{\boldsymbol{D}\boldsymbol{D}\boldsymbol{D}\cdots\boldsymbol{D}}_k\boldsymbol{P}^{-1}\\ &=\boldsymbol{P}\boldsymbol{D}^k\boldsymbol{P}^{-1}\\ \end{align*}$$

This completes the proof.

Example.

Computing the power of a diagonalizable matrix

Consider the same matrix as earlier:

$$\boldsymbol{A}=\begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix}$$

Compute $\boldsymbol{A}^{3}$.

Solution. Recall that the diagonalized form of $\boldsymbol{A}$ is:

$$\begin{align*} \boldsymbol{A}&=\boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1} \\ &=\begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix} \begin{pmatrix} 5&0\\ 0&-2\\ \end{pmatrix} \begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix}^{-1} \end{align*}$$

We now apply the theorem:

$$\begin{align*} \boldsymbol{A}^3&= \boldsymbol{P} \boldsymbol{D}^3 \boldsymbol{P}^{-1}\\ &= \begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix} \begin{pmatrix} 5&0\\ 0&-2\\ \end{pmatrix}^3 \begin{pmatrix} 1&3\\ 1&-4\\ \end{pmatrix}^{-1}\\ &= \begin{pmatrix}1&3\\1&-4\\\end{pmatrix} \begin{pmatrix}5^3&0\\0&-2^3\\\end{pmatrix} \begin{pmatrix}\frac{4}{7}&\frac{3}{7}\\\frac{1}{7}&\frac{-1}{7}\\\end{pmatrix}\\ &= \begin{pmatrix}68&57\\76&49\\\end{pmatrix} \end{align*}$$

This theorem is particularly useful when computing much higher powers. For instance, suppose we wanted to compute $\boldsymbol{A}^{100}$. Without the theorem, this would involve computing matrix products $99$ times:

$$\boldsymbol{A}^{100} = \underbrace{ \begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix} \begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix} \cdots \begin{pmatrix} 2 & 3\\ 4 & 1\\ \end{pmatrix} }_{100}$$

In contrast, using our theorem involves computing the following:

$$\begin{align*} \boldsymbol{A}^{100}&= \boldsymbol{P} \boldsymbol{D}^3 \boldsymbol{P}^{-1}\\ &= \begin{pmatrix}1&3\\1&-4\\\end{pmatrix} \begin{pmatrix}5^{100}&0\\0&-2^{100}\\\end{pmatrix} \begin{pmatrix}\frac{4}{7}&\frac{3}{7}\\\frac{1}{7}&\frac{-1}{7}\\\end{pmatrix} \end{align*}$$

Computing powers of several numbers is computationally much cheaper than performing matrix multiplication $99$ times!

Theorem.

Matrix of size n is diagonalizable if and only if it has n linearly independent eigenvectors

Let $\boldsymbol{A}$ be an $n\times{n}$ matrix. $\boldsymbol{A}$ is diagonalizable if and only if $\boldsymbol{A}$ has $n$ linearly independent eigenvectors.

Proof. We start by proving the forward proposition. Assume $\boldsymbol{A}$ is diagonalizable, which means that:

$$\boldsymbol{A}= \boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}$$

Where:

  • $\boldsymbol{D}$ is a diagonal matrix with eigenvalues of $\boldsymbol{A}$ as its diagonal entries.

  • $\boldsymbol{P}$ is an invertible $n\times{n}$ matrix with the eigenvectors of $\boldsymbol{A}$ as its columns.

Since $\boldsymbol{A}$ is diagonalizable, both $\boldsymbol{D}$ and $\boldsymbol{P}$ exist. Since $\boldsymbol{P}$ is invertible, $\boldsymbol{P}$ has $n$ linearly independent columns by theoremlink. Since the columns of $\boldsymbol{P}$ are its eigenvectors, we have that $\boldsymbol{P}$ has $n$ linearly independent eigenvectors.

* * *

Let's now prove the converse. We assume that $\boldsymbol{A}$ has $n$ linearly independent eigenvectors $\boldsymbol{x}_1$, $\boldsymbol{x}_2$, $\cdots$, $\boldsymbol{x}_n$ and corresponding eigenvalues $\lambda_1$, $\lambda_2$, $\cdots$, $\lambda_n$. By definitionlink of eigenvectors and eigenvalues, the following equations hold:

$$\begin{equation}\label{eq:x4dH09uGAwai0JZqNxh} \begin{gathered} \boldsymbol{A}\boldsymbol{x}_1 =\lambda_1\boldsymbol{x}_1\\ \boldsymbol{A}\boldsymbol{x}_2 =\lambda_2\boldsymbol{x}_2\\ \vdots\\ \boldsymbol{A}\boldsymbol{x}_n =\lambda_n\boldsymbol{x}_n\\ \end{gathered} \end{equation}$$

In matrix form, we can express this as:

$$\begin{equation}\label{eq:ymonPOqHL1v7Dg4EZ2F} \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{Ax_1}&\boldsymbol{Ax_2}&\cdots&\boldsymbol{Ax_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}= \begin{pmatrix} \vert&\vert&\cdots&\vert\\ \lambda_1\boldsymbol{x_1}&\lambda_2\boldsymbol{x_2}&\cdots&\lambda_n\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix} \end{equation}$$

Where \eqref{eq:x4dH09uGAwai0JZqNxh} is obtained by equating the columns. By theoremlink, the left-hand side of \eqref{eq:ymonPOqHL1v7Dg4EZ2F} can be expressed as:

$$\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{Ax_1}&\boldsymbol{Ax_2}&\cdots&\boldsymbol{Ax_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}= \boldsymbol{A}\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}= \boldsymbol{AP}$$

Where $\boldsymbol{P}$ is a matrix whose columns are the $n$ linearly independent eigenvectors.

Next, by theoremlink, the right-hand side of \eqref{eq:ymonPOqHL1v7Dg4EZ2F} can be expressed as:

$$\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \lambda_1\boldsymbol{x_1}&\lambda_2\boldsymbol{x_2}&\cdots&\lambda_n\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert \end{pmatrix}=\begin{pmatrix} \vert&\vert&\cdots&\vert\\ \boldsymbol{x_1}&\boldsymbol{x_2}&\cdots&\boldsymbol{x_n}\\ \vert&\vert&\cdots&\vert\\ \end{pmatrix} \begin{pmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\smash\ddots&\vdots\\ 0&0&\cdots&\lambda_n\\ \end{pmatrix}= \boldsymbol{PD}$$

Where $\boldsymbol{D}$ is a diagonal matrix whose diagonal entries are the eigenvalues of $\boldsymbol{A}$.

Now, \eqref{eq:ymonPOqHL1v7Dg4EZ2F} can be expressed as $\boldsymbol{AP}=\boldsymbol{PD}$. Since the columns of $\boldsymbol{P}$ are linearly independent, $\boldsymbol{P}$ is invertible by theoremlink. This means that $\boldsymbol{P}^{-1}$ exists and thus:

$$\begin{align*} \boldsymbol{AP}&=\boldsymbol{PD}\\ \boldsymbol{APP}^{-1}&=\boldsymbol{PDP}^{-1}\\ \boldsymbol{AI}_n&=\boldsymbol{PDP}^{-1}\\ \boldsymbol{A}&=\boldsymbol{PDP}^{-1}\\ \end{align*}$$

By definitionlink, $\boldsymbol{A}$ is therefore diagonalizable. This completes the proof.

Theorem.

Matrix of size n with n distinct eigenvalues is diagonalizable

If $\boldsymbol{A}$ is an $n\times{n}$ matrix with $n$ distinct eigenvalues, then $\boldsymbol{A}$ is diagonalizable.

Proof. By definitionlink of diagonalizable matrices and the diagonalization theoremlink, we know that $\boldsymbol{A}$ is diagonalizable if we can write $\boldsymbol{A}$ as:

$$\boldsymbol{A}= \boldsymbol{P}\boldsymbol{D}\boldsymbol{P}^{-1}$$

Where:

  • $\boldsymbol{D}$ is a diagonal matrix with eigenvalues of $\boldsymbol{A}$ as its diagonal entries.

  • $\boldsymbol{P}$ is an invertible $n\times{n}$ matrix with the eigenvectors of $\boldsymbol{A}$ as its columns.

By theoremlink, if an $n\times{n}$ matrix $\boldsymbol{A}$ has $n$ distinct eigenvalues, then $\boldsymbol{A}$ has $n$ linearly independent eigenvectors. By theoremlink, $\boldsymbol{A}$ is therefore diagonalizable. This completes the proof.

Note that the converse is not necessarily true, that is, if $\boldsymbol{A}$ is diagonalizable, then $n\times{n}$ matrix may not have $n$ distinct eigenvalues.

Theorem.

Matrix is diagonalizable if and only if it has an eigenbasis

Let $\boldsymbol{A}$ be a square matrix. $\boldsymbol{A}$ is diagonalizable if and only if $\boldsymbol{A}$ has an eigenbasis.

Proof. This theorem is analogous to theoremlink. Let's first prove the forward proposition. Assume an $n\times{n}$ matrix $\boldsymbol{A}$ is diagonalizable. By theoremlink, $\boldsymbol{A}$ has $n$ linearly independent eigenvectors, which form an eigenbasis for $\mathbb{R}^n$.

Let's now prove the converse. Assume that $\boldsymbol{A}$ has an eigenbasis. This means that $\boldsymbol{A}$ has $n$ linearly independent eigenvectors. By theoremlink once again, $\boldsymbol{A}$ is diagonalizable.

This completes the proof.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...