search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
check_circle
Mark as learned
thumb_up
0
thumb_down
0
chat_bubble_outline
0
Comment
auto_stories Bi-column layout
settings

Introduction to Eigenvalues and Eigenvectors

schedule Aug 12, 2023
Last updated
local_offer
Linear Algebra
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Definition.

Eigenvalues and eigenvectors

Eigenvectors are special vectors that can be transformed by some square matrix to become a scalar multiple of themselves, that is:

Ax=λx

Where:

  • A is the given square matrix.

  • x is a non-zero vector referred to as an eigenvector of A.

  • λ is a scalar referred to as an eigenvalue of A.

Given a square matrix and a vector, we can easily verify whether or not the vector is an eigenvector of the matrix. Let's now go through some examples.

Example.

Verifying that a vector is not an eigenvector of a matrix

Suppose we have the following matrix and vector:

A=(1232),x=(21)

Show that x is not an eigenvector of A.

Solution. By definition, for x to be an eigenvector of A, the following equation must be satisfied:

Ax=λx

Where λ is some scalar. Substituting our matrix and vector gives:

(1232)(21)=λ(21)(48)=λ(21)

Clearly, the left vector is not a scalar multiple of the right vector, which means that a λ satisfying the equation does not exist. Therefore, x is not an eigenvector of A.

Example.

Verifying that a vector is an eigenvector of a matrix

Suppose we have the following matrix and vector:

A=(1342),x=(34)

Show that x is an eigenvector of A.

Solution. Again, we check whether Ax=λx holds for some scalar λ like so:

(1342)(34)=λ(34)(1520)=λ(34)

We see that if λ=5, then the equation holds. Therefore, x is an eigenvector of A and the corresponding eigenvalue is λ=5.

Interestingly, A may have other eigenvectors and eigenvalues as well. For instance, consider the same matrix A but a different vector:

A=(1342),x=(11)

Let's check whether x is an eigenvector of A or not:

(1342)(11)=λ(11)(22)=λ(11)

This equation holds if λ=2. Therefore, x is indeed also an eigenvector of A. We will later prove that an n×n matrix can have at most n eigenvalues and corresponding eigenvectors.

As we have just demonstrated, verifying whether or not the given vector is an eigenvector of a matrix is easy. We will later introduce a procedure by which we can derive the eigenvalues and eigenvectors of a given matrix, but let's first go through their geometric intuition.

Geometric intuition behind eigenvalues and eigenvectors

Recall from this sectionlink that multiplying vectors by a scalar λ geometrically means that we either shrink or stretch the vector by a factor of λ. This is illustrated below:

The matrix-vector product Ax can be thought of as a transformation by which A converts x into some other vector. In most cases, Ax results in a vector that has a different direction to x, but eigenvectors are a special type of vectors that do not change direction but instead shrinks or stretches after the transformation.

We will now derive a method to compute the eigenvalues and eigenvectors of given a matrix.

Theorem.

Characteristic equation and polynomial

The characteristic equation of a square matrix A is defined as:

(1)det(AλI)=0

Where:

  • λ is an eigenvalue of A.

  • I is an identity matrix.

The left-hand side of (1) is called the characteristic polynomial of A - often denoted as pA(λ).

Proof. By definition, for a given square matrix A, its eigenvectors x and eigenvalues λ satisfy the following equation:

(2)Ax=λx

Remember, eigenvectors are defined to be non-zero, that is, not all elements in the vector are zero.

Let's rearrange (2) to derive a formula that allows us to compute the eigenvalues:

(3)Ax=λxAxλx=0AxλIx=0(AλI)x=0

We are now going to assume that the matrix AλI is invertible and thus has an inverse (AλI)1. Multiplying this inverse on both sides gives:

(AλI)1(AλI)x=0Ix=0x=0

This means that the eigenvector x equals the zero vector. However, we started out by specifying that the eigenvector must be non-zero. Therefore, our assumption that AIλ is invertible is false. Since AIλ is not invertible, we know from theoremlink that the determinant of AλI is zero:

(4)det(AλI)=0

This equation is called the characteristic equation of A. This completes the proof.

Computing eigenvalues

Observe how there is only one unknown value in (4) - the eigenvalue λ. Therefore, this is the equation that allows us to find the eigenvalues of a given matrix! Let's now go through an example.

Example.

Computing the eigenvalues of a 2x2 matrix

Compute the eigenvalues of the following matrix:

A=(2341)

Solution. The characteristic equation of A is:

det(AλI)=0det[(2341)λ(1001)]=0det[(2λ341λ)]=0

By theoremlink, we can compute the determinant of the 2×2 matrix:

(2λ)(1λ)(3)(4)=022λλ+λ212=0λ23λ10=0(λ5)(λ+2)=0λ=5,2

This means that the eigenvalues of A are 5 and 2. Typically, we label the eigenvalues using a subscript like so:

λ1=5λ2=2

Computing eigenvectors

Now that we have found the eigenvalues of a given matrix A, we must find the corresponding eigenvectors. Recall from (3) that:

(5)(AλI)x=0

Here, x is the eigenvectors that we are trying to find. Remember, (5) holds for eigenvalues λ1 and λ2. Let's substitute λ=λ1 to get:

(6)(Aλ1I)x1=0

Here, we've added a subscript to x1 to indicate that it is an eigenvector corresponding to the eigenvalue λ1. If A is an 2×2 matrix, then the matrix form of (6) is:

(7)[(a11a12a21a22)λ1(1001)](x11x12)=(00)(a11λ1a12a21a22λ1)(x11x12)=(00)

Since A is given, the left matrix in (7) is fully defined.

NOTE

Because det(AλI)=0, we have that AλI is not invertible. By theoremlink, this means that at least one of the rows of AλI contains a row with all zeros. In other words, one of the rows of AλI is redundant.

Let's focus on the first row of (7) - the corresponding linear equation is:

(a11λ1)x11+a12x12=0

The only unknown here is the components of the eigenvectors x11 and x12. This means that we can express x11 in terms of x12 (or vice versa) like so:

(8)x11=a12x12a11λ1

For any value x12, equation (8) automatically determines the value of x11. For instance, if we set x12=1, then x11 becomes:

x11=a12a11λ1

The right-hand side is fully defined and is equal to some scalar value, say c. Therefore, the eigenvector corresponding to eigenvalue λ1 is:

x1=(x11x12)=(c1)

Note that we found this particular eigenvector by setting x12=1. If we had set a different value for x12, then we would end up with a different eigenvector for eigenvalue λ1. This means that there exist infinitely many eigenvectors that correspond to a single eigenvalue!

Now that we have found the eigenvectors x1 corresponding to eigenvalue λ1, we must repeat the same process to find eigenvectors x2 corresponding to eigenvalue λ2. This involves:

  • substituting λ2 into (AλI)x=0.

  • expressing x21 in terms of x22 or vice versa.

  • setting some value for x22 to find x21.

Don't worry if you find this step confusing - we will now go through a concrete example of computing the eigenvectors of a 2×2 matrix.

Example.

Computing eigenvectors of a 2x2 matrix

Find the eigenvectors of the following matrix:

A=(2341)

Note that this is the same matrix as the one in the previous examplelink.

Solution. Recall that the eigenvalues of A were:

λ1=5λ2=2

Since each eigenvalue has corresponding eigenvectors, we need to compute a set of eigenvectors for λ1 and another set for λ2 separately.

When ​λ1=5, we have that:

(Aλ1I)x1=0[(2341)5(1001)](x11x12)=(00)(253415)(x11x12)=(00)(3344)(x11x12)=(00)

Notice how dividing the top row by 3 and dividing the bottom row by 4 would make the rows identical. This means that one of the rows is redundant. Again, this is guaranteed to happen because AλI is not invertible and by theoremlink, we have that one of the rows is redundant.

Let's now focus on the top row:

3x11+3x12=03x11=3x12x11=x12

This tells us that components of eigenvector x1 must be equal. As a general rule of thumb, we normally pick simple values such as 1 like so:

x1=(x11x12)=(11)

Note that because eigenvectors are defined to be non-zero vectors, we cannot set x11=0, which leads to x12=0.

Now that we've computed an eigenvector for λ1, let's find an eigenvector for λ2. We repeat the same process:

(Aλ2I)x2=0(2(2)341(2))(x21x22)=(00)(4343)(x21x22)=(00)

Taking the top row gives:

4x21+3x22=03x22=4x21x22=43x21

To avoid fractions, let's pick x21=3. This means that the eigenvector x2 is:

x2=(x21x22)=(34)

To summarise our results, an eigenvector corresponding to the eigenvalue λ1=5 is:

x1=(11)

An eigenvector corresponding to the eigenvalue λ2=2 is:

x2=(34)

Again, keep in mind that there is an infinite number of eigenvectors and that these are just one of them.

We have so far covered examples of computing eigenvalues and eigenvectors of 2×2 matrices. The same logic applies to larger square matrices except that we typically use Gaussian elimination to solve for the eigenvectors. Let's go through a 3×3 case.

Example.

Computing eigenvalues and eigenvectors of a 3x3 matrix

Compute the eigenvalues and eigenvectors of the following matrix:

A=(231103332)

Solution. The flow is to find the eigenvalues first and then find a corresponding eigenvector for each eigenvalue.

Computing eigenvalues

The first step is to find the eigenvalues λ of A using the characteristic equation of A, that is:

det(AλI)=0

In matrix form, this translates to:

|2λ3110λ3332λ|=0

From our guide on determinants, we know how to compute the determinant of a 3×3 matrix:

(2λ)|λ332λ|3|1332λ|+|1λ33|=0(2λ)(λ)(2λ)+93(2λ)9+(3+3λ)=0(2λ)(λ22λ+9)3(λ7)+(3λ3)=02λ2+4λ18λ3+2λ29λ+3λ+21+3λ3=0λ3+λ=0λ(λ21)=0λ(λ+1)(λ1)=0λ=0,1,1

We now compute the eigenvectors for each of these eigenvalues.

Computing 1st eigenvector

Recall that to compute the eigenvectors of an eigenvalue, we use the following equation:

(AλI)x=0

When λ=0, we have that:

(9)(203110033320)(x1x2x3)=(000)(231103332)(x1x2x3)=(000)

Let's perform Gaussian elimination to solve the system:

(10)(231103332)(103231332)(103037037)(103037000)

Because the last row is all zeros, we have that x3 is a free variable, which means that we are free to give it any value.

The second row of (10) gives us:

(11)3x2+7x3=0x2=73x3

Finally, the first row of (10) gives us:

(12)x1+3x3=0x1=3x3

Therefore, the eigenvector corresponding to eigenvalue λ=0 is:

x1=(x1x2x3)=(3x373x3x3)

Just as we did for the 2×2 case, we can pick any value for x3 and the values of the other variables are automatically determined. To avoid fractions, let's set x3=3, which gives us the following eigenvector:

x1=(973)

We've managed to find an eigenvector corresponding to the first eigenvalue! To find the eigenvectors corresponding to the other eigenvalues, we just repeat the same process.

Computing 2nd eigenvector

When λ=1, we have that:

(213110133321)(x1x2x3)=(000)(331113331)(x1x2x3)=(000)

Performing Gaussian elimination gives:

(331113331)(113331331)(1130010002)(113001000)

From the second row, we have that x3=0. Substituting this into the first row gives:

x1x2=0x1=x2

Therefore, the eigenvector corresponding to the second eigenvalue is:

x2=(x1x2x3)=(x1x10)

For simplicity, let's set x1=1. This gives us the eigenvector:

x2=(110)

Now on to the eigenvectors of the 3rd eigenvalue!

Computing thse 3rd eigenvector
expand_more

Once again, we first obtain the system of linear equations:

(2(1)3110(1)3332(1))(x1x2x3)=(000)(131113333)(x1x2x3)=(000)

Performing Gaussian elimination gives:

(131113333)(113131333)(113044066)(113011011)(113011000)(102011000)

Again, because the third row contains all zeros, x3 is free to vary. The second row gives us:

x2=x3

The first row gives us:

x1=2x3

Therefore, the general form of the eigenvector is:

x3=(x1x2x3)=(2x3x3x3)

Let's set x3=1 to obtain a particular eigenvector:

x3=(211)

Summary of results

The eigenvalues of our matrix A are:

λ1=0,λ2=1,λ3=1

The corresponding eigenvectors are:

x1=(973),x2=(110),x3=(211)

Again, be reminded there are infinitely many eigenvectors that correspond to a single eigenvalue - we just chose simple eigenvectors here.

What is the importance of eigenvalues and eigenvectors in machine learning?

Eigenvalues and eigenvectors crop up in several machine learning topics such as:

  • principal component analysis (PCA) - a popular technique for dimensionality reduction. It turns out that projecting data points onto the eigenvectors of the variance-covariance matrix of the features preserves the most amount of information.

  • spectral clustering - a clustering technique that can handle data points with a non-convex layout. Similar to PCA, eigenvalues and eigenvectors are computed to perform dimensionality reduction before the actual clustering takes place.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
Cookie Policy
close
By using our site, you acknowledge that you agree to our Privacy Policy and Terms and Conditions.