search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
check_circle
Mark as learned
thumb_up
0
thumb_down
0
chat_bubble_outline
0
Comment
auto_stories Bi-column layout
settings

Comprehensive Guide on Gram-Schmidt Process

schedule Aug 10, 2023
Last updated
local_offer
Linear Algebra
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Before we dive into the Gram-Schmidt process, let's go through a quick example that demonstrates how we can construct an orthogonal basis given any two basis vectors for $\mathbb{R}^2$.

Example.

Constructing an orthogonal basis using projections

Consider the following vectors:

$$\boldsymbol{x}_1= \begin{pmatrix} 1\\2 \end{pmatrix},\;\;\;\;\; \boldsymbol{x}_2= \begin{pmatrix}2\\2 \end{pmatrix}$$

Construct an orthogonal basis for $\mathbb{R}^2$ using $\boldsymbol{w}_1$ and $\boldsymbol{w}_2$.

Solution. Let's start by visualizing the two vectors:

We can clearly see that the two vectors are not perpendicular to each other, which means that they do not form an orthogonal basis for $\mathbb{R}^2$. We can fix one vector, say $\boldsymbol{x}_1$, and modify the other vector $\boldsymbol{x}_2$ such that $\boldsymbol{x}_2$ becomes perpendicular to $\boldsymbol{x}_1$.

Let's set $\boldsymbol{v}_1=\boldsymbol{x}_1$ and our goal is to find $\boldsymbol{v}_2$ where $\boldsymbol{v}_2$ is orthogonal to $\boldsymbol{v}_1$. Recall that $\boldsymbol{v}_2$ can be constructed by projecting $\boldsymbol{x}_2$ onto $\boldsymbol{v}_1$ as shown below:

From the geometric interpretationlink of vector addition, we have the following relationship:

$$\begin{equation}\label{eq:PsFQHOCQvWAd02tK9IA} \boldsymbol{v}_2= \boldsymbol{x}_2-\mathrm{Proj}_{L}(\boldsymbol{x}_2) \end{equation}$$

From theorem, we know that:

$$\begin{equation}\label{eq:WnFx9a9ZbZWJGQEa9UF} \text{Proj}_L(\boldsymbol{x}_2)= \frac{\boldsymbol{x}_2\cdot\boldsymbol{v}_1} {\boldsymbol{v}_1\cdot\boldsymbol{v}_1}\boldsymbol{v}_1 \end{equation}$$

Substituting \eqref{eq:WnFx9a9ZbZWJGQEa9UF} into \eqref{eq:PsFQHOCQvWAd02tK9IA} gives:

$$\begin{align*} \boldsymbol{v}_2&= \boldsymbol{x}_2- \frac{\boldsymbol{x}_2\cdot\boldsymbol{v}_1} {\boldsymbol{v}_1\cdot\boldsymbol{v}_1}\boldsymbol{v}_1\\ &= \begin{pmatrix}4\\2\end{pmatrix}- \frac{(4)(1)+(2)(5)} {(1)(1)+(5)(5)}\begin{pmatrix}1\\5\end{pmatrix}\\ &= \begin{pmatrix}4\\2\end{pmatrix}- \frac{14} {27}\begin{pmatrix}1\\5\end{pmatrix}\\ &\approx \begin{pmatrix}3.5\\-0.5\end{pmatrix} \end{align*}$$

From our diagram, this looks to be correct! Therefore, an orthogonal basis for $\mathbb{R}^2$ is:

$$\boldsymbol{v}_1= \begin{pmatrix} 1\\2 \end{pmatrix},\;\;\;\;\; \boldsymbol{v}_2= \begin{pmatrix}3.5\\-0.5 \end{pmatrix}$$

The Gram-Schmidt process is a generalization of what we just performed to higher dimensions. To derive the Gram-Schmidt process, we require the following theorem.

Theorem.

Expressing the projection vector using an orthogonal basis

Let $W$ be a finite-dimensional subspace of a vector space $V$. If $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_r\}$ is an orthogonal basis for $W$ and $\boldsymbol{v}$ is any vector in $V$, then:

$$\mathrm{Proj}_{W}(\boldsymbol{v})= \frac{\boldsymbol{v}\cdot\boldsymbol{v}_1}{\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1 + \frac{\boldsymbol{v}\cdot\boldsymbol{v}_2}{\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2 + \cdots + \frac{\boldsymbol{v}\cdot\boldsymbol{v}_r}{\Vert\boldsymbol{v}_r\Vert^2}\boldsymbol{v}_r $$

If $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_r\}$ is an orthonormal basislink for $W$, then:

$$\mathrm{Proj}_{W}(\boldsymbol{v})= (\boldsymbol{v}\cdot\boldsymbol{v}_1)\boldsymbol{v}_1 + (\boldsymbol{v}\cdot\boldsymbol{v}_2)\boldsymbol{v}_2 + \cdots + (\boldsymbol{v}\cdot\boldsymbol{v}_r)\boldsymbol{v}_r $$

Proof. Let $W$ be a finite-dimensional subspace of a vector space $V$ and let $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_r\}$ be an orthogonal basis for $W$. From theoremlink, we know that any vector $\boldsymbol{v}\in{V}$ can be expressed as a sum of two vectors that are orthogonal to each other, that is, $\boldsymbol{v}=\boldsymbol{w}_1+\boldsymbol{w}_2$ where $\boldsymbol{w}_1\in{W}$ and $\boldsymbol{w}_2\in{W}^\perp$. Let $\boldsymbol{w}_1=\mathrm{Proj}_W(\boldsymbol{v})$. By theoremlink, $\boldsymbol{w}_1$ can be expressed as:

$$\begin{equation}\label{eq:htY2XMEdsQyceWOhHP7} \boldsymbol{w}_1= \frac{\boldsymbol{w}_1\cdot\boldsymbol{v}_1}{\Vert{\boldsymbol{v}_1}\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_1\cdot\boldsymbol{v}_2}{\Vert{\boldsymbol{v}_2}\Vert^2}\boldsymbol{v}_2+ \cdots+ \frac{\boldsymbol{w}_1\cdot\boldsymbol{v}_r}{\Vert{\boldsymbol{v}_r}\Vert^2}\boldsymbol{v}_r \end{equation}$$

Since $\boldsymbol{w}_2$ is orthogonal to the subspace $W$, we have that $\boldsymbol{w}_2$ is orthogonal to every vector of $W$, including the basis vectors $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_r\}$ of $W$. By theoremlink, the dot product of two perpendicular vectors is zero:

$$\begin{gather*} \boldsymbol{w}_2\cdot\boldsymbol{v}_1=0\\ \boldsymbol{w}_2\cdot\boldsymbol{v}_2=0\\ \vdots\\ \boldsymbol{w}_2\cdot\boldsymbol{v}_r=0\\ \end{gather*}$$

Let's modify \eqref{eq:htY2XMEdsQyceWOhHP7} like so:

$$\begin{align*} \boldsymbol{w}_1&= \frac{\boldsymbol{w}_1\cdot\boldsymbol{v}_1+\boldsymbol{w}_2\cdot\boldsymbol{v}_1}{\Vert{\boldsymbol{v}_1}\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_1\cdot\boldsymbol{v}_2+\boldsymbol{w}_2\cdot\boldsymbol{v}_2}{\Vert{\boldsymbol{v}_2}\Vert^2}\boldsymbol{v}_2+ \cdots+ \frac{\boldsymbol{w}_1\cdot\boldsymbol{v}_r+\boldsymbol{w}_2\cdot\boldsymbol{v}_r}{\Vert{\boldsymbol{v}_r}\Vert^2}\boldsymbol{v}_r\\ &=\frac{(\boldsymbol{w}_1+\boldsymbol{w}_2)\cdot\boldsymbol{v}_1}{\Vert{\boldsymbol{v}_1}\Vert^2}\boldsymbol{v}_1+ \frac{(\boldsymbol{w}_1+\boldsymbol{w}_2)\cdot\boldsymbol{v}_2}{\Vert{\boldsymbol{v}_2}\Vert^2}\boldsymbol{v}_2+ \cdots+ \frac{(\boldsymbol{w}_1+\boldsymbol{w}_2)\cdot\boldsymbol{v}_r}{\Vert{\boldsymbol{v}_r}\Vert^2}\boldsymbol{v}_r\\ \end{align*}$$

Now, because $\boldsymbol{v}=\boldsymbol{w}_1+\boldsymbol{w}_2$, this reduces to:

$$\begin{equation}\label{eq:PCh2udaAc6cnzmwjYVC} \boldsymbol{w}_1 =\frac{\boldsymbol{v}\cdot\boldsymbol{v}_1}{\Vert{\boldsymbol{v}_1}\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{v}\cdot\boldsymbol{v}_2}{\Vert{\boldsymbol{v}_2}\Vert^2}\boldsymbol{v}_2+ \cdots+ \frac{\boldsymbol{v}\cdot\boldsymbol{v}_r}{\Vert{\boldsymbol{v}_r}\Vert^2}\boldsymbol{v}_r\\ \end{equation}$$

Note that if $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_n\}$ is an orthonormal basis, then the length of each basis vector is one. Therefore, \eqref{eq:PCh2udaAc6cnzmwjYVC} reduces to:

$$\boldsymbol{w}_1 =(\boldsymbol{v}\cdot\boldsymbol{v}_1)\boldsymbol{v}_1+ (\boldsymbol{v}\cdot\boldsymbol{v}_2)\boldsymbol{v}_2+ \cdots+ (\boldsymbol{v}\cdot\boldsymbol{v}_r)\boldsymbol{v}_r$$

This completes the proof.

Theorem.

Gram-Schmidt process

Given a basis $\{\boldsymbol{w}_1,\boldsymbol{w}_2,\cdots,\boldsymbol{w}_r\}$ for the subspace $W$ of $\mathbb{R}^n$, we can construct an orthogonal basis $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_r\}$ for $W$ where:

$$\begin{align*} \boldsymbol{v}_1&=\boldsymbol{w}_1\\ \boldsymbol{v}_2 &=\boldsymbol{w}_2- \frac{\boldsymbol{w}_2\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1\\ \boldsymbol{v}_3&=\boldsymbol{w}_3- \Big(\frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_2}{\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2\Big)\\ \boldsymbol{v}_4&=\boldsymbol{w}_4- \Big(\frac{\boldsymbol{w}_4\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_4\cdot\boldsymbol{v}_2} {\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2+ \frac{\boldsymbol{w}_4\cdot\boldsymbol{v}_3} {\Vert\boldsymbol{v}_3\Vert^2}\boldsymbol{v}_3 \Big)\\\\ &\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\vdots\\\\ \boldsymbol{v}_r&=\boldsymbol{w}_r- \Big(\frac{\boldsymbol{w}_r\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_r\cdot\boldsymbol{v}_2} {\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2+ \frac{\boldsymbol{w}_r\cdot\boldsymbol{v}_3} {\Vert\boldsymbol{v}_3\Vert^2}\boldsymbol{v}_3+\cdots+ \frac{\boldsymbol{w}_r\cdot\boldsymbol{v}_{r-1}} {\Vert\boldsymbol{v}_{r-1}\Vert^2}\boldsymbol{v}_{r-1} \Big) \end{align*}$$

Note that we can easily convert any orthogonal basis into an orthonormal basis using theoremlink.

Proof. Let $W$ be any subspace for a vector space $V$. Let the basis for $W$ be $\{\boldsymbol{w}_1,\boldsymbol{w}_2,\cdots,\boldsymbol{w}_r\}$.

We first let $\boldsymbol{v}_1=\boldsymbol{w}_1$. Let $W_1$ be the subspace spanned by $\boldsymbol{w}_1$, which means that $\{\boldsymbol{w}_1\}$ is the basis for $W_1$. Since $W_1$ is spanned by a single basis vector, $W_1$ is simply a line. We can construct $\boldsymbol{v}_2$ that is orthogonal to $\boldsymbol{v}_1$ using $\boldsymbol{w}_2$ like so:

$$\begin{equation}\label{eq:DlntcNN5PCsJKfkc7Sz} \boldsymbol{v}_2= \boldsymbol{w}_2-\mathrm{Proj}_{W_1}(\boldsymbol{w}_2) \end{equation}$$

Visually, the relationship between the vectors is:

By theoremlink, we can express the projection as:

$$\begin{equation}\label{eq:NOHA67ECOblfxSXhX0m} \mathrm{Proj}_{W_1}(\boldsymbol{w}_2)= \frac{\boldsymbol{w}_2\cdot\boldsymbol{v}_1}{\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1 \end{equation}$$

Substituting \eqref{eq:NOHA67ECOblfxSXhX0m} into \eqref{eq:DlntcNN5PCsJKfkc7Sz} gives:

$$\boldsymbol{v}_2 =\boldsymbol{w}_2- \frac{\boldsymbol{w}_2\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1$$

Great, we've managed to find $\boldsymbol{v}_2$ that is orthogonal to $\boldsymbol{v}_1$. Next, to construct $\boldsymbol{v}_3$ that is orthogonal to both $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$ using $\boldsymbol{w}_3$, we find a vector orthogonal to the subspace $W_2$ spanned by $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$. Since $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$ are linearly independent and span $W_2$, they form a basis for $W_2$. Note that, unlike $W_1$ which traced a line, $W_2$ is a flat plane.

Now, $\boldsymbol{v}_3$ can be computed by:

$$\begin{equation}\label{eq:C263WvfQPmXAvC5J3zj} \boldsymbol{v}_3= \boldsymbol{w}_3-\mathrm{Proj}_{W_2}(\boldsymbol{w}_3) \end{equation}$$

Visually, the relationship between the vectors is:

By theoremlink, because $\{\boldsymbol{v}_1,\boldsymbol{v}_2\}$ is a basis for $W_2$, we know that $\mathrm{Proj}_{W_2}(\boldsymbol{w}_3)\in{W_2}$ can be written as:

$$\mathrm{Proj}_{W_2}(\boldsymbol{w}_3)= \frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_2} {\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2$$

Therefore, \eqref{eq:C263WvfQPmXAvC5J3zj} becomes:

$$\begin{align*} \boldsymbol{v}_3 &=\boldsymbol{w}_3- \Big(\frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_2} {\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2 \Big) \end{align*}$$

We've now managed to find three vectors $\boldsymbol{v}_1$, $\boldsymbol{v}_2$ and $\boldsymbol{v}_3$ that are orthogonal! Now, $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\boldsymbol{v}_3\}$ form a basis for the subspace $W_3$. Similarly, to construct $\boldsymbol{v}_4$ that is orthogonal to these vectors using $\boldsymbol{w}_4$, we compute:

$$\begin{align*} \boldsymbol{v}_4 &=\boldsymbol{w}_4-\mathrm{Proj}_{W_3}(\boldsymbol{w}_4)\\ &=\boldsymbol{w}_4- \Big(\frac{\boldsymbol{w}_4\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_4\cdot\boldsymbol{v}_2} {\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2+ \frac{\boldsymbol{w}_4\cdot\boldsymbol{v}_3} {\Vert\boldsymbol{v}_3\Vert^2}\boldsymbol{v}_3 \Big) \end{align*}$$

We keep on repeating this process until we end up with an orthogonal basis for $W$. Remember, if $W$ originally had $r$ basis vectors, then we must continue this process until we get $r$ orthogonal basis vectors. This completes the proof.

Example.

Constructing an orthogonal basis using the Gram-Schmidt process

Consider the basis $\{\boldsymbol{w}_1,\boldsymbol{w}_2,\boldsymbol{w}_3\}$ for $\mathbb{R}^3$ where:

$$\boldsymbol{w}_1= \begin{pmatrix} 1\\1\\0 \end{pmatrix},\;\;\;\;\; \boldsymbol{w}_2= \begin{pmatrix} 1\\0\\1 \end{pmatrix},\;\;\;\;\; \boldsymbol{w}_3= \begin{pmatrix} 0\\1\\1 \end{pmatrix}$$

Use the Gram-Schmidt process to construct an orthogonal basis for $\mathbb{R}^3$.

Solution. Let $\boldsymbol{v}_1=\boldsymbol{w}_1$. Next, $\boldsymbol{v}_2$, which is orthogonal to $\boldsymbol{v}_1$, can be constructed by:

$$\begin{align*} \boldsymbol{v}_2 &=\boldsymbol{w}_2- \frac{\boldsymbol{w}_2\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1\\ &=\begin{pmatrix}1\\0\\1\end{pmatrix}- \frac{(1)(1)+(0)(1)+(1)(0)} {(1)^2+(1)^2+(0)^2}\begin{pmatrix}1\\1\\0\end{pmatrix}\\ &=\begin{pmatrix}1\\0\\1\end{pmatrix}- \frac{1} {2}\begin{pmatrix}1\\1\\0\end{pmatrix}\\ &=\begin{pmatrix}0.5\\-0.5\\1\end{pmatrix} \end{align*}$$

Finally, $\boldsymbol{v}_3$, which is orthogonal to both $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$, can be constructed by:

$$\begin{align*} \boldsymbol{v}_3&=\boldsymbol{w}_3- \Big(\frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_2}{\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2\Big)\\ &=\begin{pmatrix}0\\1\\1\end{pmatrix}- \left(\frac{(0)(1)+(1)(1)+(1)(0)}{(1)^2+(1)^2+(0)^2}\begin{pmatrix}1\\1\\0\end{pmatrix} +\frac{(0)(0.5)+(1)(-0.5)+(1)(1)}{(0.5)^2+(-0.5)^2+(1)^2}\begin{pmatrix}0.5\\-0.5\\1\end{pmatrix}\right)\\ &=\begin{pmatrix}0\\1\\1\end{pmatrix}- \left(\frac{1}{2}\begin{pmatrix}1\\1\\0\end{pmatrix} +\frac{0.5}{1.5}\begin{pmatrix}0.5\\-0.5\\1\end{pmatrix}\right)\\ &=\begin{pmatrix}-2/3\\2/3\\2/3\end{pmatrix} \end{align*}$$

By the Gram-Schmidt process, we have managed to construct an orthogonal basis $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\boldsymbol{v}_3\}$ for $\mathbb{R}^3$ where:

$$\boldsymbol{v}_1= \begin{pmatrix} 1\\1\\0 \end{pmatrix},\;\;\;\;\; \boldsymbol{v}_2= \begin{pmatrix} 0.5\\-0.5\\1 \end{pmatrix},\;\;\;\;\; \boldsymbol{v}_3= \begin{pmatrix} -2/3\\2/3\\2/3 \end{pmatrix}$$

Let's still confirm that $\boldsymbol{v}_1$, $\boldsymbol{v}_2$ and $\boldsymbol{v}_3$ are orthogonal:

$$\begin{align*} \boldsymbol{v}_1\cdot\boldsymbol{v}_2&= \begin{pmatrix}1\\1\\0\end{pmatrix}\cdot \begin{pmatrix}0.5\\-0.5\\1\end{pmatrix}\\ &=(1)(0.5)+(1)(-0.5)+(0)(1)\\&=0\\\\ \boldsymbol{v}_1\cdot\boldsymbol{v}_3&= \begin{pmatrix}1\\1\\0\end{pmatrix}\cdot \begin{pmatrix}-2/3\\2/3\\2/3\end{pmatrix}\\ &=(1)(-2/3)+(1)(2/3)+(0)(2/3)\\\\ \boldsymbol{v}_2\cdot\boldsymbol{v}_3&= \begin{pmatrix}0.5\\-0.5\\1\end{pmatrix}\cdot \begin{pmatrix}-2/3\\2/3\\2/3\end{pmatrix}\\ &=(0.5)(-2/3)+(-0.5)(2/3)+(1)(2/3)\\ &=0 \end{align*}$$

Indeed, $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\boldsymbol{v}_3\}$ is an orthogonal basis for $\mathbb{R}^3$.

There are two ways to obtain an orthonormal basis using the Gram-Schmidt process:

  • in the end, convert each orthogonal basis vector into a unit vector by theoremlink.

  • at every step, convert the constructed orthogonal basis vector into a unit vector.

If we take the latter approach, the Gram-Schmidt process can be expressed as below.

Theorem.

Constructing an orthonormal basis using the Gram-Schmidt process

Given a basis $\{\boldsymbol{w}_1, \boldsymbol{w}_2,\cdots, \boldsymbol{w}_r\}$, we can construct an orthogonal basis $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_r\}$ using the Gram-Schmidt process. If $\boldsymbol{q}_i$ is a unit vector of $\boldsymbol{v}_i$ for $i=1,2,\cdots,r$, then the equations for the Gram-Schmidt process can be simplified as:

$$\begin{align*} \boldsymbol{v}_1&=\boldsymbol{w}_1\\ \boldsymbol{v}_2&=\boldsymbol{w}_2- (\boldsymbol{w}_2\cdot\boldsymbol{q}_1)\boldsymbol{q}_1\\ \boldsymbol{v}_3&=\boldsymbol{w}_3- \Big((\boldsymbol{w}_3\cdot\boldsymbol{q}_1)\boldsymbol{q}_1+ (\boldsymbol{w}_3\cdot\boldsymbol{q}_2)\boldsymbol{q}_2\Big)\\ \boldsymbol{v}_4&=\boldsymbol{w}_4- \Big((\boldsymbol{w}_4\cdot\boldsymbol{q}_1) \boldsymbol{q}_1+ (\boldsymbol{w}_4\cdot\boldsymbol{q}_2)\boldsymbol{q}_2+ (\boldsymbol{w}_4\cdot\boldsymbol{q}_3)\boldsymbol{q}_3 \Big)\\\\ &\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\vdots\\\\ \boldsymbol{v}_r&=\boldsymbol{w}_r- \Big((\boldsymbol{w}_r\cdot\boldsymbol{q}_1)\boldsymbol{q}_1+ (\boldsymbol{w}_r\cdot\boldsymbol{q}_2)\boldsymbol{q}_2+ (\boldsymbol{w}_r\cdot\boldsymbol{q}_3)\boldsymbol{q}_3+\cdots+ (\boldsymbol{w}_r\cdot\boldsymbol{q}_{r-1}) \boldsymbol{q}_{r-1} \Big) \end{align*}$$

Note that the set $\{\boldsymbol{q}_1, \boldsymbol{q}_2,\cdots, \boldsymbol{q}_r\}$ is an orthonormal basis by definitionlink.

Proof. Recall the Gram-Schmidt process used for constructing an orthogonal basis:

$$\begin{align*} \boldsymbol{v}_1&=\boldsymbol{w}_1\\ \boldsymbol{v}_2 &=\boldsymbol{w}_2- \frac{\boldsymbol{w}_2\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1\\ \boldsymbol{v}_3&=\boldsymbol{w}_3- \Big(\frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_3\cdot\boldsymbol{v}_2}{\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2\Big)\\ \boldsymbol{v}_4&=\boldsymbol{w}_4- \Big(\frac{\boldsymbol{w}_4\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_4\cdot\boldsymbol{v}_2} {\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2+ \frac{\boldsymbol{w}_4\cdot\boldsymbol{v}_3} {\Vert\boldsymbol{v}_3\Vert^2}\boldsymbol{v}_3 \Big)\\\\ &\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\vdots\\\\ \boldsymbol{v}_r&=\boldsymbol{w}_r- \Big(\frac{\boldsymbol{w}_r\cdot\boldsymbol{v}_1} {\Vert\boldsymbol{v}_1\Vert^2}\boldsymbol{v}_1+ \frac{\boldsymbol{w}_r\cdot\boldsymbol{v}_2} {\Vert\boldsymbol{v}_2\Vert^2}\boldsymbol{v}_2+ \frac{\boldsymbol{w}_r\cdot\boldsymbol{v}_3} {\Vert\boldsymbol{v}_3\Vert^2}\boldsymbol{v}_3+\cdots+ \frac{\boldsymbol{w}_r\cdot\boldsymbol{v}_{r-1}} {\Vert\boldsymbol{v}_{r-1}\Vert^2}\boldsymbol{v}_{r-1} \Big) \end{align*}$$

Where $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_r\}$ is an orthogonal basis. At step $1$, we can convert $\boldsymbol{v}_1$ into a unit vector $\boldsymbol{q}_1$ by using theoremlink. Because $\Vert\boldsymbol{q}_1\Vert=1$, step $2$ can now be simplified to:

$$\boldsymbol{v}_2= \boldsymbol{w}_2- (\boldsymbol{w}_2\cdot\boldsymbol{q}_1)\boldsymbol{q}_1$$

We now convert $\boldsymbol{v}_2$ into a unit vector $\boldsymbol{q}_2$, which gives us an orthonormal set $\{\boldsymbol{q}_1, \boldsymbol{q}_2\}$. Step $3$ can be written as:

$$\boldsymbol{v}_3= \boldsymbol{w}_3- \Big((\boldsymbol{w}_3\cdot\boldsymbol{q}_1)\boldsymbol{q}_1+ (\boldsymbol{w}_3\cdot\boldsymbol{q}_2)\boldsymbol{q}_2\Big)$$

For step $r$, we have the following

$$\boldsymbol{v}_r= \boldsymbol{w}_r- \Big((\boldsymbol{w}_r\cdot\boldsymbol{q}_1)\boldsymbol{q}_1+ ( \boldsymbol{w}_r\cdot\boldsymbol{q}_2)\boldsymbol{q}_2 +\cdots+ (\boldsymbol{w}_r\cdot\boldsymbol{q}_{r-1})\boldsymbol{q}_{r-1} \Big)$$

Note that to obtain a complete orthonormal basis, we will have to convert $\boldsymbol{v}_r$ into a unit vector as well. This completes the proof.

Theorem.

Dot product of an original basis vector and the corresponding orthonormal basis vector

Let $\{\boldsymbol{w}_1,\boldsymbol{w}_2, \cdots,\boldsymbol{w}_r\}$ be any basis for a subspace $W$. Let $\{\boldsymbol{v}_1, \boldsymbol{v}_2,\cdots, \boldsymbol{v}_r\}$ be an orthogonal basis constructed using the Gram-Schmidt process and let $\{\boldsymbol{q}_1,\boldsymbol{q}_2,\cdots,\boldsymbol{q}_r\}$ be the corresponding orthonormal basis.

For $i=1,2,\cdots,r$, we have that:

$$\boldsymbol{w}_i\cdot\boldsymbol{q}_i=0$$

Proof. Suppose we use the variantlink of the Gram-Schmidt process in which we build up an orthonormal basis at every step. The $i$-th orthogonal basis vector is constructed by:

$$\begin{align*} \boldsymbol{v}_i &=\boldsymbol{w}_i- (\boldsymbol{w}_i\cdot\boldsymbol{q}_1) \boldsymbol{q}_1+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_2)\boldsymbol{q}_2+\cdots+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_{i-1})\boldsymbol{q}_{i-1} \end{align*}$$

Multiplying both sides by $\frac{1}{\Vert\boldsymbol{v}_i\Vert}$ gives:

$$\begin{align*} \frac{1}{\Vert\boldsymbol{v}_i\Vert}\boldsymbol{v}_i &=\frac{1}{\Vert\boldsymbol{v}_i\Vert} \Big(\boldsymbol{w}_i- (\boldsymbol{w}_i\cdot\boldsymbol{q}_1) \boldsymbol{q}_1+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_2)\boldsymbol{q}_2+\cdots+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_{i-1})\boldsymbol{q}_{i-1} \Big) \end{align*}$$

The left-hand side is equal to the unit vector $\boldsymbol{q}_i$ by definitionlink and so:

$$\begin{align*} \boldsymbol{q}_i &=\frac{1}{\Vert\boldsymbol{v}_i\Vert} \Big(\boldsymbol{w}_i- (\boldsymbol{w}_i\cdot\boldsymbol{q}_1) \boldsymbol{q}_1+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_2)\boldsymbol{q}_2+\cdots+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_{i-1})\boldsymbol{q}_{i-1} \Big)\\ \Vert\boldsymbol{v}_i\Vert\boldsymbol{q}_i &=\boldsymbol{w}_i- (\boldsymbol{w}_i\cdot\boldsymbol{q}_1) \boldsymbol{q}_1+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_2)\boldsymbol{q}_2+\cdots+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_{i-1})\boldsymbol{q}_{i-1}\\ \boldsymbol{w}_i&= \Vert\boldsymbol{v}_i\Vert\boldsymbol{q}_i +(\boldsymbol{w}_i\cdot\boldsymbol{q}_1) \boldsymbol{q}_1+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_2)\boldsymbol{q}_2+\cdots+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_{i-1})\boldsymbol{q}_{i-1} \end{align*}$$

Now, performing a dot product with $\boldsymbol{q}_i$ on both sides gives:

$$\begin{align*} \boldsymbol{w}_i\cdot\boldsymbol{q}_i&= \Big(\Vert\boldsymbol{v}_i\Vert\boldsymbol{q}_i +(\boldsymbol{w}_i\cdot\boldsymbol{q}_1) \boldsymbol{q}_1+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_2)\boldsymbol{q}_2+\cdots+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_{i-1})\boldsymbol{q}_{i-1}\Big) \cdot\boldsymbol{q}_i\\ &=\Vert\boldsymbol{v}_i\Vert(\boldsymbol{q}_i\cdot\boldsymbol{q}_i) +(\boldsymbol{w}_i\cdot\boldsymbol{q}_1) (\boldsymbol{q}_1\cdot\boldsymbol{q}_i)+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_2)(\boldsymbol{q}_2\cdot\boldsymbol{q}_i) +\cdots+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_{i-1}) (\boldsymbol{q}_{i-1}\cdot\boldsymbol{q}_i)\\ &=\Vert\boldsymbol{v}_i\Vert\Vert\boldsymbol{q}_i\Vert^2 +(\boldsymbol{w}_i\cdot\boldsymbol{q}_1) (0)+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_2) (0) +\cdots+ (\boldsymbol{w}_i\cdot\boldsymbol{q}_{i-1}) (0)\\ &=\Vert\boldsymbol{v}_i\Vert(1)^2\\ &=\Vert\boldsymbol{v}_i\Vert \end{align*}$$

This completes the proof.

Theorem.

Orthogonal relationship between original basis vectors and orthogonal basis vectors

Let $\{\boldsymbol{w}_1,\boldsymbol{w}_2,\cdots,\boldsymbol{w}_n\}$ be any basis for the subspace $W$. If $\{\boldsymbol{v}_1,\boldsymbol{v}_2,\cdots,\boldsymbol{v}_n\}$ is the corresponding orthogonal basis constructed using the Gram-Schmidt process, then $\boldsymbol{w}_i$ is orthogonal to vectors $\boldsymbol{v}_{i+1}$, $\boldsymbol{v}_{i+2}$, $\cdots$, $\boldsymbol{v}_{n}$ for $i=1,2,\cdots,n-1$.

Proof. Recall that we start the Gram-Schmidt process by setting $\boldsymbol{v}_1=\boldsymbol{w}_1$. We then select $\boldsymbol{v}_2$ such that $\boldsymbol{v}_2$ is orthogonal to the subspace $W_1$ spanned by $\boldsymbol{v}_1$. Since $\boldsymbol{w}_1\in{W}_1$, we have that:

$$\boldsymbol{w_1} \;\text{ is orthogonal to }\; \boldsymbol{\boldsymbol{v}_2}$$

In the 2nd step, we select $\boldsymbol{v}_3$ such that $\boldsymbol{v}_3$ is orthogonal to the subspace $W_2$ spanned by $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$. Recall the relationship between $\boldsymbol{w}_2$ and $\boldsymbol{v}_2$ below:

$$\boldsymbol{w}_2= \boldsymbol{v}_2+\mathrm{Proj}_{W_1}(\boldsymbol{w}_2)$$

Where $\mathrm{Proj}_{W_1}(\boldsymbol{w}_2)\in{W_1}$. Because $W_1$ is spanned by $\boldsymbol{v}_1$, we know that $\mathrm{Proj}_{W_1}(\boldsymbol{w}_2)$ can be expressed as some linear combination of $\boldsymbol{v}_1$. This means that $\boldsymbol{w}_2$ can be expressed as some linear combination of $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$. Because $W_2$ is spanned by $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$, we have that $\boldsymbol{w}_2\in{W_2}$. Also $\boldsymbol{w}_1\in{W_2}$ since $\boldsymbol{w}_1\in{W_1}$ where $W_1$ is spanned by $\boldsymbol{v}_1$. Because $\boldsymbol{w}_1,\boldsymbol{w}_2\in{W_2}$ and $\boldsymbol{v}_3$ is orthogonal to any vector in $W_2$, we have that:

$$\begin{align*} &\boldsymbol{w_1} \;\text{ is orthogonal to }\; \boldsymbol{v}_2, \boldsymbol{v}_3\\ &\boldsymbol{w_2} \;\text{ is orthogonal to }\; \boldsymbol{v}_3\\ \end{align*}$$

In the 3rd step, we select $\boldsymbol{v}_4$ such that $\boldsymbol{v}_4$ is orthogonal to the subspace $W_3$ spanned by $\boldsymbol{v}_1$, $\boldsymbol{v}_2$ and $\boldsymbol{v}_3$. Recall the relationship between $\boldsymbol{v}_3$ and $\boldsymbol{w}_3$ below:

$$\boldsymbol{w}_3= \boldsymbol{v}_3+\mathrm{Proj}_{W_2}(\boldsymbol{w}_3) $$

Where $\mathrm{Proj}_{W_2}(\boldsymbol{w}_3)\in{W_2}$. This means that $\boldsymbol{w}_3$ can be expressed as some linear combination of $\boldsymbol{v}_3$ and the basis vectors of $W_2$, which we know are $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$. Because $W_3$ is spanned by $\boldsymbol{v}_1$, $\boldsymbol{v}_2$ and $\boldsymbol{v}_3$, we have that $\boldsymbol{w}_3\in{W_3}$. Next, we already established that:

  • $\boldsymbol{w}_2\in{W}_2$ where $W_2$ is spanned by $\boldsymbol{v}_1$ and $\boldsymbol{v}_2$.

  • $\boldsymbol{w}_1\in{W}_1$ where $W_1$ is spanned by $\boldsymbol{v}_1$.

Therefore, we have that $\boldsymbol{w}_1,\boldsymbol{w}_2,\boldsymbol{w}_3\in{W_3}$. The vector $\boldsymbol{v}_4$ is orthogonal to any vector in $W_3$, which means that:

$$\begin{align*} &\boldsymbol{w_1} \;\text{ is orthogonal to }\; \boldsymbol{v}_2,\boldsymbol{v}_3,\boldsymbol{v}_4\\ &\boldsymbol{w_2} \;\text{ is orthogonal to }\; \boldsymbol{v}_3,\boldsymbol{v}_4\\ &\boldsymbol{w_3} \;\text{ is orthogonal to }\; \boldsymbol{v}_4\\ \end{align*}$$

In general, in the $j$-th step, we add a vector $\boldsymbol{v}_{j+1}$ into the list of orthogonal vectors of $\boldsymbol{w}_1$, $\boldsymbol{w}_2$, $\cdots$, $\boldsymbol{w}_{j}$. In other words, $\boldsymbol{w}_i$ is orthogonal to $\boldsymbol{v}_{i+1}$ for $i=1,2,\cdots,n-1$. This completes the proof.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...