search
Search
Join our weekly DS/ML newsletter layers DS/ML Guides
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook
0
thumb_down
0
chat_bubble_outline
0
auto_stories new
settings

Comprehensive Guide on Root Mean Squared Error (RMSE)

Machine Learning
chevron_right
Model performance
schedule Jul 1, 2022
Last updated
local_offer
Tags

What is root mean squared error (RMSE)?

The root mean squared error (RMSE) is a common way to quantify the error between actual and predicted values, and is defined as the square root of the average squared differences between the actual and predicted values. The mathematical formula is as follows:

$$\mathrm{RMSE}=\sqrt{\frac{1}{n}\sum^n_{i=1}(y_i-\hat{y}_i)^2} $$

Where:

  • $n$ is the number of predicted values

  • $y_i$ is the actual true value of the $i$-th data

  • $\hat{y}_i$ is the predicted value of the $i$-th data

As the name suggests, RMSE is simply the square root of the mean squared error (MSE). The MSE involves taking the square of the difference in the predicted and actual target, which means that the unit of MSE is not the same as that of the target value. The fact that RMSE takes the square root of MSE means that the unit of RMSE is the same as that of the response variable $y$, thereby making the interpretation easier. Loosely speaking, we can interpret RMSE as how far off the predicted values are on average.

A lower value of RMSE is favourable since this is indicative of a good model fit.

Simple example of computing root mean squared error (RMSE)

Suppose we built a simple linear model to predict three different y-values given their x-values:

Here, our predictions are off by 2 for the first data point, 0 for the second, and 1 for the last point. To compute the RMSE of our model:

$$\begin{align*} \mathrm{RMSE}&=\sqrt{\frac{1}{3}\left((1-3)^2+(2-2)^2+(3-2)^2\right)}\\ &=\sqrt{\frac{1}{3}\left(4+0+1\right)}\\ &\approx1.29 \end{align*}$$

The RMSE of our model is therefore 1.29, which we can loosely interpret as meaning our predictions are off by 1.29 on average. Consider computing mean absolute error (MAE) if you wish to compute exactly the average difference between predicted and actual values.

Difference between root mean squared error (RMSE) and mean absolute error (MAE)

Mathematically, since RMSE involves squaring the differences before taking the average, the differences become more pronounced compared to MAE.

For example, consider the following dataset:

i-th data

Error

Absolute Error

Squared Error

1

2

2

4

2

-2

2

4

3

2

2

4

4

2

2

4

Here, the MAE would be $2$, while RMSE would be $2$.

Now consider the case when we have variance in our errors:

i-th data

Error

Absolute Error

Squared Error

1

2

2

4

2

-2

2

4

3

4

4

16

4

4

4

16

Here, the MAE would be $3$, while RMSE would be around $3.16$.

Now consider the case when we have an outlier:

i-th data

Error

Absolute Error

Squared Error

1

2

2

4

2

-2

2

4

3

2

2

4

4

-102

102

10404

Here, the MAE would be $27$, while RMSE would be around $51.03$.

As we can see, RMSE penalises predictions that are far off much more compared to MAE. This means that we should use RMSE whenever we want to add more penalty when the predictions greatly differ from the actual values. For instance, suppose you have two predictions errors 5 and 10. If you wish to give more weight to the larger error terms (10 in this case), then using RMSE is desirable.

Note that we can mathematically prove that MAE will always be less than RMSE.

Upper bound of RMSE

We can prove the upper bound of RMSE in relation to MAE like so:

$$\text{RMSE}\le\mathrm{MAE}\times\sqrt{n}$$

Where $n$ is the sample size. This means that as the number of samples increase, the upper bound of RMSE increases. This is why we should be cautious when comparing RMSE derived from two different sample sizes.

Usage of RMSE and MAE

RMSE is often used in loss functions that require optimisation through standard techniques such as gradient descent, while MAE is not. This is because MAE involves taking the absolute value, which is not easily differentiable compared to RMSE.

Interpretability of RMSE and MAE

MAE is easier to interpret than RMSE. MAE can be defined as the average absolute differences between the actual and predicted values. In contrast, RMSE can only be loosely defined as so, with the added interpretation that higher error terms are penalised more.

Computing root mean squared error (RMSE) using scikit-learn

RMSE is equal to the square root of MSE. To compute RMSE using scikit-learn, use the mean_squared_error function, and then set the argument squared=False:

from sklearn.metrics import mean_squared_error

y_true = [2,6,5]
y_pred = [7,4,3]
mean_squared_error(y_true, y_pred, squared=False)
3.3166247903554
mail
Join our newsletter for updates on new DS/ML comprehensive guides (spam-free)
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
Ask a question or leave a feedback...