search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

Comprehensive Guide on Metrics of Recommendation Systems

schedule Aug 10, 2023
Last updated
local_offer
Machine Learning
Tags
tocTable of Contents
expand_more
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Unlike metrics for regression and classification problems, the metrics for recommendation systems to evaluate their performance are less known and agreed upon in literature. In this guide, we will cover four key metrics that shed light on the characteristic of a recommendation system.

Intra-list similarity (ILS)

The intra-list similarity (ILS) is an average measure of how similar the recommended items are to their seed items. The formula to compute the intra-list similarity is as follows:

$$\begin{equation}\label{eq:GYof2UhFtqlWir9SLID} \text{ILS}=\frac{1}{n} \sum_{i\ne{j}} \text{sim}(P_i,P_j) \end{equation}$$

Where:

  • $n$ is the number of recommendations made.

  • $\text{sim}(P_i,P_j)$ is the similarity score between product $P_i$ and $P_j$ where $i\ne{j}$.

Note the following properties of intra-list similarity:

  • this metric ranges between $0$ and $1$ since $\text{sim}(P_i,P_j)$ is usually normalized to fall between $0$ and $1$.

  • a high ILS score suggests that the recommended products are similar to their seed products.

  • a low ILS score indicates that the recommended products are dissimilar to their seed products.

For example, consider the following table of recommendation similarities:

$P_1$

$P_2$

$P_3$

$P_4$

$P_1$

1

0.2

0.4

0.1

$P_2$

0.2

1

0.5

0.4

$P_3$

0.4

0.5

1

0.5

$P_4$

0.1

0.3

0.5

1

Here, we see that the similarity score between $P_1$ and $P_2$ is $0.2$.

Suppose we pick the top two recommendations:

  • $P_1$ recommends $P_3$ (0.4) and $P_2$ (0.2).

  • $P_2$ recommends $P_3$ (0.5) and $P_4$ (0.3).

  • $P_3$ recommends $P_2$ (0.5) and $P_4$ (0.5).

  • $P_4$ recommends $P_3$ (0.5) and $P_2$ (0.4).

In total, we are making $8$ recommendations ($n=8$). The intra-list similarity of our recommendation system for this case is:

$$\begin{align*} \text{ILS}&= \frac{1}{8}\big(\text{sim}(P_1,P_3)+ \text{sim}(P_1,P_2)+\text{sim}(P_2,P_3)+ \text{sim}(P_2,P_4)+\cdots+\text{sim}(P_4,P_3)+ \text{sim}(P_4,P_2)\big)\\ &=\frac{1}{8}\big(0.4+ 0.2+0.5+0.3+0.5+0.5+0.5+0.4\big)\\ &=0.4125 \end{align*}$$

This means that the average similarity between all pairs of items in the recommendation list is $0.4125$.

NOTE

Since we are only considering the top $2$ recommendations for each seed product, we sometimes include this detail in the metric name, as in $\text{ILS}@2$.

Diversity

Diversity is the opposite of intra-list similarity (ILS), that is, diversity is an average measure of how dissimilar the recommended items are to their seed items. The formula to compute diversity is:

$$\begin{equation}\label{eq:f94cSCPGhzrkFyS0jae} \text{Diversity} \;{\color{blue}=}\; 1-\text{ILS} \;{\color{blue}=}\;1- \frac{1}{n}\sum_{i\ne{j}} \text{sim}(P_i,P_j) \end{equation}$$

Where:

  • $n$ is the number of recommended items.

  • $\text{sim}(P_i,P_j)$ is the similarity score between items $P_i$ and $P_j$ where $i\ne{j}$.

A high diversity score indicates that the recommendations are very different from their seed products. For instance, a system that recommends horror movies for a romance movie would have high diversity.

To demonstrate, let's use the same simple example used when explaining ILS. Recall that the ILS score for our recommendation system was $0.4125$. This means that the diversity is:

$$\begin{align*} \text{Diversity} &=1-\text{ILS}\\ &=1-0.4125\\ &=0.5875 \end{align*}$$

This means that, on average, the dissimilarity score for seed items and their recommendations is $0.5875$.

Coverage

Coverage is a measure of the percentage of items that are recommended. Intuitively, coverage is a measure of how well the recommendation system is able to cover the full range of items available. The formula for coverage is as follows:

$$\text{coverage}= \frac{\text{Number of unique items recommended}} {\text{Total number of unique items}}$$

Note the following:

  • a recommendation system with high coverage recommends most items.

  • a recommendation system with low coverage recommends only a select few items.

  • coverage can also be interpreted as a measure of how diverse the recommended products are.

To demonstrate, suppose we have $5$ products $P_1$, $P_2$, $P_3$, $P_4$ and $P_5$. Consider a recommendation system that recommends two different products for each product:

  • $P_1$ recommends $P_2$ and $P_5$.

  • $P_2$ recommends $P_1$ and $P_5$.

  • $P_3$ recommends $P_1$ and $P_5$.

  • $P_4$ recommends $P_1$ and $P_2$.

  • $P_5$ recommends $P_1$ and $P_2$.

Here, we see that only $3$ products are getting recommended ($P_1$, $P_2$ and $P_5$) out of a total of $5$ products. The coverage of our recommendation system is:

$$\text{coverage}= \frac{3}{5}=0.6$$

This means that $60\%$ of the items are getting recommended, that is, $40\%$ of the products never appear in the recommendations.

Novelty

Novelty is a recommendation system metric that measures how surprising or unique the recommended items are to the user. It is a measure of how different the recommended items are from what the user has seen before.

$$\begin{equation}\label{eq:XPf6di4PnMSHM2HVB3o} \text{novelty}= \frac{\text{Sum of popularity scores}} {\text{Number of recommendations}} \end{equation}$$

To demonstrate how to apply this formula, suppose we have a total of 100 users and the number of ratings given to movies:

  • movie 1 has 80 ratings.

  • movie 2 has 70 ratings.

  • movie 3 has 20 ratings.

  • movie 4 has 10 ratings.

Here, note that we don't care about what the rating is (e.g. a rating of 1 star and 5 stars are treated the same). This is because popularity in this context refers to the metric of how well users know the movie - a movie with lots of negative ratings is considered popular in the sense that many people know about it.

NOTE

Here, we have used the number of ratings to infer a movie's popularity but we could also use other information such as the number of views and sales revenue instead.

Using the data available, we can infer the popularity of each movie like so:

$$\mathrm{popularity}(M_i)= \frac{\text{Number of users who have rated movie }M_i} {\text{Total number of users}}$$

For example, the popularity of movie one ($M_1$) is:

$$\mathrm{popularity}(M_1)= \frac{80} {100}=0.8$$

Let's summarize the popularity scores of the movies:

Movie

Popularity score

1

0.8

2

0.7

3

0.2

4

0.1

Now, suppose our recommendation system recommends movies $M_1$ and $M_2$ to a user. We now use formula \eqref{eq:XPf6di4PnMSHM2HVB3o} to compute the novelty score of our recommendation system:

$$\begin{align*} \text{novelty}&= \frac{\text{Sum of popularity scores}} {\text{Number of recommendations}}\\ &=\frac{0.8+0.7}{2}\\ &=0.75 \end{align*}$$

Note the following:

  • a system with high novelty score means that it is recommending popular movies.

  • a system with low novelty score means that it is recommending lesser-known movies.

Novelty is an important metric for recommendation systems because it encourages the system to recommend items that are different from what the user has seen before, which can lead to a more engaging and diverse user experience.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!