Counting the occurrence of values in columns of a Pandas DataFrame
Start your free 7-days trial now!
Counting occurrence of a single value in a column
Consider the following DataFrame:
df
A0 a1 b2 a
Solution
To count the number of times the value "a"
occurs in column A
:
2
Explanation
To break this down, we are first fetching a Series
of booleans where True
indicates a match:
(df["A"] == "a")
0 True1 False2 TrueName: A, dtype: bool
Since the internal representation of a True
is 1
, and False
is 0
, we can simply take the sum of this Series to count the total occurrence:
2
Counting occurrence of a single value in multiple columns
Consider the following DataFrame:
df
A B0 a a1 b a2 a a
Solution
To get the number of "a"
in each column:
A 2B 3dtype: int64
The idea is the exact same as that of the single-column case above.
Counting occurrences of multiple values in a column
Consider the following DataFrame:
df
A0 a1 b2 a3 c
Solution
To count the occurrences of multiple values in column A
:
3
Explanation
We first obtain a frequency count of the values in column A
using Series' value_counts()
:
counts
a 2c 1b 1Name: A, dtype: int64
We then extract the values we are interested in using []
syntax:
counts[values] # returns a Series
a 2b 1Name: A, dtype: int64
We then use the Series' sum()
method:
3
Counting the total number of occurrences
Consider the same df
as above:
df
A B0 a a1 b a2 a a
Solution
To count the total number of "a"
in df
:
Explanation
Once again, we first check for the presence of "a"
like so:
df == "a" # returns a Series
A B0 True True1 False True2 True True
True
is internally represented as a 1
, while False
as a 0
. Taking the sum of each column yields:
A 2B 3dtype: int64
This tells us that we have 2
occurrences of "a"
in column A
, and 3
in B
. What we want is the total number so we must take a second sum: