Pandas Series | value_counts method
Start your free 7-days trial now!
Pandas Series.value_counts(~)
method returns the count of unique values in the Series.
Parameters
1. normalize
link | boolean
| optional
Whether or not to normalize the counts so that their sum is one. By default, normalize=False
.
2. sort
link | boolean
| optional
Whether or not to sort by count. By default, sort=True
.
3. ascending
link | boolean
| optional
This parameter is only relevant if sort=True
.
If
True
, then sort in ascending orderIf
False
, then sort in descending order
By default, ascending=False
.
4. bins
link | int
| optional
If bins
is set, then counts will be based on intervals. The width of the interval will be determined like so:
[max(Series) - min(Series)] / bins
By default, bins=None
.
5. dropna
link | boolean
| optional
If
True
, thenNaN
will be ignored.If
False
, thenNaN
will be counted as well.
By default, dropna=True
,
Return Value
A Series
.
Examples
Basic usage
To get the count of the unique values in the Series:
s.value_counts()
3.0 25.0 14.0 1dtype: int64
Here, notice how the missing value was ignored since dropna=True
by default.
Specifying normalize
To normalize the counts so that they sum up to one:
s.value_counts(normalize=True)
3.0 0.505.0 0.254.0 0.25dtype: float64
Specifying sort
By default, sort=True
, which means that the resulting Series is sorted by the count:
s.value_counts()
3.0 25.0 14.0 1dtype: int64
Setting sort=False
will disable such sorting:
s.value_counts(sort=False)
4.0 13.0 25.0 1dtype: int64
Specifying ascending
By default, ascending=True
, which means that the resulting Series is sorted by count in ascending order:
s.value_counts()
4.0 15.0 13.0 2dtype: int64
To sort by descending order instead, set ascending=False
like so:
s.value_counts(ascending=False)
3.0 25.0 14.0 1dtype: int64
Specifying bins
Instead of counting unique values, we can count based on an interval by passing bins
like so:
s.value_counts(bins=3)
(2.9970000000000003, 3.667] 2(4.333, 5.0] 1(3.667, 4.333] 1dtype: int64
Here, 2 values reside in the interval 2.997
and 3.667
.
Note that the width of each interval is computed by:
[max(s) - min(s)] / 3 = 2/3 = 0.666
Counting nan in value_counts
By default, dropna=True
, which means that all NaN
are ignored. We can choose to include them in the count by passing in dropna=False
like so:
s.value_counts(dropna=False)
3.0 2NaN 15.0 14.0 1dtype: int64