Pandas

655 guides

keyboard_arrow_down

Other math topics

Dagster

Pandas

NumPy

Matplotlib

PySpark

MySQL

chevron_leftSeries

Constructor Series

String Operations13 topics

Method argmax Method argmin Method between Method map Method to_frame Method to_list Method value_counts Property hasnans Property is_monotonic Property is_monotonic_decreasing Property is_monotonic_increasing Property is_unique

check_circle

Mark as learned

thumb_up

thumb_down

chat_bubble_outline

Comment

auto_stories Bi-column layout

settings

Pandas Series | value_counts method

schedule Aug 10, 2023

Last updated

local_offer

Python●Pandas

Parameters

1. normalizelink | boolean | optional

Whether or not to normalize the counts so that their sum is one. By default, normalize=False.

2. sortlink | boolean | optional

Whether or not to sort by count. By default, sort=True.

3. ascendinglink | boolean | optional

This parameter is only relevant if sort=True.

If True, then sort in ascending order
If False, then sort in descending order

By default, ascending=False.

4. binslink | int | optional

If bins is set, then counts will be based on intervals. The width of the interval will be determined like so:


        
        
            
                
                
                    [max(Series) - min(Series)] / bins

By default, bins=None.

5. dropnalink | boolean | optional

If True, then NaN will be ignored.
If False, then NaN will be counted as well.

By default, dropna=True,

Return Value

A Series.

Examples

Basic usage

To get the count of the unique values in the Series:


        
        
            
                
                
                    s = pd.Series([4,3,3,5,np.nan])
s.value_counts()
                
            
            3.0   2
5.0   1
4.0   1
dtype: int64

Here, notice how the missing value was ignored since dropna=True by default.

Specifying normalize

To normalize the counts so that they sum up to one:


        
        
            
                
                
                    s = pd.Series([4,3,3,5,np.nan])
s.value_counts(normalize=True)
                
            
            3.0   0.50
5.0   0.25
4.0   0.25
dtype: float64

Specifying sort

By default, sort=True, which means that the resulting Series is sorted by the count:


        
        
            
                
                
                    s = pd.Series([4,3,3,5,np.nan])
s.value_counts()
                
            
            3.0   2
5.0   1
4.0   1
dtype: int64

Setting sort=False will disable such sorting:


        
        
            
                
                
                    s = pd.Series([4,3,3,5,np.nan])
s.value_counts(sort=False)
                
            
            4.0   1
3.0   2
5.0   1
dtype: int64

Specifying ascending

By default, ascending=True, which means that the resulting Series is sorted by count in ascending order:


        
        
            
                
                
                    s = pd.Series([4,3,3,5,np.nan])
s.value_counts()
                
            
            4.0   1
5.0   1
3.0   2
dtype: int64

To sort by descending order instead, set ascending=False like so:


        
        
            
                
                
                    s = pd.Series([4,3,3,5,np.nan])
s.value_counts(ascending=False)
                
            
            3.0   2
5.0   1
4.0   1
dtype: int64

Specifying bins

Instead of counting unique values, we can count based on an interval by passing bins like so:


        
        
            
                
                
                    s = pd.Series([4,3,3,5,np.nan])
s.value_counts(bins=3)
                
            
            (2.9970000000000003, 3.667]    2
(4.333, 5.0]                   1
(3.667, 4.333]                 1
dtype: int64

Here, 2 values reside in the interval 2.997 and 3.667.

Note that the width of each interval is computed by:


        
        
            
                
                
                    [max(s) - min(s)] / 3 = 2/3 = 0.666

Counting nan in value_counts

By default, dropna=True, which means that all NaN are ignored. We can choose to include them in the count by passing in dropna=False like so:


        
        
            
                
                
                    s = pd.Series([4,3,3,5,np.nan])
s.value_counts(dropna=False)
                
            
            3.0    2
NaN    1
5.0    1
4.0    1
dtype: int64

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

Official Pandas Documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.value_counts.html

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!