Value	Description
`"average"`	Return the average of the ranks.
`"min"`	Return the minimum of the ranks.
`"max"`	Return the maximum of the ranks.
`"first"`	Return the ranks based on the ordering in the DataFrame.
`"dense"`	Similar to `"min"`, but the rank is incremented by one after each group.

Check examples below for clarification. By default, method="average".

3. numeric_only · boolean · optional

If True, ordering is performed only on numeric values. By default, numeric_only=True.

4. na_optionlink · string · optional

How to deal with NaN values:

Value	Description
`"keep"`	Leave the `NaN`s intact, and ignore them in the ordering.
`"top"`	Assign the lowest (`1`, `2`, ...) ordering to the `NaN`s.
`"bottom"`	Assign the highest ordering to the `NaN`s.

By default, na_option="keep".

5. ascendinglink · boolean · optional

If True, then the smallest value will have a rank of 1.
If False, then the largest value will have a rank of 1.

By default, ascending=False.

6. pctlink · boolean · optional

If True, then rank will be in terms of percentiles instead. By default, pct=False.

Return Value

A DataFrame containing the ordering of the values in the source DataFrame.

Examples

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[4,5,3,3], "B": ["b","a","c","d"]})
df
                
            
               A  B
0  4  b
1  5  a
2  3  c
3  3  d

Ranking column-wise

To obtain the ordering of the values of each column:


        
        
            
                
                
                    df.rank()   # axis=0
                
            
               A    B  
0  3.0  2.0
1  4.0  1.0
2  1.5  3.0
3  1.5  4.0

Notice how we have two 1.5 in column A. This is because we had a tie - entries A2 and A3 shared the same value, and so the rank(~) method computed the average of their ranks (method="average" by default), that is, the average of 1 and 2.

Ranking row-wise

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[3,4],"B":[1,2],"C":[5,6]})
df
                
            
               A  B  C
0  3  1  5
1  4  2  6

To rank the values for each row, set axis=1:


        
        
            
                
                
                    df.rank(axis=1)
                
            
               A    B    C
0  2.0  1.0  3.0
1  2.0  1.0  3.0

Specifying method

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[8,6,6,8]})
df
                
            
               A
0  8
1  6
2  6
3  8

average

By default, method="average", which means that the average rank is computed for duplicate values:

max

To use the largest rank of each group:


        
        
            
                
                
                    df.rank(method="max")
                
            
               A
0  4.0
1  2.0
2  2.0
3  4.0

Here's df again for your reference:

min

To use the smallest rank of each group:


        
        
            
                
                
                    df.rank(method="min")
                
            
               A
0  3.0
1  1.0
2  1.0
3  3.0

first

To use the ordering of the values in the original DataFrame:


        
        
            
                
                
                    df.rank(method="first")
                
            
               A
0  3.0
1  1.0
2  2.0
3  4.0

Here, notice how the first value 8 is assigned a rank of 3, while the last value 8 is assigned a rank of 4. This is because of their ordering in df, that is, the first 8 is assigned a lower rank since it appears earlier in df.

Here's df again for your reference:

dense

This is similar to "min", except that the ranks are incremented by one after each duplicate group:


        
        
            
                
                
                    df.rank(method="dense")
                
            
               A
0  2.0
1  1.0
2  1.0
3  2.0

To clarify, in the case of "min", the group values 8 were assigned a rank of 3, but for "dense", the rank only gets incremented by 1 after each group, so we end up with a rank of 2 for the next group.

Specifying na_option

Consider the following DataFrame with some missing values:


        
        
            
                
                
                    df = pd.DataFrame({"A":[pd.np.NaN,6,pd.np.NaN,5]})
df
                
            
               A
0  NaN
1  6.0
2  NaN
3  5.0

By default, na_option="keep", which means that NaNs are ignored during the ranking and kept in the resulting DataFrame:


        
        
            
                
                
                    df.rank()   # na_option="keep"
                
            
               A
0  NaN
1  2.0
2  NaN
3  1.0

To assign the lowest ranks (1, 2, ...) to missing values:


        
        
            
                
                
                    df.rank(na_option="top")
                
            
               A
0  1.5
1  4.0
2  1.5
3  3.0

Here, you see 1.5 there since we have 2 NaN, and so the average of their ranks (1 and 2) was computed.

To assign the highest ranks to the missing values:


        
        
            
                
                
                    df.rank(na_option="bottom")
                
            
               A
0  3.5
1  2.0
2  3.5
3  1.0

Ranking in descending order

Consider the same DataFrame we had before:


        
        
            
                
                
                    df = pd.DataFrame({"A":[4,5,3,3], "B":["b","a","c","d"]})
df
                
            
               A  B
0  4  b
1  5  a
2  3  c
3  3  d

To rank in descending order (largest value has a rank of 1), simply set ascending=False:


        
        
            
                
                
                    df.rank(ascending=False)
                
            
               A    B  
0  2.0  3.0
1  1.0  4.0
2  3.5  2.0
3  3.5  1.0

Ranking using percentiles

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[4,5,3,3], "B":["b","a","c","d"]})
df
                
            
               A  B
0  4  b
1  5  a
2  3  c
3  3  d

To rank using percentiles, set pct=True:


        
        
            
                
                
                    df_one.rank(pct=True)
                
            
               A      B   
0  0.750  0.50
1  1.000  0.25
2  0.375  0.75
3  0.375  1.00

Ranking by multiple columns

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[8,9,9], "B":[7,6,5]})
df
                
            
               A  B
0  8  7
1  9  6
2  9  5

To rank by column A while using column B as a tie beaker:


        
        
            
                
                
                    df[["A","B"]].apply(tuple, axis=1).rank()
                
            
            0    1.0
1    3.0
2    2.0
dtype: float64

Note the following:

the first row is assigned a rank of 1 because the its value of A is the lowest.
the second row and third rows both have the same value of A. Therefore, we use their value of B as a tie-breaker; since the third row has a larger value of B, it is assigned a rank of 2.

Let's now break down the code. We first use the apply(~) method to combine the two columns into a single column of tuples:


        
        
            
                
                
                    df[["A","B"]].apply(tuple, axis=1)
                
            
            0    (8, 7)
1    (9, 6)
2    (9, 5)
dtype: object

We then use the rank method like so:


        
        
            
                
                
                    df[["A","B"]].apply(tuple, axis=1).rank()
                
            
            0    1.0
1    3.0
2    2.0
dtype: float64

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

Official Pandas Documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rank.html

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!