What is Fancy Indexing in Pandas?
Start your free 7-days trial now!
Fancy indexing is used to access multiple values in an array-like structure. In the context of Pandas, array-like structures include, but are not limited to, Numpy arrays, Series and DataFrames.
Examples
Fancy Indexing for Series
Suppose we have the following Series:
To get the value at indices 0
and 2
:
indices = [0,2]s[indices]
0 52 6dtype: int64
The return type here is Series
, since we are accessing values from a Series
.
Fancy Indexing for Numpy Arrays
Consider the following 1D Numpy array:
Suppose we wanted to create a 2D array using some of the values in a
.
To do so, we must first create a 2D array of indices:
indices = np.array([[1,3],[0,0]])indices
array([[1, 3], [0, 0]])
Now, to create the array with the values that correspond to these indices:
a[indices]
array([[8, 7], [5, 5]])
Notice how the shape of the resulting array is the same as that of the indices. The return type here is a Numpy array since we are accessing values from a Numpy array.
Fancy indexing in multi-dimensions
Consider the following 2D NumPy array:
a = np.array([[5,8,3],[6,7,2]])a
array([[5, 8, 3], [6, 7, 2]])
To fetch multiple values in this array:
indices_row = [0,1,0]indices_column = [2,0,1]a[indices_row, indices_column]
array([3, 6, 8])
Here, we're fetching the values at (0,2)=3
, (1,0)=6
and (0,1)=8
.The return type here a NumPy array since we are accessing values from a NumPy array.
Slicing using Fancy Indexing
The slicing syntax also works when fancy indexing.
Consider the same 2D Numpy array:
a = np.array([[5,8,3],[6,7,2]])a
array([[5, 8, 3], [6, 7, 2]])
To get the columns with indices 0 and 2:
a[:, [0,2]]
array([[5, 3], [6, 2]])
Just to break this down, the rows we are after are denoted by :
, which just means to fetch all rows. Next, the [0,2]
means to fetch columns with indices 0 and 2.
Assigning values using Fancy Indexing
You can assign new values using fancy indexing as well.
Consider the same 2D Numpy array:
a = np.array([[5,8,3],[6,7,2]])a
array([[5, 8, 3], [6, 7, 2]])
Let's change the values 3
and 7
:
indices_row = [0,1]indices_column = [2,1]a[indices_row, indices_column] = 10a
array([[ 5, 8, 10], [ 6, 10, 2]])
Here, notice how we assigned a scalar value of 10
instead of [10,10]
. A scalar value of 10
simply gets broadcasted (i.e. repeated) to match the appropriate size.
Of course, if you wanted to assign individual values instead, you could just supply an array, like so:
a[indices_row, indices_column] = [10,20]a
array([[ 5, 8, 10], [ 6, 20, 2]])