Pandas DataFrame | iloc property
Start your free 7-days trial now!
Pandas' DataFrame.iloc
is used to access or update specific rows/columns of the DataFrame using integer indices.
Although it can also be used to access a single value in the DataFrame, we typically use DataFrame.iat
property instead in such cases.
Return Value
A Series
is returned if a single scalar is used in []
. Otherwise, a DataFrame
is returned.
Examples
Consider the following DataFrame:
df
A Ba 1 4b 2 5c 3 6
Accessing a single row
To access the second row:
df.iloc[1]
A 2B 5Name: 1, dtype: int64
Since we have a single scalar within the []
, the return type here is Series
.
Accessing a subset of rows
To access rows at positions 0
and 2
:
df.iloc[[0,2]]
A Ba 1 4c 3 6
Since we have a list within the []
, the return type here is DataFrame
.
Accessing rows using slicing syntax
For reference, we show the df
here again:
df
A Ba 1 4b 2 5c 3 6
Slicing works in a similar manner to that of Python's standard lists.
To access rows from positions 0
(inclusive) up until position 2
(exclusive):
df.iloc[0:2]
A Ba 1 4b 2 5
If you do not specify the start point (e.g. [:2]
) or the end point (e.g. [1:]
), then iloc
will return all rows from the beginning or until the end. For instance, to get all rows from index 1
:
df.iloc[1:]
A Bb 2 5c 3 6
Accessing rows using boolean masks
We can provide a boolean mask (i.e. an array-like structure of booleans) to fetch rows as well.
For your reference, we show the df
here again:
df
A Ba 1 4b 2 5c 3 6
For instance, consider the following mask:
df.iloc[[False, True, False]]
A Bb 2 5
With this approach, all rows corresponding to True
will be returned. Since rows at positions 0
and 2
correspond to False
in the mask, those rows are excluded.
Note that the length of the boolean mask must be the same as that of the number of rows in the DataFrame.
Accessing rows using functions
We can also pass a function to iloc
to specify what rows to fetch.
For your reference, we show the df
here again:
df
A Ba 1 4b 2 5c 3 6
To fetch rows with index greater than "b"
:
df.iloc[lambda x: x.index > "b"]
A Bc 3 6
For those unfamiliar with Python lambdas, the function can be interpreted like follows:
def foo(x): # The naming here is irrelevant
Here, note the following:
x
represents the source DataFrame, that is,df
.the function returns an array of booleans, where rows corresponding to
True
will be returned.
Accessing a single value
We can also access a single value in the DataFrame with iloc
.
For your reference, we show the df
here again:
df
A Ba 1 4b 2 5c 3 6
To access the value at position [1,1]
:
df.iloc[1,1]
5
Although iloc
can also be used to access a single value in the DataFrame, we typically use DataFrame.iat
property instead in such cases.
Accessing values using rows and columns
Consider the following DataFrame:
df
A B Ca 1 4 7b 2 5 8c 3 6 9
Using arrays
To access the subset residing at row positions 0
and 2
with column positions 0
and 1
:
df.iloc[[0,2], [0,1]]
A Ba 1 4c 3 6
Using slicing
You can also use slicing syntax here too:
df.iloc[1:, :2]
A Bb 2 5c 3 6
Here, we are fetching all rows including and after position 1
, as well as all columns up until the column at position 2
(exclusive).
Using Boolean Masks
Boolean masks work here as well:
df.iloc[1:, [True, False, True]]
A Cb 2 8c 3 9
Here, we are fetching all rows including and after position 1
, with the columns that correspond to True
in the boolean mask. In this case, we are including columns at position 0
and 2
.
Copy versus view
Depending on the context, iloc
can either return a view
or a copy
. Unfortunately, the rule by which one is returned is convoluted so it is best practise to actually check this yourself using the _is_view
property.
There is a one rule that is handy to remember - iloc
returns view of the data when a single column is extracted:
True
Since col_B
is a view, modifying col_B
will mutate the original df
.
Updating values using iloc
Consider the following DataFrame:
df
A Ba 3 5b 4 6
Updating a single value
To update the value at row 1
, column 1
:
df.iloc[1,1] = 10df
A Ba 3 5b 4 10
Updating multiple values
To update multiple values, simply use any of the access patterns described above and then assign a new value using =
. For instance, to update the second column:
df.iloc[:,1] = [9,10]df
A Ba 3 9b 4 10