Getting integer indexes of rows with NaN in Pandas DataFrame
Start your free 7-days trial now!
Rows with missing value for a specific column
Consider the following DataFrame with some missing values:
import numpy as npdf = pd.DataFrame({"A":[3,np.nan,np.nan],"B":[5,6,np.nan]}, index= ["a","b","c"])df
A Ba 3.0 5.0b NaN 6.0c NaN NaN
Solution
To get the integer indexes of rows where the value for column A
is missing:
np.where(df["A"].isna())[0] # returns a NumPy array
array([1, 2])
Explanation
We first call isna()
to extract a Series of booleans where True
indicates rows with missing value(s) for column A
:
df["A"].isna()
a Falseb Truec TrueName: A, dtype: bool
We then call NumPy's where(~)
, which returns a tuple containing the integer indexes of entries that are True
:
np.where(df["A"].isna())
(array([1, 2]),)
Finally, we use [0]
to access the NumPy array of integer indexes within the tuple.
Rows with all missing values
Consider the following DataFrame:
import numpy as npdf = pd.DataFrame({"A":[3,np.nan,np.nan],"B":[5,6,np.nan]}, index= ["a","b","c"])df
A Ba 3.0 5.0b NaN 6.0c NaN NaN
Solution
To get the integer indexes of rows with all missing values:
np.where(df.isna().all(axis=1))[0] # returns a NumPy array
array([2])
Explanation
We first obtain a DataFrame of booleans where True
represents entries with missing values using isna()
:
df.isna()
A Ba False Falseb True Falsec True True
We then call all(axis=1)
, which returns a Series of booleans where True
indicates a row with all True
:
df.isna().all(axis=1)
a Falseb Falsec Truedtype: bool
We pass this into NumPy's where(~)
method, which returns a tuple containing the integer indexes of entries that are True
:
np.where(df.isna().all(axis=1))
(array([2]),)
We then access the integer indexes, which is a NumPy array, using []
notation:
np.where(df.isna().all(axis=1))[0]
array([2])