Difference between None and NaN in Pandas
Start your free 7-days trial now!
The distinction between None
and NaN
in Pandas is subtle:
None
represents a missing entry, but its type is not numeric. This means that any column (Series) that contains aNone
cannot be of type numeric (e.g.int
andfloat
).NaN
, which stands for not-a-number, is a numeric type. This means thatNaN
can appear in columns of typeint
andfloat
.
Numeric Series
Consider a Series initialised with None
:
filter_none
Copy
s = pd.Series([3,None])s
0 3.01 NaNdtype: float64
The resulting Series contains a NaN
instead of None
. This is because Pandas automatically converted None
to NaN
given that the other value (3
) is a numeric, which then allows the column type to be float64
. If None
was not casted into NaN
, then the column type would end up as object
, which is inaccurate and makes certain operations in Pandas less performant.
Let us create a Series with NaN
:
filter_none
Copy
import numpy as nps = pd.Series([3,np.nan])s
0 3.01 NaNdtype: float64
As you would expect, the result is identical, and the only difference is that Pandas did not need to perform any casting from None
to NaN
since NaN
was directly given.
Non-numeric Series
We have seen that None
is automatically converted into NaN
when the Series type is numeric.
For non-numeric Series, None
does not get casted to NaN
:
filter_none
Copy
s = pd.Series(["3",None])s
0 31 Nonedtype: object
In comparison, creating a Series with NaN
:
filter_none
Copy
s = pd.Series(["3",np.nan])s
0 31 NaNdtype: object
Here, NaN
simply remains a NaN
since numeric values are allowed in a Series that holds other data types (a string
in this case). Note that since the Series holds mixed-types, the dtype is object
.
Arithmetics
The fact that None
is not a numeric type, whereas NaN
is, has consequences when performing arithmetics.
When performing arithmetics with None
:
filter_none
Copy
None + 5
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
Here, we get an error because a summation between a non-numeric type (None
) and a number is not defined.
In contrast:
filter_none
Copy
np.nan + 5
nan
Here, no error is thrown and instead, a NaN
is returned. Any arithmetic operation that involves a NaN
will result in another NaN
.
Equality comparison
Another difference in how None
and NaN
behave is in equality comparison.
Equating None
will result in True
:
filter_none
Copy
None == None
True
Equating NaN
will result in False
:
filter_none
Copy
np.nan == np.nan
False
As a side note, equating anything with NaN
will result in False
:
filter_none
Copy
np.nan == None
False
To check for values that are NaN
, instead of using ==
, opt to use isna(~)
:
filter_none
Copy
pd.isna(np.nan)
True
Note that isna(~)
returns True
for None
as well:
filter_none
Copy
pd.isna(None)
True