Difference between None and NaN in Pandas
Start your free 7-days trial now!
The distinction between None
and NaN
in Pandas is subtle:
None
represents a missing entry, but its type is not numeric. This means that any column (Series) that contains aNone
cannot be of type numeric (e.g.int
andfloat
).NaN
, which stands for not-a-number, is a numeric type. This means thatNaN
can appear in columns of typeint
andfloat
.
Numeric Series
Consider a Series initialised with None
:
s = pd.Series([3,None])s
0 3.01 NaNdtype: float64
The resulting Series contains a NaN
instead of None
. This is because Pandas automatically converted None
to NaN
given that the other value (3
) is a numeric, which then allows the column type to be float64
. If None
was not casted into NaN
, then the column type would end up as object
, which is inaccurate and makes certain operations in Pandas less performant.
Let us create a Series with NaN
:
import numpy as nps = pd.Series([3,np.nan])s
0 3.01 NaNdtype: float64
As you would expect, the result is identical, and the only difference is that Pandas did not need to perform any casting from None
to NaN
since NaN
was directly given.
Non-numeric Series
We have seen that None
is automatically converted into NaN
when the Series type is numeric.
For non-numeric Series, None
does not get casted to NaN
:
s = pd.Series(["3",None])s
0 31 Nonedtype: object
In comparison, creating a Series with NaN
:
s = pd.Series(["3",np.nan])s
0 31 NaNdtype: object
Here, NaN
simply remains a NaN
since numeric values are allowed in a Series that holds other data types (a string
in this case). Note that since the Series holds mixed-types, the dtype is object
.
Arithmetics
The fact that None
is not a numeric type, whereas NaN
is, has consequences when performing arithmetics.
When performing arithmetics with None
:
None + 5
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
Here, we get an error because a summation between a non-numeric type (None
) and a number is not defined.
In contrast:
np.nan + 5
nan
Here, no error is thrown and instead, a NaN
is returned. Any arithmetic operation that involves a NaN
will result in another NaN
.
Equality comparison
Another difference in how None
and NaN
behave is in equality comparison.
Equating None
will result in True
:
None == None
True
Equating NaN
will result in False
:
np.nan == np.nan
False
As a side note, equating anything with NaN
will result in False
:
np.nan == None
False
To check for values that are NaN
, instead of using ==
, opt to use isna(~)
:
pd.isna(np.nan)
True
Note that isna(~)
returns True
for None
as well:
pd.isna(None)
True