Pandas | to_numeric method
Start your free 7-days trial now!
Pandas' to_numeric(~)
method converts the input to a numerical type. By default, either int64
or float64
will be used.
Parameters
1. arg
link | array-like
The input array, which could be a scalar, list, NumPy array or Series.
2. errors
link | string
| optional
How to deal with values that cannot be parsed as a numeric:
Value | Description |
---|---|
| Raise an error. |
| Convert into a |
| Leave the value as is. |
By default, errors="raise"
.
3. downcast
link | string
| optional
Whether or not to perform convert numerics into the smallest numeric type (e.g. int64
to int8
):
Value | Description |
---|---|
| Convert type to |
| Convert type to |
| Convert type to |
| Convert type to |
| Do not perform any downcasting. |
Note that downcasting is performed after the main numeric conversion, and so if there are parsing issues during downcasting, then an error will be raised regardless of what you specified for errors
.
By default, downcast=None
.
Return Value
If arg
is a Series, then a new Series is returned. Otherwise, a new Numpy array is returned.
Examples
Basic usage
To convert the type of all values in a Series to numeric type:
pd.to_numeric(s)
0 1.01 2.02 3.0dtype: float64
Note that the source Series s
is left intact and a new Series is returned here.
Error handling
By default, errors="raise"
, which means that when a problem occurs while converting to numeric type, an error will be thrown:
pd.to_numeric(s)
ValueError: Unable to parse string "2.3.4" at position 1
Instead of throwing an error, we can convert invalid values into NaN
, by specifying errors="coerce"
like so:
pd.to_numeric(s, errors="coerce")
0 2.01 NaNdtype: float64
We could also leave invalid values intact by using errors="ignore"
:
pd.to_numeric(s, errors="ignore")
0 21 2.3.4dtype: object
Notice how the dtype of the Series is object
. This is because a Series that contains even one non-numeric type ("2.3.4"
in this case) must be upcast to a more general type, that is, object
.
Downcasting
By default, numerics are converted to either int64
or float64
:
pd.to_numeric(s)
0 1.01 2.02 3.0dtype: float64
Here, float64
is used since "2.0"
is converted to a float
instead of int
under the hood.
We can convert this to float32
instead by passing in downcast="float"
like so:
pd.to_numeric(s, downcast="float")
0 1.01 2.02 3.0dtype: float32
In this case, since 2.0
can be represented as an int
as well, we can also pass downcast="integer"
to convert the values into type int8
:
pd.to_numeric(s, downcast="integer")
0 11 22 3dtype: int8