PySpark Column | alias method
Start your free 7-days trial now!
PySpark Column's alias(~)
method assigns a column label to a PySpark Column
.
Parameters
1. *alias
| string
The column label.
2. metadata
| dict
| optional
A dictionary holding additional meta-information to store in the StructField
of the returned Column
.
Return Value
A new PySpark Column.
Examples
Consider the following PySpark DataFrame:
+-----+---+| name|age|+-----+---+| ALEX| 20|| BOB| 30||CATHY| 40|+-----+---+
Most methods in the PySpark SQL Functions
library return Column
objects whose label is governed by the method that we use. For instance, consider the lower(~)
method:
Here, the PySpark Column returned by lower(~)
has the label lower(name)
by default.
Assigning new label to PySpark Column using the alias method
We can assign a new label to a column by using the alias(~)
method:
Here, we have assigned the label "lower_name"
to the column returned by lower(~)
.
Storing meta-data in PySpark Column's alias method
To store some meta-data in a PySpark Column, we can add the metadata
option in alias(~)
:
The metadata
is a dictionary that will be stored in the Column
object.
To access the metadata
, we can use the PySpark DataFrame's schema
property:
df_new.schema["lower_name"].metadata["some_data"]
10