PySpark SQL Functions | explode method
Start your free 7-days trial now!
PySpark SQL Functions' explode(~)
method flattens the specified column values of type list
or dictionary
.
Parameters
1. col
| string
or Column
The column containing lists or dictionaries to flatten.
Return Value
A new PySpark Column.
Examples
Flattening lists
Consider the following PySpark DataFrame:
+------+| vals|+------+|[a, b]|| [d]|+------+
Here, the column vals
contains lists.
To flatten the lists in the column vals
, use the explode(~)
method:
Here, we are using the alias(~)
method to assign a label to the column returned by explode(~)
.
Flattening dictionaries
Consider the following PySpark DataFrame:
+----------------+| vals|+----------------+| {a -> b}||{e -> f, c -> d}|+----------------+
Here, the column vals
contains dictionaries.
To flatten each dictionary in column vals
, use the explode(~)
method:
In the case of dictionaries, the explode(~)
method returns two columns - the first column contains all the keys while the second column contains all the values.