PySpark SQL Functions | element_at method
Start your free 7-days trial now!
PySpark SQL Functions' element_at(~)
method is used to extract values from lists or maps in a PySpark Column.
Parameters
1. col
| string
or Column
The column of lists or maps from which to extract values.
2. extraction
| int
The position of the value that you wish to extract. Negative positioning is supported - extraction=-1
will extract the last element from each list.
The position is not indexed-based. This means that extraction=1
will extract the first value in the lists or maps.
Return Value
A new PySpark Column.
Examples
Extracting n-th value from arrays in PySpark Column
Consider the following PySpark DataFrame that contains some lists:
+------+| vals|+------+|[5, 6]||[7, 8]|+------+
To extract the second value from each list in vals
, we can use element_at(~)
like so:
Here, note the following:
the position
2
is not index-based.we are using the
alias(~)
method to assign a label to the column returned byelement_at(~)
.
Note that extracting values that are out of bounds will return null
:
We can also extract the last element by supplying a negative value for extraction
:
Extracting values from maps in PySpark Column
Consider the following PySpark DataFrame containing some dict
values:
+----------------+| vals|+----------------+| {A -> 4}||{A -> 5, B -> 6}|+----------------+
To extract the values that has the key 'A'
in the vals
column:
Note that extracting values using keys that do not exist will return null
:
Here, the key 'B'
does not exist in the map {'A':4}
so a null
was returned for that row.