chevron_left
PySpark
0
0
0
new
PySpark DataFrame | toPandas method
Machine Learning
chevron_rightPySpark
schedule Mar 10, 2022
Last updated PySpark
Tags tocTable of Contents
expand_more PySpark DataFrame's toPandas(~)
method converts a PySpark DataFrame into a Pandas DataFrame.
WARNING
Watch out for the following:
All the data from the worker nodes are transferred to the Driver, and so make sure that your Driver has sufficient memory.
Driver must have the Pandas libraries installed.
Examples
Consider the following DataFrame:
df = spark.createDataFrame([["Alex", 20], ["Bob", 24], ["Cathy", 22]], ["name", "age"])df.show()
+-----+---+| name|age|+-----+---+| Alex| 20|| Bob| 24||Cathy| 22|+-----+---+
To convert this PySpark DataFrame into a Pandas DataFrame:
df.toPandas()
name age0 Alex 201 Bob 242 Cathy 22
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
Ask a question or leave a feedback...
Official PySpark Documentation
https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.DataFrame.toPandas.html