PySpark
keyboard_arrow_down 147 guides
chevron_leftPySpark
check_circle
Mark as learned thumb_up
7
thumb_down
0
chat_bubble_outline
0
Comment auto_stories Bi-column layout
settings
PySpark | User Guide
schedule Aug 12, 2023
Last updated local_offer
Tags PySpark
tocTable of Contents
expand_more Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Start your free 7-days trial now!
Guides
PySpark is an API interface that allows you to write Python code to interact with Apache Spark, which is an open source distributing computing framework to handle big data.
RDD is the central data structure of Spark in which the data is partitioned across a number of worker nodes to facilitate parallel operations.
Getting Started with PySpark on Databricks
Databricks offer a platform to gain some hands-on experience with PySpark for free using the community edition.
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
7
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!