PySpark

147 guides

keyboard_arrow_down

Other math topics

Dagster

Pandas

NumPy

Matplotlib

PySpark

MySQL

chevron_leftPySpark Guides

File system in Databricks Guide on caching Guide on RDD Getting started with PySpark on Databricks Guide on user-defined functions Guide on window functions Using SQL against a PySpark DataFrame

check_circle

Mark as learned

thumb_up

thumb_down

chat_bubble_outline

Comment

auto_stories Bi-column layout

settings

Using SQL against a PySpark DataFrame

schedule Aug 12, 2023

Last updated

local_offer

PySpark

Registering PySpark DataFrame as a SQL table

Before we can run SQL queries against a PySpark DataFrame, we must first register the DataFrame as a SQL table:


        
        
            
                
                
                    df.createOrReplaceTempView("users")

Here, we have registered the DataFrame as a SQL table called users. The temporary table will be dropped whenever the Spark session ends. On the other hand, createGlobalTempView(~) will be shared across Spark sessions, and will only be dropped whenever the Spark application ends.

Running SQL queries against PySpark DataFrame

We can now run SQL queries against our PySpark DataFrame:


        
        
            
                
                
                    spark.sql("SELECT * FROM users").show()
                
            
            +-----+---+
| name|age|
+-----+---+
| Alex| 20|
|  Bob| 30|
|Cathy| 40|
+-----+---+

WARNING

Only read-only SQL statements are allowed - data manipulation language (DML) statements such as UPDATE and DELETE are not supported since PySpark has no notion of transactions.

Using variables in SQL queries

The sql(~) method takes in a SQL query expression (string), and so incorporating variables can be done using f-string:


        
        
            
                
                
                    table_name = "users"
query = f"SELECT * FROM {table_name}"
df_res = spark.sql(query)
df_res.show()
                
            
            +-----+---+
| name|age|
+-----+---+
| Alex| 20|
|  Bob| 30|
|Cathy| 40|
+-----+---+

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!