created a service account and downloaded the private key (JSON file) for authentication (please check out my detailed guide)
installed the Python client library:
pip install --upgrade google-cloud-storage

Writing Pandas DataFrame to Google Cloud Storage as a CSV file

Consider the following Pandas DataFrame:


        
        
            
                
                
                    import pandas as pd
df = pd.DataFrame({'A':[3,4],'B':[5,6]})
df.head()
                
            
               A  B
0  3  5
1  4  6

Case when you already have a bucket

To write this Pandas DataFrame to Google Cloud Storage (GCS) as a CSV file, use the blob's upload_from_string(~) method:


        
        
            
                
                
                    from google.cloud import storage
path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)

# The bucket on GCS in which to write the CSV file
bucket = client.bucket('test-bucket-skytowner')
# The name assigned to the CSV file on GCS
blob = bucket.blob('my_data.csv')
blob.upload_from_string(df.to_csv(), 'text/csv')

Note the following:

if the bucket with the specified name does not exist, then an error will be thrown
the DataFrame's to_csv() file converts the DataFrame into a string CSV:
df.to_csv() ',A,B\n0,3,5\n1,4,6\n'
the second argument of upload_from_string(~) is the content type of the file

After running this code, we can see that my_data.csv has been written in our test-bucket-skytowner bucket on the GCS web console:

Case when you do not have a bucket

The above solution only works when you have already created a bucket in which to place the file on GCS - specifying a bucket that does not exist will throw an error. Therefore, we must first create a bucket on GCS using the method create_bucket(~), which returns the created bucket:


        
        
            
                
                
                    from google.cloud import storage
path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)

# The NEW bucket on GCS in which to write the CSV file
bucket = client.create_bucket('test-v2-bucket-skytowner')
# The name assigned to the CSV file on GCS
blob = bucket.blob('my_data.csv')
blob.upload_from_string(df.to_csv(), 'text/csv')

Writing Pandas DataFrame to Google Cloud Storage as a feather file

The logic for writing a Pandas DataFrame to GCS as a feather file is very similar to the CSV case, except that we must first write the feather file locally, and then upload this file using the method upload_from_filename(~):


        
        
            
                
                
                    import pyarrow.feather as feather
feather.write_feather(df, './feather_df')

# The bucket in which to place the feather file on GCS
bucket = storage.Bucket(client, 'example-bucket-skytowner')
# The name to assign to the feather file on GCS
blob = bucket.blob('my_data.feather')
blob.upload_from_filename('./feather_df')

After running this code, we should see the my_data.feather file appear on web GCS console:

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!