Removing files on Google Cloud Storage using Python
Start your free 7-days trial now!
Prerequisites
To follow along with this guide, please make sure to have:
created a service account and downloaded the private key (JSON file) for authentication (please check out my detailed guide)
installed the Python client library for Google Cloud Storage:
pip install --upgrade google-cloud-storage
Removing a single file on Google Cloud Storage
Suppose we have a file called cat.png
on Google Cloud Storage (GCS) at the root level. To remove this file on GCS, call blob.delete()
:
from google.cloud import storage# Authorize ourselves using the private key of the service accountpath_to_private_key = './gcs-project-354207-099ef6796af6.json'client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
bucket = storage.Bucket(client, 'example-bucket-skytowner')blob = bucket.blob('cat.png')blob.delete()
Equivalently, we can delete a blob using the delete_blob(~)
method of the bucket:
bucket = storage.Bucket(client, 'example-bucket-skytowner')bucket.delete_blob('cat.png')
A delete operation is irreversible, so be cautious when deleting files!
Handling case when file does not exist
Note that if the specified file does not exist, then an error will be thrown:
blob = bucket.blob('non_existing_file.png')blob.delete()
NotFound: 404 DELETE https://storage.googleapis.com/storage/v1/b/example-bucket-skytowner/o/non_existing_file.png?prettyPrint=false:No such object: example-bucket-skytowner/non_existing_file.png
You can handle this NotFound
error by using the try-except
clause:
from google.cloud.exceptions import NotFound
try: blob = bucket.blob('non_existing_file.png') blob.delete()except NotFound: print(f'🚨 {blob.name} does not exist - do something')
🚨 non_existing_file.png does not exist - do something
Notice how we had to import Google Cloud's NotFound
exception here.
Removing multiple files on Google Cloud Storage
There is no way to remove multiple files in one go. Instead, we must iteratively call blob.delete()
.
Suppose we have two files called cat.png
and sample.txt
on GCS at the root level. To remove these files, we iteratively call blob.delete()
for each blob item:
from google.cloud import storagepath_to_private_key = './gcs-project-354207-099ef6796af6.json'client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
bucket = storage.Bucket(client, 'example-bucket-skytowner')str_files_to_delete_on_gcs = ['cat.png', 'sample.txt']for str_file in str_files_to_delete_on_gcs: blob = bucket.blob(str_file) blob.delete()
Removing a folder on Google Cloud Storage
Suppose we have a folder called my_folder
that holds two files:
📁 my_folder ├─ cat.png ├─ sample.txt
There is no direct method to remove a folder on GCS. We must first fetch all the files belonging to a folder using the list_blobs()
method with the prefix
argument, and then delete them iteratively:
from google.cloud import storagepath_to_private_key = './gcs-project-354207-099ef6796af6.json'client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
blobs = bucket.list_blobs(prefix='my_folder/')for blob in blobs: print(f'Deleting file {blob.name}') blob.delete()
Deleting file my_folder/Deleting file my_folder/cat.pngDeleting file my_folder/uploaded_sample.txt
Here, we are deleting the folder blob as well as all the blobs (files) within the folder individually.
Alternatively, one could delete the blobs using the Bucket.delete_blobs(~)
method:
from google.cloud import storagepath_to_private_key = './gcs-project-354207-099ef6796af6.json'client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
blobs = bucket.list_blobs(prefix='my_folder/')bucket.delete_blobs(blobs)
Note that the delete_blobs(blobs)
does not perform deletion in a bulk - this method simply calls the delete_blob(~)
method iteratively.
Removing all files on Google Cloud Storage
To remove all files on GCS, first use the list_blobs(~)
method to get a reference to all the blobs (files) available in the GCS bucket. We then iteratively delete them one by one using blob.delete()
:
from google.cloud import storagepath_to_private_key = './gcs-project-354207-099ef6796af6.json'client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
blobs = client.list_blobs('example-bucket-skytowner')for blob in blobs: print(f'Deleting file {blob.name}') blob.delete()
Deleting file cat.pngDeleting file uploaded_sample.txt