cloudpathlib
cloudpathlib provides pathlib-style classes for interacting with files and directories in various cloud storage services such as AWS S3, Google Cloud Storage, and Azure Blob Storage. It aims to offer a familiar filesystem interface, abstracting away cloud-specific details. The library is actively maintained, with a typical release cadence that includes bug fixes and new features. The current version is 0.23.0.
Warnings
- breaking The `CloudPath.copy` method's first parameter was renamed from `destination` to `target`. Code relying on keyword arguments or positional argument names will break.
- breaking Support for Python 3.7 has been removed. The last version compatible with Python 3.7 was v0.18.1.
- breaking The `CloudPath` constructor changed how it handles a client object as the second argument. Previously, it could implicitly accept a client. Now, it needs to be passed explicitly as a keyword argument (e.g., `CloudPath('s3://...', client=my_client)`).
- deprecated The environment variable `CLOUPATHLIB_FILE_CACHE_MODE` (with a typo) was deprecated and support for it has been removed. The correct environment variable is `CLOUDPATHLIB_FILE_CACHE_MODE`.
- gotcha The default for `missing_ok` in `CloudPath.unlink()` is `True`, unlike `pathlib.Path.unlink()` where it defaults to `False`. This means `unlink()` will not raise an error if the file does not exist by default.
- gotcha `CloudPath.rmdir()` will raise a `DirectoryNotEmptyError` if the directory is not empty. To recursively remove a non-empty directory, you must use `CloudPath.rmtree()`.
- gotcha An `ImportError` due to an incompatible `google-cloud-storage` version was fixed in v0.18.1 by not using `transfer_manager` if unavailable. Users on older `google-cloud-storage` versions might encounter this.
Install
-
pip install cloudpathlib -
pip install "cloudpathlib[s3]" -
pip install "cloudpathlib[gs]" -
pip install "cloudpathlib[azure]" -
pip install "cloudpathlib[all]"
Imports
- CloudPath
from cloudpathlib import CloudPath
- AnyPath
from cloudpathlib import AnyPath
- S3Path
from cloudpathlib.s3 import S3Path
- GSPath
from cloudpathlib.gs import GSPath
- AzureBlobPath
from cloudpathlib.azure import AzureBlobPath
- HttpsPath
from cloudpathlib.https import HttpsPath
Quickstart
import os
from cloudpathlib import CloudPath
# Ensure environment variables are set for the chosen cloud provider
# e.g., for S3: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
# for GS: GOOGLE_APPLICATION_CREDENTIALS (path to JSON key file)
# for Azure: AZURE_STORAGE_CONNECTION_STRING
# Example for S3
# Replace with your actual bucket and file name
s3_file_path = CloudPath("s3://your-test-bucket/hello.txt")
try:
# Write to the cloud file
s3_file_path.write_text("Hello from cloudpathlib!")
print(f"Successfully wrote to {s3_file_path}")
# Read from the cloud file
content = s3_file_path.read_text()
print(f"Content read: '{content}'")
# Check if the file exists
if s3_file_path.exists():
print(f"File {s3_file_path} exists.")
# Clean up (optional)
s3_file_path.unlink()
print(f"File {s3_file_path} deleted.")
except Exception as e:
print(f"An error occurred: {e}")