s3pathlib
s3pathlib is a Python package that provides an intuitive, object-oriented programming (OOP) interface for manipulating AWS S3 objects and directories. Its API closely resembles the Python standard library's `pathlib` module, making S3 interactions feel familiar and Pythonic. The library is actively maintained, with the current version being 2.3.6, and receives regular minor updates.
Warnings
- gotcha S3Path objects are immutable. Operations that modify the S3 object (like `write_text()` or `write_bytes()`) will return a *new* S3Path object, especially in S3 versioning-enabled buckets. Always reassign the result if you need to work with the updated path object.
- gotcha Write operations (`write_text()`, `write_bytes()`) will silently overwrite existing files by default. There is no automatic error or warning if the target S3 object already exists.
- gotcha S3 does not have a true directory concept; `s3pathlib` provides a 'logical' or 'soft' directory abstraction. A path ending with `/` is treated as a directory. Understanding this distinction is crucial for directory-related operations.
- gotcha While `s3pathlib` mimics the `pathlib.Path` API, `S3Path` is not a direct subclass of `pathlib.Path`. This means `isinstance(s3_path_object, pathlib.Path)` checks will return `False` and code expecting a `pathlib.Path` might break.
- gotcha Implicit `boto3` session usage can lead to unexpected AWS credential behavior. If you don't explicitly attach a `boto3` session using `context.attach_boto_session()`, `s3pathlib` will rely on `boto3`'s default credential chain (environment variables, shared credential file, IAM roles), which might not be what you intend.
Install
-
pip install s3pathlib
Imports
- S3Path
from s3pathlib import S3Path
- context
from s3pathlib import context
Quickstart
import boto3
import os
from s3pathlib import S3Path, context
# Configure AWS credentials (e.g., from environment variables or a profile)
# In a real application, consider using AWS IAM roles or proper credential management.
aws_region = os.environ.get("AWS_REGION", "us-east-1")
aws_profile = os.environ.get("AWS_PROFILE", None)
# Attach a boto3 session for s3pathlib to use
if aws_profile:
session = boto3.session.Session(region_name=aws_region, profile_name=aws_profile)
else:
session = boto3.session.Session(region_name=aws_region)
context.attach_boto_session(session)
# Define an S3 path object
bucket_name = os.environ.get("S3_BUCKET_NAME", "your-test-bucket-12345")
s3_path = S3Path(bucket_name, "my-folder", "hello.txt")
# Example: Write text to S3
print(f"Writing to {s3_path.uri}...")
s3_path.write_text("Hello, s3pathlib!")
print("Content written.")
# Example: Read text from S3
if s3_path.exists():
content = s3_path.read_text()
print(f"Content read from S3: '{content}'")
# Example: Check if a folder exists and list its contents
s3_dir = S3Path(bucket_name, "my-folder/")
print(f"Checking if {s3_dir.uri} exists: {s3_dir.exists()}")
# Clean up (optional)
# s3_path.delete() # Deletes the object
# s3_dir.delete_dir() # Deletes all objects under the prefix (careful!)
# Detach the boto3 session when done (optional, for explicit resource management)
context.detach_boto_session()