s5cmd Python Distributions

0.3.3 · active · verified Mon Apr 13

This project provides Python wheels for the high-performance `s5cmd` command-line tool, a utility written in Go for managing S3 and S3-compatible object storage systems. It focuses on speed and efficiency for bulk operations, parallel processing, and advanced filtering. The Python package ensures the `s5cmd` executable is available in the user's PATH after installation, allowing Python applications to invoke the CLI tool via subprocess. The current version is 0.3.3, with a release cadence tied to updates of the underlying `s5cmd` binary and build infrastructure improvements.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to check for `s5cmd` and execute basic S3 operations (list, upload, download) using Python's `subprocess` module. It assumes `s5cmd` is correctly installed via `pip install s5cmd` and that AWS credentials are configured in the environment or standard locations for `s5cmd` to access S3.

import subprocess
import os

# Ensure AWS credentials are configured (e.g., via environment variables or ~/.aws/credentials)
# For example, using environment variables for demonstration:
# os.environ['AWS_ACCESS_KEY_ID'] = os.environ.get('AWS_ACCESS_KEY_ID', 'YOUR_AWS_ACCESS_KEY_ID')
# os.environ['AWS_SECRET_ACCESS_KEY'] = os.environ.get('AWS_SECRET_ACCESS_KEY', 'YOUR_AWS_SECRET_ACCESS_KEY')
# os.environ['AWS_REGION'] = os.environ.get('AWS_REGION', 'us-east-1')

try:
    # Verify s5cmd is installed and accessible in PATH
    version_output = subprocess.run(['s5cmd', 'version'], capture_output=True, text=True, check=True)
    print(f"s5cmd version:\n{version_output.stdout}")

    # Example: List objects in an S3 bucket (replace with a real bucket)
    bucket_name = "your-test-s5cmd-bucket"
    list_command = ['s5cmd', 'ls', f's3://{bucket_name}/']
    list_result = subprocess.run(list_command, capture_output=True, text=True, check=True)
    print(f"\nListing objects in s3://{bucket_name}/:\n{list_result.stdout}")

    # Example: Create a dummy local file and upload it
    local_file_name = "hello_s5cmd.txt"
    with open(local_file_name, "w") as f:
        f.write("Hello from s5cmd Python distribution!")
    
    upload_command = ['s5cmd', 'cp', local_file_name, f's3://{bucket_name}/{local_file_name}']
    upload_result = subprocess.run(upload_command, capture_output=True, text=True, check=True)
    print(f"\nUploaded {local_file_name}:\n{upload_result.stdout}")

    # Example: Download the file back
    download_command = ['s5cmd', 'cp', f's3://{bucket_name}/{local_file_name}', f'./downloaded_{local_file_name}']
    download_result = subprocess.run(download_command, capture_output=True, text=True, check=True)
    print(f"\nDownloaded 'downloaded_{local_file_name}':\n{download_result.stdout}")
    
    # Clean up local files
    os.remove(local_file_name)
    os.remove(f'./downloaded_{local_file_name}')

    # Note: For production, consider robust error handling and command construction.
    # Ensure the bucket 'your-test-s5cmd-bucket' exists and credentials have write access.

except FileNotFoundError:
    print("Error: 's5cmd' command not found. Ensure it's installed and in your system's PATH.")
except subprocess.CalledProcessError as e:
    print(f"Error executing s5cmd command: {e}")
    print(f"Stdout: {e.stdout}")
    print(f"Stderr: {e.stderr}")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

view raw JSON →