{"id":580,"library":"smart-open","title":"Smart-open","description":"Smart-open is a Python 3 library (current version 7.5.1) for efficient streaming of very large files from and to various storage systems, including S3, GCS, Azure Blob Storage, HDFS, WebHDFS, HTTP, HTTPS, SFTP, and local filesystems. It provides transparent, on-the-fly (de-)compression for formats like gzip, bz2, and zst, acting as a drop-in replacement for Python's built-in `open()` function. The library is actively maintained with frequent releases, offering a unified Pythonic API to simplify working with remote files and cloud storage services.","status":"active","version":"7.5.1","language":"python","source_language":"en","source_url":"https://github.com/piskvorky/smart_open","tags":["file-io","cloud-storage","s3","gcs","azure","hdfs","compression","streaming"],"install":[{"cmd":"pip install smart-open","lang":"bash","label":"Base installation (no cloud/compression dependencies)"},{"cmd":"pip install 'smart-open[s3,gcs,azure,http,webhdfs,ssh,zst]'","lang":"bash","label":"Installation with common cloud and compression dependencies"}],"dependencies":[{"reason":"Required for Amazon S3 integration. Installed with `smart-open[s3]` extra.","package":"boto3","optional":true},{"reason":"Required for Google Cloud Storage integration. Installed with `smart-open[gcs]` extra.","package":"google-cloud-storage","optional":true},{"reason":"Required for Azure Blob Storage integration. Installed with `smart-open[azure]` extra.","package":"azure-storage-blob","optional":true},{"reason":"Required for SSH/SFTP integration. Installed with `smart-open[ssh]` extra.","package":"paramiko","optional":true},{"reason":"Required for HTTP/HTTPS streaming. Installed with `smart-open[http]` extra.","package":"requests","optional":true},{"reason":"Required for HDFS/WebHDFS integration. Installed with `smart-open[webhdfs]` extra.","package":"hdfs","optional":true},{"reason":"Required for Zstandard (ZST) compression. Installed with `smart-open[zst]` extra.","package":"python-zstandard","optional":true}],"imports":[{"symbol":"open","correct":"from smart_open import open"},{"note":"Prior to v1.8.1, the main function was imported as `smart_open.smart_open`. Since v1.8.1 (and solidified in v2.0.0), it's `smart_open.open` to align with Python's built-in `open`.","wrong":"from smart_open import smart_open","symbol":"smart_open","correct":"import smart_open\n# Access the main function as smart_open.open"}],"quickstart":{"code":"import os\nfrom smart_open import open\n\n# Example for S3; similar patterns apply to GCS, Azure, etc.\n# Ensure AWS credentials are configured (e.g., via environment variables, AWS CLI config, or IAM role).\n# For production, consider explicit credential management via transport_params.\nS3_BUCKET_NAME = os.environ.get('SMART_OPEN_S3_BUCKET', 'my-smart-open-test-bucket')\nS3_KEY = 'example.txt'\nS3_URL = f\"s3://{S3_BUCKET_NAME}/{S3_KEY}\"\n\n# Write to S3\nprint(f\"Writing to {S3_URL}...\")\nwith open(S3_URL, 'w') as fout:\n    fout.write('Hello, smart-open from S3!\\n')\n    fout.write('This is a second line.\\n')\nprint(\"Write complete.\")\n\n# Read from S3\nprint(f\"Reading from {S3_URL}...\")\nwith open(S3_URL, 'r') as fin:\n    for line in fin:\n        print(f\"Read line: {line.strip()}\")\nprint(\"Read complete.\")","lang":"python","description":"This quickstart demonstrates how to use `smart_open.open` to read from and write to an S3 bucket. It automatically handles transparent compression/decompression based on file extension and integrates with underlying SDKs like boto3 for S3 access. Make sure your environment has appropriate cloud credentials configured."},"warnings":[{"fix":"Upgrade your Python environment to 3.10 or newer. If you need to use an older Python version, pin `smart-open` to a compatible version (e.g., `<7.5.1`).","message":"As of `smart-open` v7.5.1, the minimum supported Python version is 3.10. Earlier versions (e.g., v2.0.0) supported Python 3.5+.","severity":"breaking","affected_versions":">=7.5.1"},{"fix":"Update your import statements from `from smart_open import smart_open` to `from smart_open import open`.","message":"The primary import for the `open` function changed from `from smart_open import smart_open` (pre-v1.8.1) to `from smart_open import open` (post-v1.8.1, solidified in v2.0.0) to align with Python's built-in `open`.","severity":"breaking","affected_versions":"<2.0.0 to >=2.0.0"},{"fix":"If your code implicitly relied on 'rb' as the default, explicitly pass `mode='rb'` to `smart_open.open`.","message":"The default read mode for `smart_open.open` changed from 'rb' (read binary) to 'r' (read text) in v1.8.1 to match the behavior of Python's built-in `open`.","severity":"breaking","affected_versions":"<1.8.1 to >=1.8.1"},{"fix":"Install `smart-open` with the necessary extras, e.g., `pip install 'smart-open[s3,gcs]'` for S3 and GCS support.","message":"`smart-open` does not install cloud or compression library dependencies by default to keep installation size small. Functionality like S3 or GCS will fail if their respective dependencies (`boto3`, `google-cloud-storage`) are not installed.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Refer to the documentation for your cloud provider's SDK (e.g., boto3 for AWS, google-cloud-storage for GCS) for credential setup. You can also pass client objects or credentials via the `transport_params` argument to `smart_open.open`.","message":"Cloud storage operations (S3, GCS, Azure) require proper credential configuration. Failing to provide credentials (e.g., via environment variables, SDK defaults, or `transport_params`) will result in authentication errors.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Avoid installing or upgrading to `smart-open==7.3.0`. Install a subsequent patch release like `7.3.1` or the latest stable version.","message":"Version 7.3.0 was yanked from PyPI because its `pyproject.toml` incorrectly claimed Python 3.7 support, even though it had already been dropped in that release train.","severity":"deprecated","affected_versions":"7.3.0"},{"fix":"Review any existing code that directly used or configured `smart_open.s3.iter_bucket` for custom thread pool or client management, as its internal concurrency model has changed. If you relied on separate clients per thread/process, adjust your logic accordingly.","message":"In `smart-open` v7.4.0, the `smart_open.s3.iter_bucket` function was updated to use a single shared `concurrent.futures.ThreadPoolExecutor` and a single shared thread-safe `S3.Client`.","severity":"breaking","affected_versions":">=7.4.0"}],"env_vars":null,"last_verified":"2026-05-12T16:17:38.796Z","next_check":"2026-06-26T00:00:00.000Z","problems":[{"fix":"pip install smart-open","cause":"The `smart-open` library has not been installed in your current Python environment.","error":"ModuleNotFoundError: No module named 'smart_open'"},{"fix":"pip install smart-open[s3]","cause":"You are attempting to open a file from Amazon S3, but the required `boto3` dependency (part of the `s3` extra) is not installed.","error":"ImportError: Missing optional dependency 'boto3'. Use pip or conda to install smart-open[s3]."},{"fix":"pip install smart-open[gcs]","cause":"You are attempting to open a file from Google Cloud Storage, but the required `google-cloud-storage` dependency (part of the `gcs` extra) is not installed.","error":"ImportError: Missing optional dependency 'google-cloud-storage'. Use pip or conda to install smart-open[gcs]."},{"fix":"Verify that the file path or URI is correct and the file exists at the specified location and that you have necessary permissions.","cause":"The specified file path or URI (e.g., S3 object key, GCS blob path, local file path) does not exist in the given storage system.","error":"FileNotFoundError: [Errno 2] No such file or directory: 's3://your-bucket/non-existent-file.txt'"}],"ecosystem":"pypi","meta_description":null,"install_score":100,"install_tag":"verified","quickstart_score":0,"quickstart_tag":"stale","pypi_latest":"7.6.1","install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"sdist","failure_reason":null,"install_time_s":null,"import_time_s":2.8,"mem_mb":40,"disk_size":"100.8M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.18,"mem_mb":7.6,"disk_size":"18.7M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.69,"mem_mb":39.6,"disk_size":"99.6M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.2,"mem_mb":7.6,"disk_size":"18.7M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":9.2,"import_time_s":2.2,"mem_mb":38.7,"disk_size":"102M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":2,"import_time_s":0.14,"mem_mb":7.6,"disk_size":"19M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.01,"mem_mb":38.3,"disk_size":"100M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.13,"mem_mb":7.6,"disk_size":"19M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"sdist","failure_reason":null,"install_time_s":null,"import_time_s":3.51,"mem_mb":44.7,"disk_size":"108.4M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.26,"mem_mb":8.4,"disk_size":"20.7M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.02,"mem_mb":44.4,"disk_size":"107.2M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.29,"mem_mb":8.4,"disk_size":"20.7M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":8.3,"import_time_s":3.06,"mem_mb":43.6,"disk_size":"109M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":1.9,"import_time_s":0.22,"mem_mb":8.4,"disk_size":"21M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.84,"mem_mb":43.3,"disk_size":"108M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.21,"mem_mb":8.4,"disk_size":"21M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"sdist","failure_reason":null,"install_time_s":null,"import_time_s":3.78,"mem_mb":44.2,"disk_size":"99.4M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.24,"mem_mb":8.1,"disk_size":"12.6M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.08,"mem_mb":43.9,"disk_size":"98.2M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.25,"mem_mb":8.1,"disk_size":"12.5M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":7,"import_time_s":3.45,"mem_mb":43.1,"disk_size":"100M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":1.7,"import_time_s":0.24,"mem_mb":8.1,"disk_size":"13M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":3.75,"mem_mb":42.8,"disk_size":"99M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.24,"mem_mb":8.1,"disk_size":"13M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"sdist","failure_reason":null,"install_time_s":null,"import_time_s":3.61,"mem_mb":46,"disk_size":"98.9M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.23,"mem_mb":8.6,"disk_size":"12.3M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.07,"mem_mb":45.7,"disk_size":"97.6M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.25,"mem_mb":8.6,"disk_size":"12.2M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":7.2,"import_time_s":3.37,"mem_mb":44.9,"disk_size":"100M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":1.7,"import_time_s":0.25,"mem_mb":8.6,"disk_size":"13M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":3.72,"mem_mb":44.6,"disk_size":"98M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.24,"mem_mb":8.6,"disk_size":"13M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"sdist","failure_reason":null,"install_time_s":null,"import_time_s":2.63,"mem_mb":39.1,"disk_size":"100.6M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":0.16,"mem_mb":7.4,"disk_size":"18.2M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.4,"mem_mb":39,"disk_size":"99.6M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.18,"mem_mb":7.4,"disk_size":"18.2M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":10.9,"import_time_s":2.53,"mem_mb":37.9,"disk_size":"101M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":2.4,"import_time_s":0.16,"mem_mb":7.4,"disk_size":"19M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"s3,gcs,azure,http,webhdfs,ssh,zst","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.18,"mem_mb":37.7,"disk_size":"100M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":0.14,"mem_mb":7.4,"disk_size":"19M"}]},"quickstart_checks":{"last_tested":"2026-04-23","tag":"stale","tag_description":"widespread failures or data too old to trust","results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":1}]}}