obstore: Python Object Storage Interface
obstore is a Python library providing a simple, high-throughput interface for various object storage services like Amazon S3, Google Cloud Storage, Azure Blob Storage, and S3-compliant APIs. It features both synchronous and asynchronous APIs, streaming downloads/uploads, and automatic multipart uploads for large files, powered by a Rust backend for performance. The current version is 0.9.2, with a frequent release cadence, often introducing minor updates and fixes.
Warnings
- breaking Support for Python 3.9 was deprecated in `obstore` version 0.9.0. Users on Python 3.9 should upgrade their Python environment.
- breaking In older versions (prior to 0.7.0), `S3Store.from_session()` and `S3Store._from_native()` were removed. Users should transition to using credential providers for S3 authentication.
- breaking Path encoding behavior changed in version 0.8.2 to prevent unintentional double-encoding. Users must now ensure that paths provided to `obstore` are valid and correctly encoded.
- breaking In version 0.5.0, the `container` parameter for `AzureStore`'s constructor was renamed to `container_name` and became a keyword-only argument. Using `container` will raise an error.
- gotcha Unlike many object storage libraries, `obstore`'s core operations (`put`, `get`, `list`, `delete`, `copy`, etc.) are top-level functions (e.g., `obs.put(store, ...)`) rather than methods on the store object (e.g., `store.put(...)`).
- gotcha When listing objects with `obstore.list(return_arrow=True)`, the optional `arro3-core` dependency is required. Without it, attempting to use this feature will fail.
Install
-
pip install obstore
Imports
- obstore
import obstore as obs
- MemoryStore
from obstore.store import MemoryStore
- put
obs.put(store, 'path', b'data')
Quickstart
import obstore as obs
from obstore.store import MemoryStore
# Initialize an in-memory store for demonstration
store = MemoryStore()
# Define a file path and content
file_path = "my_document.txt"
file_content = b"Hello, obstore world!"
# Put the object into the store
obs.put(store, file_path, file_content)
print(f"Object '{file_path}' put into store.")
# Get the object from the store
response = obs.get(store, file_path)
retrieved_content = response.bytes()
print(f"Retrieved content: {retrieved_content.decode()}")
assert retrieved_content == file_content
print("Content matches!")
# Asynchronous operations are also available (requires an async context)
# async def main():
# await obs.put_async(store, 'async_file.txt', b'async data')
# res_async = await obs.get_async(store, 'async_file.txt')
# print(f"Async retrieved: {res_async.bytes().decode()}")
# import asyncio
# asyncio.run(main())