Conda Package Streaming
An efficient library to read from new and old format .conda and .tar.bz2 conda packages. It enables downloading conda metadata from packages without transferring the entire file and getting metadata from local .tar.bz2 packages without reading entire files. The library, currently at version 0.12.0, uses enhanced pip lazy_wheel for `.conda` files and `tarfile.open` for `.tar.bz2` to stream data efficiently. It maintains a regular release cadence, with major updates roughly every few months.
Common errors
-
ModuleNotFoundError: No module named 'conda_package_streaming.package_streaming'
cause Attempting to import `package_streaming` or its functions directly from the top-level `conda_package_streaming` module, instead of as a submodule.fixCorrect the import path to `from conda_package_streaming import package_streaming` and then access functions like `package_streaming.stream_conda_info`. -
json.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
cause Attempting to `json.load()` a `tarfile.TarInfo` object directly, or an extracted file-like object that is empty or does not contain valid JSON, often due to an incorrect `member.name` check or reading the wrong file.fixEnsure `member.name` correctly identifies a JSON file (e.g., `'info/index.json'`) and that `tar.extractfile(member)` returns a valid, non-empty file-like object containing well-formed JSON before attempting to load it.
Warnings
- gotcha The behavior of `stream_conda_info` differs between `.conda` and `.tar.bz2` formats. For `.tar.bz2`, it yields all members, while for `.conda`, it yields members from the requested inner archive, allowing early termination.
- gotcha When using `conda_reader_for_url` for `.conda` files, the returned file-like object *must* be seekable. For `.tar.bz2` files, it only needs to be readable.
- gotcha This library is optimized for *streaming metadata* and individual components without full package download. Using its URL-based APIs to extract an entire package is inefficient and against its design; prefer pre-downloading the package for full extraction.
Install
-
pip install conda-package-streaming -
conda install conda-package-streaming -c conda-forge
Imports
- stream_conda_info
from conda_package_streaming import stream_conda_info
from conda_package_streaming.url import stream_conda_info
- stream_conda_info (for S3)
from conda_package_streaming.s3 import stream_conda_info
- stream_conda_info (for local files)
from conda_package_streaming import package_streaming # then use package_streaming.stream_conda_info(...)
- conda_reader_for_url
from conda_package_streaming.url import conda_reader_for_url
Quickstart
import json
from conda_package_streaming.url import stream_conda_info
# Replace with a valid .conda or .tar.bz2 URL
# For example: url = 'https://repo.anaconda.com/pkgs/main/linux-64/python-3.9.7-h62f7035_1.conda'
url = 'https://repo.anaconda.com/pkgs/main/linux-64/zlib-1.2.13-h5eee18b_0.conda'
print(f"Streaming info from: {url}")
try:
found_index_json = False
for tar, member in stream_conda_info(url):
if member.name == 'info/index.json':
index_json = json.load(tar.extractfile(member))
print("\n--- Found info/index.json ---")
print(json.dumps(index_json, indent=2))
found_index_json = True
break # Stop once index.json is found
if not found_index_json:
print("\n--- info/index.json not found in package ---")
except Exception as e:
print(f"An error occurred: {e}")