Numcodecs

0.16.5 · active · verified Thu Apr 09

Numcodecs is a Python package providing a diverse set of buffer compression and transformation codecs. It is widely used in data storage and communication applications, especially as a dependency for the Zarr N-dimensional array store. The current version is 0.16.5, with frequent patch and minor releases, typically on a monthly or bi-monthly cadence.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to instantiate a codec (Blosc or GZip as a fallback), encode a NumPy array into bytes, and then decode it back. It also shows how to globally register and retrieve codecs, which is often used in frameworks like Zarr.

import numpy as np
import numcodecs

# Define some data to encode
data = np.arange(10000, dtype='i4').reshape(100, 100)

# Choose a codec. Blosc is common, but requires 'pip install "numcodecs[blosc]"'.
# If Blosc isn't installed, GZip is a good fallback.
try:
    codec = numcodecs.blosc.Blosc(cname='lz4', clevel=5, shuffle=numcodecs.blosc.SHUFFLE)
    print("Using Blosc codec.")
except ImportError:
    print("Blosc not installed. Falling back to GZip codec.")
    codec = numcodecs.gzip.GZip(level=5)

# Encode the data
encoded_data = codec.encode(data.tobytes())
print(f"Original data shape: {data.shape}, dtype: {data.dtype}")
print(f"Original bytes: {data.nbytes}")
print(f"Encoded bytes: {len(encoded_data)}")

# Decode the data
decoded_bytes = codec.decode(encoded_data)

# Reconstruct the numpy array
decoded_data = np.frombuffer(decoded_bytes, dtype=data.dtype).reshape(data.shape)

# Verify that the decoded data matches the original
assert np.array_equal(data, decoded_data)
print("Data successfully encoded and decoded!")

# You can also register codecs globally for retrieval by ID
numcodecs.register_codec(codec)
retrieved_codec = numcodecs.get_codec({'id': codec.codec_id, **codec.get_config()})
assert retrieved_codec.codec_id == codec.codec_id
print(f"Codec '{retrieved_codec.codec_id}' registered and retrieved successfully.")

view raw JSON →