Apache Avro (Python 3)

1.10.2 · active · verified Thu Apr 09

Apache Avro (`avro-python3`) is the official Python 3 implementation of the Avro remote procedure call and data serialization framework. It enables defining language-agnostic data schemas and serializing data into a compact binary format, facilitating cross-language data exchange. The current stable version available on PyPI is 1.10.2, with releases happening periodically as part of the broader Apache Avro project.

Warnings

Install

Imports

Quickstart

This example demonstrates how to define an Avro schema, serialize Python dictionary data into the Avro binary format, and then deserialize it back into Python objects using an in-memory BytesIO stream to simulate file operations.

import avro.schema
import avro.io
import avro.datafile
import io

# 1. Define the Avro schema
schema_str = """
{
    "type": "record",
    "name": "User",
    "fields": [
        {"name": "name", "type": "string"},
        {"name": "favorite_number", "type": ["int", "null"]},
        {"name": "favorite_color", "type": ["string", "null"]}
    ]
}
"""
schema = avro.schema.parse(schema_str)

# 2. Write data to a BytesIO object (simulating a file)
writer = avro.io.DatumWriter(schema)
bytes_writer = io.BytesIO()
data_file_writer = avro.datafile.DataFileWriter(bytes_writer, writer, schema)

data_file_writer.append({"name": "Alyssa", "favorite_number": 256, "favorite_color": None})
data_file_writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"})
data_file_writer.close()

# Get the serialized data
avro_data = bytes_writer.getvalue()

# 3. Read data from the BytesIO object
bytes_reader = io.BytesIO(avro_data)
reader = avro.io.DatumReader(schema)
data_file_reader = avro.datafile.DataFileReader(bytes_reader, reader)

read_records = [record for record in data_file_reader]
data_file_reader.close()

# Print the read records
print(read_records)
# Expected output:
# [{'name': 'Alyssa', 'favorite_number': 256, 'favorite_color': None}, {'name': 'Ben', 'favorite_number': 7, 'favorite_color': 'red'}]

view raw JSON →