Avro-gen
Avro-gen is a Python library that generates Python classes from Avro schemas (.avsc files). It enables static type checking and simplifies the manipulation of Avro records within Python applications. The library integrates with `avro-python3` for reading and writing specific records. Its current version is 0.3.0, and releases appear to be infrequent, typically coinciding with feature additions or dependency updates.
Warnings
- gotcha The generated Python package structure mirrors the Avro schema's `namespace`. If your schema has `"namespace": "com.example.app"`, the generated classes will reside in `output_dir/com/example/app/` requiring `from com.example.app import MyClass`.
- gotcha Any changes to your Avro schema files (e.g., adding a new field, changing a type) require you to re-run the `avro-gen` command line tool to regenerate the Python classes. The library does not automatically detect or recompile schemas.
- gotcha `avro-gen` relies on `avro-python3` for core Avro functionality. While `avro-gen` pins a compatible version range, using an incompatible or outdated `avro-python3` version in your environment can lead to unexpected errors during class generation or at runtime when handling specific records.
Install
-
pip install avro-gen
Imports
- avro_gen
import avro_gen
Quickstart
import os
import subprocess
# Create a dummy Avro schema file
schema_content = '''
{
"type": "record",
"name": "User",
"namespace": "example.avro",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number", "type": ["int", "null"]},
{"name": "favorite_color", "type": ["string", "null"]}
]
}
'''
schema_file = 'user.avsc'
output_dir = 'generated_avro_classes'
with open(schema_file, 'w') as f:
f.write(schema_content)
# Generate Python classes from the schema
# For robust execution in CI/CD, consider passing paths explicitly
try:
print(f"Generating classes for {schema_file} into {output_dir}")
subprocess.run(['python', '-m', 'avro_gen.avro_gen', '-s', schema_file, '-o', output_dir], check=True)
print("Generation successful.")
# Import and use the generated class
# The generated module structure follows the Avro namespace
import sys
sys.path.insert(0, os.path.abspath(output_dir))
from example.avro import User # Assuming output_dir/example/avro/__init__.py was created
user_instance = User(name="Alice", favorite_number=7, favorite_color="blue")
print(f"Created user: {user_instance.name}, favorite number: {user_instance.favorite_number}, color: {user_instance.favorite_color}")
user_instance_2 = User(name="Bob", favorite_number=None) # Optional fields can be None
print(f"Created user 2: {user_instance_2.name}, favorite number: {user_instance_2.favorite_number}")
# Demonstrate type checking (if using a linter/IDE)
# user_instance.favorite_number = "not an int" # Would be a type error
except subprocess.CalledProcessError as e:
print(f"Error generating Avro classes: {e}")
print(f"Stderr: {e.stderr}")
except ImportError as e:
print(f"Error importing generated class. Did generation succeed and sys.path updated correctly? Error: {e}")
finally:
# Clean up generated files
import shutil
if os.path.exists(schema_file):
os.remove(schema_file)
if os.path.exists(output_dir):
shutil.rmtree(output_dir)