{"id":2395,"library":"avro-gen3","title":"Avro Record Class and Specific Record Reader Generator","description":"avro-gen3 is a Python library that generates concrete Avro record classes with type hints and a specific record reader. It addresses the typeless nature of default Avro Python implementations by wrapping the standard Avro DatumReader to return these type-hinted classes instead of generic dictionaries. This project is a fork of `avro_gen`, enhanced with improved Python 3 support, better namespace handling, documentation generation, and JSON (de-)serialization capabilities. The current version is 0.7.16, released on September 5, 2024, indicating an active but irregular release cycle.","status":"active","version":"0.7.16","language":"en","source_language":"en","source_url":"https://github.com/acryldata/avro_gen","tags":["avro","schema generation","data serialization","type hints","code generation"],"install":[{"cmd":"pip install avro-gen3","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Provides the underlying Avro serialization/deserialization framework (DatumReader, etc.) which avro-gen3 builds upon.","package":"apache-avro","optional":false}],"imports":[{"note":"Used to generate Python classes from Avro schemas.","symbol":"write_schema_files","correct":"from avrogen import write_schema_files"},{"note":"Generated record classes are created in a user-specified output directory, organized by Avro namespace, and must be imported from that dynamic path, not directly from 'avrogen'.","wrong":"from avrogen import <RecordName>","symbol":"GeneratedRecordClass","correct":"from <output_directory>.<avro_namespace_path> import <RecordName>"},{"note":"The generated SpecificDatumReader is created in the root of the output directory for specific record deserialization.","symbol":"SpecificDatumReader","correct":"from <output_directory> import SpecificDatumReader"}],"quickstart":{"code":"import os\nimport sys\nimport tempfile\nfrom pathlib import Path\nfrom avrogen import write_schema_files\n\n# 1. Define a simple Avro schema\navro_schema_json = '''\n{\n  \"type\": \"record\",\n  \"name\": \"User\",\n  \"namespace\": \"com.example.app\",\n  \"fields\": [\n    {\"name\": \"name\", \"type\": \"string\"},\n    {\"name\": \"favorite_number\", \"type\": [\"int\", \"null\"], \"default\": null}\n  ]\n}\n'''\n\n# 2. Define an output directory for generated classes\nwith tempfile.TemporaryDirectory() as tmpdir_name:\n    output_dir = Path(tmpdir_name)\n    print(f\"Generated Avro classes will be written to: {output_dir}\")\n\n    # 3. Generate Python classes from the Avro schema\n    write_schema_files(avro_schema_json, output_dir)\n\n    # Add the output directory to sys.path to enable import\n    sys.path.insert(0, str(output_dir))\n\n    try:\n        # 4. Import the generated classes and reader\n        # The Avro namespace 'com.example.app' translates to a path within the output_dir\n        from com.example.app import User  # Access the generated User class\n        from avro.io import DatumWriter, DatumReader\n        from avro.datafile import DataFileWriter, DataFileReader\n\n        # 5. Create an instance of the generated class\n        user_record = User(name=\"Alice\", favorite_number=123)\n        print(f\"Created user record: {user_record}\")\n        print(f\"User name: {user_record.name}, Favorite number: {user_record.favorite_number}\")\n\n        # 6. Serialize and deserialize using standard Avro tools with the generated schema/classes\n        # Note: avro-gen3 wraps DatumReader but for DataFileWriter/Reader, you still use avro's types\n        # For simpler examples, we might use the original avro library's DatumWriter/Reader directly\n        # The main benefit of avro-gen3 is the type-hinted classes.\n\n        # The generated classes are DictWrapper instances, compatible with standard Avro I/O\n        output_file = output_dir / \"users.avro\"\n        writer = DataFileWriter(open(output_file, \"wb\"), DatumWriter(), user_record.SCHEMA)\n        writer.append(user_record._inner_dict) # avro-gen3 records are dict wrappers\n        writer.close()\n\n        reader = DataFileReader(open(output_file, \"rb\"), DatumReader())\n        for read_user_dict in reader:\n            # When reading back, DatumReader returns dicts. You'd re-wrap if desired.\n            read_user = User(**read_user_dict)\n            print(f\"Deserialized user: {read_user.name}, {read_user.favorite_number}\")\n        reader.close()\n\n    finally:\n        # Clean up sys.path\n        sys.path.remove(str(output_dir))\n","lang":"python","description":"This quickstart demonstrates how to define an Avro schema, use `avro-gen3` to generate Python classes for it, and then serialize/deserialize data using these generated classes. It highlights the dynamic import of generated classes based on the schema's namespace and the use of the generated SCHEMA object with standard Avro I/O tools."},"warnings":[{"fix":"Be aware that direct dictionary writing or standard `DatumWriter` usage will not enforce schema during the write operation beyond what the underlying `apache-avro` library provides. The primary benefit of `avro-gen3` is compile-time type checking and IDE support via generated classes.","message":"avro-gen3 generates specific record classes as `DictWrapper` instances and does NOT provide an overloaded `DictWriter`. This means that generated specific records, while offering type-hinted access, behave like regular Python dictionaries for serialization purposes with standard Avro `DatumWriter`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure `avro-gen3` is updated to a version compatible with your `apache-avro` library (e.g., `avro-gen3==0.7.16` or newer). Pin your `apache-avro` dependency to a compatible version if necessary (e.g., `<1.10` or `>=1.10` based on `avro-gen3`'s constraints). Regenerate classes if the error persists after updating.","message":"Breaking change in `apache-avro` versions 1.10 and later moved `AvroTypeException` to a different package, which can cause `AttributeError: module 'avro.io' has no attribute 'AvroTypeException'` if `avro-gen3` generated code (or its dependencies) expects the old location. This often manifests when custom properties contain non-string values.","severity":"breaking","affected_versions":"avro-gen3 < 0.7.16 with apache-avro >= 1.10"},{"fix":"Always declare optional fields like `\"fields\": [{\"name\": \"optional_field\", \"type\": [\"null\", \"string\"], \"default\": null}]`.","message":"When defining optional fields in Avro schemas, the `type` must be a union with `\"null\"` as the *first* type, and a `default` value must be specified as the literal `null` (not the string `\"null\"`). Incorrectly formatted optional fields can lead to consumer-side exceptions even if the schema appears valid for encoding.","severity":"gotcha","affected_versions":"All versions"},{"fix":"If your Avro schema has `\"namespace\": \"com.example.app\"` and you generate into `my_generated_code/`, you must import with `from my_generated_code.com.example.app import MyRecord`.","message":"Generated Avro classes are organized into submodules reflecting their Avro namespaces within the output directory. Importing them requires correctly constructing the Python import path based on the Avro namespace and the chosen output directory.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}