AWS Glue Schema Registry Python Client

raw JSON →
1.1.3 verified Mon Apr 27 auth: no python

A Python library for integrating with AWS Glue Schema Registry, supporting Avro and JSON schemas. Version 1.1.3 is current; it requires Python >=3.8 and is under active development.

pip install aws-glue-schema-registry
error botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the GetSchema operation
cause Missing IAM permissions for Glue Schema Registry.
fix
Attach policy AWSGlueSchemaRegistryFullAccess or grant glue:GetSchema, glue:PutSchemaVersion, glue:RegisterSchemaVersion.
error ModuleNotFoundError: No module named 'aws_glue_schema_registry'
cause Library not installed or environment does not have it.
fix
Run pip install aws-glue-schema-registry.
gotcha The library uses boto3's glue client directly; ensure that the IAM role/policy has glue:GetSchema, glue:PutSchemaVersion, glue:RegisterSchemaVersion permissions.
fix Attach AWSGlueSchemaRegistryFullAccess policy or grant specific permissions.
breaking Version 1.0 changed the import path from aws_glue_schema_registry.serde to aws_glue_schema_registry.serde.avro and aws_glue_schema_registry.serde.json.
fix Update imports to use the new submodules.
deprecated The 'serialize' method with positional arguments is deprecated in favor of keyword arguments.
fix Use serializer.serialize(data=value) instead of serializer.serialize(value).

Initialize a serializer for Avro schema and serialize a record.

import boto3
from aws_glue_schema_registry import GlueSchemaRegistry
from aws_glue_schema_registry.serde.avro import AvroSerializer

session = boto3.Session()
glue_client = session.client('glue')
registry_name = 'my-registry'
schema_name = 'my-schema'

# Schema definition (Avro example)
avro_schema = {
    'type': 'record',
    'name': 'User',
    'fields': [{'name': 'name', 'type': 'string'}]
}

serializer = AvroSerializer(glue_client, registry_name, schema_name, avro_schema)
data = {'name': 'John'}
serialized_bytes = serializer.serialize(data)
print(serialized_bytes)