TensorFlow Metadata

1.17.3 · active · verified Fri Apr 10

TensorFlow Metadata (TFMD) provides standard representations for metadata that are useful when training machine learning models with TensorFlow. This includes formats for describing tabular data schemas (e.g., `tf.Examples`), collections of summary statistics over datasets, and problem statements. It is a foundational library used by other TensorFlow Extended (TFX) components like TensorFlow Data Validation (TFDV) and ML Metadata (MLMD). The library is actively maintained, with version 1.17.3 being the current release.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to define a simple data schema using `tensorflow-metadata`'s protobuf definitions. It shows how to add features with different types and constraints, then serializes and deserializes the schema for storage or transfer.

from tf_metadata.proto.v0 import schema_pb2

# Create a simple schema definition
schema = schema_pb2.Schema()

# Add a feature named 'age' of type INT
feature_age = schema.feature.add()
feature_age.name = "age"
feature_age.type = schema_pb2.FeatureType.INT
feature_age.int_domain.is_categorical = False
feature_age.presence.min_fraction = 1.0 # 'age' must always be present
feature_age.int_domain.min = 0
feature_age.int_domain.max = 120

# Add a feature named 'city' of type BYTES (string), which is categorical
feature_city = schema.feature.add()
feature_city.name = "city"
feature_city.type = schema_pb2.FeatureType.BYTES
feature_city.string_domain.is_categorical = True
feature_city.string_domain.value.extend(["New York", "London", "Tokyo"])

print("Generated Schema (protobuf format):")
print(schema)

# Serialize the schema to bytes
serialized_schema = schema.SerializeToString()
print(f"\nSerialized Schema (bytes): {len(serialized_schema)} bytes")

# Deserialize the schema back from bytes
deserialized_schema = schema_pb2.Schema()
deserialized_schema.ParseFromString(serialized_schema)
print("\nDeserialized Schema:")
print(deserialized_schema)

view raw JSON →