Protocol Buffers (protobuf) for Python
Google's language-neutral, platform-neutral mechanism for serializing structured data. You define message schemas in .proto files, compile them with protoc into _pb2.py modules, and use the runtime library (google.protobuf.*) to serialize, deserialize, and manipulate messages. Currently at version 7.34.1 (Python major version bumped from 6.x to 7.x in the 7.34.0 release). Releases follow a quarterly cadence; breaking major-version bumps are targeted at Q1 of each year.
Common errors
-
protoc: command not found
cause The Protocol Buffers compiler ('protoc') is either not installed on your system or its executable location is not included in your system's PATH environment variable.fixInstall 'protoc' via a package manager (e.g., 'brew install protobuf' on macOS, 'sudo apt-get install protobuf-compiler' on Linux) or by downloading the pre-compiled binary from GitHub releases and manually adding its 'bin' directory to your system's PATH. -
ModuleNotFoundError: No module named 'google.protobuf'
cause The Python 'protobuf' runtime library is not installed in your active Python environment.fixInstall the 'protobuf' Python package using 'pip install protobuf' or 'conda install protobuf' if you are using an Anaconda environment. -
AttributeError: module 'google.protobuf.descriptor' has no attribute '_internal_create_key'
cause This error typically indicates a version incompatibility between your installed 'protobuf' Python package and other dependent libraries (like TensorFlow) or between generated code and the runtime library.fixDowngrade or upgrade the 'protobuf' package to a version compatible with your dependencies (e.g., 'pip install protobuf==3.20.0' for older TensorFlow versions), and regenerate '.proto' files if they were compiled with an incompatible 'protoc' version. -
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. grpcio-status X requires protobuf<7.0.0,>=6.31.1, but you have protobuf 7.34.0 which is incompatible.
cause Other installed packages (e.g., 'grpcio-status' or 'googleapis-common-protos') have declared a dependency on an older major version of 'protobuf' (e.g., '<7.0.0'), conflicting with a newly installed 'protobuf' 7.x.fixDowngrade the 'protobuf' package to a compatible version (e.g., 'pip install protobuf<7') or use a virtual environment to isolate conflicting dependencies until dependent packages update their version requirements. -
ModuleNotFoundError: No module named 'your_message_pb2'
cause When '.proto' files are organized in nested directories, 'protoc' may generate absolute Python imports instead of relative ones, causing Python to fail to locate the generated '_pb2.py' modules within the package structure.fixEnsure the directory containing the generated '_pb2.py' files is on your Python path, or modify the imports in the generated files to be relative (e.g., 'from . import your_message_pb2'), or use a tool like 'fix-protobuf-imports'.
Warnings
- breaking Python major version bumped to 7 with the 7.34.0 release (previous line was 6.x). Boolean values are now rejected when setting enum or int fields—the API raises a TypeError instead of implicitly converting them. The deprecated float_precision option in json_format and float_format/double_format in text_format were also removed.
- breaking Gencode/runtime version mismatch raises google.protobuf.runtime_version.VersionError at import time. Generated _pb2.py files embed a minimum runtime version; loading them against an older installed protobuf package fails hard. This is especially common when grpcio-tools generates code with a newer bundled protoc than the protobuf runtime you have installed.
- breaking Python 4.21.0 (2022) switched the C extension to the upb library. Sharing message objects between Python and C++ (e.g. via SWIG or pybind11) stopped working by default. Libraries like older TensorFlow that relied on this crash with AttributeError on import.
- breaking message.UnknownFields() was deprecated in v5.25 and removed in v6.26+. Calling it raises AttributeError.
- gotcha Accessing an undefined key in a proto map field creates that key with a zero/false/empty value (defaultdict-like behaviour). This silently mutates the message during read-only access, which can cause unexpected serialization differences and test failures.
- gotcha The Python package name declared in a .proto file does NOT affect generated Python module names or import paths. Python packages are determined purely by directory structure relative to the --proto_path flag. Hyphens in filenames are silently converted to underscores (foo-bar.proto → foo_bar_pb2.py).
- gotcha Do not subclass generated message classes. They use a metaclass and internal descriptor machinery that makes subclassing produce subtle bugs ('fragile base class' problems). The official docs explicitly warn against it.
Install
-
pip install protobuf -
pip install grpcio-tools -
# Install the standalone protoc compiler (platform-specific) # macOS: brew install protobuf # Ubuntu: apt-get install -y protobuf-compiler # Or download from https://github.com/protocolbuffers/protobuf/releases
Imports
- MessageToJson / ParseDict
import google.protobuf.json_format as jf; jf.MessageToJson(msg)
from google.protobuf import json_format json_format.MessageToJson(msg) json_format.ParseDict(d, MyMessage())
- MessageToString / Parse (text format)
from google.protobuf import text_format text_format.MessageToString(msg) text_format.Parse(text, MyMessage())
- descriptor_pool / DescriptorPool
from google.protobuf import descriptor_pool pool = descriptor_pool.Default()
- Generated message class (user proto)
import my_message; msg = my_message.MyMessage()
# After: protoc --python_out=. my_message.proto from my_message_pb2 import MyMessage msg = MyMessage(field1='hello', field2=42)
- Well-known types (Timestamp, Duration, Any, Struct …)
from protobuf.timestamp_pb2 import Timestamp
from google.protobuf.timestamp_pb2 import Timestamp from google.protobuf.any_pb2 import Any
- UnknownFieldSet
msg.UnknownFields()
from google.protobuf import unknown_fields unk = unknown_fields.UnknownFieldSet(msg)
Quickstart
# pip install protobuf
# No custom .proto needed for this example — uses the built-in Timestamp well-known type.
from google.protobuf.timestamp_pb2 import Timestamp
from google.protobuf import json_format
import time
# --- Create and populate a message ---
ts = Timestamp()
ts.GetCurrentTime() # sets seconds + nanos to now
# --- Binary serialization round-trip ---
binary = ts.SerializeToString()
ts2 = Timestamp()
ts2.ParseFromString(binary) # returns number of bytes consumed
assert ts == ts2, "Round-trip failed"
# --- JSON serialization ---
json_str = json_format.MessageToJson(ts)
print("JSON:", json_str)
ts3 = json_format.Parse(json_str, Timestamp())
assert ts == ts3, "JSON round-trip failed"
print("All assertions passed.")
# --- Typical workflow with a custom proto ---
# 1. Write my_message.proto:
# syntax = "proto3";
# message Person { string name = 1; int32 id = 2; }
# 2. Compile:
# protoc --python_out=. my_message.proto
# 3. Use generated code:
# from my_message_pb2 import Person
# p = Person(name='Alice', id=42)
# data = p.SerializeToString()
# p2 = Person()
# p2.ParseFromString(data)