PyArrow Hotfix
Pyarrow-hotfix is a pure Python package designed to mitigate the PyArrow security vulnerability CVE-2023-47248, which affected PyArrow versions 0.14.0 to 14.0.0. It disables the vulnerable deserialization feature, offering a temporary solution for users unable to immediately upgrade to PyArrow 14.0.1 or later. The library is released on an as-needed basis for security patches.
Warnings
- breaking The `pyarrow-hotfix` explicitly disables the `pyarrow.PyExtensionType` feature. If your existing workloads rely on `pyarrow.PyExtensionType` for processing Parquet or other Arrow files, importing this hotfix will cause those workloads to fail with a `RuntimeError` related to 'forbidden deserialization of 'arrow.py_extension_type''.
- gotcha While `pyarrow-hotfix` addresses the CVE-2023-47248 vulnerability, it is a temporary measure. The Apache Arrow community strongly recommends upgrading to PyArrow 14.0.1 or later as the definitive solution.
- gotcha For installations via `pip`, both `pyarrow-hotfix` and `pyarrow_hotfix` are accepted package names and point to the same package on PyPI. However, consistency in naming (`pyarrow-hotfix` for `pip` and `import pyarrow_hotfix`) is good practice.
Install
-
pip install pyarrow-hotfix
Imports
- pyarrow_hotfix
import pyarrow_hotfix
Quickstart
import pyarrow_hotfix
import pyarrow as pa
# The hotfix is applied simply by importing the module.
# Any subsequent PyArrow operations will have the vulnerable feature disabled.
# Example (will raise a RuntimeError if vulnerable data is encountered):
# try:
# pa.ipc.open_file('malicious_data.arrow')
# except RuntimeError as e:
# print(f"Caught expected error: {e}")