Pantab: Pandas DataFrames to Tableau Hyper Extracts
Pantab is a Python library that enables seamless conversion between pandas DataFrames and Tableau Hyper Extracts (.hyper files). It provides a high-performance way to get data into and out of Tableau's Hyper engine, which is used for data storage and querying within Tableau products. The current version is 5.3.0, and it generally follows a release cadence with minor versions every few months, often including bug fixes and Python version support updates.
Common errors
-
RuntimeError: hyper_error_code = 701: A table with name 'MyTable' already exists.
cause Attempting to write to a Hyper file with `table_mode='w'` (default for `to_hyper`) where the specified table name already exists in the file, and the file is not empty.fixUse `table_mode='a'` to append, or `table_mode='w'` to overwrite the *entire file* if you intend to replace the whole content. If you only want to overwrite a specific table, you need to first delete the table or overwrite the file. The `table` argument in `to_hyper` and `frame_from_hyper` specifies the table within the Hyper file. -
ModuleNotFoundError: No module named 'pantab'
cause The `pantab` library is not installed in the current Python environment.fixRun `pip install pantab` to install the library. -
tableauhyperapi.TableauHyperAPILibraryError: The Hyper API library could not be loaded. Please ensure that the Tableau Hyper API is installed and accessible.
cause The underlying `tableauhyperapi` library or its native components are missing or corrupted, or the environment is not set up correctly to find them.fixEnsure `pantab` and `tableauhyperapi` are correctly installed: `pip install --upgrade pantab tableauhyperapi`. Sometimes, a fresh environment or system re-install of Hyper API dependencies might be needed if `pip` fails to resolve it. -
TypeError: 'file_path' must be a Path or str, not NoneType
cause The `file_path` argument for `to_hyper` or `frame_from_hyper` was not provided or was `None`.fixEnsure you pass a valid string or `pathlib.Path` object for the `file_path` argument, specifying the `.hyper` file you want to interact with.
Warnings
- breaking Pantab v5.3.0 dropped support for Python 3.9 and 3.10. Users on these Python versions will need to upgrade to Python 3.11 or later to use the latest Pantab.
- breaking Pantab relies on `tableauhyperapi`, which periodically updates its minimum required version. Older `tableauhyperapi` versions can lead to `RuntimeError` or `ValueError`.
- gotcha Data type mapping between pandas and Hyper can have nuances. For instance, 8-bit integer columns in pandas will be stored as 16-bit integers in Hyper. Other types might also undergo conversions.
- gotcha Hyper files are single-writer, single-reader. Concurrent access or insufficient file permissions can lead to errors, especially on network drives or shared environments.
Install
-
pip install pantab
Imports
- to_hyper
import pantab.frame.to_hyper
from pantab import to_hyper
- frame_from_hyper
import pantab.frame.frame_from_hyper
from pantab import frame_from_hyper
Quickstart
import pandas as pd
import pantab as pt
import os
data = {
'col1': [1, 2, 3],
'col2': ['A', 'B', 'C'],
'col3': [True, False, True]
}
df = pd.DataFrame(data)
hyper_file = 'my_data.hyper'
table_name = 'MyTable'
try:
# Write DataFrame to Hyper file
pt.to_hyper(df, hyper_file, table=table_name)
print(f"DataFrame written to {hyper_file} successfully.")
# Read DataFrame from Hyper file
read_df = pt.frame_from_hyper(hyper_file, table=table_name)
print(f"DataFrame read from {hyper_file} successfully:")
print(read_df)
finally:
# Clean up the generated file
if os.path.exists(hyper_file):
os.remove(hyper_file)
print(f"Cleaned up {hyper_file}.")