Hyperleaup
raw JSON → 0.1.2 verified Mon Apr 27 auth: no python
Create and publish Tableau Hyper files from Apache Spark DataFrames and Spark SQL. Current version 0.1.2, beta-level with monthly-ish releases on PyPI.
pip install hyperleaup Common errors
error HyperException: Type 'timestamp' is incompatible to type 'timestamptz' of column 'last_updated' in Parquet file. ↓
cause Timestamp column in Spark DataFrame is timestamp without timezone, but Hyper expects timestamptz.
fix
Set
timestamp_with_timezone=True in HyperFile/Creator constructor. error ModuleNotFoundError: No module named 'hyperleaup' ↓
cause Package was renamed or not installed.
fix
Install via
pip install hyperleaup and ensure you import as from hyperleaup import HyperFile. Warnings
breaking In v0.1.1, changed default `timestamp_with_timezone` to False to maintain backward compatibility; setting to True may be needed to avoid type mismatch errors. ↓
fix Explicitly set `timestamp_with_timezone=True` when creating HyperFile if your data contains timestamps with timezone.
gotcha Hyperleaup is tightly coupled to PySpark and Tableau Hyper API. Ensure both are installed and compatible. ↓
fix Run `pip install pyspark tableauhyperapi` before using hyperleaup.
gotcha The `LARGEFILE` creation mode is experimental and only works with Databricks File System (DBFS). Do not use on other distributed file systems. ↓
fix Use default creation mode unless you are on Databricks and understand the implications.
Imports
- HyperFile
from hyperleaup import HyperFile - Creator
from hyperleaup import Creator
Quickstart
from pyspark.sql import SparkSession
from hyperleaup import HyperFile
spark = SparkSession.builder.appName("example").getOrCreate()
df = spark.createDataFrame([(1, "a"), (2, "b")], ["id", "value"])
hyper = HyperFile()
hyper.save(df, "output.hyper")
print("Hyper file created successfully.")