Hyperleaup

raw JSON →
0.1.2 verified Mon Apr 27 auth: no python

Create and publish Tableau Hyper files from Apache Spark DataFrames and Spark SQL. Current version 0.1.2, beta-level with monthly-ish releases on PyPI.

pip install hyperleaup
error HyperException: Type 'timestamp' is incompatible to type 'timestamptz' of column 'last_updated' in Parquet file.
cause Timestamp column in Spark DataFrame is timestamp without timezone, but Hyper expects timestamptz.
fix
Set timestamp_with_timezone=True in HyperFile/Creator constructor.
error ModuleNotFoundError: No module named 'hyperleaup'
cause Package was renamed or not installed.
fix
Install via pip install hyperleaup and ensure you import as from hyperleaup import HyperFile.
breaking In v0.1.1, changed default `timestamp_with_timezone` to False to maintain backward compatibility; setting to True may be needed to avoid type mismatch errors.
fix Explicitly set `timestamp_with_timezone=True` when creating HyperFile if your data contains timestamps with timezone.
gotcha Hyperleaup is tightly coupled to PySpark and Tableau Hyper API. Ensure both are installed and compatible.
fix Run `pip install pyspark tableauhyperapi` before using hyperleaup.
gotcha The `LARGEFILE` creation mode is experimental and only works with Databricks File System (DBFS). Do not use on other distributed file systems.
fix Use default creation mode unless you are on Databricks and understand the implications.

Creates a simple Hyper file from a Spark DataFrame.

from pyspark.sql import SparkSession
from hyperleaup import HyperFile

spark = SparkSession.builder.appName("example").getOrCreate()
df = spark.createDataFrame([(1, "a"), (2, "b")], ["id", "value"])
hyper = HyperFile()
hyper.save(df, "output.hyper")
print("Hyper file created successfully.")