Hyperleaup

0.1.2 verified Mon Apr 27 auth: no python

Create and publish Tableau Hyper files from Apache Spark DataFrames and Spark SQL. Current version 0.1.2, beta-level with monthly-ish releases on PyPI.

pip install hyperleaup

Common errors

error HyperException: Type 'timestamp' is incompatible to type 'timestamptz' of column 'last_updated' in Parquet file. ↓

cause Timestamp column in Spark DataFrame is timestamp without timezone, but Hyper expects timestamptz.

fix

Set timestamp_with_timezone=True in HyperFile/Creator constructor.

error ModuleNotFoundError: No module named 'hyperleaup' ↓

cause Package was renamed or not installed.

fix

Install via pip install hyperleaup and ensure you import as from hyperleaup import HyperFile.

Warnings

breaking In v0.1.1, changed default `timestamp_with_timezone` to False to maintain backward compatibility; setting to True may be needed to avoid type mismatch errors. ↓

fix Explicitly set `timestamp_with_timezone=True` when creating HyperFile if your data contains timestamps with timezone.

gotcha Hyperleaup is tightly coupled to PySpark and Tableau Hyper API. Ensure both are installed and compatible. ↓

fix Run `pip install pyspark tableauhyperapi` before using hyperleaup.

gotcha The `LARGEFILE` creation mode is experimental and only works with Databricks File System (DBFS). Do not use on other distributed file systems. ↓

fix Use default creation mode unless you are on Databricks and understand the implications.

Imports

HyperFile
```
from hyperleaup import HyperFile
```
Creator
```
from hyperleaup import Creator
```

Quickstart

Creates a simple Hyper file from a Spark DataFrame.

from pyspark.sql import SparkSession
from hyperleaup import HyperFile

spark = SparkSession.builder.appName("example").getOrCreate()
df = spark.createDataFrame([(1, "a"), (2, "b")], ["id", "value"])
hyper = HyperFile()
hyper.save(df, "output.hyper")
print("Hyper file created successfully.")