Triad
Triad is a Python utility library primarily designed to support Fugue projects, offering a collection of common Python utilities for data processing, schema management, and function dispatching. It is currently at version 1.0.2 and maintains an active release cadence, often aligning with Fugue updates or addressing compatibility with key data science libraries like Pandas and PyArrow.
Warnings
- breaking The `fs` module and its associated functionality were removed in version 1.0.0. Projects relying on `triad`'s file system abstractions will need to refactor their code.
- gotcha Triad frequently updates to maintain compatibility with new Pandas versions (e.g., Pandas 2.0, 2.2, 3.0). This can lead to unexpected behavior or errors if `triad` and `pandas` versions are not carefully aligned.
- gotcha The `ciso8601` package is an optional dependency for faster datetime parsing. If not installed, `triad` will fall back to slower Python-native parsing, which can impact performance, especially on Windows where `ciso8601` historically had specific soft-dependency behavior.
Install
-
pip install triad -
pip install triad[ciso8601]
Imports
- FunctionWrapper
from triad.collections import FunctionWrapper
- Schema
from triad.collections import Schema
- run_at_def
from triad.utils.dispatcher import run_at_def
- assertion
from triad.utils import assertion
Quickstart
import pandas as pd
from triad.collections import FunctionWrapper, Schema
# Example of FunctionWrapper for input/output validation
@FunctionWrapper(
params_re="(pa:pd.DataFrame,pb:pd.DataFrame)(.*)",
return_re="(res:pd.DataFrame)"
)
def process_dataframes(pa: pd.DataFrame, pb: pd.DataFrame, *args) -> pd.DataFrame:
"""A dummy function to demonstrate FunctionWrapper validation."""
print(f"[FunctionWrapper] Received dataframes with columns: {pa.columns.tolist()}, {pb.columns.tolist()}")
print(f"[FunctionWrapper] Received other args: {args}")
return pd.DataFrame({"sum_rows": [len(pa) + len(pb)]})
# Create some dummy dataframes
df1 = pd.DataFrame({"col1": [1, 2], "col2": ["a", "b"]})
df2 = pd.DataFrame({"colA": [3, 4, 5], "colB": ["c", "d", "e"]})
# Call the wrapped function
result = process_dataframes(df1, df2, "extra_param", 42)
print(f"\nFunction result:\n{result}")
# Example of Schema usage
s = Schema("id:int,name:str,value:double")
print(f"\nCreated Schema: {s.as_str}")
# Convert a dictionary to a Pandas DataFrame conforming to the schema
data = {"id": [101, 102], "name": ["Alice", "Bob"], "value": [1.1, 2.2]}
df_from_schema = s.as_pandas_df(data)
print(f"\nDataFrame from Schema:\n{df_from_schema}")