TeradataML
teradataml is a Python package that provides an interface to perform advanced analytics on Teradata Vantage. It allows users to leverage the massive parallel processing capabilities of Teradata Vantage for data manipulation, transformation, and various analytic functions without extensive SQL coding. The current version is 20.0.0.10, and it receives frequent minor updates within its major releases.
Warnings
- breaking Starting from teradataml 20.0.0.9, optional dependencies are no longer installed by default due to footprint reduction and modularization. Users must explicitly install them using `pip install teradataml[<feature_name>]` (e.g., `teradataml[automl]`, `teradataml[openml]`, `teradataml[visualization]`, `teradataml[eda-ui]`).
- gotcha The `fastload()` utility in teradataml does not support table names that start with digits, even though it works with `copy_to_sql()` and SQL directly.
- gotcha Connections can fail if the Teradata system has 'Require Confidentiality' enabled in `gtwcontrol` due to data encryption issues with older `teradatasql` versions.
- gotcha When upgrading `teradataml`, `pip install` may use a cached version. To ensure the new version is downloaded, use the `--no-cache-dir` option.
- breaking In version 20.0.0.10, the `set_auth_token` function's return type changed from a boolean to the class object. While functionality remains the same, this might affect code expecting a boolean return.
- deprecated In earlier versions (e.g., 16.20.00.01), old analytic functions were deprecated due to namespace changes. While not directly breaking for current versions, users upgrading from very old `teradataml` might encounter this.
- gotcha When creating a SQLAlchemy engine for use with `teradataml`'s `create_context`, ensure to specify the `teradatasql` dialect, not a generic `teradata` dialect, to avoid `NoSuchModuleError`.
Install
-
pip install teradataml -
pip install teradataml[automl] -
pip install teradataml[openml] -
pip install teradataml[visualization] -
pip install teradataml[eda-ui]
Imports
- create_context
from teradataml import create_context
- DataFrame
from teradataml import DataFrame
- copy_to_sql
from teradataml import copy_to_sql
- remove_context
from teradataml import remove_context
- in_schema
from teradataml import in_schema
from teradataml.dataframe.dataframe import in_schema
Quickstart
import os
from teradataml import create_context, DataFrame, remove_context
host = os.environ.get('TD_HOST', 'your_teradata_host')
username = os.environ.get('TD_USERNAME', 'your_username')
password = os.environ.get('TD_PASSWORD', 'your_password')
temp_database_name = os.environ.get('TD_TEMP_DB', 'your_temp_db')
# Establish connection to Teradata Vantage
try:
create_context(host=host, username=username, password=password,
logmech='TD2', temp_database_name=temp_database_name)
print("Successfully connected to Teradata Vantage.")
# Create a teradataml DataFrame from an existing table
# Replace 'your_table_name' and 'your_database' with actual values
# For demo, assuming a table named 'sample_data' in the default database
td_df = DataFrame(tablename='sample_data')
print("\nTeradataml DataFrame head:")
print(td_df.head())
except Exception as e:
print(f"An error occurred: {e}")
finally:
# Clean up the connection
if create_context._active_context:
remove_context()
print("Connection context removed.")