Databricks AutoML Runtime
The Databricks AutoML Runtime package provides utilities and custom transformers designed to integrate with Databricks AutoML, particularly for time series and other specialized machine learning tasks. It offers custom scikit-learn compatible transformers, hyperparameter tuning wrappers for models like Prophet and pmdarima, and MLflow logging integrations. The current version is 0.2.21, and it has a moderate release cadence, often addressing compatibility or bug fixes.
Common errors
-
ModuleNotFoundError: No module named 'prophet'
cause The Prophet library is an optional dependency and needs to be installed explicitly if you wish to use Prophet-related functionality.fixInstall `databricks-automl-runtime` with the `prophet` extra: `pip install databricks-automl-runtime[prophet]` -
hyperopt.exceptions.UnsupportedSpace: The search space is not supported
cause This error or similar `hyperopt` related issues often indicate an incompatibility with the installed `hyperopt` version. The library explicitly requires `hyperopt>=0.2.7`.fixUpgrade `hyperopt` to the required version: `pip install -U "hyperopt>=0.2.7"` -
TypeError: '...' object is not iterable (related to hyper-parameters) or ValueError during Prophet tuning.
cause Older versions of `databricks-automl-runtime` had bugs related to Prophet's hyperparameter space, especially for enumeration types, causing issues during tuning.fixUpgrade `databricks-automl-runtime` to the latest version (`0.2.4.1` or newer) to get bug fixes for Prophet hyper-parameter handling: `pip install -U databricks-automl-runtime`
Warnings
- breaking The library explicitly requires `hyperopt>=0.2.7`. Using older versions of hyperopt can lead to unexpected errors or crashes during hyperparameter tuning tasks, particularly when using `HyperoptEstimator` classes.
- gotcha Prophet and pmdarima are optional dependencies. If you intend to use `ProphetHyperoptEstimator` or `ArimaHyperoptEstimator`, you must install the respective packages separately using the extra syntax (e.g., `pip install databricks-automl-runtime[prophet]`). A `ModuleNotFoundError` will occur otherwise.
- gotcha Versions of the library before `0.2.4.1` (e.g., `0.2.3.1`, `0.2.4`) had known issues with Prophet hyper-parameter enumeration. This could lead to incorrect tuning outcomes or runtime errors when defining certain hyperparameter search spaces for Prophet models.
Install
-
pip install databricks-automl-runtime -
pip install databricks-automl-runtime[prophet] -
pip install databricks-automl-runtime[pmdarima]
Imports
- DateTransformer
from automl_runtime.sklearn.date_time_transformers import DateTransformer
- ProphetHyperoptEstimator
from automl_runtime.hyperopt.prophet_hyperopt_estimator import ProphetHyperoptEstimator
- ArimaHyperoptEstimator
from automl_runtime.hyperopt.arima_hyperopt_estimator import ArimaHyperoptEstimator
Quickstart
import pandas as pd
from automl_runtime.sklearn.date_time_transformers import DateTransformer
# Create a sample DataFrame with a datetime column
df = pd.DataFrame({
'timestamp_col': pd.to_datetime(['2023-01-01', '2023-01-02', '2023-01-03']),
'value': [10, 12, 15]
})
# Instantiate the DateTransformer
# This transformer extracts date-related features like year, month, day, day_of_week, etc.
date_transformer = DateTransformer(
timestamp_col='timestamp_col',
output_timestamp_col_name='datetime_features'
)
# Fit and transform the DataFrame
X_transformed = date_transformer.fit_transform(df)
print("Original DataFrame:\n", df)
print("\nTransformed DataFrame (first 5 columns):\n", X_transformed.iloc[:, :5])