FLAML: A Fast Library for Automated Machine Learning and Tuning
FLAML (Fast Library for Automated Machine Learning) is an open-source Python library developed by Microsoft for efficient automation of machine learning and AI operations. It streamlines tasks such as model selection and hyperparameter optimization, and supports a wide range of models including classical machine learning algorithms, deep neural networks, and large language models. Currently at version 2.5.0, FLAML maintains an active development cycle, regularly releasing updates that include expanded Python version compatibility (e.g., Python 3.13 support), performance enhancements, and comprehensive documentation improvements.
Warnings
- breaking FLAML's Python version requirements have become stricter. As of version 2.5.0, it requires Python >= 3.10 and < 3.14. Using older Python versions will result in installation failures or runtime errors.
- breaking The `autogen` module, previously bundled with FLAML, has been moved to its own independent `autogen` library. Attempting to import `flaml.autogen` will fail.
- gotcha When integrating with Ray Tune, the import path for `tune.report` might cause issues if not updated for `ray>=2`. Older `ray` versions or inconsistent imports can lead to errors.
- gotcha When using `flaml.tune.run` for hyperparameter optimization, a warning might appear regarding missing 'low_cost_partial_config'. This parameter is crucial for cost-frugal search, and omitting it can lead to less efficient tuning.
- gotcha In specific deployment environments (e.g., Snowflake Snowpark), the default log file path set in `automl_settings` might not be writable or accessible, causing failures. The path `iris.log` in the quickstart is relative to the current working directory, which might not be appropriate in sandboxed environments.
Install
-
pip install flaml -
pip install "flaml[automl]"
Imports
- AutoML
from flaml import AutoML
- tune
from flaml import tune
- LGBMRegressor
from flaml.default import LGBMRegressor
- autogen
import autogen
Quickstart
from flaml import AutoML
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
# Load a sample dataset
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize AutoML
automl = AutoML()
# Define settings for AutoML
automl_settings = {
"time_budget": 10, # in seconds
"metric": "accuracy",
"task": "classification",
"log_file_name": "flaml_iris.log", # Optional: logs will be saved here
}
# Train the AutoML model
print("Starting AutoML training...")
automl.fit(X_train=X_train, y_train=y_train, **automl_settings)
print("AutoML training finished.")
# Best model details
print(f"Best estimator: {automl.model.estimator}")
print(f"Best metric: {automl.best_result['accuracy']}")
# Make predictions
predictions = automl.predict(X_test)
print(f"Sample predictions: {predictions[:5]}")