AutoGluon Core
AutoGluon is an open-source AutoML library developed by AWS AI, designed to automate machine learning tasks with minimal code. It supports various data types, including tabular, image, text, and time series, enabling users to train and deploy highly accurate models efficiently. The current version is 1.5.0, with regular updates introducing new features, performance improvements, and bug fixes, typically on a quarterly major release cadence with interim patch releases.
Warnings
- breaking Models trained with an older version of AutoGluon are generally not compatible with newer versions. Users are advised to re-train models after upgrading the library.
- breaking Python 3.8 support was dropped in AutoGluon v1.2.0. Users on Python 3.8 will need to upgrade their Python environment to use v1.2.0 or newer.
- deprecated Several methods of `TabularPredictor` (e.g., `persist_models`, `get_model_names`, `get_pred_from_proba`) were deprecated in v1.0.0, started raising errors in v1.2.0, and were removed in v1.3.0.
- gotcha `autogluon-core` is a foundational package primarily containing core utilities, searchers, and schedulers for hyperparameter tuning. Most users performing machine learning tasks will interact with higher-level packages like `autogluon.tabular`, `autogluon.timeseries`, or `autogluon.multimodal` (via `pip install autogluon`).
- gotcha The `extreme` preset for `TabularPredictor`, introduced in v1.4.0, often requires a CUDA-compatible GPU (ideally with 32+ GB vRAM) for optimal performance and is most effective for datasets with at most 30,000 samples. Inference time can also be longer than with other presets.
- gotcha Python 3.13 support, introduced in v1.5.0, is currently experimental. Some features may not be available when running on Python 3.13, particularly on Windows.
- breaking For users previously using `"TABPFNV2"` as a model, AutoGluon v1.4.0 (and newer) strongly recommends switching to `"REALTABPFN-V2"` to avoid breaking changes related to the underlying TabPFN releases.
Install
-
pip install autogluon
Imports
- TabularPredictor
from autogluon.tabular import TabularPredictor
- TabularDataset
from autogluon.tabular import TabularDataset
Quickstart
import pandas as pd
from autogluon.tabular import TabularPredictor, TabularDataset
# Create a dummy CSV for demonstration
data = {
'feature1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'feature2': ['A', 'B', 'A', 'C', 'B', 'A', 'C', 'B', 'A', 'C'],
'target': [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
}
df = pd.DataFrame(data)
df.to_csv('train.csv', index=False)
# Load data using TabularDataset
train_data = TabularDataset('train.csv')
# Initialize and train a TabularPredictor
label = 'target'
predictor = TabularPredictor(label=label, path='AutogluonModels').fit(train_data)
# Make predictions (example test data)
test_data = TabularDataset(pd.DataFrame({
'feature1': [11, 12, 13],
'feature2': ['A', 'B', 'C']
}))
predictions = predictor.predict(test_data)
print("Predictions:")
print(predictions)