pmdarima (Auto-ARIMA)
pmdarima is a Python library that provides an equivalent to R's `auto.arima` function, automating the process of selecting optimal ARIMA (AutoRegressive Integrated Moving Average) models for time series forecasting. It builds on `statsmodels` but offers a scikit-learn-like API, simplifying complex time series analysis. The library is currently at version 2.1.1 and receives regular updates for Python version compatibility and dependency support.
Warnings
- breaking Version 2.1.0 removed support for Python 3.7, 3.8, and 3.9. Projects using these older Python versions must either remain on pmdarima < 2.1.0 or upgrade their Python environment.
- breaking Version 2.1.0 introduced support for Numpy 2.x and simultaneously removed support for Numpy 1.x. Older projects might encounter build or runtime errors if Numpy is not updated.
- breaking Version 2.1.0 increased the minimum required versions for SciPy to >=1.13.0 and Statsmodels to >=0.14.5. Using older versions of these dependencies will likely result in installation issues or runtime errors.
- gotcha The `m` parameter (seasonal period) in `auto_arima` is not automatically detected and must be specified by the user. An incorrect `m` value can lead to suboptimal or erroneous seasonal models.
- deprecated The `exogenous` and `sarimax_kwargs` arguments to `ARIMA` and `auto_arima` were deprecated in earlier 1.x versions and will now raise a `TypeError` if used in 2.0.0 and later.
Install
-
pip install pmdarima
Imports
- auto_arima
from pmdarima import auto_arima
- ARIMA
from pmdarima.arima import ARIMA
Quickstart
import pmdarima as pm
import numpy as np
import matplotlib.pyplot as plt
# Generate some sample time series data
y = np.random.rand(100) * 10 + np.arange(100) # Simple trend + noise
# Fit a stepwise auto_arima model
model = pm.auto_arima(y,
start_p=1, start_q=1,
test='adf', # use adftest to find optimal 'd'
max_p=3, max_q=3, # maximum p and q
m=1, # frequency of series
d=None, # let model determine 'd'
seasonal=False, # No seasonality
start_P=0,
D=0,
trace=False, # Suppress verbose output
error_action='ignore',
suppress_warnings=True,
stepwise=True)
# Make predictions
forecast, conf_int = model.predict(n_periods=10, return_conf_int=True)
print("Forecast:", forecast)
print("Confidence Interval:", conf_int)
# Optional: plot results
# plt.plot(y, label='Actual')
# plt.plot(np.arange(len(y), len(y) + len(forecast)), forecast, label='Forecast')
# plt.fill_between(np.arange(len(y), len(y) + len(forecast)),
# conf_int[:, 0], conf_int[:, 1], alpha=0.1)
# plt.legend()
# plt.show()