Hierarchical Forecast
HierarchicalForecast, currently at version 1.5.1, is a Python library offering a comprehensive collection of cross-sectional and temporal reconciliation methods for hierarchical time series forecasting. It provides various reconciliation techniques, including BottomUp, TopDown, MiddleOut, MinTrace, and ERM, as well as probabilistic coherent prediction methods like Normality, Bootstrap, and Conformal. The library is actively maintained with regular releases and focuses on bridging the gap between statistical modeling and machine learning in time series analysis.
Common errors
-
ValueError: unique_id column not found. The input dataframe must contain a unique_id column.
cause The input DataFrame (e.g., `Y_df`, `fcst_df`, `S_df`) is indexed by the unique_id or lacks a 'unique_id' column entirely, which is no longer supported as of v1.0.0.fixConvert your `unique_id` index to a column using `.reset_index(names='unique_id')` or ensure a column named 'unique_id' exists in your DataFrame. -
The forecasts are not coherent (e.g., sums of lower levels do not match higher levels).
cause Base forecasts are generated independently for each series in the hierarchy without enforcing aggregation constraints, leading to inconsistencies.fixApply one of the reconciliation methods provided by `hierarchicalforecast.reconciliation.HierarchicalReconciliation`. For example, use `BottomUp`, `TopDown`, or `MinTrace` methods to enforce coherence across the hierarchy. -
ModuleNotFoundError: No module named 'statsforecast'
cause The quickstart or examples rely on `statsforecast` to generate base forecasts, but the package is not installed.fixInstall the `statsforecast` library: `pip install statsforecast`.
Warnings
- breaking As of v1.0.0, the `unique_id` column is no longer supported as a DataFrame index for input data. It must be a regular column.
- deprecated Numba-based implementations for some operations are being deprecated in favor of C++ for improved performance. While still functional, users are encouraged to rely on newer C++ optimized paths or future versions that may remove Numba.
- gotcha To run the quickstart and most practical examples, `hierarchicalforecast` typically requires `statsforecast` for generating base forecasts and `datasetsforecast` for easily loading sample hierarchical datasets. These are not direct core dependencies but are essential for a complete forecasting pipeline.
Install
-
pip install hierarchicalforecast -
conda install -c conda-forge hierarchicalforecast
Imports
- HierarchicalReconciliation
from hierarchicalforecast.reconciliation import HierarchicalReconciliation
- BottomUp
from hierarchicalforecast.methods import BottomUp
- TopDown
from hierarchicalforecast.methods import TopDown
- MinTrace
from hierarchicalforecast.methods import MinTrace
- HierarchicalForecast
from hierarchicalforecast import HierarchicalForecast
from hierarchicalforecast.core import HierarchicalForecast
Quickstart
import pandas as pd
from statsforecast import StatsForecast
from statsforecast.models import Naive
from hierarchicalforecast.reconciliation import HierarchicalReconciliation
from hierarchicalforecast.methods import BottomUp, TopDown, MinTrace
# Dummy hierarchical data (replace with datasetsforecast.HierarchicalData.load for real data)
# Y_df: DataFrame with 'unique_id', 'ds', 'y' columns
# S_df: Summing matrix (DataFrame) for reconciliation
# tags: Dictionary defining the hierarchy levels
# Create dummy data for demonstration
n_series = 5
n_dates = 10
unique_ids = [f'id_{i}' for i in range(n_series)]
all_dates = pd.to_datetime(pd.date_range(start='2020-01-01', periods=n_dates, freq='D'))
# Create Y_df (time series data)
Y_df = pd.DataFrame({
'unique_id': [uid for uid in unique_ids for _ in range(n_dates)],
'ds': list(all_dates) * n_series,
'y': [i * 10 + j + (k % 5) for i in range(n_series) for j in range(n_dates) for k in range(1)] # Simple increasing data
})
# Create S_df (summing matrix)
# Example: A simple 2-level hierarchy: Total -> id_0, id_1, ..., id_N-1
S_df = pd.DataFrame({
'unique_id': ['Total'] + unique_ids,
'Total': [1.0] * (n_series + 1)
})
for i, uid in enumerate(unique_ids):
S_df[uid] = 0.0
S_df.loc[S_df['unique_id'] == uid, uid] = 1.0
tags = {'Total': ['Total'], 'Items': unique_ids}
# 1. Generate base forecasts using StatsForecast
# (Requires `pip install statsforecast`)
sf = StatsForecast(models=[Naive()], freq='D')
fcst_df = sf.predict(Y_df, h=3)
# 2. Reconcile forecasts
reconcilers = [
BottomUp(),
TopDown(method='forecast_proportions'), # Or 'average_proportions', 'simple_average'
MinTrace(method='ols') # Or 'wls_var', 'wls_struct', 'mint_shrink'
]
hrec = HierarchicalReconciliation(reconcilers=reconcilers, S=S_df, tags=tags)
reconciled_fcst_df = hrec.reconcile(fcst_df=fcst_df, Y_df=Y_df)
print("Original Forecasts:")
print(fcst_df.head())
print("\nReconciled Forecasts (BottomUp, TopDown, MinTrace):")
print(reconciled_fcst_df.head())
# Verify coherence for 'Total' (simple check)
# Note: This is a simplified check. Full coherence verification requires more logic.
if 'Total' in reconciled_fcst_df['unique_id'].values:
total_forecast_bottomup = reconciled_fcst_df[reconciled_fcst_df['unique_id'] == 'Total']['Naive/BottomUp'].iloc[0]
sum_items_bottomup = reconciled_fcst_df[reconciled_fcst_df['unique_id'].isin(unique_ids)]['Naive/BottomUp'].sum()
print(f"\nBottomUp Coherence Check: Total={total_forecast_bottomup}, Sum of Items={sum_items_bottomup}")
assert abs(total_forecast_bottomup - sum_items_bottomup) < 1e-6, "BottomUp reconciliation failed coherence check!"