linearmodels
linearmodels is a Python library that extends `statsmodels` with advanced econometric models, including Panel data models (Fixed Effects, Random Effects), Instrumental Variable (IV) estimators (2SLS, GMM), Factor Asset Pricing models, and System Regression models (SUR, 3SLS). It is currently at version 7.0 and sees active development with several major/minor releases per year.
Warnings
- breaking Starting with version 5.0, the variable ordering in formulas processed by `from_formula` is preserved as it appears in the formula, rather than being sorted alphabetically. This can lead to changes in coefficient ordering or model interpretation if relying on the previous sorting behavior.
- breaking Version 5.0 and 7.0 significantly increased the minimum required versions for Python, NumPy, SciPy, pandas, statsmodels, and formulaic. Older environments may not be compatible.
- gotcha The name for clustered covariance was corrected from 'cluster' to 'clustered' in version 7.0. Using the old name will likely result in an error.
- gotcha The library transitioned to `formulaic` as the preferred formula parser. While `patsy` might have been implicitly used or supported in older versions, `formulaic` is now the standard for R-style formula parsing.
- gotcha Estimating models with rank-deficient regressors can lead to unreliable estimates. `linearmodels` includes a rank check by default, but it can be skipped using `rank_check=False`.
Install
-
pip install linearmodels
Imports
- PanelOLS
from linearmodels import PanelOLS
- IV2SLS
from linearmodels.iv import IV2SLS
- SUR
from linearmodels.system import SUR
- PooledOLS
from linearmodels.panel import PooledOLS
- RandomEffects
from linearmodels.panel import RandomEffects
Quickstart
import numpy as np import pandas as pd from linearmodels.datasets import grunfeld from linearmodels.panel import PanelOLS data = grunfeld.load_pandas().data data.year = data.year.astype(np.int64) # Create a MultiIndex (entity - time) for panel data data = data.set_index(['firm', 'year']) # Define dependent and independent variables dep = data.invest exog = data[['value', 'capital']] # Initialize and fit the PanelOLS model with entity effects mod = PanelOLS(dep, exog, entity_effects=True) res = mod.fit(cov_type='clustered', cluster_entity=True) print(res)