{"id":808,"library":"statsmodels","title":"statsmodels","description":"statsmodels is a Python package offering a wide array of statistical models, hypothesis tests, and statistical data exploration tools. It provides classes and functions for the estimation of many different statistical models, including linear regression, generalized linear models, discrete choice models, and time series analysis. Currently at version 0.14.6, the library follows a loose, long time-based release cycle for its dependencies, typically updating minimal versions every one and a half to two years. [2, 3, 5, 7]","status":"active","version":"0.14.6","language":"python","source_language":"en","source_url":"https://github.com/statsmodels/statsmodels","tags":["statistics","econometrics","modeling","data science","time series","regression"],"install":[{"cmd":"pip install statsmodels","lang":"bash","label":"PyPI"}],"dependencies":[{"reason":"Runtime environment","package":"Python","minimum_version":"3.9"},{"reason":"Numerical operations","package":"NumPy","minimum_version":"1.22.3"},{"reason":"Scientific computing","package":"SciPy","minimum_version":"1.8"},{"reason":"Data structures and manipulation","package":"Pandas","minimum_version":"1.4"},{"reason":"R-style formula parsing","package":"Patsy","minimum_version":"0.5.6"},{"reason":"Plotting functions and examples","package":"Matplotlib","optional":true,"minimum_version":"3"}],"imports":[{"note":"Main entry point for most common models (e.g., OLS, GLM) when using NumPy arrays or pre-processed Pandas DataFrames. It's stable and recommended for direct model fitting. [18, 20]","symbol":"statsmodels.api","correct":"import statsmodels.api as sm"},{"note":"Provides an R-style formula interface, highly recommended for exploratory data analysis and when working directly with Pandas DataFrames and categorical variables. [3, 18, 20]","symbol":"statsmodels.formula.api","correct":"import statsmodels.formula.api as smf"},{"note":"Direct imports from submodules are used for specialized functionality (e.g., time series). Be aware that older submodule paths (like `statsmodels.tsa.arima_model`) might be deprecated or removed in newer versions, use the `statsmodels.tsa.arima.model` path instead. [20, 34]","wrong":"from statsmodels.tsa.arima_model import ARIMA","symbol":"Specific Submodule","correct":"from statsmodels.tsa.arima.model import ARIMA"}],"quickstart":{"code":"import statsmodels.formula.api as smf\nimport pandas as pd\nimport numpy as np\n\n# 1. Create a sample DataFrame\nnp.random.seed(42)\ndata = {\n    'y': 10 + 2 * np.random.rand(100) + 3 * np.random.randn(100),\n    'x1': np.random.rand(100) * 10,\n    'x2': np.random.randint(0, 2, 100) # categorical variable example\n}\ndf = pd.DataFrame(data)\n\n# 2. Fit OLS (Ordinary Least Squares) model using R-style formula\n#    'y ~ x1 + C(x2)' means y is dependent on x1 and categorical x2\nmodel = smf.ols('y ~ x1 + C(x2)', data=df)\nresults = model.fit()\n\n# 3. Print the summary of the regression results\nprint(results.summary())","lang":"python","description":"This example demonstrates how to fit a simple Ordinary Least Squares (OLS) regression model using the R-style formula interface provided by `statsmodels.formula.api`. It shows creating sample data, defining the model with a formula, fitting it, and then printing a comprehensive summary of the results, including coefficients, R-squared, and various statistical tests. [3, 6, 33]"},"warnings":[{"fix":"Always use `import statsmodels.api as sm` or direct imports from `statsmodels.<submodule>` (e.g., `statsmodels.regression.linear_model`). [29, 31]","message":"The `scikits` namespace was deprecated and eventually removed in versions prior to 0.5.0. Direct imports from `scikits.statsmodels` are no longer valid.","severity":"breaking","affected_versions":"<0.5.0"},{"fix":"Ensure `model.predict` calls explicitly pass the `params` argument from the fitted model, e.g., `results.predict(exog)` or `model.predict(results.params, exog)`. [29, 31]","message":"The signature of `model.predict` methods changed in versions prior to 0.5.0. It now explicitly requires the `params` argument (e.g., `model.predict(params, exog)`), rather than assuming the model has already been fit and omitting `params`.","severity":"breaking","affected_versions":"<0.5.0"},{"fix":"Migrate to `statsmodels.tsa.arima.model.ARIMA`. The new API provides more consistent handling and features. [34]","message":"The `statsmodels.tsa.arima_model.ARMA` and `statsmodels.tsa.arima_model.ARIMA` classes have been deprecated. Using them will raise a `FutureWarning`.","severity":"deprecated","affected_versions":">=0.11.0"},{"fix":"Explicitly add a constant term using `X = sm.add_constant(X)` from `statsmodels.api` before fitting the model, or use the `statsmodels.formula.api` interface which handles intercepts automatically. [19, 33]","message":"When using the direct `statsmodels.api.OLS(y, X)` interface (without formulas), an intercept term (constant) is NOT automatically added to the `X` (exog) design matrix. This differs from some other statistical software and can lead to incorrect models if an intercept is expected.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For OLS functionality, `statsmodels.api.OLS` is the recommended replacement. For panel data, Pandas recommends using a `MultiIndex` DataFrame or `xarray`, which can then be used with `statsmodels` models where appropriate (e.g., `MixedLM` for some panel-like structures). [37]","message":"Pandas' `Panel` object and `pandas.stats.ols` (among others) were deprecated and removed in Pandas 0.20.1 and later. Users relying on these for panel data or OLS directly from Pandas will need to switch.","severity":"breaking","affected_versions":"Pandas >=0.20.1"},{"fix":"Ensure your entire scientific Python environment has compatible versions of all libraries when moving to NumPy 2.0. Check dependency release notes for NumPy 2.0 compatibility. [35]","message":"Statsmodels 0.14.2 introduced compatibility with NumPy 2.0.0. While `statsmodels` itself may run on older NumPy versions, if you upgrade to NumPy 2.0, all other Python scientific stack dependencies (like SciPy and Pandas) *must also be NumPy 2.0 compatible* to avoid runtime issues. This release also increased the minimum Python version to 3.9 to match NumPy 2.0.","severity":"breaking","affected_versions":">=0.14.2 (especially when using NumPy >= 2.0)"}],"env_vars":null,"last_verified":"2026-05-12T19:40:02.118Z","next_check":"2026-06-27T00:00:00.000Z","problems":[{"fix":"pip install statsmodels","cause":"The 'statsmodels' package is not installed in the current Python environment.","error":"ModuleNotFoundError: No module named 'statsmodels'"},{"fix":"Ensure that the `endog` and `exog` arrays/Series passed to the model constructor have the exact same number of rows/observations, often by aligning their indices or handling missing values consistently.","cause":"The dependent variable (endog) and independent variables (exog) arrays or series have a different number of observations.","error":"ValueError: endog and exog are of different lengths"},{"fix":"Identify and remove redundant independent variables from your model (e.g., duplicate columns, a constant column when an intercept is automatically added, or dummy variable trap).","cause":"Occurs when one independent variable can be perfectly predicted from a linear combination of other independent variables, leading to an ill-conditioned design matrix.","error":"ValueError: Perfect multicollinearity detected."},{"fix":"Increase the maximum number of iterations (e.g., `model.fit(maxiter=1000)`), check for perfect multicollinearity, or try different optimization methods if available for the specific model.","cause":"The iterative optimization algorithm used by the model (e.g., GLM, discrete choice models) failed to converge to a solution within the specified maximum number of iterations.","error":"ConvergenceWarning: Maximum number of iterations has been reached."}],"ecosystem":"pypi","meta_description":null,"install_score":92,"install_tag":"verified","quickstart_score":null,"quickstart_tag":null,"pypi_latest":"0.14.6","cli_name":null,"install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":2.99,"mem_mb":70.7,"disk_size":"357.3M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.91,"mem_mb":70.6,"disk_size":"357.1M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":12.9,"import_time_s":2.37,"mem_mb":70.7,"disk_size":"345M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.09,"mem_mb":70.6,"disk_size":"344M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":4.89,"mem_mb":87.6,"disk_size":"384.5M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":5.25,"mem_mb":87.5,"disk_size":"384.1M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":12.4,"import_time_s":4.62,"mem_mb":87.6,"disk_size":"370M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.15,"mem_mb":87.5,"disk_size":"369M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":4.53,"mem_mb":85.4,"disk_size":"366.2M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.72,"mem_mb":85.3,"disk_size":"365.9M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":13.7,"import_time_s":4.82,"mem_mb":85.4,"disk_size":"352M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":5.09,"mem_mb":85.3,"disk_size":"351M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":null,"import_time_s":4.13,"mem_mb":84.8,"disk_size":"364.1M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.27,"mem_mb":84.6,"disk_size":"363.7M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":13.7,"import_time_s":4.13,"mem_mb":84.8,"disk_size":"349M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.41,"mem_mb":84.6,"disk_size":"349M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":0.1,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":15.3,"import_time_s":2.58,"mem_mb":68.4,"disk_size":"352M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.41,"mem_mb":68.4,"disk_size":"352M"}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":null,"tag_description":null,"results":[{"runtime":"python:3.10-alpine","exit_code":0},{"runtime":"python:3.10-slim","exit_code":0},{"runtime":"python:3.11-alpine","exit_code":0},{"runtime":"python:3.11-slim","exit_code":0},{"runtime":"python:3.12-alpine","exit_code":0},{"runtime":"python:3.12-slim","exit_code":0},{"runtime":"python:3.13-alpine","exit_code":0},{"runtime":"python:3.13-slim","exit_code":0},{"runtime":"python:3.9-alpine","exit_code":-1},{"runtime":"python:3.9-slim","exit_code":0}]}}