Scripts2026年5月24日·1 分钟阅读

statsmodels — Statistical Modeling and Econometrics in Python

A Python library providing classes and functions for estimation of statistical models, performing tests, and exploring data with a focus on transparency and completeness of results.

Agent 就绪

这个资产可以被 Agent 直接读取和安装

TokRepo 同时提供通用 CLI 命令、安装契约、metadata JSON、按适配器生成的安装计划和原始内容链接,方便 Agent 判断适配度、风险和下一步动作。

Native · 98/100策略:允许
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Single
信任
信任等级:Established
入口
statsmodels Overview
通用 CLI 安装命令
npx tokrepo install c95fc338-578c-11f1-9bc6-00163e2b0d79

Introduction

statsmodels complements scikit-learn by focusing on classical statistical inference rather than prediction. It provides detailed model summaries with coefficients, standard errors, p-values, and confidence intervals — the output statisticians and economists expect from tools like R or Stata.

What statsmodels Does

  • Fits linear and generalized linear models with comprehensive diagnostic output
  • Implements time-series analysis including ARIMA, VAR, state-space models, and seasonal decomposition
  • Provides nonparametric methods like kernel density estimation and lowess smoothing
  • Runs hypothesis tests (t-test, F-test, Granger causality, unit root tests)
  • Generates publication-ready regression tables and diagnostic plots

Architecture Overview

statsmodels follows a model-fit-results pattern. You specify a model class (OLS, Logit, ARIMA), call .fit() to estimate parameters, and receive a results object with properties for coefficients, residuals, information criteria, and statistical tests. Under the hood, estimation uses scipy.optimize and numpy linear algebra routines.

Self-Hosting & Configuration

  • Install via pip: pip install statsmodels
  • Depends on NumPy, SciPy, pandas, and patsy for formula-based model specification
  • Use R-style formulas: sm.OLS.from_formula("y ~ x1 + x2", data=df)
  • Configure optimizer parameters and covariance estimators per model
  • Works in Jupyter notebooks with rich HTML output for model summaries

Key Features

  • Comprehensive model summaries matching R/Stata output with AIC, BIC, R-squared, and residual diagnostics
  • Time-series toolbox with ARIMA, SARIMAX, VAR, and exponential smoothing
  • Robust covariance estimators (HC0-HC3, HAC, clustered) for correct inference under heteroscedasticity
  • Mixed-effects models for hierarchical and panel data
  • Survival analysis with Kaplan-Meier and Cox proportional hazards

Comparison with Similar Tools

  • scikit-learn — focused on prediction accuracy; statsmodels provides inference statistics (p-values, confidence intervals)
  • R (stats package) — the gold standard for statistical computing; statsmodels brings similar functionality to the Python ecosystem
  • SciPy (scipy.stats) — provides individual tests; statsmodels offers full model estimation and diagnostics
  • linearmodels — extends statsmodels with panel data and IV models; statsmodels covers the broader foundation

FAQ

Q: When should I use statsmodels instead of scikit-learn? A: Use statsmodels when you need to understand relationships (coefficients, significance, confidence intervals) rather than just predict outcomes.

Q: Does statsmodels support regularized regression? A: Yes. OLS and GLM classes support elastic net regularization via fit_regularized(), though scikit-learn may be more convenient for pure prediction tasks.

Q: Can I use statsmodels for time-series forecasting? A: Yes. ARIMA, SARIMAX, and state-space models are well-implemented with automatic parameter selection helpers.

Q: How does the formula API work? A: Use patsy-style formulas like "y ~ x1 + x2 + x1:x2" to specify models declaratively from a DataFrame, similar to R.

Sources

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产