statspai
StatsPAI: Agent-Native Causal Inference & Econometrics
StatsPAI is the agent-native Python package for causal inference and applied econometrics. One import statspai as sp, 390+ functions, covering the complete empirical research workflow.
Source: https://github.com/brycewang-stanford/StatsPAI
PyPI: pip install statspai
Paper: Published in Journal of Open Source Software (JOSS)
Why StatsPAI for Agents?
StatsPAI is the first econometrics toolkit purpose-built for LLM-driven research workflows:
- Self-describing API:
sp.list_functions(),sp.describe_function("did"),sp.function_schema("rdrobust")— agents can discover and understand functions without documentation lookup - Unified result objects: Every function returns a
CausalResultwith.summary(),.plot(),.to_latex(),.to_word(),.to_excel(),.cite() - One import: No need to juggle 20+ packages —
import statspai as spcovers everything - Publication-ready output: Word, Excel, LaTeX, HTML export in every function
Core Methods
Classical Econometrics
sp.regress(df, "y ~ x1 + x2", cluster="firm_id") # OLS
sp.ivreg(df, "y ~ x1 | z1 + z2", cluster="state") # IV/2SLS
sp.panel(df, "y ~ x1 + x2", entity="firm", time="year", model="fe") # Panel FE
sp.heckman(df, "y ~ x1", "select ~ z1 + z2") # Heckman selection
sp.qreg(df, "y ~ x1 + x2", quantile=0.5) # Quantile regression
Difference-in-Differences
sp.did(df, "y", "treated", "post") # Auto-dispatch (2x2 or staggered)
sp.callaway_santanna(df, "y", "group", "time") # Staggered DID (CS 2021)
sp.sun_abraham(df, "y", "cohort", "time") # Interaction-weighted event study
sp.bacon_decomposition(df, "y", "treated", "time") # TWFE diagnostic
sp.honest_did(result, method="smoothness") # Sensitivity to PT violations
sp.continuous_did(df, "y", "dose", "time") # Continuous treatment
Regression Discontinuity
sp.rdrobust(df, "y", "running_var", cutoff=0) # Sharp RD (CCT 2014)
sp.rdrobust(df, "y", "running_var", fuzzy="treatment") # Fuzzy RD
sp.rddensity(df, "running_var") # McCrary density test
sp.rdmc(df, "y", "running_var", cutoffs=[0, 5, 10]) # Multi-cutoff RD
sp.rkd(df, "y", "running_var", cutoff=0) # Regression kink design
Matching & Reweighting
sp.match(df, "treatment", covariates, method="psm") # Propensity score matching
sp.match(df, "treatment", covariates, method="cem") # Coarsened exact matching
sp.ebalance(df, "treatment", covariates) # Entropy balancing
Synthetic Control
sp.synth(df, "y", "unit", "time", treated_unit=1, treated_period=2000) # ADH SCM
sp.sdid(df, "y", "unit", "time", treated_units, treated_periods) # Synthetic DID
Machine Learning Causal Inference
sp.dml(df, "y", "treatment", controls, model="PLR") # Double/Debiased ML
sp.causal_forest(df, "y", "treatment", controls) # Causal Forest (GRF)
sp.metalearner(df, "y", "treatment", controls, learner="dr") # DR-Learner
sp.tmle(df, "y", "treatment", controls) # Targeted MLE
sp.aipw(df, "y", "treatment", controls) # Augmented IPW
Neural Causal Models
sp.tarnet(df, "y", "treatment", controls) # TARNet
sp.cfrnet(df, "y", "treatment", controls) # CFRNet
sp.dragonnet(df, "y", "treatment", controls) # DragonNet
Robustness & Workflow
sp.spec_curve(df, "y", "treatment", controls, specs) # Specification curve
sp.robustness_report(result) # Automated robustness report
sp.subgroup_analysis(df, "y", "treatment", subgroups) # Heterogeneity with Wald test
result.to_latex() # Export to LaTeX
result.to_word("output.docx") # Export to Word
result.cite() # Auto-generate citation
Interactive Visualization (v0.6+)
fig = result.plot()
sp.interactive(fig) # Stata Graph Editor-style WYSIWYG editing, 29 academic themes
Agent Integration Pattern
import statspai as sp
# Step 1: Discover available functions
functions = sp.list_functions()
# Step 2: Understand a specific function
info = sp.describe_function("callaway_santanna")
# Step 3: Get JSON schema for structured calls
schema = sp.function_schema("callaway_santanna")
# Step 4: Execute and get structured results
result = sp.callaway_santanna(df, "y", "group", "time")
print(result.summary())
result.to_latex("tables/did_results.tex")
When to Use StatsPAI vs Other Packages
| Scenario | Use StatsPAI | Alternative |
|---|---|---|
| Agent-driven analysis pipeline | ✅ Best choice — self-describing API | pyfixest (no agent API) |
| Full causal inference workflow | ✅ 390+ functions, one import | Assemble 10+ R/Python packages |
| Publication-ready output needed | ✅ Word/Excel/LaTeX/HTML built-in | statsmodels (no export) |
| Staggered DID with diagnostics | ✅ CS + SA + Bacon + HonestDID | differences (partial) |
| Neural causal models | ✅ TARNet/CFRNet/DragonNet | econml (partial) |
| Stata users migrating to Python | ✅ Stata-equivalent function names | linearmodels (limited) |
More from brycewang-stanford/awesome-agent-skills-for-empirical-research
literature-review
帮助用户撰写高质量的文献综述类论文。提供从选题、文献检索、评估筛选、结构规划到最终写作的全流程指导。适用于需要撰写独立文献综述论文或学术论文中文献综述部分的用户。
12marginaleffects
Manual for the marginaleffects R and Python package, and guide to the book "Model to Meaning". Use when users ask about predictions, comparisons, slopes, marginal effects, average treatment effects (ATE/ATT/CATE), hypothesis testing, contrasts, counterfactuals, risk ratios, odds ratios, causal inference with G-computation, or need help with marginaleffects functions like predictions(), comparisons(), slopes(), hypotheses(), datagrid(), avg_predictions(), avg_comparisons(), avg_slopes(), or plot functions.
11avoid-ai-writing
Audit and rewrite content to remove AI writing patterns ("AI-isms"). Use this skill when asked to "remove AI-isms," "clean up AI writing," "edit writing for AI patterns," "audit writing for AI tells," or "make this sound less like AI." Supports a detection-only mode that flags patterns without rewriting.
11chinese-de-aigc
面向中文学术论文的降 AIGC 检测率 Skill。针对知网、万方、维普、Turnitin 中文版的检测机制,识别并消除中文大语言模型的 17 类结构化写作痕迹。采用"定位 → 诊断 → 改写 → 自评 → 复查"五步闭环工作流,分章节差异化策略(摘要/引言/文献综述/方法/结果/讨论/结论),保持学术严谨性前提下通过检测。
11stata-accounting-research
|
11deslop
Remove AI writing patterns from prose. Use this skill when writing, drafting, editing, reviewing, or revising any text to eliminate predictable AI tells, slop, and formulaic patterns. Trigger this skill whenever the user asks to "deslop", "de-AI", "make it sound human," "remove AI patterns," "remove AI tropes," "clean up AI writing," fix "slop," "deslop" text, or review prose for authenticity. Also use when the user asks you to write or draft anything and wants it to sound natural rather than AI-generated. Common use cases include scientific writing (manuscripts, abstracts, cover letters, grant narratives, discussion sections, peer review responses), blog posts, newsletters, memos, reports, and any other substantial prose.
10