statistical-reporting

Installation

SKILL.md

Statistical Reporting Best Practice

Test Selection Quick Reference

Comparing two groups (independent, normal): Independent t-test
Comparing two groups (independent, non-normal): Mann-Whitney U test
Comparing two groups (paired, normal): Paired t-test
Comparing two groups (paired, non-normal): Wilcoxon signed-rank test
Comparing 3+ groups (independent, normal): One-way ANOVA + post-hoc
Comparing 3+ groups (non-normal): Kruskal-Wallis test
Relationship between continuous variables: Pearson or Spearman correlation
Categorical outcomes: Chi-square or Fisher's exact test
Predicting continuous outcome: Linear regression
Predicting binary outcome: Logistic regression

Assumption Checking

Normality: Shapiro-Wilk test (n < 50) or visual Q-Q plots
Homogeneity of variance: Levene's test before t-tests and ANOVA
Independence: Verify study design ensures independent observations
Linearity: Scatter plots and residual plots for regression
Multicollinearity: VIF < 5 for multiple regression predictors
When assumptions are violated, use non-parametric alternatives or robust methods

APA Reporting Format

t-test: t(df) = X.XX, p = .XXX, d = X.XX
ANOVA: F(df_between, df_within) = X.XX, p = .XXX, eta-squared = .XX
Correlation: r(df) = .XX, p = .XXX [95% CI: .XX, .XX]
Chi-square: chi-square(df, N = XXX) = X.XX, p = .XXX
Regression: beta = X.XX, SE = X.XX, t = X.XX, p = .XXX
Always report exact p-values (not "p < .05") unless p < .001
Use leading zero for values that can exceed 1 (e.g., t = 0.50) but not for those bounded by 1 (e.g., p = .032, r = .45)

Effect Sizes

ALWAYS report effect sizes alongside p-values
Cohen's d for group comparisons: small = 0.2, medium = 0.5, large = 0.8
Eta-squared for ANOVA: small = .01, medium = .06, large = .14
R-squared for regression: report adjusted R-squared for multiple predictors
Odds ratios for logistic regression with 95% confidence intervals
Distinguish statistical significance from practical significance

Common Mistakes to Avoid

Never say "the results were not significant, therefore there is no effect"
Do not confuse correlation with causation in observational data
Apply multiple comparison corrections (Bonferroni, FDR) when running many tests
Report confidence intervals, not just point estimates
State whether tests are one-tailed or two-tailed and justify the choice

Related skills

More from aiming-lab/autoresearchclaw

Installs

9

Repository

aiming-lab/auto…archclaw

GitHub Stars

12.0K

First Seen

Apr 1, 2026

Security Audits

Gen Agent Trust HubPass