Parameter Recovery Checker

SKILL.md

Parameter Recovery Checker

Purpose

This skill encodes expert methodological knowledge for conducting parameter recovery studies -- a critical validation step before interpreting fitted model parameters. Parameter recovery determines whether a model's parameters are identifiable given the experimental design and sample size. A general-purpose programmer unfamiliar with computational modeling would not know that fitting a model is insufficient validation, or how to diagnose parameter tradeoffs and non-identifiability.

When to Use This Skill

  • Before trusting fitted parameter values from any computational cognitive model
  • When developing a new model and assessing whether parameters can be distinguished from data
  • When planning an experiment and determining the minimum trial count for reliable parameter estimation
  • When a reviewer asks for evidence of model identifiability
  • When comparing models and needing to ensure each model can be distinguished (model recovery)
  • When fitted parameters produce suspiciously extreme values or hit bounds

Research Planning Protocol

Before executing the domain-specific steps below, you MUST:

  1. State the research question -- What specific question is this analysis/paradigm addressing?
  2. Justify the method choice -- Why is this approach appropriate? What alternatives were considered?
  3. Declare expected outcomes -- What results would support vs. refute the hypothesis?
  4. Note assumptions and limitations -- What does this method assume? Where could it mislead?
  5. Present the plan to the user and WAIT for confirmation before proceeding.

For detailed methodology guidance, see the research-literacy skill.

⚠️ Verification Notice

This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.

Why Parameter Recovery Matters

Fitting a model to data and obtaining parameter estimates does NOT guarantee those estimates are meaningful (Wilson & Collins, 2019; Navarro, 2019). Common failure modes:

  1. Non-identifiability: Multiple parameter combinations produce identical model predictions (e.g., drift rate and boundary in DDM trade off; Ratcliff & Tuerlinckx, 2002)
  2. Insufficient data: Too few trials for the fitting procedure to recover true values
  3. Local minima: Optimization converges to wrong parameter values
  4. Model misspecification: The fitting procedure recovers parameters that do not reflect the assumed cognitive process

Parameter recovery is the standard diagnostic for these problems (Heathcote et al., 2015; Wilson & Collins, 2019).

Step-by-Step Recovery Procedure

Step 1: Define the Parameter Space

Choose ground-truth parameter values that span the plausible range for each parameter.

How many parameter sets to simulate?
 |
 +-- Minimum: 100 parameter sets (Wilson & Collins, 2019)
 |
 +-- Recommended: 500-1000 parameter sets for smooth recovery landscapes
 |
 +-- For publication: 1000+ parameter sets (Heathcote et al., 2015)

Sampling strategy:

Strategy When to Use Source
Uniform grid Few parameters (1-2), want complete coverage Standard practice
Latin hypercube 3+ parameters, want space-filling without excessive samples McKay et al., 1979
Random uniform Simple, adequate for many parameters Wilson & Collins, 2019
Prior-based sampling Have informative priors on parameter ranges Palestro et al., 2018

Range selection: Use ranges from published parameter estimates in the domain. For example:

  • DDM drift rate v: 0.5 -- 4.0 (Ratcliff & McKoon, 2008)
  • DDM boundary a: 0.5 -- 2.5 (Ratcliff & McKoon, 2008)
  • DDM non-decision time Ter: 0.1 -- 0.5 s (Ratcliff & McKoon, 2008)
  • ACT-R activation noise s: 0.1 -- 0.8 (Anderson, 2007)

Step 2: Simulate Data

For each ground-truth parameter set:

  1. Match the experimental design exactly -- Same number of trials, conditions, and structure as the real experiment
  2. Use the same model -- The generative model must be identical to the model you will fit
  3. Include realistic noise -- Use the model's noise mechanism (do not add external noise)
  4. Store the ground-truth parameters for later comparison

Critical: The number of simulated trials per participant must match the actual experiment. Recovery with 10,000 trials tells you nothing about recovery with 100 trials (Wilson & Collins, 2019).

Step 3: Fit the Model to Simulated Data

Apply the exact same fitting procedure you use for real data:

  • Same optimization algorithm (e.g., MLE, Bayesian, chi-square minimization)
  • Same parameter bounds and constraints
  • Same starting values or initialization strategy
  • Same convergence criteria

Multiple starting points: Run the optimizer from at least 5-10 random starting points per simulated dataset to avoid local minima (Heathcote et al., 2015).

Step 4: Evaluate Recovery Quality

Compare recovered parameters to true (ground-truth) parameters using multiple metrics.

Primary Metrics

Metric Formula Good Acceptable Concerning Source
Pearson correlation (r) cor(true, recovered) r > 0.9 r > 0.8 r < 0.7 Heathcote et al., 2015; rough benchmarks
Bias mean(recovered - true) Near 0 < 10% of range > 20% of range Wilson & Collins, 2019
RMSE sqrt(mean((recovered - true)^2)) Small relative to range -- Large relative to range Standard
Coverage % of 95% CIs containing true value ~95% 85-100% < 80% Bayesian recovery

Visualization (essential)

  1. Scatter plot: Recovered vs. true for each parameter (identity line = perfect recovery)
  2. Bland-Altman plot: Difference vs. mean (detect range-dependent bias)
  3. Parameter correlation matrix: Off-diagonal correlations reveal tradeoffs

See references/recovery-diagnostics.md for visualization templates.

Step 5: Check Parameter Tradeoffs

Correlation between recovered parameters:

Are any pairs of recovered parameters correlated |r| > 0.5?
 |
 +-- YES --> These parameters trade off. Consider:
 | - Fixing one to a theoretically motivated value
 | - Reparameterizing the model
 | - Collecting more data to improve identifiability
 | - Reporting the tradeoff and interpreting cautiously
 |
 +-- NO --> Parameters are identifiable given this design

Common parameter tradeoffs in cognitive models:

Model Correlated Parameters Nature of Tradeoff Source
DDM Drift rate (v) and boundary (a) Speed-accuracy tradeoff Ratcliff & Tuerlinckx, 2002
DDM Non-decision time (Ter) and boundary (a) Boundary absorbs timing variance Ratcliff & Tuerlinckx, 2002
ACT-R Noise (s) and threshold (tau) Both affect retrieval probability Anderson, 2007
RL models Learning rate (alpha) and inverse temperature (beta) Both control exploitation Daw, 2011
Signal detection d-prime and criterion (c) Criterion shift mimics sensitivity change Macmillan & Creelman, 2005

Model Recovery (Confusion Matrix)

Model recovery extends parameter recovery to test whether the correct model can be identified from data (Wagenmakers et al., 2004).

Procedure

  1. For each candidate model M_k (k = 1, ..., K): a. Simulate data from M_k with representative parameters b. Fit ALL candidate models to the simulated data c. Select the best-fitting model using your comparison metric (AIC, BIC, Bayes factor)
  2. Construct a K x K confusion matrix: rows = generating model, columns = selected model
  3. Diagonal entries should dominate (correct model selected)

Quality Criteria

Metric Good Concerning Source
Diagonal proportion > 90% correct < 70% correct Wagenmakers et al., 2004
Off-diagonal patterns Symmetric confusion Asymmetric (one model always "wins") Wilson & Collins, 2019

Warning: If model A is selected when data are generated from model B more than 20% of the time, those models are not distinguishable with your experimental design (Wilson & Collins, 2019).

Sample Size Effects

How Trial Count Affects Recovery

Recovery quality improves with more trials per participant. Test recovery at multiple trial counts:

Trial Count Expected Recovery Recommendation
< 50 trials Often poor (r < 0.7) Increase trials or simplify model
50-100 trials Marginal for simple models May suffice for 2-3 parameter models
100-200 trials Adequate for most models Standard for DDM (Ratcliff & McKoon, 2008)
200-500 trials Good for complex models Recommended for models with > 4 parameters
500+ trials Excellent for most models Required for hierarchical models

Source: Wilson & Collins (2019); Ratcliff & Tuerlinckx (2002) for DDM-specific guidance.

Recovery as a Function of N

Plot recovery metrics (r, RMSE) as a function of trial count to determine the minimum viable N for your specific model and paradigm.

Landscape Analysis

Parameter Sensitivity Surfaces

For 1-2 key parameters, compute and visualize the objective function surface:

  1. Fix all parameters except the target parameter(s)
  2. Evaluate the objective function (e.g., negative log-likelihood) at a grid of values
  3. Plot the surface (1D: line; 2D: contour or heatmap)

What to look for:

Surface Feature Interpretation Action
Single sharp minimum Well-identified parameter Proceed with confidence
Broad flat minimum Parameter poorly constrained Widen prior or collect more data
Multiple minima Non-convex; local minima risk Use multiple starting points; consider reparameterization
Ridge (elongated valley) Parameter tradeoff Two parameters are correlated; consider fixing one

Reporting Standards

Minimum Reporting Checklist

When publishing a parameter recovery study:

  • Number of simulated parameter sets (minimum 100; Wilson & Collins, 2019)
  • Sampling strategy for ground-truth parameters (uniform, LHS, prior-based)
  • Range of ground-truth values for each parameter (with justification)
  • Number of simulated trials per dataset (must match real experiment)
  • Fitting procedure used (same as for real data)
  • Number of starting points for optimization
  • Recovery metrics for each parameter: correlation (r), bias, RMSE
  • Scatter plots: recovered vs. true for each parameter
  • Parameter correlation matrix (recovered parameters)
  • Model recovery confusion matrix (if performing model comparison)
  • Recovery as a function of trial count (if applicable)

Where to Report

  • Main text: Summary of recovery quality (r values, key plots)
  • Supplementary: Full correlation matrices, all scatter plots, landscape analyses
  • Parameter recovery is increasingly expected in top journals (Wilson & Collins, 2019; Navarro, 2019)

Common Pitfalls

  1. Testing recovery with too many trials: Simulating 10,000 trials when the experiment has 100. Recovery will look excellent but is irrelevant to your actual data (Wilson & Collins, 2019).
  2. Using different fitting procedures: The recovery study must use the identical optimization pipeline as the real-data analysis. Different starting values, bounds, or algorithms invalidate the test.
  3. Ignoring parameter correlations: High marginal recovery (good r for each parameter) can coexist with strong parameter tradeoffs that distort interpretation. Always check the cross-parameter correlation matrix.
  4. Reporting only correlation: Correlation measures rank-order recovery but ignores systematic bias. A parameter can have r = 0.95 but be consistently overestimated by 30%. Report bias and RMSE alongside r.
  5. Sampling only near defaults: If ground-truth values cluster around typical defaults, recovery may look good only in that region. Sample across the full plausible range.
  6. Neglecting model recovery: Good parameter recovery does not guarantee good model recovery. Two models can have recoverable parameters individually but be indistinguishable when competing (Wagenmakers et al., 2004).
  7. Confusing identifiability with validity: A model can have perfectly recoverable parameters and still be a poor model of cognition. Recovery is necessary but not sufficient (Navarro, 2019).

References

  • Anderson, J. R. (2007). How Can the Human Mind Occur in the Physical Universe? Oxford University Press.
  • Daw, N. D. (2011). Trial-by-trial data analysis using computational models. In M. R. Delgado, E. A. Phelps, & T. W. Robbins (Eds.), Decision Making, Affect, and Learning. Oxford University Press.
  • Heathcote, A., Brown, S. D., & Wagenmakers, E.-J. (2015). An introduction to good practices in cognitive modeling. In B. U. Forstmann & E.-J. Wagenmakers (Eds.), An Introduction to Model-Based Cognitive Neuroscience. Springer.
  • Macmillan, N. A., & Creelman, C. D. (2005). Detection Theory: A User's Guide (2nd ed.). Lawrence Erlbaum Associates.
  • McKay, M. D., Beckman, R. J., & Conover, W. J. (1979). A comparison of three methods for selecting values of input variables. Technometrics, 21(2), 239-245.
  • Navarro, D. J. (2019). Between the devil and the deep blue sea: Tensions between scientific judgement and statistical model selection. Computational Brain & Behavior, 2(1), 28-34.
  • Palestro, J. J., Sederberg, P. B., Osth, A. F., Van Zandt, T., & Turner, B. M. (2018). Likelihood-free methods for cognitive science. Springer.
  • Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20(4), 873-922.
  • Ratcliff, R., & Tuerlinckx, F. (2002). Estimating parameters of the diffusion model. Psychonomic Bulletin & Review, 9(3), 438-481.
  • Wagenmakers, E.-J., Ratcliff, R., Gomez, P., & Iverson, G. J. (2004). Assessing model mimicry using the parametric bootstrap. Journal of Mathematical Psychology, 48(1), 28-50.
  • Wilson, R. C., & Collins, A. G. (2019). Ten simple rules for the computational modeling of behavioral data. eLife, 8, e49547.

See references/ for diagnostic visualization templates and worked examples.

Weekly Installs
0
GitHub Stars
10
First Seen
Jan 1, 1970