Parameter Recovery Checker

Purpose

This skill encodes expert methodological knowledge for conducting parameter recovery studies -- a critical validation step before interpreting fitted model parameters. Parameter recovery determines whether a model's parameters are identifiable given the experimental design and sample size. A general-purpose programmer unfamiliar with computational modeling would not know that fitting a model is insufficient validation, or how to diagnose parameter tradeoffs and non-identifiability.

When to Use This Skill

Before trusting fitted parameter values from any computational cognitive model
When developing a new model and assessing whether parameters can be distinguished from data
When planning an experiment and determining the minimum trial count for reliable parameter estimation
When a reviewer asks for evidence of model identifiability
When comparing models and needing to ensure each model can be distinguished (model recovery)
When fitted parameters produce suspiciously extreme values or hit bounds

Research Planning Protocol

Before executing the domain-specific steps below, you MUST:

State the research question -- What specific question is this analysis/paradigm addressing?
Justify the method choice -- Why is this approach appropriate? What alternatives were considered?
Declare expected outcomes -- What results would support vs. refute the hypothesis?
Note assumptions and limitations -- What does this method assume? Where could it mislead?
Present the plan to the user and WAIT for confirmation before proceeding.

For detailed methodology guidance, see the research-literacy skill.

⚠️ Verification Notice

This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.

Why Parameter Recovery Matters

Fitting a model to data and obtaining parameter estimates does NOT guarantee those estimates are meaningful (Wilson & Collins, 2019; Navarro, 2019). Common failure modes:

Non-identifiability: Multiple parameter combinations produce identical model predictions (e.g., drift rate and boundary in DDM trade off; Ratcliff & Tuerlinckx, 2002)
Insufficient data: Too few trials for the fitting procedure to recover true values
Local minima: Optimization converges to wrong parameter values
Model misspecification: The fitting procedure recovers parameters that do not reflect the assumed cognitive process

Parameter recovery is the standard diagnostic for these problems (Heathcote et al., 2015; Wilson & Collins, 2019).

Step-by-Step Recovery Procedure

Step 1: Define the Parameter Space

Choose ground-truth parameter values that span the plausible range for each parameter.

How many parameter sets to simulate?
 |
 +-- Minimum: 100 parameter sets (Wilson & Collins, 2019)
 |
 +-- Recommended: 500-1000 parameter sets for smooth recovery landscapes
 |
 +-- For publication: 1000+ parameter sets (Heathcote et al., 2015)

Sampling strategy:

Strategy	When to Use	Source
Uniform grid	Few parameters (1-2), want complete coverage	Standard practice
Latin hypercube	3+ parameters, want space-filling without excessive samples	McKay et al., 1979
Random uniform	Simple, adequate for many parameters	Wilson & Collins, 2019
Prior-based sampling	Have informative priors on parameter ranges	Palestro et al., 2018

Range selection: Use ranges from published parameter estimates in the domain. For example:

DDM drift rate v: 0.5 -- 4.0 (Ratcliff & McKoon, 2008)
DDM boundary a: 0.5 -- 2.5 (Ratcliff & McKoon, 2008)
DDM non-decision time Ter: 0.1 -- 0.5 s (Ratcliff & McKoon, 2008)
ACT-R activation noise s: 0.1 -- 0.8 (Anderson, 2007)

Step 2: Simulate Data

For each ground-truth parameter set:

Match the experimental design exactly -- Same number of trials, conditions, and structure as the real experiment
Use the same model -- The generative model must be identical to the model you will fit
Include realistic noise -- Use the model's noise mechanism (do not add external noise)
Store the ground-truth parameters for later comparison

Critical: The number of simulated trials per participant must match the actual experiment. Recovery with 10,000 trials tells you nothing about recovery with 100 trials (Wilson & Collins, 2019).

Step 3: Fit the Model to Simulated Data

Apply the exact same fitting procedure you use for real data:

Same optimization algorithm (e.g., MLE, Bayesian, chi-square minimization)
Same parameter bounds and constraints
Same starting values or initialization strategy
Same convergence criteria

Multiple starting points: Run the optimizer from at least 5-10 random starting points per simulated dataset to avoid local minima (Heathcote et al., 2015).

Step 4: Evaluate Recovery Quality

Compare recovered parameters to true (ground-truth) parameters using multiple metrics.

Primary Metrics

Metric	Formula	Good	Acceptable	Concerning	Source
Pearson correlation (r)	cor(true, recovered)	r > 0.9	r > 0.8	r < 0.7	Heathcote et al., 2015; rough benchmarks
Bias	mean(recovered - true)	Near 0	< 10% of range	> 20% of range	Wilson & Collins, 2019
RMSE	sqrt(mean((recovered - true)^2))	Small relative to range	--	Large relative to range	Standard
Coverage	% of 95% CIs containing true value	~95%	85-100%	< 80%	Bayesian recovery

Visualization (essential)

Scatter plot: Recovered vs. true for each parameter (identity line = perfect recovery)
Bland-Altman plot: Difference vs. mean (detect range-dependent bias)
Parameter correlation matrix: Off-diagonal correlations reveal tradeoffs

See references/recovery-diagnostics.md for visualization templates.

Step 5: Check Parameter Tradeoffs

Correlation between recovered parameters:

Are any pairs of recovered parameters correlated |r| > 0.5?
 |
 +-- YES --> These parameters trade off. Consider:
 | - Fixing one to a theoretically motivated value
 | - Reparameterizing the model
 | - Collecting more data to improve identifiability
 | - Reporting the tradeoff and interpreting cautiously
 |
 +-- NO --> Parameters are identifiable given this design

Common parameter tradeoffs in cognitive models:

Model	Correlated Parameters	Nature of Tradeoff	Source
DDM	Drift rate (v) and boundary (a)	Speed-accuracy tradeoff	Ratcliff & Tuerlinckx, 2002
DDM	Non-decision time (Ter) and boundary (a)	Boundary absorbs timing variance	Ratcliff & Tuerlinckx, 2002
ACT-R	Noise (s) and threshold (tau)	Both affect retrieval probability	Anderson, 2007
RL models	Learning rate (alpha) and inverse temperature (beta)	Both control exploitation	Daw, 2011
Signal detection	d-prime and criterion (c)	Criterion shift mimics sensitivity change	Macmillan & Creelman, 2005

Model Recovery (Confusion Matrix)

Model recovery extends parameter recovery to test whether the correct model can be identified from data (Wagenmakers et al., 2004).

Procedure

For each candidate model M_k (k = 1, ..., K): a. Simulate data from M_k with representative parameters b. Fit ALL candidate models to the simulated data c. Select the best-fitting model using your comparison metric (AIC, BIC, Bayes factor)
Construct a K x K confusion matrix: rows = generating model, columns = selected model
Diagonal entries should dominate (correct model selected)

Quality Criteria

Metric	Good	Concerning	Source
Diagonal proportion	> 90% correct	< 70% correct	Wagenmakers et al., 2004
Off-diagonal patterns	Symmetric confusion	Asymmetric (one model always "wins")	Wilson & Collins, 2019

Warning: If model A is selected when data are generated from model B more than 20% of the time, those models are not distinguishable with your experimental design (Wilson & Collins, 2019).

Sample Size Effects

How Trial Count Affects Recovery

Recovery quality improves with more trials per participant. Test recovery at multiple trial counts:

Trial Count	Expected Recovery	Recommendation
< 50 trials	Often poor (r < 0.7)	Increase trials or simplify model
50-100 trials	Marginal for simple models	May suffice for 2-3 parameter models
100-200 trials	Adequate for most models	Standard for DDM (Ratcliff & McKoon, 2008)
200-500 trials	Good for complex models	Recommended for models with > 4 parameters
500+ trials	Excellent for most models	Required for hierarchical models

Source: Wilson & Collins (2019); Ratcliff & Tuerlinckx (2002) for DDM-specific guidance.

Recovery as a Function of N

Plot recovery metrics (r, RMSE) as a function of trial count to determine the minimum viable N for your specific model and paradigm.

Landscape Analysis

Parameter Sensitivity Surfaces

For 1-2 key parameters, compute and visualize the objective function surface:

Fix all parameters except the target parameter(s)
Evaluate the objective function (e.g., negative log-likelihood) at a grid of values
Plot the surface (1D: line; 2D: contour or heatmap)

What to look for:

Surface Feature	Interpretation	Action
Single sharp minimum	Well-identified parameter	Proceed with confidence
Broad flat minimum	Parameter poorly constrained	Widen prior or collect more data
Multiple minima	Non-convex; local minima risk	Use multiple starting points; consider reparameterization
Ridge (elongated valley)	Parameter tradeoff	Two parameters are correlated; consider fixing one

Reporting Standards

Minimum Reporting Checklist

When publishing a parameter recovery study:

Where to Report

Main text: Summary of recovery quality (r values, key plots)
Supplementary: Full correlation matrices, all scatter plots, landscape analyses
Parameter recovery is increasingly expected in top journals (Wilson & Collins, 2019; Navarro, 2019)

Common Pitfalls

Testing recovery with too many trials: Simulating 10,000 trials when the experiment has 100. Recovery will look excellent but is irrelevant to your actual data (Wilson & Collins, 2019).
Using different fitting procedures: The recovery study must use the identical optimization pipeline as the real-data analysis. Different starting values, bounds, or algorithms invalidate the test.
Ignoring parameter correlations: High marginal recovery (good r for each parameter) can coexist with strong parameter tradeoffs that distort interpretation. Always check the cross-parameter correlation matrix.
Reporting only correlation: Correlation measures rank-order recovery but ignores systematic bias. A parameter can have r = 0.95 but be consistently overestimated by 30%. Report bias and RMSE alongside r.
Sampling only near defaults: If ground-truth values cluster around typical defaults, recovery may look good only in that region. Sample across the full plausible range.
Neglecting model recovery: Good parameter recovery does not guarantee good model recovery. Two models can have recoverable parameters individually but be indistinguishable when competing (Wagenmakers et al., 2004).
Confusing identifiability with validity: A model can have perfectly recoverable parameters and still be a poor model of cognition. Recovery is necessary but not sufficient (Navarro, 2019).

References

Anderson, J. R. (2007). How Can the Human Mind Occur in the Physical Universe? Oxford University Press.
Daw, N. D. (2011). Trial-by-trial data analysis using computational models. In M. R. Delgado, E. A. Phelps, & T. W. Robbins (Eds.), Decision Making, Affect, and Learning. Oxford University Press.
Heathcote, A., Brown, S. D., & Wagenmakers, E.-J. (2015). An introduction to good practices in cognitive modeling. In B. U. Forstmann & E.-J. Wagenmakers (Eds.), An Introduction to Model-Based Cognitive Neuroscience. Springer.
Macmillan, N. A., & Creelman, C. D. (2005). Detection Theory: A User's Guide (2nd ed.). Lawrence Erlbaum Associates.
McKay, M. D., Beckman, R. J., & Conover, W. J. (1979). A comparison of three methods for selecting values of input variables. Technometrics, 21(2), 239-245.
Navarro, D. J. (2019). Between the devil and the deep blue sea: Tensions between scientific judgement and statistical model selection. Computational Brain & Behavior, 2(1), 28-34.
Palestro, J. J., Sederberg, P. B., Osth, A. F., Van Zandt, T., & Turner, B. M. (2018). Likelihood-free methods for cognitive science. Springer.
Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20(4), 873-922.
Ratcliff, R., & Tuerlinckx, F. (2002). Estimating parameters of the diffusion model. Psychonomic Bulletin & Review, 9(3), 438-481.
Wagenmakers, E.-J., Ratcliff, R., Gomez, P., & Iverson, G. J. (2004). Assessing model mimicry using the parametric bootstrap. Journal of Mathematical Psychology, 48(1), 28-50.
Wilson, R. C., & Collins, A. G. (2019). Ten simple rules for the computational modeling of behavioral data. eLife, 8, e49547.

See references/ for diagnostic visualization templates and worked examples.

Parameter Recovery Checker

Parameter Recovery Checker

Purpose

When to Use This Skill

Research Planning Protocol

⚠️ Verification Notice

Why Parameter Recovery Matters

Step-by-Step Recovery Procedure

Step 1: Define the Parameter Space

Step 2: Simulate Data

Step 3: Fit the Model to Simulated Data

Step 4: Evaluate Recovery Quality

Primary Metrics

Visualization (essential)

Step 5: Check Parameter Tradeoffs

Model Recovery (Confusion Matrix)

Procedure

Quality Criteria

Sample Size Effects

How Trial Count Affects Recovery

Recovery as a Function of N

Landscape Analysis

Parameter Sensitivity Surfaces

Reporting Standards

Minimum Reporting Checklist

Where to Report

Common Pitfalls

References

More from haoxuanlithuai/awesome_cognitive_and_neuroscience_skills

eeg preprocessing pipeline guide

cognitive science statistical analysis

paper-to-skill extractor

creativity self-efficacy mediation analysis

verify skill

self-paced reading designer