Lesion-Symptom Mapping Guide

Purpose

This skill encodes expert methodological knowledge for lesion-symptom mapping in clinical and cognitive neuroscience. A competent programmer without neuropsychology and neuroimaging training will get this wrong because:

Lesion distributions are not random. Stroke lesions cluster in the middle cerebral artery (MCA) territory, creating systematic collinearity between brain regions. Standard voxelwise methods confound truly causal regions with regions that are simply co-damaged (Sperber, 2020).
Lesion volume is a massive confound. Larger lesions produce worse behavioral deficits simply because more tissue is damaged. Any analysis that does not control for total lesion volume will attribute behavioral deficits to whichever regions are most often part of large lesions (DeMarco & Turkeltaub, 2018).
White matter disconnection matters as much as grey matter damage. A focal lesion can disrupt distant regions via white matter pathway disruption (diaschisis). VLSM misses this entirely; disconnection analysis is needed (Foulon et al., 2018).
Standard multiple comparison correction is insufficient. Permutation-based FWE correction is required because voxelwise tests are massively non-independent (lesion voxels are spatially correlated), violating assumptions of FDR and parametric corrections (Kimberg et al., 2007).
Small samples produce false localizations. With fewer than 50 patients, VLSM lacks power and produces unreliable maps that do not replicate (Sperber, 2020).

When to Use This Skill

Planning a voxel-based lesion-symptom mapping (VLSM) study
Choosing between VLSM, multivariate, and disconnection-based approaches
Setting statistical thresholds and correction methods for lesion analyses
Evaluating whether a published lesion study adequately controlled for confounds
Implementing lesion segmentation and registration pipelines
Conducting network-based lesion mapping with normative connectome data

Do NOT use this skill for:

Radiological lesion diagnosis (requires clinical neuroradiology)
General fMRI analysis (see fmri-glm-analysis-guide)
White matter tractography methodology (upstream of this skill)

Research Planning Protocol

Before executing the domain-specific steps below, you MUST:

State the research question -- What specific question is this analysis/paradigm addressing?
Justify the method choice -- Why is this approach appropriate? What alternatives were considered?
Declare expected outcomes -- What results would support vs. refute the hypothesis?
Note assumptions and limitations -- What does this method assume? Where could it mislead?
Present the plan to the user and WAIT for confirmation before proceeding.

For detailed methodology guidance, see the research-literacy skill.

⚠️ Verification Notice

This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.

Method Selection Decision Tree

What is your research question?
|
+-- "Which brain voxels are associated with a behavioral deficit?"
| |
| +-- N >= 50 patients, continuous outcome
| | --> VLSM (mass-univariate)
| |
| +-- N >= 50 patients, binary outcome
| | --> VLSM with Brunner-Munzel or chi-square
| |
| +-- N >= 100 patients, distributed representations expected
| | --> SVR-LSM (multivariate)
| |
| +-- N < 50 patients
| --> Underpowered for VLSM; consider ROI-based approach
| or case-series descriptive analysis
|
+-- "Which white matter pathways mediate the deficit?"
| --> Disconnection analysis (BCBToolkit, Disconnectome)
| Can supplement VLSM or replace it when tracts are the question
|
+-- "Which brain networks, when lesioned, produce this symptom?"
 --> Lesion network mapping (normative connectome)
 Maps lesion location to network disruption

Lesion Segmentation

Methods

Method	Description	Accuracy	Time per patient	Source
Manual tracing	Expert traces lesion on each slice	Gold standard	30--60 min	Brett et al., 2001
Semi-automated (lesion_gnb)	Gaussian naive Bayes on FLAIR	Good for chronic WM lesions	5--15 min	Pustina et al., 2016
LINDA	Random forest on T1	Good for chronic stroke	5--10 min	Pustina et al., 2016
U-net / deep learning	Trained CNNs	Approaching manual accuracy	< 1 min	Kamnitsas et al., 2017

Domain judgment: Manual tracing remains the gold standard. Semi-automated methods should always be visually inspected and manually corrected. Never trust a fully automated segmentation without visual QC of every patient (Brett et al., 2001).

Registration to Standard Space

Cost function masking (CRITICAL): When registering a lesioned brain to MNI template, the lesion MUST be masked out of the cost function. Without masking, the registration algorithm warps healthy tissue to fill the lesion, distorting the spatial normalization (Brett et al., 2001).
Recommended pipeline:

Segment lesion on native T1 (or FLAIR)
Create binary lesion mask
Register T1 to MNI using nonlinear registration (e.g., ANTs SyN) with lesion mask as cost function mask
Apply the same warp to the lesion mask
Verify registration quality visually

Enantiomorphic normalization: For large lesions, replace the lesioned hemisphere with a flipped version of the intact hemisphere before registration. This improves normalization quality for large lesions (Nachev et al., 2008).

Voxel-Based Lesion-Symptom Mapping (VLSM)

Prerequisites and Sample Size

Requirement	Minimum	Recommended	Source
Sample size	N >= 50	N >= 100	Sperber, 2020; Kimberg et al., 2007
Lesion overlap per voxel	>= 10% of sample (or N >= 10)	>= 15%	Kimberg et al., 2007
Behavioral measure	Continuous preferred	--	Bates et al., 2003

Domain judgment: VLSM with N < 50 is severely underpowered. Sperber (2020) showed that with N = 30, VLSM produces maps that fail to replicate and have unacceptably high false discovery rates. If N < 50, consider ROI-based analyses with a priori regions or descriptive lesion overlap approaches.

Statistical Tests

Test	When to Use	Advantages	Source
t-test	Continuous behavior, binary lesion status per voxel	Simple, widely used	Bates et al., 2003
Brunner-Munzel	Non-normal behavioral data, unequal variances	Robust to non-normality and unequal group sizes	Rorden et al., 2007
Regression	Controlling for covariates (age, lesion volume)	Flexible; includes confound control	Sperber, 2020
Liebermeister	Binary behavioral outcome (impaired/spared)	For binary classification	Rorden et al., 2007

Multiple Comparison Correction

Method	Description	Recommended?	Source
Permutation-based FWE	Permute behavioral scores 5000+ times; threshold at 5th percentile of max statistic	YES -- gold standard	Kimberg et al., 2007
FDR (Benjamini-Hochberg)	Controls false discovery rate	Acceptable alternative but assumes independence	Kimberg et al., 2007
Bonferroni	Divide alpha by number of voxels	Too conservative; almost never detects effects	Expert consensus
Uncorrected	No correction	NEVER for publication	Expert consensus

Domain judgment: Permutation testing is strongly preferred because lesion maps violate the independence assumptions of FDR and parametric corrections. The spatial correlation structure of lesions means neighboring voxels are highly non-independent. Permutation testing implicitly accounts for this correlation structure (Kimberg et al., 2007).

Controlling for Lesion Volume

Lesion volume MUST be controlled. Methods (DeMarco & Turkeltaub, 2018):

Direct regression: Include total lesion volume as a covariate in the voxelwise regression model
Behavioral residualization: Regress behavior on lesion volume first; use residuals as the dependent variable in VLSM
Both yield similar results, but direct regression is preferred for interpretability (DeMarco & Turkeltaub, 2018)

CRITICAL: Failing to control for lesion volume is the single most common error in VLSM studies. Large lesions damage more regions and produce worse deficits, creating a spurious correlation between any frequently-damaged voxel and behavioral impairment (DeMarco & Turkeltaub, 2018).

Multivariate Lesion-Symptom Mapping

SVR-LSM (Zhang et al., 2014)

Support vector regression-based lesion-symptom mapping considers the full lesion pattern simultaneously, addressing the collinearity problem of mass-univariate VLSM.

Parameter	Recommended Value	Source
Kernel	Linear	Zhang et al., 2014
C parameter	30 (default for LSM)	Zhang et al., 2014
Feature reduction	Remove voxels with < 10% lesion overlap	Zhang et al., 2014
Statistical inference	Permutation testing (5000+ permutations)	Zhang et al., 2014

Advantages over VLSM (Zhang et al., 2014):

Considers spatial covariance of lesion damage (voxels are analyzed jointly, not independently)
Better handles the collinearity created by vascular territories
Can detect distributed patterns

Limitations:

Computationally expensive (hours per analysis with permutation testing)
Requires larger samples (N >= 80--100 recommended) for stable SVR weights
Interpretation of SVR weight maps is less straightforward than VLSM t-maps

Machine Learning-Based Mapping (MLBM)

Other multivariate approaches (random forests, LASSO regression) can be applied:

LASSO is useful for variable selection among brain regions
Random forests handle nonlinear relationships
All require permutation-based significance testing
Cross-validation (leave-one-out or k-fold) is mandatory

Disconnection Analysis

Rationale

A focal lesion disrupts not only the damaged tissue but also white matter pathways passing through the lesion, disconnecting distant brain regions (Foulon et al., 2018). VLSM maps only the lesion site; disconnection analysis maps the affected structural connections.

Methods

Tool	Approach	Data Required	Source
BCBToolkit	Maps lesion to disconnected tracts using normative tractography atlas	Lesion mask in MNI space	Foulon et al., 2018
Disconnectome maps	Pre-computed: for each brain voxel, which tracts are disconnected	Lesion mask in MNI space	Thiebaut de Schotten et al., 2015
Individual tractography	DTI/DWI tractography in each patient	Patient DWI data	Expert consensus

BCBToolkit Pipeline (Foulon et al., 2018)

Register lesion mask to MNI space
Use the normative tractography atlas (from 170 healthy controls) to identify tracts passing through each lesion
Generate a disconnection probability map: for each voxel outside the lesion, the probability that it is disconnected
Use disconnection maps (instead of or in addition to lesion masks) as predictors in VLSM-like analyses

Advantages Over VLSM

Captures remote effects (diaschisis) that VLSM misses entirely
Accounts for the fact that two lesions in the same voxel can disconnect different pathways depending on their extent
Particularly important for white matter lesions, where the damaged tissue itself has no gray matter function

Network Lesion Mapping

Lesion Network Mapping (Boes et al., 2015; Fox, 2018)

Maps a lesion to the brain-wide functional network it disrupts, using normative resting-state fMRI data:

Use the lesion as a seed region in a normative resting-state fMRI connectome (e.g., 1000-subject dataset)
Compute the functional connectivity map of the lesion location in healthy brains
The resulting map shows which brain regions are functionally connected to the lesion site
Across patients, identify the network that is commonly disrupted

Key Considerations

Normative connectome: Results depend on the quality and size of the normative dataset. Larger datasets (N >= 500 healthy controls) provide more reliable connectivity estimates (Fox, 2018).
Seed definition: Use the entire lesion as a seed, not just the peak voxel. Consider weighting by lesion probability if lesion masks are probabilistic.
Validation: Compare network maps to known functional anatomy. The disrupted network should be biologically plausible for the observed deficit.
Limitations: Assumes that healthy-brain connectivity predicts the effect of a lesion. Does not account for reorganization or compensation (Fox, 2018).

Common Confounds

1. Lesion Volume Correlation

Larger lesions produce worse behavioral deficits and damage more voxels. Without controlling for lesion volume, VLSM maps reflect voxels that are part of large lesions rather than voxels critical for the behavior (DeMarco & Turkeltaub, 2018).

2. Non-Random Lesion Distribution (MCA Territory Bias)

Stroke lesions are not uniformly distributed across the brain. The MCA territory (lateral frontal, temporal, parietal, insular cortex) is disproportionately affected. This means:

High statistical power in MCA territory, low power elsewhere
Voxels outside MCA territory may be critical but undetectable
Collinearity between MCA-territory voxels inflates false positives for non-critical regions (Sperber, 2020)

3. Time Post-Onset

Phase	Time	Concern	Source
Acute (< 2 weeks)	Edema, diaschisis, penumbra	Lesion extent overestimated; behavior worst	Karnath et al., 2004
Subacute (2 weeks -- 3 months)	Recovery, reorganization	Lesion stabilizing; behavior improving	Expert consensus
Chronic (> 3 months)	Stable lesion	Preferred for VLSM; most stable brain-behavior relationship	Sperber, 2020

Domain judgment: Chronic-phase data (> 3 months post-onset) is strongly preferred for VLSM because both lesion extent and behavioral deficits have stabilized. Acute-phase data confounds true lesion effects with transient diaschisis and edema (Karnath et al., 2004; Sperber, 2020).

4. Covariates

Always consider controlling for:

Age: Older patients have worse outcomes independent of lesion
Education: Affects cognitive test performance
Time post-onset: If sample includes mixed acute/chronic patients
Handedness: Affects lateralization of language
Lesion hemisphere: If analyzing bilateral samples, hemisphere effects must be modeled

Software

Software	Methods	Language	Source
NiiStat	VLSM, ROI analysis	MATLAB	Rorden et al., 2007
VLSM2	VLSM with permutation testing	MATLAB	Bates et al., 2003
SVR-LSM toolbox	Multivariate SVR-LSM	MATLAB	Zhang et al., 2014
BCBToolkit	Disconnection analysis, disconnectome	GUI/Python	Foulon et al., 2018
LESYMAP	VLSM, SVR-LSM, SCCAN	R	Pustina et al., 2018
ANTs	Registration with cost function masking	C++/Python	Avants et al., 2011
FSL	Registration, lesion masking	Python/C++	Jenkinson et al., 2012

Common Pitfalls

1. Not Controlling for Lesion Volume

The most frequent and most damaging error. Always include lesion volume as a covariate or use residualized behavioral scores (DeMarco & Turkeltaub, 2018).

2. Too Small a Sample

VLSM with N < 50 produces unreliable maps. With N = 30, false positive rates can exceed 50% in some simulations (Sperber, 2020). If your sample is small, use ROI-based approaches or descriptive methods.

3. Not Using Cost Function Masking During Registration

Registering lesioned brains to template without masking the lesion distorts the normalization, warping healthy tissue into the lesion cavity and misaligning the rest of the brain (Brett et al., 2001).

4. Using Uncorrected or Bonferroni Correction

Uncorrected thresholds produce massive false positives. Bonferroni is too conservative due to spatial correlation. Use permutation-based FWE (Kimberg et al., 2007).

5. Ignoring the Vascular Architecture

Interpreting a VLSM map as showing "the region responsible for function X" ignores that vascular territory collinearity may have driven the result. Consider supplementing with disconnection analysis or using multivariate methods (Sperber, 2020).

6. Mixing Acute and Chronic Patients Without Controlling for Time

Acute patients have larger effective lesions (edema) and worse behavior, confounding time-post-onset with lesion severity. Analyze chronic patients separately or include time as a covariate (Karnath et al., 2004).

Minimum Reporting Checklist

Based on Sperber (2020), DeMarco & Turkeltaub (2018), and Kimberg et al. (2007):

Key References

Bates, E., Wilson, S. M., Saygin, A. P., Dick, F., Sereno, M. I., Knight, R. T., & Dronkers, N. F. (2003). Voxel-based lesion-symptom mapping. Nature Neuroscience, 6(5), 448--450.
Boes, A. D., Prasad, S., Liu, H., Liu, Q., Pascual-Leone, A., Caviness, V. S., & Fox, M. D. (2015). Network localization of neurological symptoms from focal brain lesions. Brain, 138(10), 3061--3075.
Brett, M., Leff, A. P., Rorden, C., & Ashburner, J. (2001). Spatial normalization of brain images with focal lesions using cost function masking. NeuroImage, 14(2), 486--500.
DeMarco, A. T., & Turkeltaub, P. E. (2018). A multivariate lesion symptom mapping toolbox and examination of lesion-volume biases and correction methods in lesion-symptom mapping. Human Brain Mapping, 39(11), 4169--4182.
Foulon, C., Cerliani, L., Kinkingnehun, S., Levy, R., Rosso, C., Urbanski, M., Volle, E., & Thiebaut de Schotten, M. (2018). Advanced lesion symptom mapping analyses and implementation as BCBtoolkit. GigaScience, 7(3), giy004.
Fox, M. D. (2018). Mapping symptoms to brain networks with the human connectome. New England Journal of Medicine, 379(23), 2237--2245.
Kamnitsas, K., Ledig, C., Newcombe, V. F. J., Simpson, J. P., Kane, A. D., Menon, D. K., Rueckert, D., & Glocker, B. (2017). Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Medical Image Analysis, 36, 61--78.
Karnath, H.-O., Fruhmann Berger, M., Kuker, W., & Rorden, C. (2004). The anatomy of spatial neglect based on voxelwise statistical analysis: A study of 140 patients. Cerebral Cortex, 14(10), 1164--1172.
Kimberg, D. Y., Coslett, H. B., & Schwartz, M. F. (2007). Power in voxel-based lesion-symptom mapping. Journal of Cognitive Neuroscience, 19(7), 1067--1080.
Nachev, P., Coulthard, E., Jager, H. R., Kennard, C., & Husain, M. (2008). Enantiomorphic normalization of focally lesioned brains. NeuroImage, 39(3), 1215--1226.
Pustina, D., Coslett, H. B., Turkeltaub, P. E., Tustison, N., Schwartz, M. F., & Avants, B. (2016). Automated segmentation of chronic stroke lesions using LINDA: Lesion identification with neighborhood data analysis. Human Brain Mapping, 37(4), 1405--1421.
Rorden, C., Karnath, H.-O., & Bonilha, L. (2007). Improving lesion-symptom mapping. Journal of Cognitive Neuroscience, 19(7), 1081--1088.
Sperber, C. (2020). Rethinking causality and data complexity in brain lesion-behaviour inference and its implications for lesion-behaviour modelling. Cortex, 126, 49--62.
Zhang, Y., Kimberg, D. Y., Coslett, H. B., Schwartz, M. F., & Wang, Z. (2014). Multivariate lesion-symptom mapping using support vector regression. Human Brain Mapping, 35(12), 5861--5876.

See references/vlsm-pipeline.md for step-by-step VLSM analysis workflow. See references/disconnection-guide.md for detailed BCBToolkit and disconnection analysis procedures.

Lesion-Symptom Mapping Guide

Lesion-Symptom Mapping Guide

Purpose

When to Use This Skill

Research Planning Protocol

⚠️ Verification Notice

Method Selection Decision Tree

Lesion Segmentation

Methods

Registration to Standard Space

Voxel-Based Lesion-Symptom Mapping (VLSM)

Prerequisites and Sample Size

Statistical Tests

Multiple Comparison Correction

Controlling for Lesion Volume

Multivariate Lesion-Symptom Mapping

SVR-LSM (Zhang et al., 2014)

Machine Learning-Based Mapping (MLBM)

Disconnection Analysis

Rationale

Methods

BCBToolkit Pipeline (Foulon et al., 2018)

Advantages Over VLSM

Network Lesion Mapping

Lesion Network Mapping (Boes et al., 2015; Fox, 2018)

Key Considerations

Common Confounds

1. Lesion Volume Correlation

2. Non-Random Lesion Distribution (MCA Territory Bias)

3. Time Post-Onset

4. Covariates

Software

Common Pitfalls

1. Not Controlling for Lesion Volume

2. Too Small a Sample

3. Not Using Cost Function Masking During Registration

4. Using Uncorrected or Bonferroni Correction

5. Ignoring the Vascular Architecture

6. Mixing Acute and Chronic Patients Without Controlling for Time

Minimum Reporting Checklist

Key References

More from haoxuanlithuai/awesome_cognitive_and_neuroscience_skills

eeg preprocessing pipeline guide

cognitive science statistical analysis

paper-to-skill extractor

creativity self-efficacy mediation analysis

verify skill

self-paced reading designer