skills/mims-harvard/tooluniverse/tooluniverse-spatial-transcriptomics

tooluniverse-spatial-transcriptomics

SKILL.md

Spatial Transcriptomics Analysis

Comprehensive analysis of spatially-resolved transcriptomics data to understand gene expression patterns in tissue architecture context. Combines expression profiling with spatial coordinates to reveal tissue organization, cell-cell interactions, and spatially variable genes.

When to Use This Skill

Triggers:

  • User has spatial transcriptomics data (Visium, MERFISH, seqFISH, etc.)
  • Questions about tissue architecture or spatial organization
  • Spatial gene expression pattern analysis
  • Cell-cell proximity or neighborhood analysis requests
  • Tumor microenvironment spatial structure questions
  • Integration of spatial with single-cell data
  • Spatial domain identification
  • Tissue morphology correlation with expression

Example Questions:

  1. "Analyze this 10x Visium dataset to identify spatial domains"
  2. "Which genes show spatially variable expression in this tissue?"
  3. "Map the tumor microenvironment spatial organization"
  4. "Find genes enriched at tissue boundaries"
  5. "Identify cell-cell interactions based on spatial proximity"
  6. "Integrate spatial transcriptomics with scRNA-seq annotations"

Core Capabilities

Capability Description
Data Import 10x Visium, MERFISH, seqFISH, Slide-seq, STARmap, Xenium formats
Quality Control Spot/cell QC, spatial alignment verification, tissue coverage
Normalization Spatial-aware normalization accounting for tissue heterogeneity
Spatial Clustering Identify spatial domains with similar expression profiles
Spatial Variable Genes Find genes with non-random spatial patterns
Neighborhood Analysis Cell-cell proximity, spatial neighborhoods, niche identification
Spatial Patterns Gradients, boundaries, hotspots, expression waves
Integration Merge with scRNA-seq for cell type mapping
Ligand-Receptor Spatial Map cell communication in tissue context
Visualization Spatial plots, heatmaps on tissue, 3D reconstruction

Supported Platforms

Platform Resolution Genes Notes
10x Visium 55um spots (~50 cells/spot) Genome-wide Most common, includes H&E image
MERFISH/seqFISH Single-cell 100-10,000 (targeted) Imaging-based, absolute coordinates
Slide-seq/V2 10um beads Genome-wide Higher resolution than Visium
Xenium Single-cell, subcellular 300+ (targeted) 10x single-cell spatial

Workflow Overview

Input: Spatial Transcriptomics Data + Tissue Image
    |
    v
Phase 1: Data Import & QC
    |-- Load spatial coordinates + expression matrix
    |-- Load tissue histology image
    |-- Quality control per spot/cell (min 200 genes, 500 UMI, <20% MT)
    |-- Align spatial coordinates to tissue
    |
    v
Phase 2: Preprocessing
    |-- Normalization (spatial-aware methods)
    |-- Highly variable gene selection (top 2000)
    |-- Dimensionality reduction (PCA)
    |-- Spatial lag smoothing (optional)
    |
    v
Phase 3: Spatial Clustering
    |-- Build spatial neighbor graph (squidpy)
    |-- Graph-based clustering with spatial constraints (Leiden)
    |-- Annotate domains with marker genes (Wilcoxon)
    |-- Visualize domains on tissue
    |
    v
Phase 4: Spatial Variable Genes
    |-- Test spatial autocorrelation (Moran's I, Geary's C)
    |-- Filter significant spatial genes (FDR < 0.05)
    |-- Classify pattern types (gradient, hotspot, boundary, periodic)
    |
    v
Phase 5: Neighborhood Analysis
    |-- Define spatial neighborhoods (k-NN, radius)
    |-- Calculate neighborhood composition (squidpy nhood_enrichment)
    |-- Identify interaction zones between domains
    |
    v
Phase 6: Integration with scRNA-seq
    |-- Cell type deconvolution (Cell2location, Tangram, SPOTlight)
    |-- Map cell types to spatial locations
    |-- Validate with marker genes
    |
    v
Phase 7: Spatial Cell Communication
    |-- Identify proximal cell type pairs
    |-- Query ligand-receptor database (OmniPath)
    |-- Score spatial interactions (squidpy ligrec)
    |-- Map communication hotspots
    |
    v
Phase 8: Generate Spatial Report
    |-- Tissue overview with domains
    |-- Spatially variable genes
    |-- Cell type spatial maps
    |-- Interaction networks in tissue context

Phase Summaries

Phase 1: Data Import & QC

Load platform-specific data (scanpy read_visium for Visium). Apply QC filters: min 200 genes/spot, min 500 UMI/spot, max 20% mitochondrial. Verify spatial alignment with tissue image overlay.

Phase 2: Preprocessing

Normalize to median total counts, log-transform, select top 2000 HVGs. Optional spatial smoothing via neighbor averaging (useful for noisy data but blurs boundaries).

Phase 3: Spatial Clustering

PCA (50 components) followed by spatial neighbor graph construction (squidpy). Leiden clustering with spatial constraints yields spatial domains. Find domain markers via Wilcoxon rank-sum test.

Phase 4: Spatially Variable Genes

Moran's I statistic tests spatial autocorrelation: I > 0 = clustering, I ~ 0 = random, I < 0 = checkerboard. Filter by FDR < 0.05. Classify patterns as gradient, hotspot, boundary, or periodic.

Phase 5: Neighborhood Analysis

Neighborhood enrichment analysis (squidpy) tests whether cell types/domains are co-localized beyond random expectation. Identify interaction zones at domain boundaries using k-NN spatial graphs.

Phase 6: scRNA-seq Integration

Cell type deconvolution maps single-cell annotations to spatial spots. Methods: Cell2location (recommended for Visium), Tangram, SPOTlight. Produces cell type fraction estimates per spot.

Phase 7: Spatial Cell Communication

Combine spatial proximity with ligand-receptor databases (OmniPath). Score interactions by co-expression of L-R pairs in proximal cells. Map hotspots where interaction scores peak.

Phase 8: Report Generation

See report_template.md for full example output.


Integration with ToolUniverse Skills

Skill Used For Phase
tooluniverse-single-cell scRNA-seq reference for deconvolution Phase 6
tooluniverse-single-cell (Phase 10) L-R database for communication Phase 7
tooluniverse-gene-enrichment Pathway enrichment for spatial domains Phase 3
tooluniverse-multi-omics-integration Integrate with other omics Phase 8

Example Use Cases

Use Case 1: Tumor Microenvironment Mapping

Question: "Map the spatial organization of tumor, immune, and stromal cells" Workflow: Load Visium -> QC -> Spatial clustering -> Deconvolution -> Interaction zones -> L-R analysis -> Report

Use Case 2: Developmental Gradient Analysis

Question: "Identify spatial gene expression gradients in developing tissue" Workflow: Load spatial data -> SVG analysis -> Classify gradient patterns -> Map morphogens -> Correlate with cell fate -> Report

Use Case 3: Brain Region Identification

Question: "Automatically segment brain tissue into anatomical regions" Workflow: Load Visium brain -> High-resolution clustering -> Match to known regions -> Validate with Allen Brain Atlas -> Report


Quantified Minimums

Component Requirement
Spots/cells At least 500 spatial locations
QC Filter low-quality spots, verify alignment
Spatial clustering At least one method (graph-based or spatial)
Spatial genes Moran's I or similar spatial test
Visualization Spatial plots on tissue images
Report Domains, spatial genes, visualizations

Limitations

  • Resolution: Visium spots contain multiple cells (not single-cell)
  • Gene coverage: Imaging methods have limited gene panels
  • 3D structure: Most platforms are 2D sections
  • Tissue quality: Requires well-preserved tissue for imaging
  • Computational: Large datasets require significant memory
  • Reference dependency: Deconvolution quality depends on scRNA-seq reference

References

Methods:

Platforms:


Reference Files

Weekly Installs
111
GitHub Stars
1.1K
First Seen
Feb 19, 2026
Installed on
gemini-cli108
codex108
opencode107
github-copilot107
cursor105
amp104