skills/mims-harvard/tooluniverse/tooluniverse-variant-interpretation

tooluniverse-variant-interpretation

SKILL.md

Clinical Variant Interpreter

Systematic variant interpretation using ToolUniverse - from raw variant calls to ACMG-classified clinical recommendations with structural impact analysis.

Triggers

Use this skill when users:

  • Ask about variant interpretation, classification, or pathogenicity
  • Have VCF data needing clinical annotation
  • Need ACMG classification for variants
  • Want structural impact analysis for missense variants

Key Principles

  1. ACMG-Guided - Follow ACMG/AMP 2015 guidelines with explicit evidence codes
  2. Structural Evidence - Use AlphaFold2 for novel structural impact analysis
  3. Population Context - gnomAD frequencies with ancestry-specific data
  4. Actionable Output - Clear recommendations, not just classifications
  5. English-first queries - Always use English terms in tool calls; respond in user's language

Workflow Overview

Phase 1: VARIANT IDENTITY        → Normalize HGVS, map gene/transcript/consequence
Phase 2: CLINICAL DATABASES       → ClinVar, gnomAD, OMIM, ClinGen, COSMIC, SpliceAI
Phase 2.5: REGULATORY CONTEXT     → ChIPAtlas, ENCODE (non-coding variants only)
Phase 3: COMPUTATIONAL PREDICTIONS → CADD, AlphaMissense, EVE, SIFT/PolyPhen
Phase 4: STRUCTURAL ANALYSIS      → PDB/AlphaFold2, domains, functional sites (VUS/novel)
Phase 4.5: EXPRESSION CONTEXT     → CELLxGENE, GTEx tissue expression
Phase 5: LITERATURE EVIDENCE      → PubMed, EuropePMC, BioRxiv, MedRxiv
Phase 6: ACMG CLASSIFICATION      → Evidence codes, classification, recommendations

Phase 1: Variant Identity

Tools: myvariant_query, Ensembl_get_variant_info, NCBI_gene_search

Capture: HGVS notation (c. and p.), gene symbol, canonical transcript (MANE Select), consequence type, amino acid change, exon/intron location.

Phase 2: Clinical Databases

Tools: clinvar_search, gnomad_search, OMIM_search, OMIM_get_entry, ClinGen_search_gene_validity, ClinGen_search_dosage_sensitivity, ClinGen_search_actionability, COSMIC_search_mutations, COSMIC_get_mutations_by_gene, DisGeNET_search_gene, DisGeNET_get_vda, SpliceAI_predict_splice, SpliceAI_get_max_delta

Use SpliceAI for: intronic variants near splice sites, synonymous variants, exonic variants near splice junctions.

See CODE_PATTERNS.md for implementation details.

Phase 2.5: Regulatory Context (Non-Coding Only)

Apply for intronic (non-splice), promoter, UTR, or intergenic variants near disease genes.

Tools: ChIPAtlas_enrichment_analysis, ChIPAtlas_get_peak_data, ENCODE_search_experiments, ENCODE_get_experiment

Phase 3: Computational Predictions

Tools: CADD_get_variant_score (PHRED 0-99), AlphaMissense_get_variant_score (0-1, needs UniProt ID), EVE_get_variant_score (0-1), myvariant_query (SIFT/PolyPhen), Ensembl_get_variant_info (VEP)

Consensus: Run CADD (all variants) + AlphaMissense + EVE (missense). 2+ concordant damaging = strong PP3; 2+ concordant benign = strong BP4.

See ACMG_CLASSIFICATION.md for thresholds.

Phase 4: Structural Analysis (VUS/Novel Missense)

Tools: PDB_search_by_uniprot, NvidiaNIM_alphafold2, alphafold_get_prediction, InterPro_get_protein_domains, UniProt_get_protein_function

Workflow: Get structure -> map residue -> assess domain/functional site -> predict destabilization.

Phase 4.5: Expression Context

Tools: CELLxGENE_get_expression_data, CELLxGENE_get_cell_metadata, GTEx_get_median_gene_expression

Confirms gene expression in disease-relevant tissues. Supports PP4 if highly restricted; challenges classification if not expressed in affected tissue.

Phase 5: Literature Evidence

Tools: PubMed_search, EuropePMC_search, BioRxiv_search_preprints, MedRxiv_search_preprints, openalex_search_works, SemanticScholar_search_papers

Always flag preprints as NOT peer-reviewed.

Phase 6: ACMG Classification

Apply all relevant evidence codes (PVS1, PS1, PS3, PM1, PM2, PM5, PP3, PP5 for pathogenic; BA1, BS1, BS3, BP4, BP7 for benign). See ACMG_CLASSIFICATION.md for the complete algorithm.


Special Scenarios

Novel Missense VUS: Check PM5 (other pathogenic at same residue), get AlphaFold2 structure, apply PM1/PP3 as appropriate.

Truncating Variant: Check LOF mechanism, NMD escape, alternative isoforms, ClinGen LOF curation. Apply PVS1 at appropriate strength.

Splice Variant: Run SpliceAI, assess canonical splice distance, in-frame skipping potential. Apply PP3/BP7 based on scores.


Output Structure

# Variant Interpretation Report: {GENE} {VARIANT}
## Executive Summary
## 1. Variant Identity
## 2. Population Data
## 3. Clinical Database Evidence
## 4. Computational Predictions
## 5. Structural Analysis
## 6. Literature Evidence
## 7. ACMG Classification
## 8. Clinical Recommendations
## 9. Limitations & Uncertainties
## Data Sources

File naming: {GENE}_{VARIANT}_interpretation_report.md


Clinical Recommendations

Pathogenic/Likely Pathogenic: Enhanced screening, risk-reducing options, drug dosing adjustment, reproductive counseling, family cascade screening.

VUS: Do not use for medical decisions. Reinterpret in 1-2 years. Pursue functional studies and segregation data.

Benign/Likely Benign: Not expected to cause disease. No cascade testing needed.


Quantified Minimums

Section Requirement
Population frequency gnomAD overall + at least 3 ancestry groups
Predictions At least 3 computational predictors
Literature search At least 2 search strategies
ACMG codes All applicable codes listed

References

  • ACMG_CLASSIFICATION.md - Evidence codes, classification algorithm, prediction thresholds, structural/regulatory impact tables
  • CODE_PATTERNS.md - Reusable code patterns for each workflow phase
  • CHECKLIST.md - Pre-delivery verification
  • EXAMPLES.md - Sample interpretations
  • TOOLS_REFERENCE.md - Tool parameters and fallbacks
Weekly Installs
100
GitHub Stars
1.1K
First Seen
Feb 20, 2026
Installed on
codex98
opencode96
gemini-cli96
github-copilot96
amp94
kimi-cli94