skills/mims-harvard/tooluniverse/tooluniverse-gwas-snp-interpretation

tooluniverse-gwas-snp-interpretation

SKILL.md

GWAS SNP Interpretation Skill

Overview

Interpret genetic variants (SNPs) from GWAS studies by aggregating evidence from multiple sources to provide comprehensive clinical and biological context.

Use Cases:

  • "Interpret rs7903146" (TCF7L2 diabetes variant)
  • "What diseases is rs429358 associated with?" (APOE Alzheimer's variant)
  • "Clinical significance of rs1801133" (MTHFR variant)
  • "Is rs12913832 in any fine-mapped loci?" (Eye color variant)

What It Does

The skill provides a comprehensive interpretation of SNPs by:

  1. SNP Annotation: Retrieves basic variant information including genomic coordinates, alleles, functional consequence, and mapped genes
  2. Association Discovery: Finds all GWAS trait/disease associations with statistical significance
  3. Fine-Mapping Evidence: Identifies credible sets the variant belongs to (fine-mapped causal loci)
  4. Gene Mapping: Uses Locus-to-Gene (L2G) predictions to identify likely causal genes
  5. Clinical Summary: Aggregates evidence into actionable clinical significance

Workflow

User Input: rs7903146
[1] SNP Lookup
    → Get location, consequence, MAF
    → gwas_get_snp_by_id
[2] Association Search
    → Find all trait/disease associations
    → gwas_get_associations_for_snp
[3] Fine-Mapping (Optional)
    → Get credible set membership
    → OpenTargets_get_variant_credible_sets
[4] Gene Predictions
    → Extract L2G scores for causal genes
    → (embedded in credible sets)
[5] Clinical Summary
    → Aggregate evidence
    → Identify key traits and genes
Output: Comprehensive Interpretation Report

Data Sources

GWAS Catalog (EMBL-EBI)

  • SNP annotations: Functional consequences, mapped genes, population frequencies
  • Associations: P-values, effect sizes, study metadata
  • Coverage: 350,000+ publications, 670,000+ associations

Open Targets Genetics

  • Fine-mapping: Statistical credible sets from SuSiE, FINEMAP methods
  • L2G predictions: Machine learning-based gene prioritization
  • Colocalization: QTL evidence for causal genes
  • Coverage: UK Biobank, FinnGen, and other large cohorts

Input Parameters

Required

  • rs_id (str): dbSNP rs identifier
    • Format: "rs" + number (e.g., "rs7903146")
    • Must be valid rsID in GWAS Catalog

Optional

  • include_credible_sets (bool, default=True): Query fine-mapping data
    • True: Complete interpretation (slower, ~10-30s)
    • False: Fast associations only (~2-5s)
  • p_threshold (float, default=5e-8): Genome-wide significance threshold
  • max_associations (int, default=100): Maximum associations to retrieve

Output Format

Returns SNPInterpretationReport containing:

1. SNP Basic Info

{
    'rs_id': 'rs7903146',
    'chromosome': '10',
    'position': 112998590,
    'ref_allele': 'C',
    'alt_allele': 'T',
    'consequence': 'intron_variant',
    'mapped_genes': ['TCF7L2'],
    'maf': 0.293
}

2. Trait Associations

[
    {
        'trait': 'Type 2 diabetes',
        'p_value': 1.2e-128,
        'beta': '0.28 unit increase',
        'study_id': 'GCST010555',
        'pubmed_id': '33536258',
        'effect_allele': 'T'
    },
    ...
]

3. Credible Sets (Fine-Mapping)

[
    {
        'study_id': 'GCST90476118',
        'trait': 'Renal failure',
        'finemapping_method': 'SuSiE-inf',
        'p_value': 3.5e-42,
        'predicted_genes': [
            {'gene': 'TCF7L2', 'score': 0.863}
        ],
        'region': '10:112950000-113050000'
    },
    ...
]

4. Clinical Significance

Genome-wide significant associations with 100 traits/diseases:
  - Type 2 diabetes
  - Diabetic retinopathy
  - HbA1c levels
  ...

Identified in 20 fine-mapped loci.
Predicted causal genes: TCF7L2

Example Usage

See QUICK_START.md for platform-specific examples.

Tools Used

GWAS Catalog Tools

  1. gwas_get_snp_by_id: Get SNP annotation
  2. gwas_get_associations_for_snp: Get all trait associations

Open Targets Tools

  1. OpenTargets_get_variant_info: Get variant details with population frequencies
  2. OpenTargets_get_variant_credible_sets: Get fine-mapping credible sets with L2G

Interpretation Guide

P-value Significance Levels

  • p < 5e-8: Genome-wide significant (strong evidence)
  • p < 5e-6: Suggestive (moderate evidence)
  • p < 0.05: Nominal (weak evidence)

L2G Score Interpretation

  • > 0.5: High confidence causal gene
  • 0.1-0.5: Moderate confidence
  • < 0.1: Low confidence

Clinical Actionability

  1. High: Multiple genome-wide significant associations + in credible sets + high L2G scores
  2. Moderate: Genome-wide significant associations but limited fine-mapping
  3. Low: Suggestive associations or limited replication

Limitations

  1. Variant ID Conversion: OpenTargets requires chr_pos_ref_alt format, which may need allele lookup
  2. Population Specificity: Associations may vary by ancestry
  3. Effect Sizes: Beta values are study-dependent (different phenotype scales)
  4. Causality: Associations don't prove causation; fine-mapping improves confidence
  5. Currency: Data reflects published GWAS; latest studies may not be included

Best Practices

  1. Use Full Interpretation: Enable include_credible_sets=True for clinical decisions
  2. Check Multiple Variants: Look at other variants in the same locus
  3. Validate Populations: Consider ancestry-specific effect sizes
  4. Review Publications: Check original studies for context
  5. Integrate Evidence: Combine with functional data, eQTLs, pQTLs

Technical Notes

Performance

  • Fast mode (no credible sets): 2-5 seconds
  • Full mode (with credible sets): 10-30 seconds
  • Bottleneck: OpenTargets GraphQL API rate limits

Error Handling

  • Invalid rs_id: Returns error message
  • No associations: Returns empty list with note
  • API failures: Graceful degradation (returns partial results)

Related Skills

  • Gene Function Analysis: Interpret predicted causal genes
  • Disease Ontology Lookup: Understand trait classifications
  • PubMed Literature Search: Find original GWAS publications
  • Variant Effect Prediction: Functional consequence analysis

References

  1. GWAS Catalog: https://www.ebi.ac.uk/gwas/
  2. Open Targets Genetics: https://genetics.opentargets.org/
  3. GWAS Significance Thresholds: Fadista et al. 2016
  4. L2G Method: Mountjoy et al. 2021 (Nature Genetics)

Version

  • Version: 1.0.0
  • Last Updated: 2026-02-13
  • ToolUniverse Version: >= 1.0.0
  • Tools Required: gwas_get_snp_by_id, gwas_get_associations_for_snp, OpenTargets_get_variant_credible_sets
Weekly Installs
99
GitHub Stars
1.1K
First Seen
Feb 20, 2026
Installed on
codex97
gemini-cli96
github-copilot96
opencode95
amp94
kimi-cli94