skills/wu-yc/labclaw/tooluniverse-gwas-drug-discovery

tooluniverse-gwas-drug-discovery

SKILL.md

GWAS-to-Drug Target Discovery

Transform genome-wide association studies (GWAS) into actionable drug targets and repurposing opportunities.

Overview

This skill bridges genetic discoveries from GWAS with drug development by:

  1. Identifying genetic risk factors - Finding genes associated with diseases
  2. Assessing druggability - Evaluating which genes can be targeted by drugs
  3. Prioritizing targets - Ranking candidates by genetic evidence strength
  4. Finding existing drugs - Discovering approved/investigational compounds
  5. Identifying repurposing opportunities - Matching drugs to new indications

Why This Matters

From Genetics to Therapeutics: GWAS has identified thousands of disease-associated variants, but most haven't been translated into therapies. This skill accelerates that translation.

Success Stories:

  • PCSK9 (cholesterol) → Alirocumab, Evolocumab (approved 2015)
  • IL-6R (rheumatoid arthritis) → Tocilizumab (approved 2010)
  • CTLA4 (autoimmunity) → Abatacept (approved 2005)
  • CFTR (cystic fibrosis) → Ivacaftor (approved 2012)

Genetic Evidence Doubles Success Rate: Targets with genetic support have 2x higher probability of clinical approval (Nelson et al., Nature Genetics 2015).

Core Concepts

1. GWAS Evidence Strength

Not all genetic associations are equal. Consider:

  • P-value - Statistical significance (genome-wide: p < 5×10⁻⁸)
  • Effect size (beta/OR) - Magnitude of genetic effect
  • Replication - Confirmed in multiple studies
  • Sample size - Larger studies = more reliable
  • Population diversity - Validated across ancestries

2. Druggability Criteria

A good drug target must be:

  • Accessible - Protein location allows drug binding (extracellular > intracellular)
  • Modality match - Target class fits drug type (GPCR → small molecule, receptor → antibody)
  • Tractable - Binding pocket suitable for drug design
  • Safe - Minimal off-target effects, not essential in all tissues

3. Target Prioritization Framework

GWAS Evidence (40%):

  • Multiple independent SNPs = stronger signal
  • Functional variants (missense > intronic)
  • Tissue-specific expression matches disease

Druggability (30%):

  • Known druggable protein family
  • Structural data available
  • Existing chemical matter

Clinical Evidence (20%):

  • Prior safety data
  • Validated disease models
  • Biomarker availability

Commercial Factors (10%):

  • Patent landscape
  • Market size
  • Competitive positioning

4. Drug Repurposing Logic

Repurposing works when:

  1. Shared genetic architecture - Same gene implicated in multiple diseases
  2. Pathway overlap - Related biological mechanisms
  3. Opposite effects - Drug's mechanism counteracts disease pathology
  4. Proven safety - Approved drug = de-risked

Example: Metformin (T2D drug) being tested for:

  • Cancer (AMPK activation)
  • Aging (mitochondrial effects)
  • PCOS (insulin sensitization)

Workflow Steps

Step 1: GWAS Gene Discovery

Input: Disease/trait name (e.g., "type 2 diabetes", "Alzheimer disease")

Process:

  • Query GWAS Catalog for associations
  • Filter by significance threshold (p < 5×10⁻⁸)
  • Map variants to genes (nearest, eQTL, fine-mapping)
  • Aggregate evidence across studies

Output: List of genes with genetic support

Tools Used:

  • gwas_get_associations_for_trait - Get associations by disease
  • gwas_search_associations - Flexible search
  • gwas_get_associations_for_snp - SNP-specific associations
  • OpenTargets_search_gwas_studies_by_disease - Curated GWAS data
  • OpenTargets_get_variant_credible_sets - Fine-mapped loci with L2G predictions

Step 2: Druggability Assessment

Input: Gene list from Step 1

Process:

  • Check target class (GPCR, kinase, ion channel, etc.)
  • Assess tractability (antibody, small molecule)
  • Evaluate safety (expression profile, essentiality)
  • Check for tool compounds or crystal structures

Output: Druggability score (0-1) + modality recommendations

Tools Used:

  • OpenTargets_get_target_tractability_by_ensemblID - Druggability assessment
  • OpenTargets_get_target_classes_by_ensemblID - Target classification
  • OpenTargets_get_target_safety_profile_by_ensemblID - Safety data
  • OpenTargets_get_target_genomic_location_by_ensemblID - Genomic context

Step 3: Target Prioritization

Input: Genes with GWAS + druggability data

Process:

  • Calculate composite score: genetic evidence × druggability
  • Rank targets by score
  • Add qualitative factors (novelty, competitive landscape)
  • Generate target dossiers

Output: Ranked list of drug target candidates

Scoring Formula:

Target Score = (GWAS Score × 0.4) + (Druggability × 0.3) + (Clinical Evidence × 0.2) + (Novelty × 0.1)

Step 4: Existing Drug Search

Input: Prioritized target list

Process:

  • Search drug-target associations (ChEMBL, DGIdb)
  • Find approved drugs, clinical candidates, tool compounds
  • Get mechanism of action, indication, phase
  • Check for off-label use or failed trials

Output: Drug-target pairs with development status

Tools Used:

  • OpenTargets_get_associated_drugs_by_disease_efoId - Known drugs for disease
  • OpenTargets_get_drug_mechanisms_of_action_by_chemblId - Drug MOA
  • ChEMBL_get_target_activities - Bioactivity data
  • ChEMBL_get_drug_mechanisms - Drug mechanisms
  • ChEMBL_search_drugs - Drug search

Step 5: Clinical Evidence

Input: Drug candidates

Process:

  • Check clinical trial history (ClinicalTrials.gov)
  • Review safety profile (FDA labels, adverse events)
  • Assess pharmacology (PK/PD, formulation)
  • Evaluate regulatory path

Output: Clinical risk assessment

Tools Used:

  • FDA_get_adverse_reactions_by_drug_name - Safety data
  • FDA_get_active_ingredient_info_by_drug_name - Drug composition
  • OpenTargets_get_drug_warnings_by_chemblId - Drug warnings

Step 6: Repurposing Opportunities

Input: Approved drugs + new disease associations

Process:

  • Match drug targets to new disease genes
  • Assess mechanistic fit (agonist vs antagonist)
  • Check contraindications
  • Estimate repurposing probability

Output: Repurposing candidates with rationale

Repurposing Score:

  • Genetic overlap: Gene targeted by drug = gene implicated in new disease
  • Clinical feasibility: Dosing, route, safety profile compatible
  • Regulatory path: Faster approval (Phase II vs Phase I)

Use Cases

Use Case 1: Novel Target Discovery for Rare Disease

Scenario: Identify druggable targets for Huntington's disease

Steps:

  1. Get GWAS hits for Huntington's → HTT, PDE10A, MSH3
  2. Assess druggability → PDE10A (phosphodiesterase) = high
  3. Find existing PDE10A inhibitors → Multiple tool compounds
  4. Recommendation: Develop selective PDE10A inhibitor

Clinical Context:

  • HTT (huntingtin) = difficult to drug (large, scaffold protein)
  • PDE10A = modifier gene, GPCR-coupled, small molecule tractable
  • Precedent: PDE5 inhibitors (sildenafil) already approved

Use Case 2: Drug Repurposing for Common Disease

Scenario: Find repurposing opportunities for Alzheimer's disease

Steps:

  1. Get GWAS targets → APOE, CLU, CR1, PICALM, BIN1, TREM2
  2. Find drugs targeting these → Anti-inflammatory drugs (CR1, TREM2)
  3. Match approved drugs → Anakinra (IL-1R antagonist)
  4. Rationale: TREM2 links inflammation to neurodegeneration

Example Output:

Repurposing Candidate: Anakinra
- Target: IL-1R → affects TREM2 pathway
- Current use: Rheumatoid arthritis (approved)
- AD rationale: 3 GWAS genes in immune pathway
- Clinical phase: Phase II trial in progress
- Safety: Known profile, subcutaneous injection

Use Case 3: Target Validation for Existing Drug Class

Scenario: Validate new diabetes targets related to GLP-1 pathway

Steps:

  1. Get T2D GWAS genes → TCF7L2, PPARG, KCNJ11, GLP1R
  2. GLP1R validated → Existing drug class (semaglutide, liraglutide)
  3. Check related genes → GIP, GIPR (glucose-dependent insulinotropic polypeptide)
  4. Outcome: Dual GLP-1/GIP agonists (tirzepatide, approved 2022)

Druggability Assessment Deep Dive

Target Classes (by Druggability)

Tier 1: High Druggability

  • GPCRs (33% of approved drugs) - Extracellular binding, established chemistry
  • Kinases (18% of approved drugs) - ATP-competitive inhibitors, allosteric sites
  • Ion channels (15% of approved drugs) - Blocking/opening channels
  • Nuclear receptors - Ligand-binding domains

Tier 2: Moderate Druggability

  • Proteases - Active site inhibitors
  • Phosphatases - Challenging selectivity
  • Epigenetic targets - Readers, writers, erasers

Tier 3: Difficult to Drug

  • Transcription factors - No obvious binding pocket
  • Scaffold proteins - Large, flat surfaces
  • RNA targets - Emerging modality

Modality Selection

Small Molecules:

  • Target: Intracellular proteins, enzymes
  • Advantages: Oral bioavailability, CNS penetration
  • Disadvantages: Off-target effects, development time
  • Examples: Kinase inhibitors, GPCR antagonists

Antibodies:

  • Target: Extracellular proteins, receptors
  • Advantages: High specificity, long half-life
  • Disadvantages: Expensive, injection-only, no CNS
  • Examples: PD-1 inhibitors, TNF-α blockers

Antisense/RNAi:

  • Target: mRNA (any gene)
  • Advantages: Sequence-specific, undruggable targets
  • Disadvantages: Delivery challenges, liver-centric
  • Examples: Patisiran (TTR), nusinersen (SMN)

Gene Therapy:

  • Target: Genetic defects
  • Advantages: One-time treatment, curative potential
  • Disadvantages: Immunogenicity, manufacturing complexity
  • Examples: Luxturna (RPE65), Zolgensma (SMN1)

Clinical Translation Considerations

Regulatory Requirements

IND (Investigational New Drug) Application:

  • Pharmacology and toxicology
  • Manufacturing information
  • Clinical protocols and investigator information

Clinical Trial Phases:

  • Phase I: Safety, dosing (20-100 healthy volunteers)
  • Phase II: Efficacy, side effects (100-300 patients)
  • Phase III: Confirmatory trials (1,000-3,000 patients)
  • Phase IV: Post-market surveillance

Repurposing Advantages:

  • Skip Phase I if dosing similar
  • Shorter timelines (2-4 years vs 10-15)
  • Lower costs ($50M vs $2B)

Success Rate Benchmarks

Traditional Drug Development (Wong et al., Biostatistics 2019):

  • Phase I → II: 63%
  • Phase II → III: 31%
  • Phase III → Approval: 58%
  • Overall: 12% (from Phase I to approval)

With Genetic Evidence (King et al., PLOS Genetics 2019):

  • Phase I → Approval: 24% (2× improvement)
  • Phase II → Approval: 38% vs 18% (no genetic support)

Cost and Timeline

Traditional Development:

  • Pre-clinical: 3-6 years, $500M
  • Clinical trials: 6-7 years, $1-1.5B
  • Total: 10-15 years, $2-2.5B

Repurposing:

  • Pre-clinical: 1-2 years, $50M
  • Clinical trials: 2-3 years, $100-200M
  • Total: 3-5 years, $150-250M

Best Practices

1. Multi-Ancestry GWAS

Why: Genetic architecture varies across populations

Approach:

  • Include trans-ethnic meta-analyses
  • Check replication in multiple ancestries
  • Consider population-specific variants

Example: APOL1 kidney disease variants (African ancestry-specific)

2. Functional Validation

GWAS alone is not enough - need mechanistic support:

  • eQTL analysis: Variant affects gene expression?
  • pQTL analysis: Variant affects protein levels?
  • Colocalization: GWAS + eQTL signals overlap?
  • Fine-mapping: Which variant(s) are causal?

Tools for validation:

  • GTEx (tissue-specific expression)
  • ENCODE (regulatory elements)
  • gnomAD (variant frequency, constraint)

3. Network and Pathway Analysis

Beyond Single Genes:

  • Group GWAS hits by pathway (KEGG, Reactome)
  • Identify druggable nodes in disease network
  • Consider combination therapies

Example: Alzheimer's GWAS →

  • Immune cluster (TREM2, CR1, CLU)
  • Lipid cluster (APOE, ABCA7)
  • Endocytosis (BIN1, PICALM)

4. Safety Liability Assessment

Red Flags:

  • Essential gene (loss-of-function lethal)
  • Broad expression (on-target toxicity)
  • Off-target kinase panel (promiscuity)
  • hERG inhibition (cardiotoxicity)
  • CYP450 interactions (drug-drug interactions)

Tools:

  • gnomAD pLI (intolerance to loss-of-function)
  • GTEx expression (tissue specificity)
  • PharmaGKB (pharmacogenomics)

5. Intellectual Property Landscape

Patent Considerations:

  • Target patents (composition of matter)
  • Method of use patents (indication-specific)
  • Formulation patents (delivery)

Freedom to Operate:

  • Existing patents on target
  • Blocking patents on drug class
  • Expired patents (generic opportunity)

Limitations and Caveats

GWAS Limitations

1. Association ≠ Causation

  • Linkage disequilibrium = true causal variant may differ
  • Pleiotropy = gene affects multiple traits
  • Confounding = population stratification

Solution: Fine-mapping, functional studies, Mendelian randomization

2. Missing Heritability

  • Common variants explain ~10-50% of heritability
  • Rare variants, structural variants, epigenetics matter
  • Gene-environment interactions

Solution: Whole-genome sequencing, family studies

3. Druggable ≠ Effective

  • Can bind target ≠ modulates disease
  • Right direction (agonist vs antagonist)?
  • Right tissue (CNS penetration)?

Solution: Experimental validation, disease models

Target Validation Challenges

1. Mouse Models ≠ Humans

  • 95% of drugs work in mice, 5% in humans
  • Species differences (immune system)
  • Acute models ≠ chronic disease

Solution: Human cell models (iPSCs, organoids), humanized mice

2. Genetic Perturbation ≠ Pharmacology

  • Knockout = complete loss, drug = partial inhibition
  • Timing matters (developmental vs adult)
  • Compensation in knockout

Solution: Inducible knockouts, tool compounds

3. Efficacy ≠ Safety

  • On-target toxicity (essential gene)
  • Off-target effects (selectivity)
  • Dose-limiting side effects

Solution: Therapeutic index assessment, biomarkers

Ethical and Regulatory Considerations

Human Genetics Research

Informed Consent:

  • Secondary use of GWAS data
  • Return of results policies
  • Privacy protections (de-identification)

Equity:

  • Most GWAS = European ancestry (78%)
  • Risk: Drugs may not work equally across populations
  • Solution: Diversify GWAS cohorts

Clinical Trials

Study Design:

  • Stratification by genetics (precision medicine)
  • Adaptive trials (basket, umbrella designs)
  • Real-world evidence (pragmatic trials)

Patient Selection:

  • Enrichment by genotype (higher response rate)
  • Ethics of genetic testing for trial entry
  • Cost-effectiveness of stratified medicine

Regulatory Pathways

FDA Breakthrough Therapy:

  • Substantial improvement over existing
  • Expedited review (6 months vs 10 months)
  • Examples: CAR-T therapies, gene therapies

Accelerated Approval:

  • Based on surrogate endpoints
  • Post-market confirmation required
  • Risk: Approval withdrawal if confirmatory fails

Resources and References

Databases

GWAS:

Drugs:

  • ChEMBL - Bioactivity database
  • DrugBank - Comprehensive drug information
  • DGIdb - Drug-gene interactions

Targets:

Clinical:

Key Literature

Genetic Evidence for Drug Targets:

  • Nelson et al. (2015) Nature Genetics - Genetic support doubles clinical success
  • King et al. (2019) PLOS Genetics - Systematic analysis of target success

GWAS to Function:

  • Visscher et al. (2017) American Journal of Human Genetics - 10 years of GWAS
  • Claussnitzer et al. (2020) Nature Reviews Genetics - From GWAS to biology

Drug Repurposing:

  • Pushpakom et al. (2019) Nature Reviews Drug Discovery - Repurposing opportunities
  • Shameer et al. (2018) Nature Biotechnology - Computational repurposing

Success Stories:

  • Plenge et al. (2013) Nature Reviews Drug Discovery - IL-6R to tocilizumab
  • Cohen et al. (2006) Science - PCSK9 to evolocumab

Disclaimer

For Research Purposes Only

This skill is designed for:

  • Target discovery and validation
  • Drug repurposing hypothesis generation
  • Preclinical research planning

NOT for:

  • Clinical decision-making
  • Patient treatment recommendations
  • Regulatory submissions (without validation)

Important Notes:

  • All targets require experimental validation
  • GWAS evidence is correlational, not causal
  • Regulatory approval requires extensive preclinical and clinical data
  • Consult domain experts (geneticists, pharmacologists, clinicians)

Liability: The authors assume no liability for actions taken based on this analysis. All therapeutic development requires rigorous validation and regulatory oversight.

Version History

  • v1.0.0 (2026-02-13): Initial release with GWAS-to-drug workflow
    • Support for GWAS Catalog, Open Targets, ChEMBL, FDA tools
    • Target discovery, druggability assessment, repurposing identification
    • Comprehensive documentation with examples

Future Enhancements

Planned Features:

  • Integration with UK Biobank for larger-scale GWAS
  • PheWAS (phenome-wide association studies) for pleiotropic effects
  • Mendelian randomization for causal inference
  • Network-based target prioritization
  • AI-powered structure-activity relationship (SAR) prediction
  • Clinical trial matching for repurposing candidates

Tool Additions:

  • PDB (Protein Data Bank) for structural druggability
  • STRING for protein-protein interaction networks
  • DisGeNET for disease-gene associations
  • ClinVar for pathogenic variant interpretation

Contact

For questions, issues, or contributions:

  • GitHub: [ToolUniverse Repository]
  • Documentation: [skills/tooluniverse-gwas-drug-discovery/]
  • Email: tooluniverse@example.com
Weekly Installs
3
Repository
wu-yc/labclaw
GitHub Stars
646
First Seen
3 days ago
Installed on
gemini-cli3
claude-code3
github-copilot3
codex3
kimi-cli3
cursor3