tooluniverse-rare-disease-diagnosis
Rare Disease Diagnosis Advisor
Systematic diagnosis support for rare diseases using phenotype matching, gene panel prioritization, and variant interpretation across Orphanet, OMIM, HPO, ClinVar, and structure-based analysis.
KEY PRINCIPLES:
- Report-first - Create report file FIRST, update progressively
- Phenotype-driven - Convert symptoms to HPO terms before searching
- Multi-database triangulation - Cross-reference Orphanet, OMIM, OpenTargets
- Evidence grading - Grade diagnoses by supporting evidence strength
- English-first queries - Always use English terms in tool calls
LOOK UP, DON'T GUESS
When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory.
COMPUTE, DON'T DESCRIBE
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Clinical Reasoning Framework (BEFORE Tools)
Apply these strategies to form a 3-5 candidate differential, then use tools to confirm/refute:
- Multi-system involvement - Symptoms spanning 2+ organ systems = strongest rare disease signal. Ask: what single pathway explains ALL features?
- Regression question - Losing abilities vs never acquired? Regression = neurodegenerative/metabolic storage. Stable = developmental/structural.
- Trigger question - Episodic/triggered (fasting, illness, exercise) = metabolic disorder (often treatable). Constitutive = structural/degenerative.
- Rarest feature first - Build differential from most specific finding, not most prominent. Check remaining features for consistency.
- Treatable-first - Move treatable conditions to top for urgent workup (enzyme replacement, dietary, chelation, vitamin-responsive).
- Occupational/environmental exposure - Latency up to 50 years. Asbestos/silica/heavy metals/solvents/farming. Always ask about PAST jobs.
- Autoimmune differential - Which joints? Symmetric? Extra-articular? Serologic pattern? Organ under attack?
- Rare syndrome signals - Named triads, common diagnoses failing to explain ALL findings, failed standard treatment, unusual lab findings.
- Tools verify, not generate - Form hypothesis first, then use databases to confirm.
Common pitfalls: Felty's (RA+splenomegaly+neutropenia) mimics infection; SLE nephritis mimics PSGN (check ASO); occupational exposures trigger autoimmunity (silica→scleroderma/RA/SLE).
Tool Parameter Corrections
| Tool | WRONG | CORRECT |
|---|---|---|
OpenTargets_get_associated_drugs_by_target_ensemblID |
ensemblID |
ensemblId |
ClinVar_get_variant_details |
variant_id |
id |
MyGene_query_genes |
gene |
q |
gnomad_get_variant |
variant |
variant_id |
Workflow
Phase 0: Clinical Reasoning → 3-5 candidate differential
Phase 1: Phenotype → HPO terms (HPO_search_terms), core vs variable, onset, family history
Phase 2: Disease Matching → Orphanet_search_diseases, OMIM_search, DisGeNET_search_gene
Phase 3: Gene Panel → ClinGen validation, GTEx expression, prioritization scoring
Phase 3.5: Expression Context → CELLxGENE, ChIPAtlas for tissue/cell-type confirmation
Phase 3.6: Pathway Analysis → KEGG, IntAct for convergent pathways
Phase 4: Variant Interpretation → ClinVar, gnomAD frequency, CADD/AlphaMissense/EVE/SpliceAI, ACMG criteria
Phase 5: Structure Analysis → AlphaFold2, InterPro domains (for VUS)
Phase 6: Literature → PubMed, BioRxiv/MedRxiv, OpenAlex
Phase 7: Report Synthesis → Prioritized differential with next steps
Key Phase Details
Phase 2 - Disease Matching: Orphanet_search_diseases(operation="search_diseases", query=keyword) then Orphanet_get_genes(operation="get_genes", orpha_code=code). Score overlap: Excellent >80%, Good 60-80%, Possible 40-60%.
Phase 3 - Gene Panel: ClinGen classification drives inclusion (Definitive/Strong/Moderate = include; Limited = flag; Disputed/Refuted = exclude). Scoring: Tier 1 (top disease gene +5), Tier 2 (multi-disease +3), Tier 3 (ClinGen Definitive +3), Tier 4 (tissue expression +2), Tier 5 (pLI >0.9 +1).
Phase 4 - Variants: gnomAD frequency classes: ultra-rare <0.00001, rare <0.0001, low-freq <0.01. ACMG: PVS1 (null), PS1 (same AA), PM2 (absent pop), PP3 (computational), BA1 (>5% AF). 2+ concordant predictors strengthen PP3.
Evidence Grading
| Tier | Criteria |
|---|---|
| T1 (High) | Phenotype match >80% + gene match |
| T2 (Medium-High) | Phenotype match 60-80% OR likely pathogenic variant |
| T3 (Medium) | Phenotype match 40-60% OR VUS in candidate gene |
| T4 (Low) | Phenotype <40% OR uncertain gene |
Fallback Chains
| Primary | Fallback 1 | Fallback 2 |
|---|---|---|
get_joint_associated_diseases_by_HPO_ID_list |
Orphanet_search_diseases |
PubMed phenotype search |
ClinVar_get_variant_details |
gnomad_get_variant |
VEP annotation |
GTEx_get_expression_summary |
HPA_search_genes_by_query |
Tissue-specific literature |
Reference Files
- DIAGNOSTIC_WORKFLOW.md - Code examples and algorithms per phase
- REPORT_TEMPLATE.md - Report template and examples
- CHECKLIST.md - Interactive completeness checklist
scripts/clinical_patterns.py- Clinical pattern lookup (syndromes, differentials, red flags, occupational exposures)
More from mims-harvard/tooluniverse
tooluniverse-image-analysis
Production-ready microscopy image analysis and quantitative imaging data skill for colony morphometry, cell counting, fluorescence quantification, and statistical analysis of imaging-derived measurements. Processes ImageJ/CellProfiler output (area, circularity, intensity, cell counts), performs Dunnett's test, Cohen's d effect size, power analysis, Shapiro-Wilk normality tests, two-way ANOVA, polynomial regression, natural spline regression with confidence intervals, and comparative morphometry. Supports CSV/TSV measurement tables, multi-channel fluorescence data, colony swarming assays, and neuron counting datasets. Use when analyzing microscopy measurement data, colony area/circularity, cell count statistics, swarming assays, co-culture ratio optimization, or answering questions about imaging-derived quantitative data.
379tooluniverse-drug-research
Generates comprehensive drug research reports with compound disambiguation, evidence grading, and mandatory completeness sections. Covers identity, chemistry, pharmacology, targets, clinical trials, safety, pharmacogenomics, and ADMET properties. Use when users ask about drugs, medications, therapeutics, or need drug profiling, safety assessment, or clinical development research.
254tooluniverse-disease-research
Generate comprehensive disease research reports using 100+ ToolUniverse tools. Creates a detailed markdown report file and progressively updates it with findings from 10 research dimensions. All information includes source references. Use when users ask about diseases, syndromes, or need systematic disease analysis.
247tooluniverse-drug-drug-interaction
Comprehensive drug-drug interaction (DDI) prediction and risk assessment. Analyzes interaction mechanisms (CYP450, transporters, pharmacodynamic), severity classification, clinical evidence grading, and provides management strategies. Supports single drug pairs, polypharmacy analysis (3+ drugs), and alternative drug recommendations. Use when users ask about drug interactions, medication safety, polypharmacy risks, or need DDI assessment for clinical decision support.
222tooluniverse-protein-therapeutic-design
Design novel protein therapeutics (binders, enzymes, scaffolds) using AI-guided de novo design. Uses RFdiffusion for backbone generation, ProteinMPNN for sequence design, ESMFold/AlphaFold2 for validation. Use when asked to design protein binders, therapeutic proteins, or engineer protein function.
222devtu-create-tool
Create new scientific tools for ToolUniverse framework with proper structure, validation, and testing. Use when users need to add tools to ToolUniverse, implement new API integrations, create tool wrappers for scientific databases/services, expand ToolUniverse capabilities, or follow ToolUniverse contribution guidelines. Supports creating tool classes, JSON configurations, validation, error handling, and test examples.
221