tooluniverse-infectious-disease
COMPUTE, DON'T DESCRIBE
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Infectious Disease Outbreak Intelligence
Rapid response system for emerging pathogens using taxonomy analysis, target identification, structure prediction, and computational drug repurposing.
KEY PRINCIPLES:
- Speed is critical - Optimize for rapid actionable intelligence
- Target essential proteins - Focus on conserved, essential viral/bacterial proteins
- Leverage existing drugs - Prioritize FDA-approved compounds for repurposing
- Structure-guided - Use NvidiaNIM for rapid structure prediction and docking
- Evidence-graded - Grade repurposing candidates by evidence strength
- Actionable output - Prioritized drug candidates with rationale
- English-first queries - Always use English terms in tool calls; respond in user's language
REASONING STRATEGY — Start Here: Start with pathogen identification: What type of organism? (virus, bacteria, fungus, parasite). Then ask:
- What are the essential proteins? (required for replication or viability — cannot be mutated away)
- Which are surface-exposed? (accessible to drugs and antibodies)
- Which are conserved across strains? (targeting conserved regions prevents resistance escape) These three questions define your drug targets and vaccine candidates. Organisms in the same genus share targets — look up drug precedent for related pathogens before predicting from scratch.
LOOK UP DON'T GUESS: Never assume a pathogen's taxonomy, genome size, or protein function. Always call BVBRC_search_taxonomy or UniProt_search first. Even well-known pathogens have strains with different drug susceptibility profiles — look up the specific strain when known.
When to Use
Apply when user asks:
- "New pathogen detected - what drugs might work?"
- "Emerging virus [X] - therapeutic options?"
- "Drug repurposing candidates for [pathogen]"
- "What do we know about [novel coronavirus/bacteria]?"
- "Essential targets in [pathogen] for drug development"
- "Can we repurpose [drug] against [pathogen]?"
Critical Workflow Requirements
1. Report-First Approach (MANDATORY)
- Create
[PATHOGEN]_outbreak_intelligence.mdFIRST with section headers - Progressively update as data is gathered
- Output separate files:
[PATHOGEN]_drug_candidates.csv,[PATHOGEN]_target_proteins.csv
2. Citation Requirements (MANDATORY)
Every finding must have inline source attribution:
### Target: RNA-dependent RNA polymerase (RdRp)
- **UniProt**: P0DTD1 (NSP12)
- **Essentiality**: Required for replication
*Source: UniProt via `UniProt_search`, literature review*
Phase 0: Tool Verification
Known Parameter Corrections
| Tool | WRONG Parameter | CORRECT Parameter |
|---|---|---|
NCBIDatasets_get_taxonomy |
name |
tax_id (integer) or use BVBRC_search_taxonomy for keyword search |
UniProt_search |
name |
query |
ChEMBL_search_targets |
query, target |
pref_name__contains (substring match) |
get_diffdock_info |
protein_file |
protein (content) |
drugbank_full_search |
(may fail) | Use drugbank_vocab_search as primary DrugBank lookup |
PubMed tip: Use
sort="relevance"(default) notsort="pub_date"— date-sorted queries can return empty for narrow topics. Tool name:PubMed_search_articles. FDA labels: UseFDA_get_drug_label_info_by_field_valuewith targetedreturn_fieldsto avoid oversized responses fromOpenFDA_search_drug_labels.
Workflow Overview
Phase 1: Pathogen Identification
├── Taxonomic classification (NCBI Taxonomy)
├── Closest relatives (for knowledge transfer)
├── Genome/proteome availability
└── OUTPUT: Pathogen profile
|
Phase 2: Target Identification
├── Essential genes/proteins (UniProt)
├── Conservation across strains
├── Druggability assessment (ChEMBL)
└── OUTPUT: Prioritized target list (scored by essentiality/conservation/druggability/precedent)
|
Phase 3: Structure Prediction (NvidiaNIM)
├── AlphaFold2/ESMFold for targets
├── Binding site identification
├── Quality assessment (pLDDT)
└── OUTPUT: Target structures (docking-ready if pLDDT > 70)
|
Phase 4: Drug Repurposing Screen
├── Approved drugs for related pathogens (ChEMBL)
├── Broad-spectrum antivirals/antibiotics
├── Docking screen (get_diffdock_info)
└── OUTPUT: Ranked candidate drugs
|
Phase 4.5: Pathway Analysis
├── KEGG: Pathogen metabolism pathways
├── Essential metabolic targets
├── Host-pathogen interaction pathways
└── OUTPUT: Pathway-based drug targets
|
Phase 5: Literature Intelligence
├── PubMed: Published outbreak reports
├── BioRxiv/MedRxiv: Recent preprints (CRITICAL for outbreaks)
├── ArXiv: Computational/ML preprints
├── OpenAlex: Citation tracking
├── ClinicalTrials.gov: Active trials
└── OUTPUT: Evidence synthesis
|
Phase 6: Report Synthesis
├── Top drug candidates with evidence grades
├── Clinical trial opportunities
├── Recommended immediate actions
└── OUTPUT: Final report
Phase Summaries
Phase 1: Pathogen Identification
Classify via NCBI Taxonomy (query param). Identify related pathogens with existing drugs for knowledge transfer. Determine genome/proteome availability.
Knowledge transfer principle: Drugs effective against related pathogens are the highest-priority repurposing candidates. A protease inhibitor for SARS-CoV-1 is immediately relevant to SARS-CoV-2. Look up the related pathogen's approved drugs in ChEMBL before generating candidates from first principles.
Phase 2: Target Identification
Search UniProt for pathogen proteins (reviewed). Check ChEMBL for drug precedent. Score targets by: Essentiality (30%), Conservation (25%), Druggability (25%), Drug precedent (20%). Aim for 5+ targets.
Phase 3: Structure Prediction
Use NvidiaNIM AlphaFold2 for top 3 targets. Assess pLDDT confidence. Only dock structures with pLDDT > 70 (active site > 90 preferred). Fallback: alphafold_get_prediction or ESMFold_predict_structure.
Phase 4: Drug Repurposing Screen
Source candidates from: related pathogen drugs, broad-spectrum antivirals, target class drugs (DGIdb). Dock top 20+ candidates via get_diffdock_info. Rank by docking score and evidence tier.
Phase 4.5: Pathway Analysis
Use KEGG to identify essential metabolic pathways. Map host-pathogen interaction points. Identify pathway-based drug targets beyond direct protein inhibition.
Phase 5: Literature Intelligence
Search PubMed (peer-reviewed), BioRxiv/MedRxiv (preprints - critical for outbreaks), ArXiv (computational), ClinicalTrials.gov (active trials). Track citations via OpenAlex. Note: preprints are NOT peer-reviewed.
Phase 6: Report Synthesis
Aggregate all findings into final report. Grade every candidate. Provide 3+ immediate actions, clinical trial opportunities, and research priorities.
Evidence Grading
| Tier | Symbol | Criteria | Example |
|---|---|---|---|
| T1 | [T1] | FDA approved for this pathogen | Remdesivir for COVID |
| T2 | [T2] | Clinical trial evidence OR approved for related pathogen | Favipiravir |
| T3 | [T3] | In vitro activity OR strong docking + mechanism | Sofosbuvir |
| T4 | [T4] | Computational prediction only | Novel docking hits |
Completeness Checklist
Phase 1: Pathogen ID
- Taxonomic classification complete
- Related pathogens identified
- Genome/proteome availability noted
Phase 2: Targets
- 5+ targets identified
- Essentiality documented
- Conservation assessed
- Drug precedent checked
Phase 3: Structures
- Structures predicted for top 3 targets
- pLDDT confidence reported
- Binding sites identified
Phase 4: Drug Screen
- 20+ candidates screened
- FDA-approved drugs prioritized
- Docking scores reported
- Top 5 candidates detailed
Phase 5: Literature
- Recent papers summarized
- Active trials listed
- Resistance data noted
Phase 6: Recommendations
- 3+ immediate actions
- Clinical trial opportunities
- Research priorities
Fallback Chains
| Primary Tool | Fallback 1 | Fallback 2 |
|---|---|---|
NvidiaNIM_alphafold2 |
alphafold_get_prediction |
ESMFold_predict_structure |
get_diffdock_info |
NvidiaNIM_boltz2 |
Manual docking |
NCBIDatasets_suggest_taxonomy |
UniProtTaxonomy_get_taxon |
Manual classification |
ChEMBL_search_drugs |
drugbank_vocab_search |
PubChem bioassays |
References
| File | Contents |
|---|---|
| TOOLS_REFERENCE.md | Complete tool documentation |
| phase_details.md | Detailed code examples and procedures for each phase |
| report_template.md | Report template with section headers, checklist, and evidence grading |
| CHECKLIST.md | Pre-delivery verification checklist (quality, citations, docking) |
| EXAMPLES.md | Full worked examples (coronavirus, CRKP, limited-info scenarios) |
More from mims-harvard/tooluniverse
tooluniverse-sequence-retrieval
Retrieves biological sequences (DNA, RNA, protein) from NCBI and ENA with gene disambiguation, accession type handling, and comprehensive sequence profiles. Creates detailed reports with sequence metadata, cross-database references, and download options. Use when users need nucleotide sequences, protein sequences, genome data, or mention GenBank, RefSeq, EMBL accessions.
1.4Ktooluniverse-image-analysis
Production-ready microscopy image analysis and quantitative imaging data skill for colony morphometry, cell counting, fluorescence quantification, and statistical analysis of imaging-derived measurements. Processes ImageJ/CellProfiler output (area, circularity, intensity, cell counts), performs Dunnett's test, Cohen's d effect size, power analysis, Shapiro-Wilk normality tests, two-way ANOVA, polynomial regression, natural spline regression with confidence intervals, and comparative morphometry. Supports CSV/TSV measurement tables, multi-channel fluorescence data, colony swarming assays, and neuron counting datasets. Use when analyzing microscopy measurement data, colony area/circularity, cell count statistics, swarming assays, co-culture ratio optimization, or answering questions about imaging-derived quantitative data.
379tooluniverse-literature-deep-research
Comprehensive literature deep research across any academic domain using 120+ ToolUniverse tools. Conducts subject disambiguation, systematic literature search with citation network expansion, evidence grading (T1-T4), and structured theme extraction. Produces detailed reports with mandatory completeness checklists, integrated models, and testable hypotheses. Use when users need thorough literature reviews, target/drug/disease profiles, topic deep-dives, claim verification, or systematic evidence synthesis. Supports biomedical (genes, proteins, drugs, diseases), computer science, social science, and general academic topics. For single factoid questions, uses a fast verification mode with inline answer.
347tooluniverse
Router skill for ToolUniverse tasks. First checks if specialized tooluniverse skills (105+ skills covering disease/drug/target research, gene-disease associations, clinical decision support, genomics, epigenomics, proteomics, comparative genomics, chemical safety, toxicology, systems biology, and more) can solve the problem, then falls back to general strategies for using 2300+ scientific tools. Covers tool discovery, multi-hop queries, comprehensive research workflows, disambiguation, evidence grading, and report generation. Use when users need to research any scientific topic, find biological data, or explore drug/target/disease relationships. ALSO USE for any biology, medicine, chemistry, pharmacology, or life science question — even simple factoid questions like "how many X in protein Y", "what drug interacts with Z", "what gene causes disease W", or "translate this sequence". These questions benefit from database lookups (UniProt, PubMed, ChEMBL, ClinVar, GWAS Catalog, etc.) rather than answering from memory alone. When in doubt about a scientific fact, USE THIS SKILL to verify against real databases.
257tooluniverse-drug-research
Generates comprehensive drug research reports with compound disambiguation, evidence grading, and mandatory completeness sections. Covers identity, chemistry, pharmacology, targets, clinical trials, safety, pharmacogenomics, and ADMET properties. Use when users ask about drugs, medications, therapeutics, or need drug profiling, safety assessment, or clinical development research.
254setup-tooluniverse
Install and configure ToolUniverse for any use case — MCP server (chat-based), CLI (command line with 9 subcommands), or Python SDK (Coding API with 3 calling patterns). Covers uv/uvx setup, MCP configuration for 12+ AI clients (Cursor, Claude Desktop, Windsurf, VS Code, Codex, Gemini CLI, Trae, Cline, etc.), full CLI reference (tu list/grep/find/info/run/test/status/build/serve), Coding API quickstart, agentic tools, code executor, API key walkthrough, skill installation, and upgrading. Use when user asks how to set up ToolUniverse, which access mode to use (MCP vs CLI vs SDK), configuring MCP servers, using the CLI, troubleshooting installation, upgrading, or mentions installing ToolUniverse or setting up scientific tools. Also triggers for "how do I use ToolUniverse", "what's the best way to access tools", "command line", "tu command", "coding API", "tu build".
251