ToolUniverse Tool Description Optimization

Optimize tool descriptions in ToolUniverse JSON configuration files to ensure they are clear, complete, and user-friendly.

When to Apply This Skill

Use when:

Reviewing newly created tool descriptions
User asks "are these tools easy to understand?"
Improving existing tool documentation
Adding new tools to ToolUniverse
User mentions tool usability, clarity, or documentation

Quick Optimization Checklist

Tool Description Review:
- [ ] Prerequisites stated (packages, API keys, accounts)
- [ ] Critical abbreviations expanded on first use
- [ ] Required vs optional parameters clear
- [ ] Mutually exclusive options numbered/labeled
- [ ] Parameter guidance includes trade-offs
- [ ] Filter syntax shows available fields
- [ ] File size warnings where relevant
- [ ] Examples show realistic usage

Critical Improvements (Fix Immediately)

1. Clarify Required Input Requirements

Problem: Users don't know if they need ONE input or ALL inputs.

Fix: Use "Required: Provide ONE input type" for mutually exclusive options.

// Before
"description": "Process BED regions, motifs, or gene lists..."

// After
"description": "Process genomic data. **Required: Provide ONE input type** - (1) BED regions, (2) DNA motif, or (3) gene list. Analyzes..."

Number the options and use bold for "Required".

2. Add Prerequisites to First Tool

Problem: Users don't know what to install/configure before use.

Fix: Add prerequisites note to first tool in each family.

"description": "Query single-cell data. Prerequisites: Requires 'package-name' (install: pip install tooluniverse[extra]). Returns..."

Include:

Package installation command
API key requirements
Account creation instructions

3. Expand Critical Abbreviations

Problem: New users don't understand technical terms.

Fix: Expand on first use with format: "Abbreviation (Full Name)".

Common abbreviations to expand:

H5AD → HDF5-based AnnData
RPM → Reads Per Million
TSS → Transcription Start Site
TAD → Topologically Associating Domain
DRS → Data Repository Service
API names (MACS2, IUPAC, etc.)

// Before
"description": "Download H5AD files..."

// After  
"description": "Download H5AD (HDF5-based AnnData) files..."

High-Priority Improvements

4. Enhance Filter Parameter Descriptions

Problem: Users don't know what fields are available or what syntax to use.

Fix: List operators, common fields, and provide multiple examples.

"parameter_name": {
  "type": "string",
  "description": "Filter using SQL-like syntax. Format: 'field == \"value\"'. Operators: ==, !=, in, <, >, <=, >=. Combine with 'and'/'or'. Common fields: tissue, cell_type, disease, assay, sex, ethnicity. Examples: 'tissue == \"lung\"', 'disease == \"COVID-19\" and tissue == \"lung\"', 'cell_type in [\"T cell\", \"B cell\"]'."
}

Include:

Syntax format
Available operators
List of 5-10 common fields
2-3 diverse examples

5. Improve Parameter Guidance

Problem: Users don't know which value to choose or what trade-offs exist.

Fix: Explain what each value means and provide recommendations.

// Before
"threshold": "Q-value threshold (05=1e-5, 10=1e-10, 20=1e-20)"

// After
"threshold": "Peak calling stringency. '05'=1e-5 (permissive, more peaks, broad features), '10'=1e-10 (moderate, balanced), '20'=1e-20 (strict, high confidence, narrow peaks). Default '05' suitable for most analyses. Higher values = fewer but more confident peaks."

For each parameter option, explain:

What it means practically
When to use it
Trade-offs involved
Recommended default

6. Number Mutually Exclusive Options

Problem: Users provide multiple options when only one is allowed.

Fix: Label options as "Option 1", "Option 2", etc.

"bed_data": {
  "description": "**Option 1**: BED format regions (tab-separated: chr, start, end). Example: 'chr1\\t1000\\t2000'."
},
"motif": {
  "description": "**Option 2**: DNA sequence motif in IUPAC notation. Use: A/T/G/C, W=A|T, S=G|C. Example: 'CANNTG'."
},
"gene_list": {
  "description": "**Option 3**: Gene symbols as array. Example: ['TP53', 'MDM2']."
}

Medium-Priority Improvements

7. Add File Size Warnings

For tools that download or return large files:

"description": "Download contact matrices. Note: Files can be large (GBs), check file_size in metadata before downloading. Returns..."

8. Clarify Web Form vs API Results

When tool returns submission URL instead of direct results:

"description": "Perform enrichment analysis. Note: Returns submission URL (web form-based analysis). Analyzes..."

9. Explain File Type Differences

For tools with multiple format options:

"file_type": "File format. Common types: 'cooler' (multi-resolution contact matrices), 'pairs' (aligned read pairs), 'hic' (Juicer format), 'mcool' (multi-resolution cooler)."

Description Structure Template

{
  "name": "Tool_operation_name",
  "type": "ToolClassName",
  "description": "[Action verb] to [purpose]. [Prerequisites if first tool]. [Key data/features]. [Required inputs if mutually exclusive]. [Note about limitations/requirements]. Use for: [use case 1], [use case 2], [use case 3].",
  "parameter": {
    "properties": {
      "param_name": {
        "type": "string",
        "description": "[What it does]. [Format/syntax if applicable]. [Options with trade-offs]. [Examples]. [Recommendation if applicable]."
      }
    }
  }
}

Description Quality Checklist

Clarity Checks

Purpose clear in first sentence
Technical terms expanded
Prerequisites stated upfront
Examples show realistic usage
"Use for:" section lists 3-5 concrete use cases

Completeness Checks

Required inputs clearly marked
Parameter choices explained
Limitations noted (file size, web form, etc.)
Available fields listed for filters
Default values recommended

Usability Checks

New users can understand without external docs
Users know what to provide
Users can make informed parameter choices
Error prevention (mutually exclusive options labeled)

Testing Description Quality

To verify description quality, ask:

Can a new user understand what the tool does?
- Read only the description (no docs)
- Should be clear within 30 seconds
Can a user provide correct inputs on first try?
- Required inputs obvious
- Format/syntax clear
- Mutually exclusive options labeled
Can a user choose appropriate parameters?
- Trade-offs explained
- Recommendations provided
- Defaults justified
Are prerequisites obvious?
- Installation instructions
- API keys/accounts
- File size warnings

Common Patterns by Tool Type

API Query Tools

"description": "Query [data type] from [source]. [Prerequisites]. Filter by [criteria]. Returns [output]. [Data scale]. Use for: [discovery], [analysis], [specific research tasks]."

Key elements:

What you're querying
How to filter
What you get back
Scale of data
Prerequisites

Data Download Tools

"description": "Download [file types] from [source]. [Format details]. [File size warning]. [Authentication requirement]. Use for: [offline analysis], [custom processing], [integration]."

Key elements:

File formats available
Size warning
Authentication needs
What's in the files

Enrichment/Analysis Tools

"description": "Analyze [input type] to find [results]. **Required: Provide ONE input type** - (1) [option], (2) [option], (3) [option]. Compares against [database/background]. [Result format]. Use for: [identifying], [discovering], [predicting]."

Key elements:

Input requirements clear
Options numbered
What gets compared
What you learn

Validation Commands

After updating descriptions, validate JSON syntax:

# Validate all tool JSONs
python3 -m json.tool src/tooluniverse/data/your_tools.json > /dev/null && echo "✓ Valid"

# Check all tools in category
for f in src/tooluniverse/data/*_tools.json; do
  python3 -m json.tool "$f" > /dev/null && echo "✓ $f valid" || echo "✗ $f invalid"
done

Example: Before and After

Before (Unclear):

{
  "name": "Tool_enrichment",
  "description": "Perform enrichment with tool to find factors.",
  "parameter": {
    "properties": {
      "bed": {"description": "BED data"},
      "motif": {"description": "Motif"},
      "genes": {"description": "Genes"},
      "threshold": {"description": "Threshold value"}
    }
  }
}

After (Clear):

{
  "name": "Tool_enrichment_analysis",
  "description": "Identify transcription factors enriched in your data. **Required: Provide ONE input type** - (1) BED genomic regions, (2) DNA sequence motif (IUPAC notation), or (3) gene symbol list. Compares against 400,000+ ChIP-seq experiments. Returns ranked proteins with enrichment scores. Note: Returns submission URL (web-based analysis). Use for: identifying regulators of regions, finding proteins bound to motifs, discovering transcription factors regulating genes.",
  "parameter": {
    "properties": {
      "bed_data": {
        "description": "**Option 1**: BED format regions (tab-separated: chr, start, end). For finding proteins bound to genomic regions. Example: 'chr1\\t1000\\t2000'."
      },
      "motif": {
        "description": "**Option 2**: DNA motif in IUPAC notation (A/T/G/C, W=A|T, S=G|C, M=A|C, K=G|T, R=A|G, Y=C|T). Example: 'CANNTG' (E-box)."
      },
      "gene_list": {
        "description": "**Option 3**: Gene symbols as array or single gene. Example: ['TP53', 'MDM2', 'CDKN1A']."
      },
      "threshold": {
        "description": "Peak stringency. '05'=1e-5 (permissive, more peaks), '10'=1e-10 (moderate), '20'=1e-20 (strict, high confidence). Default '05' suitable for most analyses."
      }
    }
  }
}

Summary

Priority order for optimization:

Critical (fix immediately):
- Clarify required inputs
- Add prerequisites
- Expand abbreviations
High (fix soon):
- Enhance filter descriptions
- Improve parameter guidance
- Number mutually exclusive options
Medium (nice to have):
- Add file size warnings
- Clarify web form vs API
- Explain file type differences

Expected impact: 50-75% reduction in user errors, 50-67% faster time to first successful use.

devtu-optimize-descriptions

ToolUniverse Tool Description Optimization

When to Apply This Skill

Quick Optimization Checklist

Critical Improvements (Fix Immediately)

1. Clarify Required Input Requirements

2. Add Prerequisites to First Tool

3. Expand Critical Abbreviations

High-Priority Improvements

4. Enhance Filter Parameter Descriptions

5. Improve Parameter Guidance

6. Number Mutually Exclusive Options

Medium-Priority Improvements

7. Add File Size Warnings

8. Clarify Web Form vs API Results

9. Explain File Type Differences

Description Structure Template

Description Quality Checklist

Clarity Checks

Completeness Checks

Usability Checks

Testing Description Quality

Common Patterns by Tool Type

API Query Tools

Data Download Tools

Enrichment/Analysis Tools

Validation Commands

Example: Before and After

Summary